Samsara is an R-like, easy-to-use Scala DSL for machine learning and data analysis. Samsara programs are automatically parallelized and executed on massively parallel dataflow systems such as Apache Flink or Apache Spark.
The talk will introduce Samsara and give a deep dive into a selection of its optimization strategies.
About the speaker
I’m currently a PhD student at the Database Systems and Information Management Group (DIMA) of TU Berlin with Prof. Volker Markl.
My research aims at improving the technology for performing large scale data analysis on parallel processing platforms. Use case-wise, my focus is on enabling Collaborative Filtering with billions of interactions and Graph Mining on graphs with billions of vertices and edges. I am also engaged in Open Source as a member of the Apache Software Foundation, where I’m a committer and PMC member in the Mahout, Giraph and Flink projects.
During my PhD, I have been interning at IBM Research Almaden and Twitter in California. After my upcoming graduation, I will join Amazon Berlin as a Machine Learning Scientist and Post-Doctoral Researcher.