Big data use cases come in lots of different flavors. Many companies start to adopt Hadoop with a few prominent and complex use cases in mind. At a later stage, when data is already in the cluster and developers are more knowledgable about the tools, the number of “everyday jobs” usually start to explode. These jobs often look more straightforward, but due to their number can quickly account for a major share in both maintenance efforts and cluster time. In this talk we look into two everyday use cases and compare Flink with MR and Hive with regards to developer productivity, ease of maintenance, and performance.
About the speaker
Michael HÃ¤usler is Head of Engineering at ResearchGate, a professional network for researchers and scientists. In 2011 he introduced Hadoop at ResearchGate, which now runs many thousands of Hadoop jobs every day.
His special interests are fun and productivity in software engineering, as well as bridging the gaps between real-time, near-realtime, and batch use cases.