YARN has opened up Hadoop for custom distributed applications that do not fit very well (if at all) the classical Map/Reduce framework. However, with great flexibility comes a price to pay: in order to implement a custom YARN application a developer has to deal with all the details of a low level API. A number of attempts have been made to simplify this process, Apache Twill being the most popular one. For us though, that seemed not simple enough.
In this talk we will present an approach we took to build a Flink job to distribute processing across a YARN cluster. The unusual application of simple Flink primitives allowed us to achieve our goal with relatively small development efforts.
About the speaker
Vyacheslav Zholudev is a software architect at ResearchGate. He started to use Hadoop together with ResearchGate more than 4 years ago. He is an early adopter of Flink and enjoying spreading the knowldge about it within ResearchGate.
Vyacheslav has a PhD in Computer Science from Jacobs University Bremen.