LinkedIn pioneered a design pattern for using Kafka as a central distributed low latency component to collect streams to solve its data integration problem by collecting all the organization«™s data and put it into a central log for real-time data subscription.
While putting this design pattern to work at Bouygues Telecom, we quickly confronted the fact that our data comes from different sources in several raw formats, going from text files to binary encoded logs, with no visible usable business logic information.
After exploring different solutions we«™ve soon realized the urge of a fast, reliable and scalable data processing framework in order to transform raw data and make it available in real time.
In this talk, we«™ll discuss how we are using Apache Flink with Apache Kafka to achieve real-time data integration and overcome the problems we came across.
About the speaker
Mohamed Amine Abdessemed
Mohamed Amine Abdessemed is a software engineer and solution architect at Bouygues Telecom since 2013 working on NoSQL and Bigdata solutions such as Oracle semantics technologies, Elasticsearch, Kafka and the Hadoop ecosystem. He holds an engineering degree in computer science and a master degree in communication networks from the university of Paris 6.