Session starts - 14:00

Beyond MapReduce, Scientific data processing in real-time

Proteomics is a branch of Life Sciences concerned with studying the abundance and type of proteins found in cells. As the Human Genome project mapped out the genes making up human DNA, proteomics researchers are mapping out the proteins which are created from the information in the genes. The experiments using Mass Spectrometers typically create large volumes of data which need complex pre-processing steps in order to be useful.

In this talk we look at how a parallel algorithm was developed and a process initially created for Hadoop MapReduce has been run successfully on Apache Flink. Comparing the streaming process with a batch based Hadoop job and the implications this could have for large scale proteomics laboratories.

Show in schedule

About the speaker

Christopher Hillman

Christopher is currently studying part-time for a PhD in Data Science at the University of Dundee applying Big Data analytics to the data produced from experimentation into the Human Proteome. For a day job Christopher Hillman is Principal Data Scientist in the International Advanced Analytics team at Teradata based in London and working in the international region. He has 20+ years experience working with data, information and analytics



We've added you to our Newsletter.

Feel free to unsubscribe at any time through the link provider in the bottom of our e-mails.


You're already on the list