Thursday
Workshop Room
17:40 - 18:40
(UTC+01)
Workshop (60 min)
Part 2/2: Stream the word with Apache Flink and Apache Spark
Stream processing is often compared with batch processing in terms of latency. Although it's true, there are many technical differences like watermarks, state stores, micro-batch or dataflow processing models. They make the streaming exciting but also more challenging for engineers used to working with batch systems.
In this workshop you're going to see two stream processing models in action. The first is the micro-batch one that should help you enter the streaming world and see basic aspects the easy way. The second is the dataflow model that has nothing to do with batch processing and therefore, requires a bigger mind shift effort. Both parts will be covered with Open Source data processing frameworks, Apache Spark for the micro-batch part, and Apache Flink for the dataflow model.
By the end of this workshop you, as a prior batch processing person, should better understand the streaming world and be able to write your first jobs by taking all gotchas into account.