What is it
Apache Beam is an open source unified programming model which let's you implement streaming or batch pipelines, and have them run on several different execution engines. Available SDKs are Python, Go, Java and Scala.
Available runners are:
- Direct. For local development.
- Google Cloud Dataflow. The managed runner on GCP.
- Other available runners include Apache Flink, Apache Spark, Apache Samza and Apache Nemo
When to use it
We are currently evaluating when and if we should use Apache Beam. For batch processing our default tool is Dataform, our hypothesis is that Apache Beam will be useful for real time processing use cases, e.g. real time aggregations of vehicle telematics events.