This repository, contains reference implementation of different stream processing workloads across different systems.
✔️ done ❌ not possible
Queries | Flink | Spark Structured Streaming | Kafka Streams | RedPanda | Timly Dataflow | Light Saber | Google Dataflow | Microsoft Stream Insighes / Trill | Ressources |
YSB | ✔️ | ||||||||
Nextmark Q4 | ✔️ | ||||||||
Nextmark Q5 | ✔️ | ||||||||
Nextmark Q7 | ✔️ | ||||||||
Nextmark Q8 | ✔️ | ||||||||
Nextmark Q11 | ✔️ | ||||||||
ClusterMonitoring (LS) | ✔️ | ||||||||
SmartGrid (LS) | Organizer Description, 500M Data | ||||||||
LinearRoadBenchmark (LS) |
Node-55 - Intel
Fatnode - AMD Rayzen
Cloud 48 - ARM
RaspeeryPi 2 ARM
- Build Java code using maven.
- Ensure you have a properly configured Flink cluster
- To run a query execute:
/path/to/flink/bin/flink run /path/to/your/jar/im-job-vanilla-benchmarks_2.11-0.1-SNAPSHOT.jar -queryName -sourceParallelism SOURCE PARALLELISM -windowParallelism WINDOW OPERATOR PARALLELISM
SOURCE PARALLELISM: number of threads that will execute the source operator WINDOW OPERATOR PARALLELISM: number of threads that will execute the window operator
Please, have a look here:
jobmanager.rpc.address: cloud-40 ##coordinator hostame.
taskmanager.compute.numa: true -XX:+TieredCompilation -server -XX:+UseG1GC 0.3 536870912 8589934592
taskmanager.memory.fraction: 0.5
jobmanager.rpc.port: 6123
jobmanager.heap.size: 8gb
taskmanager.heap.size: 64g
taskmanager.numberOfTaskSlots: 10 ## change this with the number of physical cores for the worker
parallelism.default: 1
Please make sure you deploy the coordinator on a dedicated node and each task manager runs on a dedicated node as well.