Blog

Fixing Data Skew for Sparse Joins in Flink
Data skew has been one of the trickiest issues to deal with in our production workloads. Several of our customers make very heavy use of LEFT JOINs, and it only takes one of these join operations to create a processing hotspot, resulting in tons of back-pressure, low throughput, and failed checkpoints.
Mar 22, 2023

Part 2: Hacking Flink to handle large scale continuous ETL
This post is part 2 of a multi-part series about how we scaled Flink to handle a large scale continuous ETL pipeline. Check out part 1 of the series if you have not done so yet.
Oct 6, 2022

Part 1: Continuous ETL from PostgreSQL to Elasticsearch with Apache Flink
One of our customers built a platform to help recruiters source and manage candidates. Having used modern cloud-native tools, a small team built a great product that scaled for a large number of customers.
Oct 6, 2022