Forget the Cloud - Building Lean Batch Pipelines from TCP Streams with Python and DuckDB
Tech Talks | Links PreTalx | Slides

I held this talk at PyCon & PyData DE 2025.
Conference
- PyData Berlin 2025
Video & Material
- Slides
- Video (to be published)
Description
Many industrial and legacy systems still push critical data over TCP streams. Instead of reaching for heavyweight cloud platforms, you can build fast, lean batch pipelines on-prem using Python and DuckDB.
In this talk, you’ll learn how to turn raw TCP streams into structured data sets, ready for analysis, all running on-premise. We’ll cover key patterns for batch processing, practical architecture examples, and real-world lessons from industrial projects.
If you work with sensor data, logs, or telemetry, and you value simplicity, speed, and control this talk is for you.