Big Data Europe 2025 Highlights
I was lucky enough to attend and speak at this years Big Data Europe event in Vilnius, Lithuania.
Big Data Conference Europe is a conference focused on technical discussions in the areas of Big Data, High Load, Data Science, Machine Learning and AI. I was lucky enough to attend and speak at this year's event in Vilnius, Lithuania.
Day One Highlights

Frank Munz ☁️ 🧱's talk about building real-time aircraft tracking dashboards with Apache Spark Data Pipelines and Databricks was a tough act to follow but it was really interesting and I enjoyed seeing a different approach to the same problem I would address later in my talk.
Olena Kutsenko was amazing as always and brought streaming security to life with her comprehensive talk covering everything you need to consider when trying to keep your company out of the news!
Even though the Cloudflare outage wreaked havoc with my demo, I still enjoyed taking my real-time, real fast talk on it's final outing and had some great conversations afterwards with the audience about Apache Kafka, ClickHouse, and open-source aviation data (I'll definitely be trying out GNU Reader now!)
If you want to try out the demo for yourself (minus the Cloudflare induced data losses) you can check out my post about it here.
Day Two Highlights

I've been curious about Apache Fluss (Incubating) for a while so Giannis Polyzos' talk on the seven deadly sins of streaming (and how Fluss addresses them) was a perfect start to my second day at BDE.
Next up there was a super interesting panel on AI and data products featuring Alessandro Pregnolato, Viktor Kessler, Emiliano Mancuso, Anton Borisov, Pavel Filinov, and Devesh Prasad. There were plenty of great takeaways from the session but I particularly liked Pavel's reminder on the importance of failing fast, learning, and not being afraid to end a project in a single meeting.
Natalia Sokolowska's session about observability for data quality was eye opening - not least because I hadn't heard the story about Dr Wolf and spinach before! It was really useful to see the tools PAYBACK use to check for data quality at every stage of their pipelines and has definitely given me some ideas for how to save myself from myself in my own projects!
Day Three Highlights

I rounded out my time at BDE with a choice selection of talks from the AI track. Adebola Olomo's talk has both practical advice for preserving your unique voice in a world of AI and the inspiring story of her career that has taken her from learning law in Lagos to founding her own businesses to leading partnerships at Women in AI.
The team at Netflix have a well deserved reputation for industry leading engineering so it was no surprise that Adrian Taruc's deep dive into how Netflix built a real-time distributed graph had some great insights into how data engineering gets done at a truly massive scale.
Finally, Ana Catarina De Alencar gave a brilliant talk about AI that can read emotion and what it means for data privacy and consent. This was a fascinating, thought provoking, and important talk in a world where AI is being integrated into more and more aspects of our lives. If you watch one recording of any of the talks from BDE, watch this one.
Bonus:
Sightseeing in Vilnius
I arrived the day before BDE and once I was done preparing for my presentation spent an enjoyable afternoon exploring the city. I have no idea how to be a tourist so I watched Travel Man: 48 Hours in Vilnius for some inspiration.

Cribbing from the Travel Man episode I headed into Vilnius proper for a wander including taking in the views from Gedimas Castle and grabbing a drink in Užupis.
After a taste of sight-seeing it was time to head over to Tech Zity for the Apache Iceberg Europe Meetup.
The Apache Iceberg™ Europe Community Meetup

Viktor Kessler kicked things off with his talk taking us on a journey through the history of data analytics and shared some insights on the future of Apache Iceberg REST. The next talk from Martynas Cibulskis was an impressive showcase of the work the team at Vinted have done extending the Apache Iceberg FileIO abstraction layer.
We heard about OSS Lakehouse support at Google Cloud from Kęstutis Daugėla, this is especially exciting given that Google are now storing over an Exabyte of Iceberg data in GCP! Following this up was a whistle stop tour of the challenges of Iceberg CDC from Anton Borisov who shared the work he's been doing on this problem at Fresha as well as some hope that there are potentially some improved tools for Iceberg CDC in Iceberg v4.
Finally CEO at LakeOps Amit Gilad dived into the optimisations that are possible when you build observability on your Iceberg metadata. Thanks to Viktor Kessler for organising and to Vakamo, Confluent, and Google for making this event possible. Looking forward to more Iceberg Meetups in Europe next year, perhaps in some warmer weather!
Final thoughts
I'm extremely grateful to Aušra Stučinskė and the rest of the BDE team for organising. It was a fantastic event! And a big thank you to Francesco Tisiot for suggesting I apply to speak.
That's a wrap on BDE and my conference travel for 2025! Looking forward to going home to get some rest and get cracking on putting together some awesome Apache Kafka content with some inspiration from the cool people I've got to meet and see speak over the last few months.
Vilnius is a beautiful city and I hope to come back again and see it in future. Not least because, whilst I did manage to try Šakotis here, I didn't get time to try Cepelinai.
Looking forward to next time!
