PodcastsBusinessThe Data Engineering Show

The Data Engineering Show

The Firebolt Data Bros
The Data Engineering Show
Latest episode

58 episodes

  • The Data Engineering Show

    The Framework Canva Uses for 200M+ Designers with Paul Tune

    28/04/2026 | 22 mins.
    In this episode of The Data Engineering Show, Benjamin sits down with Paul Tune, Staff Research Scientist at Canva, to explore the advancement of machine learning at one of the world's leading design platforms. Learn how Canva is transitioning from traditional ML like recommendation engines for templates to cutting-edge agentic workflows that allow users and AI to collaborate on complex design tasks. Whether you're interested in the infrastructure behind distributed training or the nuances of post-training LLMs for aesthetic tasks, this deep dive offers a masterclass in scaling ML for millions of creative users.
  • The Data Engineering Show

    Llama 2 & 3 Safety: Soumya Batra on Agentic AI Training

    08/04/2026 | 22 mins.
    What if the expertise that built foundation models could reshape how you think about AI's future? In this episode, Benjamin sits down with Soumya Batra, founder and CEO of WisePort AI and former safety lead on Llama 2 and Llama 3 at Meta, to explore how foundation models evolved from traditional NLP, why post-training holds the highest leverage for safety and controllability, and what natively agentic AI means for the next frontier of AI development. Whether you're curious about the model training lifecycle or wondering what comes after large language models, this conversation unpacks the technical strategies and vision shaping tomorrow's AI systems.
  • The Data Engineering Show

    The Data Fusion Secret & Why Custom Query Engines Fail with Nikita Lapkov

    24/03/2026 | 18 mins.
    What if building a distributed SQL engine meant rethinking everything about how query execution works at scale? In this episode, Benjamin sits down with Nikita, Senior Software Engineer at Cloudflare, to explore how R2 SQL leverages object storage and distributed computing to power analytics across 300 global locations, why backward compatibility becomes critical when you can't control infrastructure rollouts, and the key strategies for handling joins and adaptive query execution in a stateless, point-to-point network architecture. Whether you're designing distributed systems or curious about how Cloudflare processes petabytes of data, this conversation reveals the real-world engineering challenges and innovations shaping the future of cloud data platforms.
  • The Data Engineering Show

    How Zipline AI Turns Weeks of Engineering Into Minutes of SQL Queries ft. Nikhil Simha

    10/03/2026 | 24 mins.
    What if you could deploy ML features and real-time data pipelines without building complex infrastructure from scratch?

    In this episode, host Benjamin sits down with Nikhil Simha, CTO at Zipline AI and co-author of Chronon AI, to explore how Chronon, an open-source system that generates data infrastructure from simple queries, is transforming feature engineering at companies like OpenAI and Airbnb. Learn why iteration speed matters for fraud detection, how to serve thousands of signals at a massive scale, and what the future of analytical databases looks like in an AI-first world. Whether you're scaling real-time ML systems or building customer-facing analytics, this conversation is packed with practical insights on bridging the gap between data scientists and ML engineers.
  • The Data Engineering Show

    The Geo-Data Problem Nobody Talks About And How Voi Solved It ft. Magnus Dahlbäck

    19/02/2026 | 16 mins.
    What if your data platform could power both critical business decisions and real-time product features at scale? In this episode, host Benjamin sits down with Magnus Dahlbäck, Senior Director of Data and Platform at Voi, to explore how a metrics-first approach and semantic layers transform data accessibility, why traditional ML and LLMs require different strategies for different problems, and how to balance FinOps costs while processing billions of IoT events daily. Whether you're building data infrastructure for a high-growth company or rethinking how your organization consumes data, this conversation is packed with practical strategies for unlocking data value and preparing your platform for AI. Tune in to discover how Voi ditched traditional BI tools and revolutionized their approach to enterprise analytics.

More Business podcasts

About The Data Engineering Show

The Data Engineering Show is a podcast for data engineering and BI practitioners to go beyond theory. Learn from the biggest influencers in tech about their practical day-to-day data challenges and solutions in a casual and fun setting. SEASON 1 DATA BROS Eldad and Boaz Farkash shared the same stuffed toys growing up as well as a big passion for data. After founding Sisense and building it to become a high-growth analytics unicorn, they moved on to their next venture, Firebolt, a leading high-performance cloud data warehouse. SEASON 2 DATA BROS In season 2 Eldad adopted a brilliant new little brother, and with their shared love for query processing, the connection was immediate. After excelling in his MS, Computer Science degree, Benjamin Wagner joined Firebolt to lead its query processing team and is a rising star in the data space. For inquiries contact [email protected] Website: https://www.firebolt.io
Podcast website

Listen to The Data Engineering Show, The Diary Of A CEO with Steven Bartlett and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features