Latent Space: The AI Engineer Podcast

Available Episodes

5 of 155

Why RL Won — Kyle Corbitt, OpenPipe (acq. CoreWeave)
In this deep dive with Kyle Corbitt, co-founder and CEO of OpenPipe (recently acquired by CoreWeave), we explore the evolution of fine-tuning in the age of AI agents and the critical shift from supervised fine-tuning to reinforcement learning. Kyle shares his journey from leading YC's Startup School to building OpenPipe, initially focused on distilling expensive GPT-4 workflows into smaller, cheaper models before pivoting to RL-based agent training as frontier model prices plummeted. The conversation reveals why 90% of AI projects remain stuck in proof-of-concept purgatory - not due to capability limitations, but reliability issues that Kyle believes can be solved through continuous learning from real-world experience. He discusses the breakthrough of RULER (Relative Universal Reinforcement Learning Elicited Rewards), which uses LLMs as judges to rank agent behaviors relatively rather than absolutely, making RL training accessible without complex reward engineering. Kyle candidly assesses the challenges of building realistic training environments for agents, explaining why GRPO (despite its advantages) may be a dead end due to its requirement for perfectly reproducible parallel rollouts. He shares insights on why LoRAs remain underrated for production deployments, why GEPA and prompt optimization haven't lived up to the hype in his testing, and why the hardest part of deploying agents isn't the AI - it's sandboxing real-world systems with all their bugs and edge cases intact. The discussion also covers OpenPipe's acquisition by CoreWeave, the launch of their serverless reinforcement learning platform, and Kyle's vision for a future where every deployed agent continuously learns from production experience. He predicts that solving the reliability problem through continuous RL could unlock 10x more AI inference demand from projects currently stuck in development, fundamentally changing how we think about agent deployment and maintenance. Key Topics: The rise and fall of fine-tuning as a business model Why 90% of AI projects never reach production RULER: Making RL accessible through relative ranking The environment problem: Why sandboxing is harder than training GRPO vs PPO and the future of RL algorithms LoRAs: The underrated deployment optimization Why GEPA and prompt optimization disappointed in practice Building world models as synthetic training environments The $500B Stargate bet and OpenAI's potential crypto play Continuous learning as the path to reliable agents References https://www.linkedin.com/in/kcorbitt/ Aug 2023 https://openpipe.ai/blog/from-prompts-to-models DEC 2023 https://openpipe.ai/blog/mistral-7b-fine-tune-optimized JAN 2024 https://openpipe.ai/blog/s-lora MAY 2024 https://openpipe.ai/blog/the-ten-commandments-of-fine-tuning-in-prod https://www.youtube.com/watch?v=-hYqt8M9u_M Oct 2024 https://openpipe.ai/blog/announcing-dpo-support AIE NYC 2025 Finetuning 500m agents https://www.youtube.com/watch?v=zM9RYqCcioM&t=919s AIEWF 2025 How to train your agent (ART-E) https://www.youtube.com/watch?v=gEDl9C8s_-4&t=216s SEPT 2025 ACQUISTION https://openpipe.ai/blog/openpipe-coreweave W&B Serverless RL https://openpipe.ai/blog/serverless-rl?refresh=1760042248153
--------
--------
DevDay 2025: Apps SDK, Agent Kit, MCP, Codex and why Prompting is More Important than Ever
At OpenAI DevDay, we sit down with Sherwin Wu and Christina Cai from the OpenAI Platform Team to discuss the launch of AgentKit - a comprehensive suite of tools for building, deploying, and optimizing AI agents. Christina walks us through the live demo she performed on stage, building a customer support agent in just 8 minutes using the visual Agent Builder, while Sherwin shares insights on how OpenAI is inverting the traditional website-chatbot paradigm by embedding apps directly within ChatGPT through the new Apps SDK. The conversation explores how OpenAI is tackling the challenges developers face when taking agents to production - from writing and optimizing prompts to building evaluation pipelines. They discuss the decision to adopt Anthropic's MCP protocol for tool connectivity, the importance of visual workflows for complex agent systems, and how features like human-in-the-loop approvals and automated prompt optimization are making agent development more accessible to a broader range of developers. Sherwin and Christina also reveal how OpenAI is dogfooding these tools internally, with their own customer support at openai.com already powered by AgentKit, and share candid insights about the evolution from plugins to GPTs to this new agent platform. They discuss the surprising persistence of prompting as a critical skill (contrary to predictions from two years ago), the challenges of serving custom fine-tuned models at scale, and why they believe visual agent builders are essential as workflows grow to span dozens of nodes. Guests: Sherwin Wu: Head of Engineering, OpenAI Platform https://www.linkedin.com/in/sherwinwu1/ https://x.com/sherwinwu?lang=en Christina Huang: Platform Experience, OpenAI https://x.com/christinaahuang https://www.linkedin.com/in/christinaahuang/ Thanks very much to Lindsay and Shaokyi for helping us set up this great deepdive into the new DevDay launches! Key Topics: • AgentKit launch: Agent SDK, Builder, Evals, and deployment tools • Apps SDK and the inversion of the app-chatbot paradigm • Adopting MCP protocol for universal tool connectivity • Visual agent building vs code-first approaches • Human-in-the-loop workflows and approval systems • Automated prompt optimization and "zero-gradient fine-tuning" • Service Health Dashboard and achieving five nines reliability • ChatKit as an embeddable, evergreen chat interface • The evolution from plugins to GPTs to agent platforms • Internal dogfooding with Codex and agent-powered support
--------
--------
Taste is your Moat (Dylan Field of Figma)
Dylan Field (CEO Figma) on how they are letting designers build with Figma Make, how Figma can be the context repository for aesthetic in the age of vibe coding, and why design is your only differentiator now. Full show notes: https://www.latent.space/p/figma 00:00 Figma’s Mission: Bridging Imagination and Reality 00:56 Becoming AI-Pilled 07:44 Figma Make 08:57 Language as the Interface for Design 13:37 Source of truth between design and code 18:15 Figma as a Context Repository 21:30 Understanding and Representing Design Diffs through AI 24:20 Figma’s Role in Shaping Visual Aesthetics 31:56 Fast Fashion in Software 36:04 Limitations of Prompt-Based Software Creation 39:43 Interfaces Beyond Chat 42:12 Lessons from the Thiel Fellowship 44:58 Using X for Product Feedback 48:10 Early-Stage Recruiting at Figma 53:11 Positioning Figma Make in the Prompt-to-App Landscape 55:19 Digital Scarcity & AI
--------
--------
Amp: The Emperor Has No Clothes
Quinn Slack (CEO) and Thorsten Ball (Amp Dictator) from SourceGraph join the show to talk about Amp Code, how they ship 15x/day with no code reviews, and why subagents and prompt optimizers aren’t a promising direction for coding agents. Amp Code: https://ampcode.com/ Latent Space: https://latent.space/ 00:00 Introduction 00:41 Transition from Cody to Amp 03:18 The Importance of Building the Best Coding Agent 06:43 Adapting to a Rapidly Evolving AI Tooling Landscape 09:36 Dogfooding at Sourcegraph 12:35 CLI vs. VS Code Extension 21:08 Positioning Amp in Coding Agent Market 24:10 The Diminishing Importance of Model Selectors 32:39 Tooling vs. Harness 37:19 Common Failure Modes of Coding Agents 47:33 Agent-Friendly Logging and Tooling 52:31 Are Subagents Real? 56:52 New Frameworks and Agent-Integrated Developer Tools 1:00:25 How Agents Are Encouraging Codebase and Workflow Changes 1:03:13 Evolving Outer Loop Tasks 1:07:09 Version Control and Merge Conflicts in an AI-First World 1:10:36 Rise of User-Generated Enterprise Software 1:14:39 Empowering Technical Leaders with AI 1:17:11 Evaluating Product Without Traditional Evals 1:20:58 Hiring
--------
--------
Context Engineering for Agents - Lance Martin, LangChain
Lance: https://www.linkedin.com/in/lance-martin-64a33b5/ How Context Fails: https://www.dbreunig.com/2025/06/22/how-contexts-fail-and-how-to-fix-them.html How New Buzzwords Get Created: https://www.dbreunig.com/2025/07/24/why-the-term-context-engineering-matters.html Content Engineering: https://x.com/RLanceMartin/status/1948441848978309358 https://rlancemartin.github.io/2025/06/23/context_engineering/ https://docs.google.com/presentation/d/16aaXLu40GugY-kOpqDU4e-S0hD1FmHcNyF0rRRnb1OU/edit?usp=sharing Manus Post: https://manus.im/blog/Context-Engineering-for-AI-Agents-Lessons-from-Building-Manus Cognition Post: https://cognition.ai/blog/dont-build-multi-agents Multi-Agent Researcher: https://www.anthropic.com/engineering/multi-agent-research-system Human-in-the-loop + Memory: https://github.com/langchain-ai/agents-from-scratch - Bitter Lesson in AI Engineering - Hyung Won Chung on the Bitter Lesson in AI Research: https://www.youtube.com/watch?v=orDKvo8h71o Bitter Lesson w/ Claude Code: https://www.youtube.com/watch?v=Lue8K2jqfKk&t=1s Learning the Bitter Lesson in AI Engineering: https://rlancemartin.github.io/2025/07/30/bitter_lesson/ Open Deep Research: https://github.com/langchain-ai/open_deep_research https://academy.langchain.com/courses/deep-research-with-langgraph Scaling and building things that "don't yet work": https://www.youtube.com/watch?v=p8Jx4qvDoSo - Frameworks - Roast framework at Shopify / standardization of orchestration tools: https://www.youtube.com/watch?v=0NHCyq8bBcM MCP adoption within Anthropic / standardization of protocols: https://www.youtube.com/watch?v=xlEQ6Y3WNNI How to think about frameworks: https://blog.langchain.com/how-to-think-about-agent-frameworks/ RAG benchmarking: https://rlancemartin.github.io/2025/04/03/vibe-code/ Simon's talk with memory-gone-wrong: https://simonwillison.net/2025/Jun/6/six-months-in-llms/
--------
--------

More Business podcasts

Trending Business podcasts

About Latent Space: The AI Engineer Podcast

The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space

Podcast website

Business Technology Entrepreneurship

Listen to Latent Space: The AI Engineer Podcast, The Diary Of A CEO with Steven Bartlett and many other podcasts from around the world with the radio.net app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

Open app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

Latent Space: The AI Engineer Podcast

Scan code,
download the app,
start listening.