Powered by RND

KubeFM

KubeFM
KubeFM
Latest episode

Available Episodes

5 of 58
  • Performance testing Kubernetes workloads, with Stephan Schwarz
    If you're tasked with performance testing Kubernetes workloads without much guidance, this episode offers clear, experience-based strategies that go beyond theory.Stephan Schwarz, a DevOps engineer at iits-consulting, walks through his systematic approach to performance testing Kubernetes applications. He covers everything from defining what performance actually means, to the practical methodology of breaking individual pods to understand their limits, and navigating the complexities of Kubernetes-specific components that affect test results.You will learn:How to establish baseline performance metrics by systematically testing individual pods, disabling autoscaling features, and documenting each incremental change to understand real application limitsWhy shared Kubernetes components skew results and how ingress controllers, service meshes, and monitoring stacks create testing challenges that require careful consideration of the entire request chainPractical approaches to HPA configuration, including how to account for scaling latency, the time delays inherent in Kubernetes scaling operations, and planning for spare capacity based on your SLA requirementsThe role of observability tools like OpenTelemetry in production environments where load testing isn't feasible, and how distributed tracing helps isolate performance bottlenecks across interdependent servicesSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/yY-FnmGfHInterested in sponsoring an episode? Learn more.
    --------  
  • Managing 100s of Kubernetes Clusters using Cluster API, with Zain Malik
    Discover how to manage Kubernetes at scale with declarative infrastructure and automation principles.Zain Malik shares his experience managing multi-tenant Kubernetes clusters with up to 30,000 pods across clusters capped at 950 nodes. He explains how his team transitioned from Terraform to Cluster API for declarative cluster lifecycle management, contributing upstream to improve AKS support while implementing GitOps workflows.You will learn:How to address challenges in large-scale Kubernetes operations, including node pool management inconsistencies and lengthy provisioning timesWhy Cluster API provides a powerful foundation for multi-cloud cluster management, and how to extend it with custom operators for production-specific needsHow implementing GitOps principles eliminates manual intervention in critical operations like cluster upgradesStrategies for handling production incidents and bugs when adopting emerging technologies like Cluster APISponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/5PLksqVlkInterested in sponsoring an episode? Learn more.
    --------  
  • Super-Scaling Open Policy Agent with Batch Queries, with Nicholaos Mouzourakis
    Dive into the technical challenges of scaling authorization in Kubernetes with this in-depth conversation about Open Policy Agent (OPA).Nicholaos Mouzourakis, Staff Product Security Engineer at Gusto, explains how his team re-architected Kubernetes native authorization using OPA to support scale, latency guarantees, and audit requirements across services. He shares detailed insights about their journey optimizing OPA performance through batch queries and solving unexpected interactions between Kubernetes resource limits and Go's runtime behavior.You will learn:Why traditional authorization approaches (code-driven and data-driven) fall short in microservice architectures, and how OPA provides a more flexible, decoupled solutionHow batch authorization can improve performance by up to 18x by reducing network round-tripsThe unexpected interaction between Kubernetes CPU limits and Go's thread management (GOMAXPROCS) that can severely impact OPA performancePractical deployment strategies for OPA in production environments, including considerations for sidecars, daemon sets, and WASM modulesSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/S-2vQ_j-4Interested in sponsoring an episode? Learn more.
    --------  
  • Kubernetes upgrades: beyond the one-click update, with Tanat Lokejaroenlarb
    Discover how Adevinta manages Kubernetes upgrades at scale in this episode with Tanat Lokejaroenlarb. Tanat shares his team's journey from time-consuming blue-green deployments to efficient in-place upgrades for their multi-tenant Kubernetes platform SHIP, detailing the engineering decisions and operational challenges they overcame.You will learn:How to transition from blue-green to in-place Kubernetes upgrades while maintaining service reliabilityTechniques for tracking and addressing API deprecations using tools like Pluto and Kube-no-troubleStrategies for minimizing SLO impact during node rebuilds through serialized approaches and proper PDB configurationWhy a phased upgrade approach with "cluster waves" provides safer production deployments even with thorough testingSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/VVHFfXGl_Interested in sponsoring an episode? Learn more.
    --------  
  • From Fragile to Faultless: Kubernetes Self-Healing In Practice, with Grzegorz Głąb
    Discover how to build resilient Kubernetes environments at scale with practical automation strategies from an engineer who's tackled complex production challenges.Grzegorz Głąb, Kubernetes Engineer at Cloud Kitchens, shares his team's journey developing a comprehensive self-healing framework. He explains how they addressed issues ranging from spot node preemptions to network packet drops caused by unbalanced IRQs, providing concrete examples of automation that prevents downtime and improves reliability.You will learn:How managed Kubernetes services like AKS provide benefits but require customization for specific use casesThe architecture of an effective self-healing framework using DaemonSets and deployments with Kubernetes-native componentsPractical solutions for common challenges like StatefulSet pods stuck on unreachable nodes and cleaning up orphaned podsTechniques for workload-level automation, including throttling CPU-hungry pods and automating diagnostic data collectionSponsorThis episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/yg_fkP0LNInterested in sponsoring an episode? Learn more.
    --------  

More Technology podcasts

About KubeFM

Discover all the great things happening in the world of Kubernetes, learn (controversial) opinions from the experts and explore the successes (and failures) of running Kubernetes at scale.
Podcast website

Listen to KubeFM, All-In with Chamath, Jason, Sacks & Friedberg and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features
Social
v7.18.3 | © 2007-2025 radio.de GmbH
Generated: 6/1/2025 - 3:28:42 PM