
Human Cognition Can’t Keep Up with Modern Networks. What’s Next?
07/1/2026 | 23 mins.
IBM’s recent acquisitions of Red Hat, HashiCorp, and its planned purchase of Confluent reflect a deliberate strategy to build the infrastructure required for enterprise AI. According to IBM’s Sanil Nambiar, AI depends on consistent hybrid cloud runtimes (Red Hat), programmable and automated infrastructure (HashiCorp), and real-time, trustworthy data (Confluent). Without these foundations, AI cannot function effectively. Nambiar argues that modern, software-defined networks have become too complex for humans to manage alone, overwhelmed by fragmented data, escalating tool sophistication, and a widening skills gap that makes veteran “tribal knowledge” hard to transfer. Trust, he says, is the biggest barrier to AI adoption in networking, since errors can cause costly outages. To address this, IBM launched IBM Network Intelligence, a “network-native” AI solution that combines time-series foundation models with reasoning large language models. This architecture enables AI agents to detect subtle warning patterns, collapse incident response times, and deliver accurate, trustworthy insights for real-world network operations.Learn more from The New Stack about AI infrastructure and IBM’s approach: AI in Network Observability: The Dawn of Network Intelligence How Agentic AI Is Redefining Campus and Branch Network Needs Join our community of newsletter subscribers to stay on top of the news and at the top of your game. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

From Group Science Project to Enterprise Service: Rethinking OpenTelemetry
30/12/2025 | 17 mins.
Ari Zilka, founder of MyDecisive.ai and former Hortonworks CPO, argues that most observability vendors now offer essentially identical, reactive dashboards that highlight problems only after systems are already broken. After speaking with all 23 observability vendors at KubeCon + CloudNativeCon North America 2025, Zilka said these tools fail to meaningfully reduce mean time to resolution (MTTR), a long-standing demand he heard repeatedly from thousands of CIOs during his time at New Relic.Zilka believes observability must shift from reactive monitoring to proactive operations, where systems automatically respond to telemetry in real time. MyDecisive.ai is his attempt to solve this, acting as a “bump in the wire” that intercepts telemetry and uses AI-driven logic to trigger actions like rolling back faulty releases.He also criticized the rising cost and complexity of OpenTelemetry adoption, noting that many companies now require large, specialized teams just to maintain OTel stacks. MyDecisive aims to turn OpenTelemetry into an enterprise-ready service that reduces human intervention and operational overhead.Learn more from The New Stack about OpenTelemetry:Observability Is Stuck in the Past. Your Users Aren't. Setting Up OpenTelemetry on the Frontend Because I Hate MyselfHow to Make OpenTelemetry Better in the BrowserJoin our community of newsletter subscribers to stay on top of the news and at the top of your game. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Why You Can't Build AI Without Progressive Delivery
23/12/2025 | 27 mins.
Former GitHub CEO Thomas Dohmke’s claim that AI-based development requires progressive delivery frames a conversation between analyst James Governor and The New Stack’s Alex Williams about why modern release practices matter more than ever. Governor argues that AI systems behave unpredictably in production: models can hallucinate, outputs vary between versions, and changes are often non-deterministic. Because of this uncertainty, teams must rely on progressive delivery techniques such as feature flags, canary releases, observability, measurement and rollback. These practices, originally developed to improve traditional software releases, now form the foundation for deploying AI safely. Concepts like evaluations, model versioning and controlled rollouts are direct extensions of established delivery disciplines. Beyond AI, Governor’s book “Progressive Delivery” challenges DevOps thinking itself. He notes that DevOps focuses on development and operations but often neglects the user feedback loop. Using a framework of four A’s — abundance, autonomy, alignment and automation — he argues that progressive delivery reconnects teams with real user outcomes. Ultimately, success isn’t just reliability metrics, but whether users are actually satisfied. Learn more from The New Stack about progressive delivery: Mastering Progressive Hydration for Enhanced Web Performance Continuous Delivery: Gold Standard for Software Development Join our community of newsletter subscribers to stay on top of the news and at the top of your game. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

How Nutanix Is Taming Operational Complexity
18/12/2025 | 15 mins.
Most enterprises today run workloads across multiple IT infrastructures rather than a single platform, creating significant operational challenges. According to Nutanix CTO Deepak Goel, organizations face three major hurdles: managing operational complexity amid a shortage of cloud-native skills, migrating legacy virtual machine (VM) workloads to microservices-based cloud-native platforms, and running VM-based workloads alongside containerized applications. Many engineers have deep infrastructure experience but lack Kubernetes expertise, making the transition especially difficult and increasing the learning curve for IT administrators. To address these issues, organizations are turning to platform engineering and internal developer platforms that abstract infrastructure complexity and provide standardized “golden paths” for deployment. Integrated development environments (IDEs) further reduce friction by embedding capabilities like observability and security. Nutanix contributes through its hyper converged platform, which unifies compute and storage while supporting both VMs and containers. At KubeCon North America, Nutanix announced version 2.0 of Nutanix Data Services for Kubernetes (NDK), adding advanced data protection, fault-tolerant replication, and enhanced security through a partnership with Canonical to deliver a hardened operating system for Kubernetes environments.Learn more from The New Stack about operational complexity in cloud native environments:Q&A: Nutanix CEO Rajiv Ramaswami on the Cloud Native Enterprise Kubernetes Complexity Realigns Platform Engineering Strategy Platform Engineering on the Brink: Breakthrough or Bust? Join our community of newsletter subscribers to stay on top of the news and at the top of your game. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Do All Your AI Workloads Actually Require Expensive GPUs?
18/12/2025 | 29 mins.
GPUs dominate today’s AI landscape, but Google argues they are not necessary for every workload. As AI adoption has grown, customers have increasingly demanded compute options that deliver high performance with lower cost and power consumption. Drawing on its long history of custom silicon, Google introduced Axion CPUs in 2024 to meet needs for massive scale, flexibility, and general-purpose computing alongside AI workloads. The Axion-based C4A instance is generally available, while the newer N4A virtual machines promise up to 2x price performance.In this episode, Andrei Gueletii, a technical solutions consultant for Google Cloud joined Gari Singh, a product manager for Google Kubernetes Engine (GKE), and Pranay Bakre, a principal solutions engineer at Arm for this episode, recorded at KubeCon + CloudNativeCon North America, in Atlanta. Built on Arm Neoverse V2 cores, Axion processors emphasize energy efficiency and customization, including flexible machine shapes that let users tailor memory and CPU resources. These features are particularly valuable for platform engineering teams, which must optimize centralized infrastructure for cost, FinOps goals, and price performance as they scale.Importantly, many AI tasks—such as inference for smaller models or batch-oriented jobs—do not require GPUs. CPUs can be more efficient when GPU memory is underutilized or latency demands are low. By decoupling workloads and choosing the right compute for each task, organizations can significantly reduce AI compute costs.Learn more from The New Stack about the Axion-based C4A: Beyond Speed: Why Your Next App Must Be Multi-ArchitectureArm: See a Demo About Migrating a x86-Based App to ARM64Join our community of newsletter subscribers to stay on top of the news and at the top of your game. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.



The New Stack Podcast