Season 3 of Making of the SRE Omelette is here - and it’s all about resilience.
Resilience isn’t just about surviving outages. It’s about building systems and cultures that adapt, learn, and thrive under pressure.
In our kickoff episode, we sit down with Dr. Jennifer Petoff, co-editor of Site Reliability Engineering: How Google Runs Production Systems and leader of Google’s Global SRE Education. Jennifer shares why resilience starts with people, not just technology—and how psychological safety and confidence are the secret ingredients for reliability at scale.
You’ll learn:
* How to scale learning like a production system
* Why postmortem culture drives improvement
* How to apply SRE principles beyond infrastructure
If you’ve ever wondered how to make reliability a business advantage, this episode is for you.
Check out How to SRE Anything here: https://www.reliablepgm.com/how-to-sre-anything/
Topics:
* Origins of SRE and Education at Google
How Google scaled SRE education globally.
Why education is treated like a production system (repeatable, reliable, measurable).
* Psychological Safety and Learning
Why psychological safety is critical for resilience.
Creating environments where teams can share mistakes without fear of blame.
How this accelerates learning and reliability.
* Hands-On Experience as a Learning Model
Importance of experiential learning (e.g., game days, simulations).
Why theory alone isn’t enough for building confidence under pressure.
* Scaling Knowledge Across Large Organizations
Strategies Google uses to scale SRE principles globally.
Balancing standardization with flexibility for local teams.
* Resilience Beyond Reliability
How resilience differs from reliability.
Building adaptive systems and teams that thrive through adversity.
* Culture as a Foundation
Why culture is the “secret ingredient” for successful SRE adoption.
Encouraging curiosity and collaboration across roles.
* Future of SRE Education
Trends in learning for distributed teams.
How continuous education supports evolving reliability practices.
--------
41:44
--------
41:44
Episode 8 - AI for Sustainable IT Part 2 of 2
The conclusion of the two part crossover podcast series explores the intersection of AI with sustainable IT operations featuring Jerry Cuomo from The Art of AI and Kevin Yu from Making of the SRE Omelette. The discussion delves into practical measures for more efficient energy use in AI systems, emphasizing the need for data and the analysis of past behavior to inform energy-efficient decision-making. Jerry and Kevin highlight the importance of balancing AI and human inputs to achieve meaningful tasks and improve overall quality of products. Discuss challenges such as right-sizing compute and recognize the pivotal role of data in addressing these issues, advocating for a data-driven approach to answer critical questions and provide necessary context for decision-making.
Additionally, the conversation touches on the future of AI and sustainable IT operations, emphasizing the need for diverse perspectives and the integration of SRE and sustainability as standard practices in software development. The podcast aims to provide a better understanding of how AI intersects with sustainable IT operations and how innovation can be approached responsibly.
Please be sure to catch Part 1 on Jerry's Art of AI podcast.
--------
18:21
--------
18:21
Episode 7 - Intelligent Facilities & Assets
Mike Hollinger, Master Inventor, CTO for Applied AI & Distinguished Engineer for Maximo Application Suite talks about how we can leverage operational insights from assets, facilities and infrastructure to drive clean energy transition and decarbonization. Mike shares stories from customers that showcases successes as well as challenges they faced. Mike have a call to action to inspire Site Reliability Engineers to embrace the data and capabilities we have at our fingertips today to turn data into action to achieve the sustainable future.
Things to listen for:
[02:20 - 03:25] Mike's career path that led to his current role
[03:59 - 05:47] Meaning of sustainability to Mike
[07:56 - 10:06] Sustainability movement over last few years
[10:18 - 11:41] Importance of driving action from data
[12:04 - 15:45] Challenges in Facilities and Assets
[16:28 - 18:07] Civil Infrastructure example that drive action from data
[20:40 - 22:19] What Mike considers as success
[22:50 - 23:39] Importance of driving action from data
[26:06 - 29:34] Suggestion for c-suite executives to take action
[30:15 - 34:24] Call to action for SREs
[38:32 - 41:03] Mike's ingredient & recipe for a Sustainable Future
--------
41:46
--------
41:46
Episode 6 - Pitch Master
Have you struggled to convince others of your idea? Be it to tackle a reliability problem or a sustainability challenge. In this episode, I have a conversation with Danny Fontaine - host of the Podcast Pitch Master on how to pitch SRE and Sustainability ideas.
Danny shared one of his favorite stories - the origin of the elevator pitch to get us started - and continued with many others including how he changed the paradigm of a customer by surprising them with a fictional scenario and won the deal.
Listen in as Danny transforms how you think about presenting and help you persuade others of your ideas.
Things to listen for:
[02:01 - 03:16] Danny pitching himself
[06:48 - 09:28] Meaning of pitching
[12:24 - 16:39] The origin story of the elevator pitch
[16:54 - 18:57] How Danny get ready for pitching
[21:44 - 27:32] Pitching for Sustainability
[28:38 - 32:52] Pitching against detractors
[34:31 - 37:42] Pitching for head of IT vs. Business
[38:23 - 40:04] Danny's ingredient and recipe for pitching
--------
40:45
--------
40:45
Episode 5 - Sustainability begins with Design
Design transforms the human experience with technology - including the experience for Site Reliability Engineers. And enduring Sustainable results are only achieved when we consider Sustainability in the entire solution life cycle beginning with Design.
In this episode, I have a conversation with Erin Buonomo, Executive Director of Design and Chris Hammond, Distinguished Designer of IBM Sustainability Software on how the SRE discipline can embrace Design for a better experience for Site Reliability Engineers and clients we support - as well as how we partner up to achieve our Sustainability goals.
Erin & Chris encourages the SRE practitioners to introduce ourselves into dialogs everyday. To be that extra leg on the stool, to not only share our perspectives of achieving Sustainable goals, but also to educate others of the practice of SRE. The goal is to achieve "Sustainability Consciousness" - where it is part of everyday decisions, how we do business and part of our culture.
Things to listen for:
[03:37 - 04:58] Reason for having the conversation between SRE & Design
[05:40 - 09:23] Meaning of Sustainability to Chris & Erin
[09:49 - 15:48] AS-IS state
[16:06 - 21:09] How we get to the TO-BE state
[21:33 - 25:36] How do we know we have arrived at this future?
[25:59 - 28:57] How SRE can partner w/ Design
[32:35 - 34:55] Erin & Chris' ingredient & recipe to embrace design for a sustainable future
Wondering what an omelette has to do with SRE (Site Reliability Engineering)?It’s based on the analogy that culture is the outcome of what we do - so in the context of the chicken or the egg, it’s like an omelette. And that’s how this podcast was born: Making of the SRE Omelette.
This show explores how the practice of SRE can help organizations achieve positive business and client success outcomes.
Season 1 focused on culture—because reliability isn’t just about systems, it’s about people.
Season 2 explored Reliable Sustainability—how SRE practices can help organizations deliver reliability today while building a sustainable future.
Season 3 is all about Resilience—not just resilient systems, but resilient teams and resilient ways of working. Because resilience is what helps us adapt, respond, and thrive through adversity.
Much like an avid chef who brings their own unique flair to a classic recipe, the art of SRE lies in thoughtfully adapting industry best practices to fit the distinct culture, needs, and goals of your team and organization. This podcast is designed to inspire thoughtful experimentation and encourage personalization—empowering you to forge your own path toward greater reliability, resilience, and long-term success.
Join us on this journey to surface the ingredients that help drive business and client success through SRE.