33 talks in 9 tracks
|
Attendees, speakers & organisers are all invited to a video call. Come to meet others, ask questions and hang out. We'll explain how the conference works and give you a tour of the talks. Subscribe for FREE to get access |
|
|
|
Keynote: Why Is Resilience Testing Non-Negotiable in an Enterprise SDLC? Uma Mukkara Head of Chaos Engineering at Harness |
2026-03-19 17:30:00 |
|
|
Keynote: When the Stack Lies: Finding Root Cause in Distributed Systems David McNerney Director of Product Management at Virtana Daniel Raskin Chief Marketing Officer at Virtana |
2026-03-19 18:00:00 |
|
Human-Governed Automation Loops for Reliable AI at Planet Scale Suganya Nagarajan Software Development Manager at Amazon.com LLC |
2026-03-19 18:30:00 |
|
|
Operating Predictive Analytics at Scale: Reliability Engineering for Data-Driven Education Finance Prajakta Talathi Marketing Analytics Manager at College Ave |
2026-03-19 19:00:00 |
|
|
Operationalizing LLMs at Scale: Reliability, Resilience, and Cognitive Commerce in Grocery Retail Sanjay Basu Global Head for AI / GenAI Products at TCS |
2026-03-19 19:30:00 |
|
Compute-Sharded Stream Processing for Petabyte-Scale Real-Time Cybersecurity Analytics Abhishek Suman Senior Software Engineer at Microsoft Corporation |
2026-03-19 18:30:00 |
|
|
Autonomy with Guardrails: Operating Agentic Automation in High-Risk Production Systems Ajay Athitya Ramanathan Data & AI Engineer at FourthSquare |
2026-03-19 19:00:00 |
|
|
Reliability-First Architectures for AI Analytics in Mission-Critical Platforms Ajay Srinivas Kiran Gemidi Sr Systems Engineer at Valuguard Solutions LLC |
2026-03-19 19:30:00 |
|
|
SRE Driven Frontend Architecture for High Performance and Reliable Web Systems Murali Varma Senior Staff Software Engineer at Galileo Financial Technologies LLC |
2026-03-19 20:00:00 |
|
|
Building Highly Available Databases with Cloud Native Reliability Practices Rajesh Kumar Balusu Database Architect at AGAP Technologies Inc |
2026-03-19 20:30:00 |
|
SRE-Driven Automation of FLEX2 Analytics in AMBR250 Upstream Biologics Workflows Amogha Tenneti Senior Associate Scientist at Eurofins PSS |
2026-03-19 18:30:00 |
|
|
AI-Assisted Incident Response Using LLMs and MCP in Distributed Systems Makarand Gujarathi Senior Software Engineer at Walmart |
2026-03-19 19:00:00 |
|
|
AI-Driven Risk-Aware Decision-Making for Scalable Reliability Systems Oreoluwa Omoike Site Reliability Engineer at JPMorgan Chase & CO |
2026-03-19 19:30:00 |
|
Building Reliable AI-Driven Mobile Platforms for Healthcare and Commerce Dipta Rakshit Staff Software Engineer at Walmart Global Tech |
2026-03-19 18:30:00 |
|
|
AI-Governed Lakehouse Ingestion with Flink on Kubernetes for Reliable DataOps Jyothish Sreedharan Vice President at Independent Researcher |
2026-03-19 19:00:00 |
|
|
Reliable AI-Driven UM Letters: Automating Compliance in Healthcare Systems Prakash Easwaran Implementation Manager at ZeOmega Inc |
2026-03-19 19:30:00 |
|
|
Reliable Data at Scale: ETL vs. Unified Platforms in Modern SRE Venkata Kundavaram IT DEVELOPMENT MANAGER at GOODWILL EASTER SEALS MINNESOTA |
2026-03-19 20:00:00 |
|
|
Reliable API Support: Human-in-the-Loop Agentic RAG for SRE Vishal Shah Staff Software engineer - 2 at Visa |
2026-03-19 20:30:00 |
|
Reducing On-Call Pain in Hybrid Platforms: Operating VMs and Containers Reliably on Kubernetes Shruthi Rajashekar Engineering Manager at Broadcom Inc |
2026-03-19 18:30:00 |
|
|
Why Kubernetes Default Load Balancing Doesn’t Work for HTTP/2 Traffic Mariem Sboui Senior Site Reliability Engineer at gridX |
2026-03-19 19:00:00 |
|
|
Migration from On-Prem Messaging System to The Cloud: What, How and Why Ran Tao Cloud Support Engineer at Amazon Web Services, Inc. |
2026-03-19 19:30:00 |
|
|
Predictive Workflow Integrity for SRE: Autonomous Triage at Global Scale Rohit Wadhwa Senior Software Engineer at Walmart Inc |
2026-03-19 20:00:00 |
|
Correlation Over Collection: A Layered Observability Framework for SREs Khushboo Nigam Principal Cloud Architect at Oracle |
2026-03-19 18:30:00 |
|
|
From Alerts to Answers: Modern SRE with Observability, AI, and Agentic Ops Savi Grover Site Reliability Engineer at Tekskills Inc |
2026-03-19 19:00:00 |
|
|
Beyond Observability: Proactive SRE for AI-Driven Personalisation Yury Lysak Head of Strategic Projects at Lionsoul Global |
2026-03-19 19:30:00 |
|
Unlocking Just-in-Time CPU Optimization with In-Place Pod Resize Raman Tehlan Cloud Native Consultant at Zurich Lab Shubham Rai Engineering at Truefoundry |
2026-03-19 18:30:00 |
|
|
SRE for National-Scale Regulatory Reporting: HA, DR, and Audit-Ready Design Bhargavaram Potharaju Senior Infrastructure Engineer at Wells Fargo |
2026-03-19 19:00:00 |
|
|
SRE-Ready NDC Shopping: Caching at Scale Without Pricing Drift Mukul Kumar Gaur Senior Principal - Product Management • Product Engineering at Accelya Group |
2026-03-19 19:30:00 |
|
Abhimanyu Narwal Engineering Team Lead at Bloomberg LP |
2026-03-19 18:30:00 |
|
|
Reliable AI for Clinical Handoffs: SRE Lessons for Safer Care Transitions Abhiram Potharaju Senior Software Engineer at Wells Fargo Bank N.A. |
2026-03-19 19:00:00 |
|
|
From Testing to Reliability: Strengthening Regulated Production Systems Shruthi Sepuri Software Developer Engineer at TATA Consultancy Services(TCS) |
2026-03-19 19:30:00 |
|
Program Leadership in AI-Enabled Platform Systems: Reliability Signals from Complex Organizations Sonali Galhotra Sr. Engineering Program Manager at Sony Pictures |
2026-03-19 18:30:00 |
|
|
Resilient AI Platforms for Crisis-Ready Health and Financial Systems Rakesh Kumar Kavsari Gopal Technical Architect at Osmania University, India |
2026-03-19 19:00:00 |
|
|
Scale or Fail as Spotify's Growth Exposed the Abstraction Paradox Stuart Clark Senior Developer Advocate at Spotify |
2026-03-19 19:30:00 |
|
|
From Vibes to Outages: When AI Writes the Code You Debug Sylvain Kalache Head of AI Labs at Rootly |
2026-03-19 20:00:00 |
In Resilience Testing Is Non-Negotiable in SDLC, Uma Mukkara, Head of Harness Resilience Testing and Co-Creator of LitmusChaos, explains why resilience must be built and tested throughout the Software Development Life Cycle, not treated as something to address after production issues occur. He defines resilience as the ability of business services to withstand system failures, high load, and...
SRE teams know the pattern. A service slows down, dashboards light up, and every tool points to a different culprit. Application metrics say one thing, infrastructure metrics say another, and the real cause is buried somewhere in the interactions between them. Modern incidents rarely originate in a single component. A storage queue spike can throttle Kubernetes pods. Network contention can...
Learn for free, join the best tech learning community
Event notifications, weekly newsletter
Access to all content