
Bijlmerdreef 106
1102 CT Amsterdam, Netherlands



We’ve all been there. It’s 3 a.m., an alert just fired, and you’re staring at a dashboard. The P99 latency is spiking, but the average looks fine. The CPU is high, but the load average is normal. What’s actually going on? The hard truth is that most of our dashboards are subtle liars. They lie by averaging percentiles, hiding the real user pain. They lie with poorly chosen time windows that miss the crucial spike. They lie because we're forced to guess: is this a counter or a gauge? Can I sum this? Is a line graph even the right way to look at this? We expect every engineer on our team to also be a part-time statistician, and at 3 a.m., that’s a recipe for disaster. In this talk, I'll share stories of how misleading dashboards have sent teams down the wrong rabbit hole during critical incidents. We'll dissect the common lies and half-truths our monitoring tells us every day. Then, we'll talk about the alternative: building "opinionated" observability. It's about baking our expertise and best practices directly into our tools, so the right chart appears by default, the alert already has context, and the system guides you to the right questions. This isn't just about making prettier graphs. It's about giving back our most valuable resource time, and letting engineers get back to being engineers, not full-time data analysts. What you'll get from this talk: How to spot the 3 most common ways your dashboards are probably lying to you. A simple mental model for choosing the right visualization for the job (and knowing when a line graph is a terrible choice). Practical ways to start baking your team's expertise and "tribal knowledge" into your monitoring with smart defaults. How to argue for, or build, "opinionated" tools that stop wasting your team's time and lead to faster, more accurate incident response.
Shyam Sreevalsan builds products at the intersection of AI, observability, and open source. He leads product and strategy at Netdata, democratizing observability, one metric at a time.
Dashboards were invented for a world without AI. They encode assumptions about human cognition, scanning, pattern recognition, correlation, that AI now handles better. This contrarian talk argues that the future of operational observability isn't better dashboards, but AI agents that render dashboards obsolete for the decisions that matter. We'll examine: The dashboard's original sin: Assumptions that worked for small systems but fail at scale What AI-native observability actually looks like: Concrete examples from production systems The transition path: Pragmatic stages from AI augmentation to dashboard sunset The human role in AI-native operations: Why this isn't about replacing humans Based on three years of building AI observability systems, including conversational interfaces that replace manual investigation and ML anomaly detection achieving 10⁻³⁶ false positive rates, this talk challenges you to reconsider whether dashboards are helping or hindering your operational excellence.
Costa Tsaousis is the Founder and CEO of Netdata. Since 1995, Costa has been actively working on internet-related startups. He has been a co-founder and C-level executive of many successful projects, including Internet Service Providers, Cloud Hosting Providers, and Fintech startups. With a passion for innovation and open-source, he now leads Netdata, a monitoring solution aiming to simplify and modernize infrastructure observability for all of us.
Do you trust your AI coding assistant? What if I told you that attackers have found ways to manipulate it and attack your code? With everyone now using AI coding assistants it’s time to look at the risks! During this talk I’ll show you several new techniques attackers are already using. This will range from hidden messages (ASCII smuggling) to abusing mistyping and characters that look the same (typosquatting). I will also show how an LLM can make mistakes when generating code (hallucinations). Did you know that a smart attacker can abuse this too? When you join this talk, you’ll learn how to spot hidden text in your instruction file and prompts. I will also explain how to set up a trusted dependency repository to prevent the malicious code from entering your production environment!
Active in the IT industry since 2012, Leo Visser is a Subject Matter Expert for Azure and AI Foundry at OGD. He advises organizations on AI, automation, and cloud architecture, combining tools like PowerShell, Power Platform, Azure Logic Apps, and native Azure services. His work strongly focuses on security, sustainability, and long-term value. Leo is a Microsoft PowerShell MVP.
Every SRE team knows their clusters are overprovisioned. But how much exactly, and where? I kept asking this question on our EKS clusters. kubectl top gives percentages, not dollars. AWS Cost Explorer sees instances, not pods. Every tool that connects the two wanted me to deploy Helm charts, agents, and dashboards. I just wanted a number. So I built Burn — an open-source CLI that reads your kubeconfig, fetches real-time pricing from AWS and Azure APIs, and gives you per-namespace cost breakdown in 30 seconds. No agent, no dashboard, no cluster changes. Running it on production, I found 33% idle capacity ($117/month on a 5-node cluster), a pod requesting 500m CPU but using 0.12m, and debug pods nobody remembered deploying. I deleted the waste that same day. In this talk I'll cover: - How Burn splits node cost into CPU and RAM using ratio-based pricing - Why P95 metrics matter more than averages for rightsizing - How we detect Ingress-based load balancers that other tools miss - Honest trade-offs of an agentless approach vs full platforms like Kubecost Attendees will leave knowing how to identify idle resources, understand Kubernetes cost allocation math, and evaluate the right level of cost tooling for their clusters.
Cloud/DevOps Engineer who enjoys solving complex problems and building efficient, scalable systems. My expertise includes Kubernetes, AWS, and CI/CD pipelines, with a strong focus on automating processes to simplify developers' workflows. Additionally, I maintain a supercomputer (HPC) at NEU IBM Center, ensuring it runs at peak performance for high-demand tasks. Creator of Burn, an open-source Kubernetes FinOps CLI.
Modern applications often outgrow simple API key or role-based access models, especially in multi-tenant and microservice environments. This talk explores how OpenFGA enables fine-grained, relationship-based authorization to model complex access patterns such as “user X can access resource Y because they belong to team Z.” We’ll walk through real-world scenarios where traditional RBAC breaks down, demonstrate how OpenFGA implements Google Zanzibar–inspired authorization, and show how to integrate it into cloud-native platforms and APIs. Attendees will leave with practical patterns, architecture guidance, and lessons learned from implementing authorization as a standalone service in modern platform architectures.
Ankit Asthana is a Senior Cloud & Platform Architect Engineer at SQUER, with over 13 years of experience across AI infrastructure, DevOps, SRE, and platform engineering. He specializes in building secure, scalable cloud-native systems on Kubernetes and AWS, and works at the intersection of platform engineering and AI-driven workloads. Ankit is an active community contributor and regularly speaks at meetups on topics such as developer platforms, policy-as-code, and modern authorization systems.
Varnish is a well-known reverse open source HTTP caching proxy that accelerates websites, applications, files, and any other HTTP-base type of workload. The technology has been around for more than 15 years and powers millions of active websites. While Varnish is considered a stable and reliable piece of acceleration software, there hasn't been a lot of hype surrounding the project. The recent release of Varnish version 9 deserves a bit of hype. In this presentation, Thijs will explain the major new features of Varnish 9, which include TLS support, dynamic backends, OTEL support, and a wide range of exciting modules. He’ll show you how to use them to apply a more intelligent caching layer to your web stack. Thijs will also explain some major changes in the way the open source project is run, and will present some cloud-native additions to the project like Docker images, Helm Charts, and even a Varnish-powered Kubernetes Gateway Controller.
As the Technical Evangelist at Varnish Software, Thijs Feryn focuses on web performance, software scalability, and content delivery. He demonstrates content-driven and technical messaging through presentations, videos, books, blog posts, social media posts, podcasts, and other media. Thijs is a published author and wrote Getting Started with Varnish Cache and Varnish 6 by Example. As a public speaker, he has a track record of over 380 presentations in 26 different countries, where he is often praised for his energetic and engaging presentation style. As an evangelist, Thijs is also active in many open-source communities, most notably the Varnish and PHP community. He has contributed to various communities for over 15 years both technically and as an organizer and facilitator. Prior to joining Varnish Software, Thijs Feryn spent 15 years in the web hosting industry, tackling web performance and scalability issues on a daily basis and evangelizing these topics. For more information about Thijs’ past & upcoming presentations, please visit https://feryn.eu/speaking.
Teams spend a lot of time defining functional and non-functional requirements, writing acceptance criteria, and verifying behavior in controlled environments. That work matters, but it can also lead teams to optimize for test environments while real incidents emerge in production. Production is where systems meet latency, dependency failures, retries, partial outages, and degraded behavior. Most incidents do not come from one obviously broken service. They happen in the gaps between services, where assumptions were never made explicit and failure modes were never explored. In this talk, we give a practical introduction to chaos engineering as a way to close that gap. Not as a dramatic exercise in breaking everything, but as a disciplined way to learn how systems behave under stress. By running small, intentional experiments, teams can test assumptions, uncover weak spots before they become incidents, and better understand how their systems really behave in production. One of the biggest lessons is that reliability often improves before the first experiment even runs. As soon as teams start asking “what if?” questions together, they surface hidden assumptions and identify practical changes worth making immediately. Attendees will leave with a better understanding of why incidents emerge between systems, how chaos engineering connects to a Site Reliability Engineering (SRE) mindset, and how to start with safe, small experiments in their own environment.
Jacob is a leader who inspires others through his commitment to learning, sharing knowledge, and fostering growth. With a strong foundation as a software developer, team lead, and technical consultant, he brings real-world experience to every role he takes on. Jacob actively guides organizations through changes, such as adopting Team Topologies, and supports teams by mentoring developers, creating development teams, and leading workshops. His hands-on approach ensures that his ideas are practical and grounded in experience. Driven by a passion for helping people and teams succeed, Jacob focuses on building strong, collaborative environments. He combines technical expertise with leadership to advocate for better ways of working and continuous improvement.
After five years of managing serverless databases, I have learned that my rollercoaster journey is very similar to the CPU usage you dream of seeing in the console. This session shares five hard-earned lessons learned while working with so-called serverless databases.
Renato has extensive experience as a cloud architect, tech lead, and cloud services specialist. Currently, he lives in Berlin and works remotely as a principal cloud architect. His primary areas of interest include cloud services and relational databases. He is an editor at InfoQ and a recognized AWS Data Hero.
Finding performance issues in modern software is like finding a needle in a haystack and intuition on where to look first is often wrong. APerf is an open source tool we have used many times to help with performance debugging by looking “wide” before going “deep”. This session will present the tool along with an performance regression example
Nati is a Solutions Architect with AWS. He delights in helping customers simplify complex systems, teaching them about the inner workings of cloud services and debugging annoying technical oddities. When he is not at his computer he is soldering electronic kits, tinkering with smaller computers and drumming on a Taiko.
Site Reliability Engineering was never meant to be about firefighting, yet too many teams find themselves stuck in an endless cycle of pages, postmortems, and quick fixes. Why? Because SRE is full of hidden traps — patterns that look like best practices on the surface but slowly erode reliability, burn out engineers, and stall progress.
In this talk, we’ll expose the 7 Deadly Traps of SRE, from the obsession with chasing “five nines,” to the cult of on-call heroism, to the false comfort of tooling and checklists. For each trap, we’ll unpack why it’s so seductive, how it quietly sabotages your team, and what to do instead.
You’ll walk away with a clearer lens on the pitfalls holding SRE organizations back, and a practical playbook to help your team escape firefighting mode and reclaim the true purpose of SRE: building systems - and cultures - that are resilient, scalable, and human-friendly.
Miko Pawlikowski is an SRE Author and a platform engineer at Quadrature. He has led large-scale infrastructure and SRE initiatives at Citadel and Bloomberg, with deep expertise in Kubernetes, cloud computing, and chaos engineering. Passionate about building resilient systems and communities, he brings together engineers worldwide through conferences and media projects.
















