Observability Clinic: Last night Dynatrace saved my life from a broken k8s cluster
Like any other technology transformation, k8s adoption typically starts with small “pet projects”. One Kubernetes (k8s) cluster here, another one over there. If you don’t pay attention, you may end up like many organizations these days, something that spreads like wildfire: hundreds or thousands of k8s clusters, owned by different teams, spread across on-premises and in the cloud, some shared, some very isolated.
As you scale your k8s clusters you need enforce best practices on quotas and limits. You need invest in standardizing observability to ensure your deployed services and applications are within your SLOs. You need to find a way to collect and store traces, logs, events and metrics at scale and provide access to this data to DevOps & SREs when it comes to troubleshooting problems
In this Observability Clinic Henrik Rexed, Cloud Native Advocate at Dynatrace, will remind us about the importance of observability in the k8s landscape. He will walk us through two real production horror stories related to Kubernetes. For each story he will:
- Explain how we could avoid this situation by applying Kubernetes Best Practices
- Show the various features of Dynatrace that could save our lives from a broken cluster
You will also get an outlook on further planned improvements. Make sure to bring your questions as we will conclude with live Q&A.