As any developer will tell you, we all accidentally add bugs to our software. Although unit testing can catch a great deal of potential issues it is inevitable that a few bugs end up in production.
Once these have been identified it is vitally important to be able to quickly investigate, triage, and diagnose the issue. While murder mysteries can be fun, no one likes to be playing a game of murder mystery or “whodunnit” when facing a bug; especially not a 3am after you’ve been woken by PagerDuty!
Cloud Native: More Services, More Layers, More Challenges
Before the era of cloud native, diagnosing and debugging systems was relatively straightforward. There was typically a monolithic application (or small number of applications) and so the search space for the problem was constrained primarily by the size of a single code base.
Applications were also typically executed as a standalone process on a virtual machine. Viewing logs or attaching a remote debugger was often as simple as SSHing into the machine.
All of this changed with the adoption of microservices, containers, and Kubernetes. These new architectures and technologies enable rapid evolution of systems, but the cost is often related to increased complexity and reduced understandability. This can lead to the “whodunnit” style of bug hunting!
Locating an issue in a system composed of ten microservices now means that there are ten code bases where a potential bug could be lurking. And this isn’t counting the integrations and gaps between services where bugs can also hide. Containerizing applications and running them via Kubernetes also adds additional layers of complexity for viewing logs and debugging, and the applications can be rescheduled at a moment's notice.
Number of Services
1 (or a small number)
VM, Kubernetes, containers
Tail single process
Tail multiple processes, or logs shipped to centralized location
Correlation of User Requests
Single/multiple threads in single process
Multiple threads in multiple processes separate by network boundaries
Debug locally running instances, or open ports in VM to enable remote debugging
Open ports in firewall, VPC, K8s and containers to enable remote debugging
Become a Cloud Native Sherlock Holmes
Being able to effectively debug services in Kubernetes is not dependent on a single tool or technique. A combination of approaches is required (and made easier by free, community tools):
Ready to investigate and debug your own Kubernetes woes?
This learning journey walks you through the primary concepts and hands-on activities required to debug issues across your cluster and multi-service applications.
Kubernetes beginner or experienced user
Time to complete
40 minutes • 10 lessons
What you’ll need
Nothing, we’ll walk through learning the concepts and installing the tools you’ll need as we go
What you’ll use
What you’ll learn
- →Annotating services to quickly identify key debugging information
- →How distributed tracing helps follow requests across multiple services
- →Debugging your cluster when things go wrong
- •Using Telepresence to debug services locally