From the course: DevOps Foundations

Unlock the full course today

Join today to access over 23,200 courses taught by industry experts.

Operational feedback: Incident response and retrospectives

Operational feedback: Incident response and retrospectives

From the course: DevOps Foundations

Operational feedback: Incident response and retrospectives

- Remember how we said that all of our systems are sociotechnical systems and humans are a part of their resilient operation? - Yeah, well you can do all the other stuff, right. You can have great design and development and testing and great monitoring, but things are still going to break. - Since this is absolutely no surprise. Part of the job is to get really good at responding to and remediating problems in your production system, which we affectionately refer to as incidents. - Incident response is an activity that needs to be practiced. It's the place where in-depth system knowledge and a cool head make all the difference. - There are three general activities you want to be good at for incident response, troubleshooting, understanding the system enough to be able to diagnose and remediate the problem. Automation having tooling already created to speed up and make safe information gathering and re remediation activities and communication. Incident response often requires a team of…
