From the course: DevOps Foundations

Unlock the full course today

Join today to access over 23,200 courses taught by industry experts.

Building for reliability: Practice

Building for reliability: Practice

From the course: DevOps Foundations

Building for reliability: Practice

- As I like to say, "Dev comes from school, but Ops comes from the street." Developers tend to come from computer science backgrounds, but a lot of system administrators are self-taught. They learn from the real world of running large scale systems. SRE is in large part bringing that hard one ops experience to the table and turning it from manual fiddling to disciplined engineering. You are now about to witness the strength of street knowledge. The hard truth is that all systems fail. Codes often written with the assumption that failure of the underlying systems is if not impossible, at least very unusual and probably should result in some manual intervention. But failure and slowdowns, which are as much of a threat as outages, are common. In a modestly complex system individual components are failing all the time, as every operations engineer who's spent time looking through log files knows, and this doesn't always cause an outage. System components are like slices of Swiss cheese…

Contents