There are very few things as satisfying as destruction, especially when we’re frustrated.
How often did it happen that you have an issue that you cannot solve and that you just want to scream or destroy things? Did you ever have a problem in production that is negatively affecting a lot of users? Were you under a lot of pressure to solve it, but you could not “crack” it as fast as you should. It must have happened, at least once, that you wanted to take a hammer and destroy servers in your datacenter. If something like that never happened to you, then you were probably never in a position under a lot of pressure. In my case, there were countless times when I wanted to destroy things. But I didn’t, for quite a few reasons. Destruction rarely solves problems, and it usually leads to negative consequences. I cannot just go and destroy a server and expect that I will not be punished. I cannot hope to be rewarded for such behavior.
What would you say if I tell you that we can be rewarded for destruction and that we can do a lot of good things by destroying stuff? If you don’t believe me, you will soon. That’s what chaos engineering is about. It is about destroying, obstructing, and delaying things in our servers and in our clusters. And we’re doing all that, and many other things, for a very positive outcome.
Chaos engineering tries to find the limits of our system. It helps us deduce what are the consequences when bad things happen. We are trying to simulate the adverse effects in a controlled way. We are trying to do that as a way to improve our systems to make them more resilient and capable of recuperating and resisting harmful and unpredictable events.
That’s our mission. We will try to find ways how we can improve our systems based on the knowledge that we will obtain through the chaos.