Watch for kubernetes resources changes and trigger handlers using kwatchman
10 Aug 2019What is the real Cost of Change ?
According to research from Gartner Group, “80 percent of unplanned downtime is caused by people and process issues (changes)”.
Back in time as an SRE in on call duty, I was looking for a way to know when something has changed within my kubernetes clusters in a “passive” and completely automated way, changes such as number of replicas, container image versions, really anything that could be relevant and end up impacting production.
After a quick research I found kubewatch from bitnami as a potential solution, however kubewatch was not implementing the feature that I needed (report manifest changes) and is in general quite noisy, after a couple of tries to contribute and release the feature I needed I realized that despite being a popular repository with over 900 stars it was not maintained properly and had several problems, making it very difficult for me to implement such a large feature, the perfect excuse to build and implement my own solution from scratch and even improve it!
kwatchman was born, it is a custom kubernetes controller that list and watches cluster resources using shared informers to then process those logical changes in manifests through a chain of handlers.
Essentially you deploy kwatchman into your cluster of choice, you can restrict watching to a given namespace or to a set of label selectors.
Every time there is a change into kubernetes resources manifests an event is generated and passed through the configured chain of handlers triggering powerful actions.
For example to be notified on every change being done to your deployments and services on slack simply Install kwatchman using its helm chart.
⚠ To get notifications in slack you have to configure it first
The default values file does not include the slack handler, so you will have to configure it, copy values.yaml
locally from the chart repository and edit it.
Uncomment the following lines and add your webhook url, you can configure the resources that you want to watch as well here, when you are ready redeploy kwatchman with the new changes.
$ helm upgrade kwatchman \
--namespace kwatchman \
--values values.yaml \
snl-charts/kwatchman
Of course you can customize other values and even create your own handlers!
Still lost? this is how changing the number of replicas in the cluster would cause kwatchman to trigger a notification in slack
The strategy ahead with kwatchman is to focus on change management, providing semantic differences by creating a structure holding the list of changes that would available to the handlers for applying logic based on them, such as if this thing changed do this, etc.
That way teams can streamline deployment events of their projects for troubleshooting, and even machines (AIOPS) could leverage them for root cause analysis by logging your relevant changes.