Java Operator SDK — Introduction to Event Sources

The Definition of the Problem

When creating an operator we extend Kubernetes API with our own custom types. The intention behind these custom resources is to hide the complexity of resource management processes and just give a nice API, where the user execute mostly CRUD operations on custom resources. All the complex logic and workflow then is implemented in a controller. However inside a controller we usually just manage other resources, creating pods, deployments, persistent volumes or other Kubernetes or non-Kubernetes resources. Let’s call these resources “dependent resources”, since they depend on our custom resource. So in other words, what happens is that we try to manage these dependent resources using a custom resource.

  1. When we create a dependent resource, the duration while it’s successfully created and gets into a target state can take a long time (Think provisioning a databse for example). So we have two options, synchronously waiting in the controller — blocking the thread — until it’s created or asynchronous react on the changes of the dependent resource. Thus execute the controller again on changes of the dependent resource and continue in the workflow.
  2. Let’s consider we are in a state where all the dependent resources are created, but suddenly one of the dependent resources is destroyed — let’s say (for sake of simplicity) a pod is crashed. Until now we had no way to react on such event, until next time the controller is executed (or it was quite cumbersome to hack it into our operators). Until now we usually just executed the controller when the custom resource was changed. We could put there some timer (what we support also now) to periodically execute the controller and poll the state of the dependent resource. This works, be it’s not ideal, in case we have hundreds of custom resource instances, polling all the related APIs is not efficient. What we want is a way to get notified, in other words trigger the controller where there are some changes in the resources that we manage.

Event Sources

Event sources are a relatively simple yet powerful and extensible concept to trigger controller executions. Usually based on changes of dependent resources. To solve the mentioned problems above, de-facto we watch resources we manage for changes, and reconcile the state if a resource is changed. Note that resources we are watching can be Kubernetes and also non-Kubernetes objects. Typically in case of non-Kubernetes objects or services we can extend our operator to handle webhooks or websockets or to react to any event coming from a service we interact with.

  • The CustomResourceEvenSource event source is a special one, which sends events regarding changes of our custom resource, this is an event source which is always registered for every controller by default.
  • An event is always related to a custom resource, so our API did not change: UpdateControl<R> createOrUpdateResource(R resource, Context<R> context);
    We receive however, the event(s) which triggered the controller execution in context object.
  • Concurrency is still handled for you, thus we still guarantee that there is no concurrent execution of the controller for the same custom resource (there is parallel execution if an event is related to another custom resource instance).
    Note that if we receive multiple events while a controller is being executed, we buffer those events and execute the controller again, when the previous execution finished.

Best Practices

During the development and usage of our first event sources, we discovered some patterns, related to usage of event sources. Here I will describe some best practices as we see them now.

Using Events Only as Triggers

When a controller executes we receive events that triggered the execution. Based on these events we see which system changed since the last execution. One of the patters we see tha can beneficial is that, although it might be tempting we strongly advise not to use these events at all in the controller, thus don’t implement reconciliation logic based on them. Instead always check the state of dependent resources, the specs and status of custom resources and reconcile based on those. Note that reconciling the whole state of every dependent resource can be very efficient in case the actual state is read from the local in-memory cache (see on caching below).

Caching

Typically when we work with Kubernetes (but possibly with others), we manage the objects in a declarative way. This is true also for Event Sources. For example if we watch for changes of a Kubernetes Deployment object in an event source, we always receive the whole object from the Kubernetes API. Later when we try to reconcile in the controller (not using events) we would like to check the state of this deployment (but also other dependent resources), we could read the object again from Kubernetes API. However since we watch for the changes we know that we always receive the most up to date version in the Event Source. So naturally, what we can do is cache the latest received objects (in the Event Source) and read it from there if needed.

Filtering events

When we watch a Kubernetes object (or any other resource) in an Event Source we do not necessarily want to propagate an event for every change we receive from the object, just changes which we know that could trigger some meaningful actions within a controller. So it’s again up to the implementation of an Event Source to provide a (possibly extendable/reusable) interface so it’s easy to filter out events which we are not interested in.

Closing Note

Event Sources are quite powerful components which support efficient implementation of controllers. We did not want to restrict the APIs or access to the information — like which events are triggering a controller — since some corner case implementations might need it. However this leads to more possible patterns on how to implement a controller. Which are now also a source of debates what is the proper way to do it. In future we want to create also higher level abstractions that will hopefully lead to easier and more trivial ways of reasoning.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store