Installation
You may use a prepared image flant/addon-operator to install addon-operator in a cluster. The image comprises a binary addon-operator
file as well as several required tools: helm
, kubectl
, jq
, bash
.
The installation incorporates the image building process with files of modules and hooks, applying the necessary RBAC rights and deploying the image in the cluster.
Examples
To experiment with modules, hooks, and values we’ve prepared some examples.
Deckhouse Kubernetes Platform was an initial reason to create addon-operator, thus its modules might become a vital source of inspiration for implementing your own modules.
Sharing your examples of using addon-operator is much appreciated. Please, use the relevant Discussions section for that.
Community
Please feel free to reach developers/maintainers and users via GitHub Discussions for any questions regarding addon-operator.
You’re also welcome to follow @flant_com to stay informed about all our Open Source initiatives.
License
Apache License 2.0, see LICENSE.
Overview
Addon-operator combines Helm charts with hooks and values storage to transform charts into smart modules that configure themselves and respond to changes in the cluster. It is a sister project for shell-operator and is actively used in Deckhouse Kubernetes Platform to implement its modules.
Features
- Discovery of values for Helm charts — parameters can be generated, calculated or retrieved from the cluster;
- Continuous discovery — parameters can be changed in response to cluster events;
- Controlled Helm execution — addon-operator monitors the Helm operation to ensure the Helm chart’s successful installation. Coming soon: use kubedog to track deploy status and more;
- Custom extra actions before and after running Helm as well as any other events via the hooks paradigm. See related shell-operator capabilities.
Additionally, addon-operator provides:
- ease of maintenance of Kubernetes clusters: use the tools that Ops are familiar with to build your modules and hooks such as Bash, kubectl, Python, etc;
- the execution queue of modules and hooks that ensures the launch sequence and repeated execution in case of an error, which simplifies programming of modules and ensures predictable outcome of their operation;
- the possibility of dynamic enabling/disabling of a module (depending on detected parameters);
- the ability to tie conditions of module activation to the activation of other modules;
- the unified ConfigMap for the configuration of all settings;
- the ability to run Helm only if parameters have changed. In this case,
helm history
would output only releases with changes; - global hooks for figuring out parameters and performing actions that affect several dependent modules;
- off-the-shelf metrics for monitoring via Prometheus.
Hooks and Helm values
Hooks are triggered by Kubernetes events and in response to other stimuli.
A hook is an executable file that can make changes to Kubernetes and set values of Helm (they are stored in the memory of addon-operator) during execution.
Hooks are a part of the module. Also, there is a Helm chart in the module. If the hook makes changes to values, then addon-operator would upgrade the release of the Helm chart.
Modules
There can be many modules.
In addition to modules, addon-operator supports global hooks and global values. They have a storage of values. Global hooks are triggered by events and when active they can:
- Make changes to Kubernetes cluster;
- Make changes to global values storage.
If the global hook changes values in the global storage, then addon-operator triggers an upgrade of releases of all Helm charts.
Running Addon-operator
Environment variables
GLOBAL_HOOKS_DIR — a directory with global hook files.
MODULES_DIR — paths separated by colon where modules are located.
UNNUMBERED_MODULE_ORDER — an integer number to use as the default order for modules without numbered prefix.
ADDON_OPERATOR_NAMESPACE — a required parameter with namespace where Addon-operator is deployed.
ADDON_OPERATOR_CONFIG_MAP — a name of ConfigMap to store values. Default is addon-operator
.
Namespace and config map name are used to watch for ConfigMap changes.
Example of container:
containers:
- image: addon-operator-image:latest
env:
- name: ADDON_OPERATOR_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: ADDON_OPERATOR_CONFIG_MAP
value: my-values
With this variables Addon-operator would monitor ConfigMap/my-values object.
ADDON_OPERATOR_LISTEN_ADDRESS — address for http server. Default is 0.0.0.0
ADDON_OPERATOR_LISTEN_PORT — port for http server. Default is 9650
.
Addon-operator starts http server and listens on ADDRESS:PORT
. There is a liveness probe and /metrics
endpoint.
env:
...
- name: ADDON_OPERATOR_LISTEN_ADDRESS
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: ADDON_OPERATOR_LISTEN_PORT
value: 9090
livenessProbe:
httpGet:
path: /healthz
port: 9090
ADDON_OPERATOR_PROMETHEUS_METRICS_PREFIX — a prefix for Prometheus metrics. Default is addon_operator_
.
env
- name: ADDON_OPERATOR_PROMETHEUS_METRICS_PREFIX
value: dev_cluster_
curl localhost:9650/metrics
...
dev_cluster_live_ticks 32
...
ADDON_OPERATOR_CRD_EXTRA_LABELS – string with CRDs label pairs.
For example: heritage=my-app,scope=extra
Default is heritage=addon-operator
.
ADDON_OPERATOR_CRD_FILTER_PREFIXES – String of filters for the CRD, separated by commas. Default is doc-,_
.
Kubernetes client settings
KUBE_CONFIG — a path to a kubernetes client config (~/.kube/config)
KUBE_CONTEXT — a context name in a kubernetes client config (similar to a --context
flag of a kubectl)
KUBE_CLIENT_QPS and KUBE_CLIENT_BURST — qps and burst parameters to rate-limit requests to Kubernetes API server. Default qps is 5 and burst is 10 as in a rest/config.go file.
Helm settings
Addon-operator expects that “helm” binary is available in $PATH. It detects Helm version at start by executing “helm --help” command. If this is not appropriate by some reasons, you can use these settings:
HELM_BIN_PATH — a path to a Helm binary.
HELM_POST_RENDERER_PATH — a path to a Helm post-renderer binary.
HELM3 — set to “yes” to disable auto-detection and explicitly enable compatibility with helm3.
HELM_IGNORE_RELEASE — a name of the release that should not be treated as the module’s release. Prevent self-destruction when addon-operator release is stored in the same namespace as releases for modules.
env:
- name: HELM_IGNORE_RELEASE
value: {{ .Release.Name }}
HELM_MONITOR_KUBE_CLIENT_QPS — QPS for a rate limiter of a kubernetes client for Helm resources monitor.
HELM_MONITOR_KUBE_CLIENT_BURST — Burst for a rate limiter of a kubernetes client for Helm resources monitor.
Logging settings
LOG_TYPE — Logging formatter type: json
, text
or color
.
LOG_LEVEL — Logging level: debug
, info
, error
.
LOG_NO_TIME — ‘true’ value will disable timestamp logging. Useful when output is redirected to logging system that already adds timestamps. Default is ‘false’.
Debug
Several tools are available for the debugging of addon-operator and hooks:
- You can get logs of an Addon-operator’s pod for analysis (by executing
kubectl logs -f po/POD_NAME
) - You can set the environment variable
LOG_LEVEL=debug
to include detailed debugging data into logs - Addon-operator inherits shell-operator’s debug CLI interface and a UNIX socket HTTP endpoint. A path to the endpoint can be configured with
DEBUG_UNIX_SOCKET
environment variable, the default path is “/var/run/addon-operator/debug.socket”.
Available debug commands:
addon-operator queue list [-o text|yaml|json]
Dump tasks in all queues.
addon-operator global values [-o yaml|json]
Dump current global values.
addon-operator global patches
Dump current JSON patches for global values.
addon-operator global config [-o yaml|json]
Dump global config values.
addon-operator module list [-o text|yaml|json]
List available modules and their enabled status.
addon-operator module values [-o yaml|json] <module_name>
Dump module values by name.
addon-operator module patches <module_name>
Dump JSON patches for module values by name.
addon-operator module config [-o yaml|json] <module_name>
Dump module config values by name.
addon-operator module resource-monitor [-o text|yaml|json]
Dump resource monitors.
Lifecycle
Structure
Module files are located in the /modules
directory. The directory can be set via $MODULES_DIR
variable. Global hook files are located in the /global-hooks
directory (you can set your own directory with the $GLOBAL_HOOKS_DIR
variable).
Startup sequence
During startup, Addon-operator finds and initializes all global hooks. For more info, see HOOKS.
After the global hooks initialization, Addon-operator executes all global onStartup
hooks.
Then, the global hooks with kubernetes
binding are executed with a binding context of type Synchronization
and Kubernetes monitors for global hooks are started.
Reload all modules
Next, the ‘reload all modules’ process is started. First, it finds all modules and their hooks.
Then, all global hooks with beforeAll
binding are executed.
Next, the ‘module discovery’ process is started, it finds which modules are enabled by executing ‘enabled’ script for modules enabled in values.yaml and in ConfigMap/addon-operator.
Enabled modules are started.
During each module start-up, it executes all onStartup
hooks and initializes the installation of a Helm chart. Prior to the installation of a Helm chart, the beforeHelm
hook is executed. The afterHelm
hook is executed after the installation.
When all modules are started, all global hooks with afterAll
binding are executed.
Main loop
After the first run of ‘reload all modules’, the main loop starts. It reacts to schedule and Kubernetes events, and to a values changes: it restarts a particular module if its values are changed and runs ‘reload all modules’ process again if global values are changed.
Named queues
The Addon-operator supports named queues to execute hooks in parallel for schedule
and kubernetes/Event
bindings.
All other actions are handled in a single “main” queue:
- global hooks:
onStartup
kubernetes/Synchronization
beforeAll
afterAll
- module hooks:
onStartup
kubernetes/Synchronization
beforeHelm
- execution of
helm
commands afterHelm
This document mainly describes modules. To get more information on hooks, see HOOKS document. To get a full view of how hooks, modules, values, binding contexts, and queues are interlinked, see LIFECYCLE-STEPS document.
Module lifecycle
The onStartup
hooks of enabled module is executed at the startup of the Addon-operator or later on module enablement.
Next, the module’s chart is installed with helm upgrade --install
. Before launching Helm, beforeHelm
hooks are executed, after the launch, afterHelm
hooks are executed.
After the launch the module would start responding to two types of events:
schedule
— events that are generated by the crontab scheduler built in the addon-operator;kubernetes
— events within the cluster that API server announces to the Addon-operator.
When the module is deactivated, the Addon-operator launches command helm delete --purge
and after the release deletion, the afterDeleteHelm
hooks are executed.
All necessary hooks will be restarted if there are errors during the module activation or deactivation. For example, if an error occurred in the hook with afterHelm
binding during the first module execution, then after a 5 seconds delay the onStartup
and beforeHelm
hooks are executed, the Helm chart is installed and then afterHelm
hooks are executed.
Modules discovery
The Addon-operator makes a list of all enabled modules for their execution and a list of disabled modules for the deletion of their Helm releases. This process is called ‘modules discovery’ and is started in the following cases:
- during the start of addon-operator
- when an event to restart all modules occurs (see VALUES).
Modules are disabled by default. The module can be enabled by a key with the module name suffixed by Enabled
. This key should contain a boolean value and can be specified in these sources:
- $MODULES_DIR/values.yaml
values.yaml
files in modules directories- ConfigMap/addon-operator
Boolean values from values.yaml files and ConfigMap/addon-operator are combined and if the result is equal to false
or is empty, then the module is disabled.
If the value is true
, an additional check is performed – the enabled
script is executed (see below). If the script is present in the module and it returns false
, then the module is considered disabled. If the script is not present or returns true
, then the module is enabled.
If an error occurs during the ‘modules discovery’ process, then the module discovery is restarted every 5 seconds until successful execution. In this case, the execution of hooks with schedule
and kubernetes
bindings will be blocked in the “main” queue.
As a result of a ‘module discovery’ process, the tasks for the execution of all enabled modules, deletion of all disabled modules, and execution of all global hooks with the afterAll
binding are added to the queue.
Enabled script
A script or an executable file that returns the status of the module. The script has access to the module values in $VALUES_PATH
and $CONFIG_VALUES_PATH
files, more details about the values are available here. The variable $MODULE_ENABLED_RESULT
passes the path to the file into which the script should write the module status: true
or false
.
Below is an example of the enabled
script that disables the module when parameter param2
is set to “stopMePlease”.
#!/usr/bin/env bash
param2=$(jq -r '.simpleModule.param2' $VALUES_PATH)
if [[ $param2 == "stopMePlease" ]] ; then
echo "false" > $MODULE_ENABLED_RESULT
else
echo "true" > $MODULE_ENABLED_RESULT
fi
Examples
Keys in values.yaml
files
A module named nginx-ingress
may have an nginxIngressEnabled
flag in two files:
$ cat modules/values.yaml
nginxIngressEnabled: true
$ cat modules/001-nginx-ingress/values.yaml
nginxIngressEnabled: false
Module nginx-ingress
is enabled in modules/values.yaml
but disabled in modules/001-nginx-ingress/values.yaml
. The final result is that the module is disabled.
Also, note that the module’s directory name is kebab-cased but keys in values.yaml are camelCased (see VALUES).
values.yaml
and ConfigMap
A module named ‘some-module’ has no someModuleEnabled
flag in modules/001-some-module/values.yaml
but this flag is defined in a ConfigMap and the module has enabled
script:
$ cat modules/values.yaml
global:
param1: 100
someModuleEnabled: false
$ cat modules/001-some-module/values.yaml
someModule:
param1: "String"
$ kubectl -n addon-operator get cm/addon-operator -o yaml
data:
global: |
param1: 200
someModule: |
param1: "Long string"
param2: "FOO"
someModuleEnabled: "true"
$ cat modules/01-some-module/enabled
#!/bin/bash
echo false > $MODULE_ENABLED_RESULT
Module some-module
is explicitly disabled in modules/values.yaml
but enabled by someModuleEnabled
key in ConfigMap/addon-operator. Thus enabled script is executed and returns false
. So the final result is that the module is disabled.
Task queues
Task queues are simple FIFO queues. The Addon-operator processes an event, creates a task and adds it to the particular named queue. Each named queue has a queue handler which runs the first task and proceeds to the next.
Each task is processed until successful completion. In case of an error, the task is returned to the start of the queue and executed with an exponentially growing delay (from 5s to 30s). When executing tasks for the kubernetes
and schedule
events, the queue handler ignores execution errors if the allowFailure: true
flag is specified in the binding configuration.
Queue monitoring
You can use Prometheus metrics to monitor the queue. For details, see METRICS.
Steps of addon-operator lifecycle
This document is intended to give a full view of how hooks, modules, values, binding contexts, and queues are interlinked within the Addon-operator’s lifecycle.
Startup steps:
1. execute global hooks with ‘onStartup’ binding ordered by the ORDER value (see onStartup)
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
[{"binding":"onStartup"}]
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- values changes do not trigger any event
2. execute global hooks with ‘kubernets’ binding in alphabetic order (see kubernetes)
- a hook executes several times for each defined ‘kubernetes’ binding
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
"type": "Synchronization"
"objects"
contains all existed objects"snapshots"
contains existed objects from previous bindings
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- values changes do not trigger an event
3. execute global hooks with ‘beforeAll’ binding ordered by ORDER value (see beforeAll)
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
- “snapshots” contains existed objects from all ‘kubernetes’ bindings of this hook
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- values changes do not trigger an event
- get merged ‘enabled’ state from config for each module
- false
- ‘{moduleName}Enabled’ value from modules/values.yaml
- ‘{moduleName}Enabled’ value from modules/{moduleName}/values.yaml
- ‘{moduleName}Enabled’ value from ConfigMap
- run ‘enabled’ script if merged ‘enabled’ state is true
- input
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- ‘{moduleName}’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- extra field ‘global.enabledModules’ contains a list of previously enabled modules
- ‘module values’ merged from:
- ‘{moduleName}’ section in modules/values.yaml
- ‘{moduleName}’ section in modules/{moduleName}/values.yaml
- ‘{moduleName}’ section in ConfigMap
- patched with patches saved from previous module hooks
- ‘global values’ merged from:
- config ($CONFIG_VALUES_PATH temporary file)
- output
- ‘enabled’ state ($MODULE_ENABLED_RESULT temporary file)
- if hook return “false”, hook state is disabled
- ‘enabled’ state ($MODULE_ENABLED_RESULT temporary file)
- input
- no ‘enabled’ script
- merged ‘enabled’ state is used
- create 3 lists
- modules to enable
- modules to delete (disabled)
- modules to purge (there is helm release, but no module directory)
5. ‘module run’ for each enabled module
- if startup or if module just become enabled
- execute module hooks with ‘onStartup’ binding ordered by the ORDER value (see onStartup)
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
{"binding":"onStartup"}
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- ‘{moduleName}’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- extra field ‘global.enabledModules’ contains a list of all enabled modules created by ‘discover modules’ step (4)
- ‘module values’ merged from:
- ‘{moduleName}’ section in modules/values.yaml
- ‘{moduleName}’ section in modules/{moduleName}/values.yaml
- ‘{moduleName}’ section in ConfigMap
- patched with patches saved from previous module hooks temporary file
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after hook run
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- values changes do not trigger an event
- input
- execute module hooks with ‘kubernetes’ bindings ordered in alphabetic order
- a hook executes several times for each defined ‘kubernetes’ binding
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
"type": "Synchronization"
"objects"
contains all existed objects"snapshots"
contains existed objects from previous bindings
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- ‘{moduleName}’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- extra field ‘global.enabledModules’ contains a list of all enabled modules created by ‘discover modules’ step (4)
- ‘module values’ merged from:
- ‘{moduleName}’ section in modules/values.yaml
- ‘{moduleName}’ section in modules/{moduleName}/values.yaml
- ‘{moduleName}’ section in ConfigMap
- patched with patches saved from previous module hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- values changes do not trigger an event
- execute module hooks with ‘onStartup’ binding ordered by the ORDER value (see onStartup)
- execute module hooks with ‘beforeHelm’ binding ordered by the ORDER value (see beforeHelm)
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
{"binding":"beforeHelm"}
- extra field
"snaphots"
contains existed objects from all ‘kubernetes’ bindings of this hook
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- ‘{moduleName}’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- extra field ‘global.enabledModules’ contains a list of all enabled modules created by ‘discover modules’ step (4)
- ‘module values’ merged from:
- ‘{moduleName}’ section in modules/values.yaml
- ‘{moduleName}’ section in modules/{moduleName}/values.yaml
- ‘{moduleName}’ section in ConfigMap
- patched with patches saved from previous module hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- events after execution
- values changes do not trigger an event
- input
- check if
helm upgrade
should run- get saved checksum from last release values
- render templates
- if checksum is changed → helm release should be upgraded
- get helm resources defined in templates
- if there are absent resources → helm release should be upgraded
- run
helm upgrade --install
- values (unique file in a temporary directory)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- ‘module values’ merged from:
- ‘{moduleName}’ section in modules/values.yaml
- ‘{moduleName}’ section in /modules/{moduleName}/values.yaml
- ‘{moduleName}’ section in ConfigMap
- patched with patches saved from previous module hooks
- ‘global values’ merged from:
- release name
- module name in kebab-case
- name from Chart.yaml is ignored
- namespace
- $ADDON_OPERATOR_NAMESPACE (see RUNNING)
- values (unique file in a temporary directory)
- execute module hooks with ‘afterHelm’ binding ordered by the ORDER value (see afterHelm)
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
{"binding":"afterHelm"}
- extra field
"snaphots"
contains existed objects from all ‘kubernetes’ bindings of this hook
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- ‘{moduleName}’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- extra field ‘global.enabledModules’ contains a list of all enabled modules created by ‘discover modules’ step (4)
- ‘module values’ merged from:
- ‘{moduleName}’ section in modules/values.yaml
- ‘{moduleName}’ section in modules/{moduleName}/values.yaml
- ‘{moduleName}’ section in ConfigMap
- patched with patches saved from previous module hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- if module values are changed, restart ‘module run’
- input
6. ‘module delete’ for each disabled module
- run
helm delete --purge
- execute module hooks with ‘afterDeleteHelm’ binding ordered by the ORDER value (see afterDeleteHelm)
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
{"binding":"afterDeleteHelm"}
- extra field
"snaphots"
contains existed objects from all ‘kubernetes’ bindings of this hook
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- ‘{moduleName}’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- extra field ‘global.enabledModules’ contains a list of all enabled modules created by ‘discover modules’ step (4)
- ‘module values’ merged from:
- ‘{moduleName}’ section in modules/values.yaml
- ‘{moduleName}’ section in modules/{moduleName}/values.yaml
- ‘{moduleName}’ section in ConfigMap
- patched with patches saved from previous module hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- values changes do not trigger an event
- input
7. ‘module purge’ for each non-existent module
- run
helm delete --purge
8. execute global hooks with ‘afterAll’ binding ordered by the ORDER value (see afterAll)
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
{"binding":"afterAll"}
- extra field “snapshots” contains existed objects from all ‘kubernetes’ bindings of this hook
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- if values are changed, re-run ‘Reload all modules’ steps 3 to 8.
Reaction to events
9. ‘kubernetes’ event for global hook (see kubernetes)
- hook execution is queued in “main” or in a named queue according to the binding configuration
- queue handler runs a hook:
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
"type": "Event"
"object"
contains a related object"filterResult"
contains a result of jqFilter"snapshots"
contains existed objects from other bindings
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- ‘global values changed’ if global section or *Enabled flags are changed
- input
10. ‘kubernetes’ event for module hook (see kubernetes)
- hook execution is queued in “main” or in a named queue according to the binding configuration
- queue handler runs a hook:
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
"type": "Event"
"object"
contains a related object"filterResult"
contains a result of jqFilter"snapshots"
contains existed objects from other bindings
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- ‘{moduleName}’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- extra field ‘global.enabledModules’ contains a list of all enabled modules created by ‘discover modules’ step (4)
- ‘module values’ merged from:
- ‘{moduleName}’ section in modules/values.yaml
- ‘{moduleName}’ section in modules/{moduleName}/values.yaml
- ‘{moduleName}’ section in ConfigMap
- patched with patches saved from previous module hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- ‘global values changed’ if global values are changed
- ‘modules values changed’ if module values are changed
- input
11. ‘schedule’ event for global hook (see schedule)
- hook execution is queued in “main” or in a named queue according to the binding configuration
- queue handler runs a hook:
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
"snapshots"
contains existed objects from other bindings
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- ‘global values changed’ if global section or *Enabled flags are changed
- input
12. ‘schedule’ event for module hook (see schedule)
- hook execution is queued in “main” or in a named queue according to the binding configuration
- queue handler runs a hook:
- input
- binding context ($BINDING_CONTEXT_PATH temporary file)
"snapshots"
contains existed objects from other bindings
- config ($CONFIG_VALUES_PATH temporary file)
- ‘global’ section in ConfigMap
- ‘{moduleName}’ section in ConfigMap
- values ($VALUES_PATH temporary file)
- ‘global values’ merged from:
- ‘global’ section in modules/values.yaml
- ‘global’ section in ConfigMap
- patched with patches saved from previous global hooks
- extra field ‘global.enabledModules’ contains a list of all enabled modules created by ‘discover modules’ step (4)
- ‘module values’ merged from:
- ‘{moduleName}’ section in modules/values.yaml
- ‘{moduleName}’ section in modules/{moduleName}/values.yaml
- ‘{moduleName}’ section in ConfigMap
- patched with patches saved from previous module hooks
- ‘global values’ merged from:
- binding context ($BINDING_CONTEXT_PATH temporary file)
- output
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- applied to ConfigMap just after the hook execution
- values patches ($VALUES_JSON_PATCH_PATH temporary file)
- saved in memory
- config patches ($CONFIG_VALUES_JSON_PATCH_PATH temporary file)
- events after execution
- trigger ‘global values changed’ if global values are changed
- trigger ‘modules values changed’ if module values are changed
- input
13. ‘global values changed’ event (see global hook)
- create ‘Reload all modules’ task in the “main” queue (steps 3 to 8)
14. ‘module values changed’ event (see module hook)
- create ‘module run’ task in the “main” queue
- step 5 without
onStartup
andkubernetes@Synchronization
hooks
- step 5 without
15. ‘helm resources absent’ event (see auto-healing)
- create ‘module run’ task in the “main” queue
- step 5 without
onStartup
andkubernetes@Synchronization
hooks
- step 5 without
16. ConfigMap is changed (see ConfigMap/addon-operator)
- values in global section are changed
- create ‘Reload all modules’ task in the “main” queue (steps 3 to 8)
- *Enabled flags are changed
- create ‘Reload all modules’ task in the “main” queue (steps 3 to 8)
- values in modules sections are changed
- create ‘module run’ task in the “main” queue
- step 5 without
onStartup
andkubernetes@Synchronization
hooks
- step 5 without
- create ‘module run’ task in the “main” queue
Module structure
A module is a directory with files. Addon-operator searches for the modules directories in /modules
or in the paths specified by the $MODULES_DIR variable. The module has the same name as the corresponding directory excluding the numeric prefix.
An example of the file structure of the module:
/modules/001-simple-module
├── crds
│ ├── doc-ru-projects.yaml
│ ├── doc-ru-projecttemplate.yaml
│ ├── projects.yaml
│ ├── projecttemplate.yaml
├── hooks
│ ├── module-hook-1.sh
│ ├── ...
│ └── module-hook-N.sh
├── openapi
│ ├── config-values.yaml
│ └── values.yaml
├── templates
│ ├── config-maps.yaml
│ ├── ...
│ └── daemon-set.yaml
├── enabled
├── README.md
├── .helmignore
├── Chart.yaml
└── values.yaml
crds
— a directory with crd files.hooks
— a directory with hooks.openapi
— OpenAPI schemas for config values and for helm values.enabled
— a script that gets the status of module (is it enabled or not). See the modules discovery process.Chart.yaml
,.helmignore
,templates
— a Helm chart files.README.md
— an optional file with the module description.values.yaml
– default values for chart in a YAML format.
The name of this module is simple-module
. values.yaml should contain a section simpleModule
and a simpleModuleEnabled
flag (see VALUES).
Notes on how Helm is used
values.yaml
Addon-operator does not use values.yaml as the only source of values for the chart. It generates a new file with a merged set of values (also mixing values from this file (see VALUES).
Chart.yaml
We recommend to define the “version” field in your Chart.yaml as “0.0.1” and use VCS to control versions. We also recommend to explicitly specify the “name” field even despite it is ignored: Addon-operator passes the module name to the Helm as a release name.
Releases deduplication
A module’s execution might be triggered by an event that does not change the values used by Helm templates (see modules discovery). Re-running Helm will lead to an “empty” release. To avoid this, Addon-operator runs a helm template
command and compares a checksum of output with a saved checksum and starts the installation of a Helm chart only if there are changes.
Release auto-healing
The Addon-operator monitors resources defined by a Helm chart and triggers an update if something is deleted. This is useful for resources that Helm can’t update without deletion. It is worth noting, that resource deletion by hooks is smartly ignored to prevent needless updates.
Next
Hooks
A hook is an executable file that the Addon-operator executes when some event occurs. It can be a script or a compiled program written in any programming language.
The Addon-operator pursues an agreement stating that the information is transferred to hooks via files and results of hook’s execution are also stored in files. Paths to files are passed via environment variables. The output to stdout will be written to the log, except for the case with the configuration output (run with --config
flag). Such an agreement simplifies the work with the input data and reporting the results of the hook execution.
Global hooks
Global hooks are stored in the $GLOBAL_HOOKS_DIR/hooks
directory. The Addon-operator recursively searches all executable files in it (lib
subdirectory ignored) and runs them with the --config
flag. Each hook prints its events binding configuration in JSON or YAML format to stdout. If the execution fails, the Addon-operator terminates with the code of 1.
Bindings from shell-operator are available for global hooks: onStartup, schedule and kubernetes. The bindings to the events of the modules discovery process are also available: beforeAll and afterAll (see modules discovery).
During execution, a global hook receives global values. These values can be modified by the hook to share data with global hooks, module hooks, and Helm templates. If the hook changes global values, the ‘global values changed’ event is generated and all modules are reloaded. For details on values storage, see VALUES. See also an overview and a detailed description of ‘Reload all modules’ process.
Module hook
Module hooks are executable files stored in the hooks
subdirectory of the module. During the ‘modules discovery’ process, if module appears to be enabled, the Addon-operator searches for executable files in hooks
directory and executes them with --config
flag. Each hook prints its event binding configuration in JSON or YAML format to stdout. The module discovery process restarts if an error occurs.
Bindings from shell-operator are available for module hooks: schedule and kubernetes. The bindings of the module lifecycle are also available: onStartup
, beforeHelm
, afterHelm
, afterDeleteHelm
— see module lifecycle.
During execution, a module hook receives global values and module values. Module values can be modified by the hook to share data with other hooks of the same module. If the hook changes module values, the ‘module values changed’ event is generated and then the module is reloaded. For details on values storage, see VALUES. See also a module lifecycle and a module run detailed description.
Bindings
Overview
Binding | Global? | Module? | Info |
---|---|---|---|
onStartup↗ | ✓ | – | On Addon-operator startup |
onStartup↗ | – | ✓ | On Addon-operator startup or module enablement |
beforeAll↗ | ✓ | – | Before any modules are executed |
afterAll↗ | ✓ | – | After all modules are executed |
beforeHelm↗ | – | ✓ | Before executing helm install |
afterHelm↗ | – | ✓ | After executing helm install |
afterDeleteHelm↗ | – | ✓ | After executing helm delete |
schedule↗ | ✓ | ✓ | Run on schedule |
kubernetes↗ | ✓ | ✓ | Run on event from Kubernetes |
onStartup
Example:
configVersion: v1
onStartup: ORDER
Parameters:
ORDER
— an integer value that specifies an execution order. When added to the “main” queue, the hooks will be sorted by this value and then alphabetically by file name.
beforeAll
Example:
configVersion: v1
beforeAll: ORDER
Parameters:
ORDER
— an integer value that specifies an execution order. When added to the “main” queue, the hooks will be sorted by this value and then alphabetically by file name.
afterAll
Example:
configVersion: v1
afterAll: ORDER
Parameters:
ORDER
— an integer value that specifies an execution order. When added to the “main” queue, the hooks will be sorted by this value and then alphabetically by file name.
beforeHelm
Example:
configVersion: v1
beforeHelm: ORDER
Parameters:
ORDER
— an integer value that specifies an execution order. When added to the “main” queue, the hooks will be sorted by this value and then alphabetically by file name.
afterHelm
Example:
configVersion: v1
afterHelm: ORDER
Parameters:
ORDER
— an integer value that specifies an execution order. When added to the “main” queue, the hooks will be sorted by this value and then alphabetically by file name.
afterDeleteHelm
Example:
configVersion: v1
afterDeleteHelm: ORDER
Parameters:
ORDER
— an integer value that specifies an execution order. When added to the “main” queue, the hooks will be sorted by this value and then alphabetically by file name.
schedule
See the schedule binding from the Shell-operator.
kubernetes
See the kubernetes binding from the Shell-operator.
Note: Addon-operator requires a ServiceAccount with the appropriate RBAC permissions. See
addon-operator-rbac.yaml
files in examples.
Execution on event
When an event associated with a hook is triggered, Addon-operator executes the hook without arguments and passes the global or module values from the storage of the values via temporary files. In response, a hook could return JSON patches to modify values. The detailed description of the storage of the values is available in VALUES document.
Binding context
The binding context is a piece of information about the event which caused the hook execution.
The $BINDING_CONTEXT_PATH
environment variable contains the path to a file with a JSON array of structures with the following fields:
binding
is a string from thename
parameter forschedule
orkubernetes
bindings. Its value is a binding type if the parameter is not set and for other hooks. For example, the binding context forbeforeAll
binding type:
[{"binding":"beforeAll"}]
The binding context for schedule
and kubernetes
hooks contains additional fields, described in Shell-operator documentation.
beforeAll
and afterAll
global hooks and beforeHelm
, afterHelm
, and afterDeleteHelm
module hooks are executed with the binding context that includes a snapshots
field, which contains all Kubernetes objects that match hook’s kubernetes
bindings configurations.
For example, a global hook with kubernetes
and beforeAll
bindings may have this configuration:
configVersion: v1
beforeAll: 10
kubernetes:
- name: monitor-pods
apiVersion: v1
kind: Pod
jqFilter: ".metadata.labels"
This hook will be executed before updating the Helm release with this binding context:
[{"binding": "beforeAll",
"snapshots": {
"monitor-pods": [
{
"object": {
"kind": "Pod",
"apiVersion": "v1",
"metadata": {
"name":"pod-1r62e3",
"namespace":"default", ...},
...
},
"filterResult": {
"label1": "label value",
...
},
},
...
more pods
...
]
}
}]
Synchronization for global hooks
Synchronization is the first run of global hooks with “kubernetes” bindings. As with the Shell-operator, it executes right after the successful completion of global “onStartup” hooks, but the following behavior is slightly different. By default, the Addon-operator executes “beforeAll” hooks after the completion of hooks with executeHookOnSynchronization: true
. Set waitForSynchronization: false
to execute these hooks in parallel with “beforeAll” hooks.
For example, a global hook with kubernetes
and beforeAll
bindings may have this configuration:
configVersion: v1
beforeAll: 10
kubernetes:
- name: monitor-pods
apiVersion: v1
kind: Pod
jqFilter: ".metadata.labels"
- name: monitor-nodes
apiVersion: v1
kind: Node
jqFilter: ".metadata.labels"
queue: nodes-handling
executeHookOnSynchronization: false
- name: monitor-cms
apiVersion: v1
kind: ConfigMap
jqFilter: ".metadata.labels"
queue: config-map-handling
waitForSynchronization: false
- name: monitor-secrets
apiVersion: v1
kind: Secret
jqFilter: ".metadata.labels"
queue: secrets-handling
executeHookOnSynchronization: false
waitForSynchronization: false
This hook will be executed after “onStartup” as follows:
- Run hook with binding context for the “monitor-pods” binding in the “main” queue.
- Fill snapshot for the “monitor-nodes” binding, do not execute hook.
- Run in parallel:
- hook with the “beforeAll” binding context in the “main” queue
- hook with the “monitor-cms” binding context in the “config-map-handling” queue
- fill snapshot for the “monitor-secrets” binding.
Note: there is no guarantee that the “beforeAll” binding context contains snapshots with ConfigMaps and Secrets.
Synchronization for module hooks
Synchronization is the first run of module hooks with “kubernetes” bindings after module enablement. It executes right after the successful completion of the module’s “onStartup” hooks. By default, the Addon-operator executes “beforeHelm” hooks after the completion of hooks with executeHookOnSynchronization: true
. Set waitForSynchronization: false
to execute these hooks in parallel with “beforeHelm” hooks.
For example, a module hook with kubernetes
and beforeHelm
bindings may have this configuration:
configVersion: v1
beforeHelm: 10
kubernetes:
- name: monitor-pods
apiVersion: v1
kind: Pod
jqFilter: ".metadata.labels"
- name: monitor-nodes
apiVersion: v1
kind: Node
jqFilter: ".metadata.labels"
queue: nodes-handling
executeHookOnSynchronization: false
- name: monitor-cms
apiVersion: v1
kind: ConfigMap
jqFilter: ".metadata.labels"
queue: config-map-handling
waitForSynchronization: false
- name: monitor-secrets
apiVersion: v1
kind: Secret
jqFilter: ".metadata.labels"
queue: secrets-handling
executeHookOnSynchronization: false
waitForSynchronization: false
This hook will be executed after “onStartup” as follows:
- Run hook with binding context for the “monitor-pods” binding in the “main” queue.
- Fill snapshot for the “monitor-nodes” binding, do not execute hook.
- Run in parallel:
- hook with the “beforeHelm” binding context in the “main” queue
- hook with the “monitor-cms” binding context in the “config-map-handling” queue
- fill snapshot for the “monitor-secrets” binding
Note: there is no guarantee that the “beforeHelm” binding context contains snapshots with ConfigMaps and Secrets.
Execution rate
Hook configuration has a settings
section with parameters executionMinPeriod
and executionBurst
. These parameters are used to throttle hook executions and wait for more events in the queue. See section execution rate from the Shell-operator.
Values storage
The Addon-operator provides the storage for the values that will be passed to the Helm chart. You may find out more about the chart values concept in the Helm documentation: values files. Global and module hooks have access to the values in the storage and can change them.
The storage is a hash-like data structure. The global
key contains all global values – they are passed to every hook and available to all Helm charts. Only global hooks may change global values.
The other keys must match the module’s name converted to camelCase. Each key stores the object with module values. These values are only available to hooks, enabled
script of this module, and to its Helm chart. Only module hooks can change the values of the module.
Note: You cannot get the values of another module within the module hook. Shared values should be global values for now (#9).
Hook receives values via files on execution. These schemas can help you understand the flow of values for a global hook and for a module hook:
The values can be represented as:
- a structure (including empty structure)
- a list (including empty list)
Structures and lists must be JSON-compatible since hooks receive values at runtime as JSON files (see using values in hook).
Note: each module has an additional key with
Enabled
suffix and a boolean value to enable or disable the module (e.g.,ingressNginxEnabled: false
). This key is handled by modules discovery process.
values.yaml
On start-up, the Addon-operator loads values into storage from values.yaml
files:
$MODULES_DIR/values.yaml
values.yaml
files in modules directories — only the values from key with camelCase name of the module
An example of global values in $MODULES_DIR/values.yaml
:
global:
param1: value1
param2: value2
simpleModule:
modParam1: value3
An example of module values in $MODULES_DIR/001-simple-module/values.yaml
:
simpleModule:
modParam1: value1
modParam2: value2
ConfigMap/addon-operator
There is a key global
in the ConfigMap/addon-operator that contains global values and the keys with module values. The values are stored in these keys as the YAML encoded strings. Values in the ConfigMap/addon-operator override the values loaded from values.yaml
files.
The Addon-operator monitors changes in the ConfigMap/addon-operator and starts the ‘reload all modules’ process in case of global values changes or ‘module run’ process if only the module section is changed. See LIFECYCLE.
An example of ConfigMap/addon-operator:
data:
global: | # vertical bar is required here
param1: newValue
param3: valu3
simpleModule: | # module name should be in camelCase
modParam2: newValue2
anotherModule: "false" # `false' value disables a module
Update values
Hooks can update values in the storage. To do that the hook returns a JSON Patch.
A hook can update values in the ConfigMap/addon-operator so that the updated values would be available after restarting the Addon-operator (long-term update). For example, you may store generated passwords or certificates.
Patch for a long-term update is returned via the $CONFIG_VALUES_JSON_PATCH_PATH
file and after hook execution, the Addon-operator immediately applies this patch to the values in ConfigMap/addon-operator.
Another option is to store updated values for a period while the Addon-operator process is running. For example, you may store the results of the discovery of cluster resources or parameters.
Patch for temporary updates is returned via the $VALUES_JSON_PATCH_PATH
file and remains in the Addon-operator volatile memory.
Merged values
When the hook or enabled
script is about to be executed, or a Helm chart is to be installed, the Addon-operator generates a merged set of values. This merged set combines:
- global values from
values.yaml
files and ConfigMap/addon-operator; - module values from the
values.yaml
files and ConfigMap/addon-operator; - patches for the temporary updates are applied.
The merged values are passed as the temporary JSON file to hooks or enabled
script and as the temporary values.yaml
file to the helm install
.
Using values in the hook
When the hook is triggered by an event, the values are passed to it via JSON files. The hook can use environment variables to get paths of those files:
$CONFIG_VALUES_PATH
— this file contains values from the ConfigMap/addon-operator.$VALUES_PATH
— this file contains merged values.
For global hooks, only global values are available.
For module hooks the global values and the module values are available. Also, the enabledModules
field is added to the global
values in the $VALUES_PATH
file. It contains the list of all enabled modules in the order of execution (see module lifecycle).
To change the values, the hook must return JSON patches via the result files. The hook can use environment variables to get paths of those files:
$CONFIG_VALUES_JSON_PATCH_PATH
— hook should write a patch for ConfigMap/addon-operator into this file.$VALUES_JSON_PATCH_PATH
— hook should write a patch for a temporary update of parameters into this file.
Using the values in enabled
scripts
The enabled
script works with values in the read-only mode. It receives values in JSON files. The script can use environment variables to get paths of those files:
$CONFIG_VALUES_PATH
— this file contains values from ConfigMap/addon-operator.$VALUES_PATH
— this file contains merged values.
The enabledModules
field with the list of previously enabled modules is added to the global
key in the $VALUES_PATH
file.
Using values in Helm charts
Helm chart of the module has access to the merged values similar to the $VALUES_PATH
but without enabledModules
field.
The Helm template’s variable .Values
allows you to use values in the templates:
{{ .Values.global.param1 }}
{{ .Values.moduleName.modParam2 }}
Example
Let’s assume the following values are defined:
$ cat modules/values.yaml:
global:
param1: 100
param2: "Yes"
$ cat modules/01-some-module/values.yaml
someModule:
param1: "String"
$ kubectl -n addon-operator get cm/addon-operator -o yaml
data:
global: |
param1: 200
someModule: |
param1: "Long string"
param2: "FOO"
The Addon-operator generates the following files with values:
$ cat $CONFIG_VALUES_PATH
{"global":{
"param1":200
}, "someModule":{
"param1":"Long string",
"param2": "FOO"
}}
$ cat $VALUES_PATH
{"global":{
"param1":200,
"param2": "YES"
}, "someModule":{
"param1":"Long string",
"param2": "FOO"
}}
A hook adds a new value with the help of a JSON patch:
$ cat /modules/001-some-module/hooks/hook.sh
#!/usr/bin/env bash
...
cat > $CONFIG_VALUES_JSON_PATCH_PATH <<EOF
[{"op":"add", "path":"/someModule/param3", "value":"newValue"}]
EOF
...
Now the ConfigMap/addon-operator has the following content:
data:
global: |
param1: 200
someModule: |
param1: "Long string"
param2: "FOO"
param3: "newValue"
Next time the hook is executed, the Addon-operator would generate the following files with values:
$ cat $CONFIG_VALUES_PATH
{"global":{
"param1":200
},
"someModule":{
"param1":"Long string",
"param2": "FOO",
"param3": "newValue"
}}
$ cat $VALUES_PATH
{"global":{
"param1":200,
"param2": "YES"
}, "someModule":{
"param1":"Long string",
"param2": "FOO",
"param3": "newValue"
}}
Helm chart template
replicas: {{ .Values.global.param1 }}
would generate the string replicas: 200
. As you can see, the value “100” from the values.yaml is replaced by “200” from the ConfigMap/addon-operator.
Validation
The addon-operator supports OpenAPI schemas for config values and for effective values. These schemas should be stored in the $GLOBAL_HOOKS_DIR/openapi
directory for global values and in the $MODULES_DIR/<module-name>/openapi
directories for modules.
openapi/config-values.yaml
is a schema for values merged from values.yaml, modules/values.yaml and the ConfigMap.
openapi/values.yaml
is a schema for values merged from values.yaml, modules/values.yaml and the ConfigMap with applied values patches.
Validation occurs on startup, on ConfigMap changes, and after hook executions. If validation fails after hook execution, hook is restarted. If validation fails on startup, the addon-operator stops. If validation fails on ConfigMap changes, error is logged and no new tasks are queued.
Note: Unlike the default behavior, the addon-operator sets
additionalProperties: false
ifadditionalProperties
is not set.
Example
# /global/openapi/config-values.yaml
type: object
additionalProperties: false
required:
- project
- clusterName
minProperties: 2
properties:
project:
type: string
clusterName:
type: string
clusterHostname:
type: string
discovery:
type: object
This schema defines 2 required fields for ‘global’ values: project
and clusterName
. clusterHostname
field is an optional string. discovery
is an optional object with no restrictions on keys.
Consider this ConfigMap/addon-operator
content:
metadata:
...
data:
global: |
project: myProject
moduleOne: |
param1: value1
...
This ConfigMap has invalid ‘global’ values, and the addon-operator stops with an error on startup.
Consider valid ConfigMap/addon-operator
and this config patch from global hook:
[{"op":"add", "path":"/global/clusterHostname", "value":"{}"}]
This patch sets clusterHostname
field in the ‘global’ section. It is not allowed because schema defines clusterHostname
as a string. This situation is handled like a hook execution error, the hook stays in queue and restarts with exponential backoff (see LIFECYCLE.
Extending
Values are config values with applied patches, so schema in values.yaml should contain duplicates of properties from config-values.yaml schema. There is a technique with allOf
to reduce duplicates, but it will not eliminate duplicates when additionalProperties: false
. To overcome this problem, we implement custom property x-extend
for values.yaml schema.
If values.yaml schema contains x-extend
field, shell-operator extends fields in values.yaml schema with fields from config-values.yaml schema:
- definitions
- required
- properties
- patternProperties
- title
- description
Also, “x-*“ properties copied from config-values.yaml schema.
Example
Consider these OpenAPI schemas:
# /global/openapi/config-values.yaml
type: object
additionalProperties: false
required:
- project
- clusterName
properties:
project:
type: string
clusterName:
type: string
clusterHostname:
type: string
# /global/openapi/values.yaml
x-extend:
schema: config-values.yaml
type: object
additionalProperties: false
required:
- discovery
- param1
properties:
discovery:
type: object
param1:
type: string
The addon-operator will validate values with this effective schema:
# effective schema for values
type: object
additionalProperties: false
required:
- project
- clusterName
- discovery
- param1
properties:
project:
type: string
clusterName:
type: string
clusterHostname:
type: string
discovery:
type: object
param1:
type: string
Defaults
The addon-operator respects default
key in schemas and apply defaults when merge values.
Example
Consider this schema for global values:
# /global/openapi/values.yaml
x-extend:
schema: config-values.yaml
type: object
additionalProperties: false
required:
- param1
properties:
discovery:
type: object
default:
{}
param1:
type: string
The addon-operator will add discovery
with empty object to values if no discovery
key is present in the ConfigMap, modules/values.yaml
or in patches.
Required fields
There is a problem with required
fields defined in openapi/values.yaml
: values for Helm can be constructed by multiple hooks. Different hooks return different portions of required
fields and validation will fail on hook execution. To define a contract for Helm values in this situation, the addon-operator implements x-required-for-helm
to define required values for Helm. Values are checked before helm execution with x-required-for-helm
array merged with required
.
Example
Suppose we have two hooks: one hook prepares a param1
value and the second hook prepares a param2
value. Helm required both fields, but we can’t require both fields after each hook execution. x-required-for-helm
to the rescue:
# /global/openapi/values.yaml
type: object
x-required-for-helm:
- param1
- param2
properties:
param1:
type: string
param2:
type: string
The addon-operator will validate values after each hook execution with this effective schema:
# effective schema for values
type: object
additionalProperties: false
properties:
param1:
type: string
param2:
type: string
The addon-operator will validate values before Helm execution with this effective schema:
# effective schema for values
type: object
additionalProperties: false
required:
- param1
- param2
properties:
param1:
type: string
param2:
type: string
Addon-operator metrics
The Addon-operator implements Prometheus target at /metrics
endpoint. The default port is 9650
.
Metrics
-
addon_operator_binding_count{module="", hook=""}
— a gauge with bindings count for every hooks. Global hooks has empty “module” label. -
addon_operator_config_values_errors_total{}
— a counter of ConfigMap validation errors afterkubectl edit
. See validation. -
addon_operator_global_hook_run_seconds{hook="", binding="", activation="", queue=""}
— a histogram with hook execution times. “hook” label is a name of the hook, “binding” is a binding name from configuration, “queue” is a queue name where hook is queued and “activation” is an event that triggers hook execution. -
addon_operator_global_hook_run_errors_total{hook="", binding="", activation="", queue=""}
– this is the counter of hooks’ execution errors. It only tracks errors of hooks with the disabledallowFailure
(i.e. respective key is omitted in the configuration or theallowFailure: false
parameter is set). This metric has a “hook” label with the name of a failed hook. -
addon_operator_global_hook_run_allowed_errors_total{hook="", binding="", activation="", queue=""}
– this is the counter of hooks’ execution errors. It only tracks errors of hooks that are allowed to exit with an error (the parameterallowFailure: true
is set in the configuration). The metric has a “hook” label with the name of a failed hook. -
addon_operator_global_hook_run_success_total{hook="", binding="", activation="", queue=""}
– this is the counter of hooks’ success execution. The metric has a “hook” label with the name of a succeeded hook. -
addon_operator_global_hook_run_sys_cpu_seconds{hook="", binding="", activation="", queue=""}
— a histogram with global hook system cpu seconds. -
addon_operator_global_hook_run_user_cpu_seconds{hook="", binding="", activation="", queue=""}
— a histogram with global hook user cpu seconds. -
addon_operator_global_hook_run_max_rss_bytes{hook="", binding="", activation="", queue=""}
— a gauge with global hook max rss usage in bytes. -
addon_operator_module_hook_run_seconds{module="", hook="", binding="", activation="", queue=""}
— a histogram with module hook execution times. “module” label is a name of the module, “hook” label is a name of the hook, “binding” is a binding name from configuration, “queue” is a queue name where hook is queued and “activation” is an event that triggers hook execution. -
addon_operator_module_hook_run_errors_total{module="", hook="", binding="", activation="", queue=""}
– this is the counter of hooks’ execution errors. It only tracks errors of hooks with the disabledallowFailure
(i.e. respective key is omitted in the configuration or theallowFailure: false
parameter is set). This metric has a “hook” label with the name of a failed hook. -
addon_operator_module_hook_run_allowed_errors_total{module="", hook="", binding="", activation="", queue=""}
– this is the counter of hooks’ execution errors. It only tracks errors of hooks that are allowed to exit with an error (the parameterallowFailure: true
is set in the configuration). The metric has a “hook” label with the name of a failed hook. -
addon_operator_module_hook_run_success_total{module="", hook="", binding="", activation="", queue=""}
– this is the counter of hooks’ success execution. The metric has a “hook” label with the name of a succeeded hook. -
addon_operator_module_hook_run_sys_cpu_seconds{module="", hook="", binding="", activation="", queue=""}
— a histogram with module hook system cpu seconds. -
addon_operator_module_hook_run_user_cpu_seconds{module="", hook="", binding="", activation="", queue=""}
— a histogram with module hook user cpu seconds. -
addon_operator_module_hook_run_max_rss_bytes{module="", hook="", binding="", activation="", queue=""}
— a gauge with module hook max rss usage in bytes. -
addon_operator_module_discover_errors_total
– a counter of errors during the modules discover process. It increases in these cases:- an ‘enabled’ script is executed with an error
- a module hook return an invalid configuration
- a call to the Kubernetes API ends with an error (for example, retrieving Helm releases).
-
addon_operator_module_run_errors_total{module=x}
– counter of errors on module start-up. -
addon_operator_module_delete_errors_total{module=x}
– counter of errors on module deletion. -
addon_operator_module_run_seconds{module=""}
— a histogram with module execution timings. -
addon_operator_module_helm_seconds{module="", activation=""}
— a histogram of module’shelm upgrade
timings. -
addon_operator_helm_operation_seconds{module="", activation="", operation=""}
— a histogram of different helm operations timings. -
addon_operator_convergence_seconds{activation=onStartup}
— a counter of seconds spent to execute “reload all modules” processes. “activation=OnStartup” label value can be used to retrieve information about first “reload all modules” when operator starts. -
addon_operator_convergence_total{activation=onStartup}
— a counter of “reload all modules” processes. -
addon_operator_tasks_queue_length{queue=""}
– a gauge showing the length of the working queue. This metric can be used to warn about stuck hooks. It has the “queue” label with the queue name. -
addon_operator_task_wait_in_queue_seconds_total{module="", hook="", binding="", queue=""}
— a counter with seconds that the task is elapsed in the queue. -
addon_operator_live_ticks
– a counter that increases every 10 seconds. This metric can be used for alerting about an unhealthy Addon-operator. It has no labels. -
addon_operator_kube_jq_filter_duration_seconds{module="", hook="", binding="", queue="", kind=""}
— a histogram with jq filter timings. -
addon_operator_kube_event_duration_seconds{module="", hook="", binding="", queue="", kind=""}
— a histogram with kube event handling timings. -
addon_operator_kube_snapshot_objects{module="", hook="", binding="", queue=""}
— a gauge with count of cached objects (the snapshot) for particular binding. “module” label is empty for global hook. -
addon_operator_kube_snapshot_bytes{module="", hook="", binding="", queue=""}
— a gauge with size in bytes of cached objects for particular binding. Each cached object contains a Kubernetes object and/or result of jqFilter depending on the binding configuration. The size is a sum of the length of Kubernetes object in JSON format and the length of jqFilter‘s result in JSON format. -
addon_operator_kubernetes_client_request_result_total
— a counter of requests made by kubernetes/client-go library. -
addon_operator_kubernetes_client_request_latency_seconds
— a histogram with latency of requests made by kubernetes/client-go library. -
addon_operator_tasks_queue_action_duration_seconds{queue_name="", queue_action=""}
— a histogram with measurements of low level queue operations. Use QUEUE_ACTIONS_METRICS=”no” to disable this metric.
Custom metrics
Hooks can export metrics by writing a set of operations in JSON format into $METRICS_PATH file.
Operation to register a counter and increase its value:
{"name":"metric_name","action":"add","value":1,"labels":{"label1":"value1"}}
Operation to register a gauge and set its value:
{"name":"metric_name","action":"set","value":33,"labels":{"label1":"value1"}}
Operation to register a histogram and observe a duration:
{"name":"metric_name","action":"observe","value":42, "buckets": [1,2,5,10,20,50], "labels":{"label1":"value1"}}
Labels are not required, but Shell-operator adds a hook
label with a path to a hook script relative to hooks directory.
Several metrics can be exported at once. For example, this script will create 2 metrics:
echo '{"name":"hook_metric_count","action":"add","value":1,"labels":{"label1":"value1"}}' >> $METRICS_PATH
echo '{"name":"hook_metrics_items","action":"add","value":1,"labels":{"label1":"value1"}}' >> $METRICS_PATH
The metric name is used as-is, so several hooks can export the same metric name. It is responsibility of hooks‘ developer to maintain consistent label cardinality.
There are fields “add” and “set” that can be used as shortcuts for action and value. This feature may be deprecated in future releases.
{"name":"metric_name","add":1,"labels":{"label1":"value1"}}
Note that there is no mechanism to expire this kind of metrics except the addon-operator restart. It is the default behavior of prometheus-client.
Grouped metrics
The common cause to expire a metric is a removed object. It means that the object is no longer in the snapshot, and the hook can’t identify the metric that should be expired.
To solve this, use the “group” field in metric operations. When Shell-operator receives operations with the “group” field, it expires previous metrics with the same group and applies new metric values. This grouping works across hooks and label values.
echo '{"group":"group1", "name":"hook_metric_count", "action":"add", "value":1, "labels":{"label1":"value1"}}' >> $METRICS_PATH
echo '{"group":"group1", "name":"hook_metrics_items", "action":"add", "value":1, "labels":{"label1":"value1"}}' >> $METRICS_PATH
To expire all metrics in a group, use action “expire”:
{"group":"group_name_1", "action":"expire"}
WARNING: “observe” is currently an unsupported action for grouped metrics
Example
hook1.sh
returns these metrics:
echo '{"group":"hook1", "name":"hook_metric", "action":"add", "value":1, "labels":{"kind":"pod"}}' >> $METRICS_PATH
echo '{"group":"hook1", "name":"hook_metric", "action":"add", "value":1, "labels":{"kind":"replicaset"}}' >> $METRICS_PATH
echo '{"group":"hook1", "name":"hook_metric", "action":"add", "value":1, "labels":{"kind":"deployment"}}' >> $METRICS_PATH
echo '{"group":"hook1", "name":"hook1_special_metric", "action":"set", "value":12, "labels":{"label1":"value1"}}' >> $METRICS_PATH
echo '{"group":"hook1", "name":"common_metric", "action":"set", "value":300, "labels":{"source":"source3"}}' >> $METRICS_PATH
echo '{"name":"common_metric", "action":"set", "value":100, "labels":{"source":"source1"}}' >> $METRICS_PATH
hook2.sh
returns these metrics:
echo '{"group":"hook2", "name":"hook_metric","action":"add", "value":1, "labels":{"kind":"configmap"}}' >> $METRICS_PATH
echo '{"group":"hook2", "name":"hook_metric","action":"add", "value":1, "labels":{"kind":"secret"}}' >> $METRICS_PATH
echo '{"group":"hook2", "name":"hook2_special_metric", "action":"set", "value":42}' >> $METRICS_PATH
echo '{"name":"common_metric", "action":"set", "value":200, "labels":{"source":"source2"}}' >> $METRICS_PATH
Prometheus scrapes these metrics:
# HELP hook_metric hook_metric
# TYPE hook_metric counter
hook_metric{hook="hook1.sh", kind="pod"} 1 -------------------+---------- group:hook1
hook_metric{hook="hook1.sh", kind="replicaset"} 1 ------------+
hook_metric{hook="hook1.sh", kind="deployment"} 1 ------------+
hook_metric{hook="hook2.sh", kind="configmap"} 1 ------------|-------+-- group:hook2
hook_metric{hook="hook2.sh", kind="secret"} 1 ----------------|-------+
# HELP hook1_special_metric hook1_special_metric | |
# TYPE hook1_special_metric gauge | |
hook1_special_metric{hook="hook1.sh", label1="value1"} 12 ----+ |
# HELP hook2_special_metric hook2_special_metric | |
# TYPE hook2_special_metric gauge | |
hook2_special_metric{hook="hook2.sh"} 42 ---------------------|-------'
# HELP common_metric common_metric |
# TYPE common_metric gauge |
common_metric{hook="hook1.sh", source="source3"} 300 ---------'
common_metric{hook="hook1.sh", source="source1"} 100 ---------------+---- no group
common_metric{hook="hook2.sh", source="source2"} 200 ---------------'
On next execution of hook1.sh
values for hook_metric{kind="replicaset"}
, hook_metric{kind="deployment"}
, common_metric{source="source3"}
and hook1_special_metric
are expired and hook returns only one metric:
echo '{"group":"hook1", "name":"hook_metric", "action":"add", "value":1, "labels":{"kind":"pod"}}' >> $METRICS_PATH
Addon-operator expires previous values for group “hook1” and updates value for hook_metric{hook="hook1.sh", kind="pod"}
. Values for group hook2
and common_metric
without group are left intact. Now Prometheus scrapes these metrics:
# HELP hook_metric hook_metric
# TYPE hook_metric counter
hook_metric{hook="hook1.sh", kind="pod"} 2 --------------- group:hook1
hook_metric{hook="hook2.sh", kind="configmap"} 1 ----+---- group:hook2
hook_metric{hook="hook2.sh", kind="secret"} 1 -------+
# HELP hook2_special_metric hook2_special_metric |
# TYPE hook2_special_metric gauge |
hook2_special_metric{hook="hook2.sh"} 42 ------------'
# HELP common_metric common_metric
# TYPE common_metric gauge
common_metric{hook="hook1.sh", source="source1"} 100 --+-- no group
common_metric{hook="hook2.sh", source="source2"} 200 --'
Next execution of hook2.sh
expires all metrics in group ‘hook2’:
echo '{"group":"hook2", "action":"expire"}' >> $METRICS_PATH
Shell-operator expires previous values for group “hook2” but leaves common_metrics
for “hook2.sh” as is. Now Prometheus scrapes these metrics:
# HELP hook_metric hook_metric
# TYPE hook_metric counter
hook_metric{hook="hook1.sh", kind="pod"} 2 --------------- group:hook1
# HELP common_metric common_metric
# TYPE common_metric gauge
common_metric{hook="hook1.sh", source="source1"} 100 --+-- no group
common_metric{hook="hook2.sh", source="source2"} 200 --'