Shell-operator is a tool for running event-driven scripts in a Kubernetes cluster.
This operator is not an operator for a particular software product such as prometheus-operator
or kafka-operator
. Shell-operator provides an integration layer between Kubernetes cluster events and shell scripts by treating scripts as hooks triggered by events. Think of it as an operator-sdk
but for scripts.
Shell-operator is used as a base for more advanced addon-operator that supports Helm charts and value storages.
Shell-operator provides:
- Ease of management of a Kubernetes cluster: use the tools that Ops are familiar with. It can be bash, python, kubectl, etc.
- Kubernetes object events: hook can be triggered by
add
,update
ordelete
events. Learn more about hooks. - Object selector and properties filter: shell-operator can monitor a particular set of objects and detect changes in their properties.
- Simple configuration: hook binding definition is a JSON or YAML document on script’s stdout.
- Validating webhook machinery: hook can handle validating for Kubernetes resources.
- Conversion webhook machinery: hook can handle version conversion for Kubernetes resources.
Articles & talks
Shell-operator has been presented during KubeCon + CloudNativeCon Europe 2020 Virtual (Aug’20). Here is the talk called “Go? Bash! Meet the shell-operator”:
Official publications on shell-operator:
- “shell-operator v1.0.0: the long-awaited release of our tool to create Kubernetes operators“ (Apr’21);
- “shell-operator & addon-operator news: hooks as admission webhooks, Helm 3, OpenAPI, Go hooks, and more!“ (Feb’21);
- “Kubernetes operators made easy with shell-operator: project status & news“ (Jul’20);
- “Announcing shell-operator to simplify creating of Kubernetes operators“ (May’19).
Other languages:
- Chinese: “介绍一个不太小的工具:Shell Operator“; “使用shell-operator实现Operator“;
- Dutch: “Een operator om te automatiseren – Hoe pak je dat aan?“;
- Russian: “shell-operator v1.0.0: долгожданный релиз нашего проекта для Kubernetes-операторов“; “Представляем shell-operator: создавать операторы для Kubernetes стало ещё проще“.
Community
Please feel free to reach developers/maintainers and users via GitHub Discussions for any questions regarding shell-operator.
You’re also welcome to follow @flant_com to stay informed about all our Open Source initiatives.
License
Apache License 2.0, see LICENSE.
Quickstart
You need to have a Kubernetes cluster, and the
kubectl
must be configured to communicate with your cluster.
The simplest setup of shell-operator in your cluster consists of these steps:
- build an image with your hooks (scripts)
- create necessary RBAC objects (for
kubernetes
bindings) - run Pod or Deployment with the built image
For more configuration options see RUNNING.
Build an image with your hooks
A hook is a script that, when executed with --config
option, outputs configuration to stdout in YAML or JSON format. Learn more about hooks.
Let’s create a small operator that will watch for all Pods in all Namespaces and simply log the name of a new Pod.
kubernetes
binding is used to tell shell-operator about objects that we want to watch. Create the pods-hook.sh
file with the following content:
#!/usr/bin/env bash
if [[ $1 == "--config" ]] ; then
cat <<EOF
configVersion: v1
kubernetes:
- apiVersion: v1
kind: Pod
executeHookOnEvent: ["Added"]
EOF
else
podName=$(jq -r .[0].object.metadata.name $BINDING_CONTEXT_PATH)
echo "Pod '${podName}' added"
fi
Make the pods-hook.sh
executable:
chmod +x pods-hook.sh
You can use a prebuilt image ghcr.io/flant/shell-operator:latest with bash
, kubectl
, jq
and shell-operator
binaries to build you own image. You just need to ADD
your hook into /hooks
directory in the Dockerfile
.
Create the following Dockerfile
in the directory where you created the pods-hook.sh
file:
FROM ghcr.io/flant/shell-operator:latest
ADD pods-hook.sh /hooks
Build an image (change image tag according to your Docker registry):
docker build -t "registry.mycompany.com/shell-operator:monitor-pods" .
Push image to the Docker registry accessible by the Kubernetes cluster:
docker push registry.mycompany.com/shell-operator:monitor-pods
Create RBAC objects
We need to watch for Pods in all Namespaces. That means that we need specific RBAC definitions for shell-operator:
kubectl create namespace example-monitor-pods
kubectl create serviceaccount monitor-pods-acc --namespace example-monitor-pods
kubectl create clusterrole monitor-pods --verb=get,watch,list --resource=pods
kubectl create clusterrolebinding monitor-pods --clusterrole=monitor-pods --serviceaccount=example-monitor-pods:monitor-pods-acc
Install shell-operator in a cluster
Shell-operator can be deployed as a Pod. Put this manifest into the shell-operator-pod.yaml
file:
apiVersion: v1
kind: Pod
metadata:
name: shell-operator
spec:
containers:
- name: shell-operator
image: registry.mycompany.com/shell-operator:monitor-pods
imagePullPolicy: Always
serviceAccountName: monitor-pods-acc
Start shell-operator by applying a shell-operator-pod.yaml
file:
kubectl -n example-monitor-pods apply -f shell-operator-pod.yaml
It all comes together
Let’s deploy a kubernetes-dashboard to trigger kubernetes
binding defined in our hook:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/aio/deploy/recommended.yaml
Now run kubectl -n example-monitor-pods logs po/shell-operator
and observe that the hook will print dashboard pod name:
...
INFO[0027] queue task HookRun:main operator.component=handleEvents queue=main
INFO[0030] Execute hook binding=kubernetes hook=pods-hook.sh operator.component=taskRunner queue=main task=HookRun
INFO[0030] Pod 'kubernetes-dashboard-775dd7f59c-hr7kj' added binding=kubernetes hook=pods-hook.sh output=stdout queue=main task=HookRun
INFO[0030] Hook executed successfully binding=kubernetes hook=pods-hook.sh operator.component=taskRunner queue=main task=HookRun
...
Note: hook output is logged with output=stdout label.
To clean up a cluster, delete namespace and RBAC objects:
kubectl delete ns example-monitor-pods
kubectl delete clusterrole monitor-pods
kubectl delete clusterrolebinding monitor-pods
This example is also available in /examples: monitor-pods.
Running Shell-operator
Configuration
Run in a cluster
- Build an image from
ghcr.io/flant/shell-operator:latest
(or use a specific tag). - Copy your hooks to
/hooks
directory. - Apply RBAC manifests.
- Apply Pod or Deployment manifest with the built image.
More detailed explanation is available in Quick Start, also see installing with Helm in the example.
Run outside a cluster
- Setup kube context,
- Prepare hooks directory,
- Run shell-operator with context and path to hooks directory.
It is not recommended for production but can be useful while debugging hooks. A scenario can be like this:
# Start local cluster
kind create cluster
# Start Shell-operator from outside the cluster
shell-operator start --hooks-dir $(pwd)/hooks --tmp-dir $(pwd)/tmp --log-type color
Environment variables and flags
You can configure the operator with the following environment variables and cli flags:
CLI flag | Env-Variable name | Default | Description |
---|---|---|---|
--hooks-dir | SHELL_OPERATOR_HOOKS_DIR | "" | A path to a hooks file structure |
--tmp-dir | SHELL_OPERATOR_TMP_DIR | "/tmp/shell-operator" | A path to store temporary files with data for hooks |
--listen-address | SHELL_OPERATOR_LISTEN_ADDRESS | "0.0.0.0" | Address to use for HTTP serving. |
--listen-port | SHELL_OPERATOR_LISTEN_PORT | "9115" | Port to use for HTTP serving. |
--prometheus-metrics-prefix | SHELL_OPERATOR_PROMETHEUS_METRICS_PREFIX | "shell_operator_" | A prefix for metrics names. |
--kube-context | KUBE_CONTEXT | "" | The name of the kubeconfig context to use. (as a --context flag of kubectl) |
--kube-config | KUBE_CONFIG | "" | Path to the kubeconfig file. (as a $KUBECONFIG for kubectl) |
--kube-client-qps | KUBE_CLIENT_QPS | 5 | QPS for rate limiter of k8s.io/client-go |
--kube-client-burst | KUBE_CLIENT_BURST | 10 | burst for rate limiter of k8s.io/client-go |
--object-patcher-kube-client-timeout | OBJECT_PATCHER_KUBE_CLIENT_TIMEOUT | 10s | timeout for object patcher’s requests to the Kubernetes API server |
--jq-library-path | JQ_LIBRARY_PATH | "" | Prepend directory to the search list for jq modules (works as jq -L ). |
n/a | JQ_EXEC | "" | Set to yes to use jq as executable — it is more for developing purposes. |
--log-level | LOG_LEVEL | "info" | Logging level: debug , info , error . |
--log-type | LOG_TYPE | "text" | Logging formatter type: json , text or color . |
--log-no-time | LOG_NO_TIME | false | Disable timestamp logging if flag is present. Useful when output is redirected to logging system that already adds timestamps. |
--log-proxy-hook-json | LOG_PROXY_HOOK_JSON | false | Delegate hook stdout/ stderr JSON logging to the hooks and act as a proxy that adds some extra fields before just printing the output. NOTE: It ignores LOG_TYPE for the output of the hooks; expects JSON lines to stdout/ stderr from the hooks |
--debug-keep-tmp-files | DEBUG_KEEP_TMP_FILES | "no" | Set to yes to keep files in $SHELL_OPERATOR_TMP_DIR for debugging purposes. Note that it can generate many files. |
--debug-unix-socket | DEBUG_UNIX_SOCKET | "/var/run/shell-operator/debug.socket" | Path to the unix socket file for debugging purposes. |
--validating-webhook-configuration-name | VALIDATING_WEBHOOK_CONFIGURATION_NAME | "shell-operator-hooks" | A name of a ValidatingWebhookConfiguration resource. |
--validating-webhook-service-name | VALIDATING_WEBHOOK_SERVICE_NAME | "shell-operator-validating-svc" | A name of a service used in ValidatingWebhookConfiguration. |
--validating-webhook-server-cert | VALIDATING_WEBHOOK_SERVER_CERT | "/validating-certs/tls.crt" | A path to a server certificate for service used in ValidatingWebhookConfiguration. |
--validating-webhook-server-key | VALIDATING_WEBHOOK_SERVER_KEY | "/validating-certs/tls.key" | A path to a server private key for service used in ValidatingWebhookConfiguration. |
--validating-webhook-ca | VALIDATING_WEBHOOK_CA | "/validating-certs/ca.crt" | A path to a ca certificate for ValidatingWebhookConfiguration. |
--validating-webhook-client-ca | VALIDATING_WEBHOOK_CLIENT_CA | [] | A path to a server certificate for ValidatingWebhookConfiguration. |
--conversion-webhook-service-name | CONVERSION_WEBHOOK_SERVICE_NAME | "shell-operator-conversion-svc" | A name of a service for clientConfig in CRD. |
--conversion-webhook-server-cert | CONVERSION_WEBHOOK_SERVER_CERT | "/conversion-certs/tls.crt" | A path to a server certificate for clientConfig in CRD. |
--conversion-webhook-server-key | CONVERSION_WEBHOOK_SERVER_KEY | "/conversion-certs/tls.key" | A path to a server private key for clientConfig in CRD. |
--conversion-webhook-ca | CONVERSION_WEBHOOK_CA | "/conversion-certs/ca.crt" | A path to a ca certificate for clientConfig in CRD. |
--conversion-webhook-client-ca | CONVERSION_WEBHOOK_CLIENT_CA | [] | A path to a server certificate for CRD.spec.conversion.webhook. |
Notes on JSON log proxying
- JSON log proxying (see above
--log-proxy-hook-json
) gives a lot of control to the hooks, which might want to use their own logger or different fields or log level - It is incompatible with the other log flags in regard to filtering or configuring logging for the hooks.
shell-operator
will always expect valid json lines and output them regardless of the other flags - The log lines from the hooks will be enhanced with these top-level fields, from
shell-operator
before being printed: ‘hook’, ‘binding’, ‘event’, ‘task’, ‘queue’ - Configure hooks to use the
msg
,time
andlevel
fields for consistency with the logs coming fromshell-operator
. This, however, is not enforced.
Debug
The following tools for debugging and fine-tuning of Shell-operator and hooks are available:
- Analysis of logs of a Shell-operator’s pod (enter
kubectl logs -f po/POD_NAME
in terminal), - The environment variable can be set to
LOG_LEVEL=debug
to include the detailed debugging information into logs, - You can view the contents of the working queues with cli command from inside a Pod:
kubectl exec -ti po/shell-operator /bin/bash shell-operator queue list
Hooks
A hook is an executable file that Shell-operator runs when some event occurs. It can be a script or a compiled program written in any programming language. For illustrative purposes, we will use bash scripts. An example with a hook in the form of a Python script is available here: 002-startup-python.
The hook receives the data and returns the result via files. Paths to files are passed to the hook via environment variables.
Shell-operator lifecycle
At startup Shell-operator initializes the hooks:
- The recursive search for hook files is performed in the hooks directory. You can specify it with
--hooks-dir
command-line argument or with theSHELL_OPERATOR_HOOKS_DIR
environment variable (the default path is/hooks
).- Every executable file found in the path is considered a hook (please note,
lib
subdirectory ignored).
- Every executable file found in the path is considered a hook (please note,
- Found hooks are sorted alphabetically according to the directories’ and hooks’ names. Then they are executed with the
--config
flag to get bindings to events in YAML or JSON format. - If hook’s configuration is successful, the working queue named “main” is filled with
onStartup
hooks. - Then, the “main” queue is filled with
kubernetes
hooks withSynchronization
binding context type, so that each hook receives all existing objects described in hook’s configuration. - After executing
kubernetes
hook withSynchronization
binding context, Shell-operator starts a monitor of Kubernetes events according to configuredkubernetes
binding.- Each monitor stores a snapshot - a refreshable list of all Kubernetes objects that match a binding definition.
Next, the main cycle is started:
-
Event handler adds hooks to the named queues on events:
kubernetes
hooks are added to the queue when desired WatchEvent is received from Kubernetes,schedule
hooks are added according to the schedule,kubernetes
andschedule
hooks are added to the “main” queue or the named queue ifqueue
field was specified.
-
Each named queue has its queue handler which executes hooks strictly sequentially. If hook fails with an error (non-zero exit code), Shell-operator restarts it (every 5 seconds) until it succeeds. In case of an erroneous execution of a hook, when other events occur, a queue will be filled with new tasks, but their execution will be blocked until the failing hook succeeds.
- You can change this behavior for a specific hook by adding
allowFailure: true
to the binding configuration (not available foronStartup
hooks).
- You can change this behavior for a specific hook by adding
-
Each hook is executed with a binding context, that describes an already occurred event:
kubernetes
hook receivesEvent
binding context with an object related to the event.schedule
hook receives a name of triggered schedule binding.
-
If there is a sequence of hook executions in a queue, then hook is executed once with array of binding contexts.
- If binding contains
group
key, then a sequence of binding context with similargroup
key is compacted into one binding context.
- If binding contains
-
Several metrics are available for monitoring the activity of the queues and hooks: queues size, number of execution errors for specific hooks, etc. See METRICS for more details.
Hook configuration
Shell-operator runs the hook with the --config
flag. In response, the hook should print its event binding configuration to stdout. The response can be in YAML format:
configVersion: v1
onStartup: ORDER,
schedule:
- {SCHEDULE_PARAMETERS}
- {SCHEDULE_PARAMETERS}
kubernetes:
- {KUBERNETES_PARAMETERS}
- {KUBERNETES_PARAMETERS}
kubernetesValidating:
- {VALIDATING_PARAMETERS}
- {VALIDATING_PARAMETERS}
settings:
SETTINGS_PARAMETERS
or in JSON format:
{
"configVersion": "v1",
"onStartup": STARTUP_ORDER,
"schedule": [
{SCHEDULE_PARAMETERS},
{SCHEDULE_PARAMETERS}
],
"kubernetes": [
{KUBERNETES_PARAMETERS},
{KUBERNETES_PARAMETERS}
],
"kubernetesValidating": [
{VALIDATING_PARAMETERS},
{VALIDATING_PARAMETERS}
],
"settings": {SETTINGS_PARAMETERS}
}
configVersion
field specifies a version of configuration schema. The latest schema version is v1 and it is described below.
Event binding is an event type (one of “onStartup”, “schedule”, “kubernetes” or “kubernetesValidating”) plus parameters required for a subscription.
onStartup
Use this binding type to execute a hook at the Shell-operator’s startup.
Syntax:
configVersion: v1
onStartup: ORDER
Parameters:
ORDER
— an integer value that specifies an execution order. “OnStartup” hooks will be sorted by this value and then alphabetically by file name.
schedule
Scheduled execution. You can bind a hook to any number of schedules.
Syntax
configVersion: v1
schedule:
- crontab: "*/5 * * * *"
allowFailure: true|false
- name: "Every 20 minutes"
crontab: "*/20 * * * *"
allowFailure: true|false
- name: "every 10 seconds",
crontab: "*/10 * * * * *"
allowFailure: true|false
queue: "every-ten"
includeSnapshotsFrom: "monitor-pods"
- name: "every minute"
crontab: "* * * * *"
allowFailure: true|false
group: "pods"
...
Parameters
-
name
— is an optional identifier. It is used to distinguish between multiple schedules during runtime. For more information see binding context. -
crontab
– is a mandatory schedule with a regular crontab syntax with 5 fields. 6 fields style crontab also supported, for more information see documentation on robfig/cron.v2 library. -
allowFailure
— if ‘true’, Shell-operator skips the hook execution errors. If ‘false’ or the parameter is not set, the hook is restarted after a 5 seconds delay in case of an error. -
queue
— a name of a separate queue. It can be used to execute long-running hooks in parallel with other hooks. -
includeSnapshotsFrom
— a list of names ofkubernetes
bindings. When specified, all monitored objects will be added to the binding context in asnapshots
field. -
group
— a key that define a group ofschedule
andkubernetes
bindings. See grouping.
kubernetes
Run a hook on a Kubernetes object changes.
Syntax
configVersion: v1
kubernetes:
- name: "Monitor pods in cache tier"
apiVersion: v1
kind: Pod # required
executeHookOnEvent: [ "Added", "Modified", "Deleted" ]
executeHookOnSynchronization: true|false # default is true
keepFullObjectsInMemory: true|false # default is true
nameSelector:
matchNames:
- pod-0
- pod-1
labelSelector:
matchLabels:
myLabel: myLabelValue
someKey: someValue
matchExpressions:
- key: "tier"
operator: "In"
values: ["cache"]
# - ...
fieldSelector:
matchExpressions:
- field: "status.phase"
operator: "Equals"
value: "Pending"
# - ...
namespace:
nameSelector:
matchNames: ["somenamespace", "proj-production", "proj-stage"]
labelSelector:
matchLabels:
myLabel: "myLabelValue"
someKey: "someValue"
matchExpressions:
- key: "env"
operator: "In"
values: ["production"]
# - ...
jqFilter: ".metadata.labels"
includeSnapshotsFrom:
- "Monitor pods in cache tier"
- "monitor Pods"
- ...
allowFailure: true|false # default is false
queue: "cache-pods"
group: "pods"
- name: "monitor Pods"
kind: "pod"
# ...
Parameters
-
name
is an optional identifier. It is used to distinguish different bindings during runtime. See also binding context. -
apiVersion
is an optional group and version of object API. For example, it isv1
for core objects (Pod, etc.),rbac.authorization.k8s.io/v1beta1
for ClusterRole andmonitoring.coreos.com/v1
for prometheus-operator. -
kind
is the type of a monitored Kubernetes resource. This field is required. CRDs are supported, but the resource should be registered in the cluster before Shell-operator starts. This can be checked withkubectl api-resources
command. You can specify a case-insensitive name, kind or short name in this field. For example, to monitor a DaemonSet these forms are valid:"kind": "DaemonSet" "kind": "Daemonset" "kind": "daemonsets" "kind": "DaemonSets" "kind": "ds"
-
executeHookOnEvent
— the list of events which led to a hook’s execution. By default, all events are used to execute a hook: “Added”, “Modified” and “Deleted”. Docs: Using API WatchEvent. Empty array can be used to prevent hook execution, it is useful when binding is used only to define a snapshot. -
executeHookOnSynchronization
— iffalse
, Shell-operator skips the hook execution with Synchronization binding context. See binding context. -
nameSelector
— selector of objects by their name. If this selector is not set, then all objects of a specified Kind are monitored. -
labelSelector
— standard selector of objects by labels (examples of use). If the selector is not set, then all objects of a specified kind are monitored. -
fieldSelector
— selector of objects by their fields, works like--field-selector=''
flag ofkubectl
. Supported operators are Equals (or=
,==
) and NotEquals (or!=
) and all expressions are combined with AND. Also, note that fieldSelector with ‘metadata.name’ the field is mutually exclusive with nameSelector. There are limits on fields, see Note. -
namespace
— filters to choose namespaces. If omitted, events from all namespaces will be monitored. -
namespace.nameSelector
— this filter can be used to monitor events from objects in a particular list of namespaces. -
namespace.labelSelector
— this filter works likelabelSelector
but for namespaces and Shell-operator dynamically subscribes to events from matched namespaces. -
jqFilter
— an optional parameter that specifies event filtering using jq syntax. The hook will be triggered on the “Modified” event only if the filter result is changed after the last event. See example 102-monitor-namespaces. -
allowFailure
— iftrue
, Shell-operator skips the hook execution errors. Iffalse
or the parameter is not set, the hook is restarted after a 5 seconds delay in case of an error. -
queue
— a name of a separate queue. It can be used to execute long-running hooks in parallel with hooks in the “main” queue. -
includeSnapshotsFrom
— an array of names ofkubernetes
bindings in a hook. When specified, a list of monitored objects from that bindings will be added to the binding context in asnapshots
field. Self-include is also possible. -
keepFullObjectsInMemory
— if not set ortrue
, dumps of Kubernetes resources are cached for this binding, and the snapshot includes them asobject
fields. Set tofalse
if the hook does not rely on full objects to reduce the memory footprint. -
group
— a key that define a group ofschedule
andkubernetes
bindings. See grouping.
Example
configVersion: v1
kubernetes:
# Trigger on labels changes of Pods with myLabel:myLabelValue in any namespace
- name: "label-changes-of-mylabel-pods"
kind: pod
executeHookOnEvent: ["Modified"]
labelSelector:
matchLabels:
myLabel: "myLabelValue"
namespace:
nameSelector: ["default"]
jqFilter: .metadata.labels
allowFailure: true
includeSnapshotsFrom: ["label-changes-of-mylabel-pods"]
This hook configuration will execute hook on each change in labels of pods labeled with myLabel=myLabelValue
in “default” namespace. The binding context will contain all pods with myLabel=myLabelValue
from “default” namespace.
Notes
default Namespace
Unlike kubectl
you should explicitly define namespace.nameSelector
to monitor events from default
namespace.
namespace:
nameSelector: ["default"]
RBAC is required
Shell-operator requires a ServiceAccount with the appropriate RBAC permissions. See examples with RBAC: monitor-pods and monitor-namespaces.
jqFilter
This filter is used to ignore superfluous “Modified” events, and to exclude object from event subscription. For example, if the hook should track changes of object’s labels, jqFilter: ".metadata.labels"
can be used to ignore changes in other properties (.status
,.metadata.annotations
, etc.).
The result of applying the filter to the event’s object is passed to the hook in a filterResult
field of a binding context.
You can use JQ_LIBRARY_PATH
environment variable to set a path with jq
modules.
In case you need to filter by multiple fields, you can use the form of an object or an array:
-
jqFilter: "{nodeName: .spec.nodeName, name: .metadata.labels}"
returns filterResult as object:"filterResult": { "labels": { "app": "proxy", "pod-template-hash": "cfdbfcbb8" }, "nodeName": "node-01" }
-
jqFilter: "[.spec.nodeName, .metadata.labels]"
returns filterResult as array:"filterResult": [ "node-01", { "app": "proxy", "pod-template-hash": "cfdbfcbb8" } ]
Added != Object created
Consider that the “Added” event is not always equal to “Object created” if labelSelector
, fieldSelector
or namespace.labelSelector
is specified in the binding
. If objects and/or namespace are updated in Kubernetes, the binding
may suddenly start matching them, with the “Added” event. The same with “Deleted”, event “Deleted” is not always equal to “Object removed”, the object can just move out of a scope of selectors.
fieldSelector
There is no support for filtering by arbitrary field neither for core resources nor for custom resources (see issue#53459). Only metadata.name
and metadata.namespace
fields are commonly supported.
However fieldSelector can be useful for some resources with extended set of supported fields:
kind | fieldSelector | src url |
---|---|---|
Pod | spec.nodeName spec.restartPolicy spec.schedulerName spec.serviceAccountName status.phase status.podIP status.nominatedNodeName | 1.16 |
Event | involvedObject.kind involvedObject.namespace involvedObject.name involvedObject.uid involvedObject.apiVersion involvedObject.resourceVersion involvedObject.fieldPath reason source type | 1.16 |
Secret | type | 1.16 |
Namespace | status.phase | 1.16 |
ReplicaSet | status.replicas | 1.16 |
Job | status.successful | 1.16 |
Node | spec.unschedulable | 1.16 |
Example of selecting Pods by ‘Running’ phase:
kind: Pod
fieldSelector:
matchExpressions:
- field: "status.phase"
operator: Equals
value: Running
fieldSelector and labelSelector expressions are ANDed
Objects should match all expressions defined in fieldSelector
and labelSelector
, so, for example, multiple fieldSelector
expressions with metadata.name
field and different values will not match any object.
kubernetesValidating
Use a hook as handler for ValidationWebhookConfiguration.
See syntax and parameters in BINDING_VALIDATING.md
kubernetesCustomResourceConversion
Use a hook as handler for custom resource conversion.
See syntax and parameters in BINDING_CONVERSION.md
Binding context
When an event associated with a hook is triggered, Shell-operator executes the hook without arguments. The information about the event that led to the hook execution is called the binding context and is written in JSON format to a temporary file. The path to this file is available to hook via environment variable BINDING_CONTEXT_PATH
.
Temporary files have unique names to prevent collisions between queues and are deleted after the hook run.
Binging context is a JSON-array of structures with the following fields:
binding
— a string from thename
parameter. If this parameter has not been set in the binding configuration, then strings “schedule” or “kubernetes” are used. For a hook executed at startup, this value is always “onStartup”.type
— “Schedule” forschedule
bindings. “Synchronization” or “Event” forkubernetes
bindings. “Group” ifgroup
is defined.
The hook receives “Event”-type binding context on Kubernetes event and it contains more fields:
watchEvent
— the possible value is one of the values you can use withexecuteHookOnEvent
parameter: “Added”, “Modified” or “Deleted”.object
— a JSON dump of the full object related to the event. It contains an exact copy of the corresponding field in WatchEvent response, so it’s the object state at the moment of the event (not at the moment of the hook execution).filterResult
— the result ofjq
execution with specifiedjqFilter
on the above mentioned object. IfjqFilter
is not specified, thenfilterResult
is omitted.
The hook receives existed objects on startup for each binding with “Synchronization”-type binding context:
objects
— a list of existing objects that match selectors in binding configuration. Each item of this list containsobject
andfilterResult
fields. The state of items is actual for the moment of the hook execution. If the list is empty, the value ofobjects
is an empty array.
If group
or includeSnapshotsFrom
are defined, the hook receives binding context with additional field:
snapshots
— a map that contains an up-to-date lists of objects for each binding name fromincludeSnapshotsFrom
or for eachkubernetes
binding with a similargroup
. IfincludeSnapshotsFrom
list is empty, the field is omitted.
onStartup
binding context example
Hook with this configuration:
configVersion: v1
onStartup: 1
will be executed with this binding context at startup:
[{"binding": "onStartup"}]
schedule
binding context example
For example, if you have the following configuration in a hook:
configVersion: v1
schedule:
- name: incremental
crontab: "0 2 */3 * * *"
allowFailure: true
then at 12:02, it will be executed with the following binding context:
[{ "binding": "incremental", "type":"Schedule"}]
kubernetes
binding context example
A hook can monitor Pods in all namespaces with this simple configuration:
configVersion: v1
kubernetes:
- kind: Pod
“Synchronization” binding context
During startup, the hook receives all existing objects with “Synchronization”-type binding context:
[
{
"binding": "kubernetes",
"type": "Synchronization",
"objects": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
}
},
{
"object": {
"kind": "Pod",
"metadata": {
"name": "kube-proxy-...",
"namespace": "kube-system",
...
},
}
},
...
]
}
]
Note: hook execution at startup with “Synchronization” binding context can be turned off with
executeHookOnSynchronization: false
“Event” binding context
If pod pod-321d12
is then added into namespace ‘default’, then the hook will be executed with the “Event”-type binding context:
[
{
"binding": "kubernetes",
"type": "Event",
"watchEvent": "Added",
"object": {
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "pod-321d12",
"namespace": "default",
...
},
"spec": {
...
},
...
}
}
]
Snapshots
“Event”-type binding context contains an object state at the moment of the event. Actual objects’ state for the moment of the execution can be received in a form of Snapshots.
Shell-operator maintains an up-to-date list of objects for each kubernetes
binding. schedule
and kubernetes
bindings can be configured to receive these lists via includeSnapshotsFrom
parameter. Also, there is a group
parameter to automatically receive all snapshots from multiple bindings and to deduplicate executions.
Snapshot is a JSON array of Kubernetes objects and corresponding jqFilter results. To access the snapshot during the hook execution, there is a map snapshots
in the binding context. The key of this map is a binding name, and the value is the snapshot.
snapshots
example:
[
{ "binding": ...,
"snapshots": {
"binding-name-1": [
{
"object": {
"kind": "Pod",
"metadata": {
"name": "etcd-...",
"namespace": "kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}}]
object
— a JSON dump of Kubernetes object.filterResult
— a JSON result of applyingjqFilter
to the Kubernetes object.
Keeping dumps for object
fields can take a lot of memory. There is a parameter keepFullObjectsInMemory: false
to disable full dumps.
Note that disabling full objects make sense only if jqFilter
is defined, as it disables full objects in snapshots
field, objects
field of “Synchronization” binding context and object
field of “Event” binding context.
For example, this binding configuration will execute hook with empty items in objects
field of “Synchronization” binding context:
kubernetes:
- name: pods
kinds: Pod
keepFullObjectsInMemory: false
Snapshots example
To illustrate the includeSnapshotsFrom
parameter, consider the hook that reacts to changes of labels of all Pods and requires the content of the ConfigMap named “settings-for-my-hook”. There is also a schedule to do periodic checks:
configVersion: v1
schedule:
- name: periodic-checking
crontab: "0 */3 * * *"
includeSnapshotsFrom: ["monitor-pods", "configmap-content"]
kubernetes:
- name: configmap-content
kind: ConfigMap
nameSelector:
matchNames: ["settings-for-my-hook"]
executeHookOnSynchronization: false
executeHookOnEvent: []
- name: monitor-pods
kind: Pod
jqFilter: '.metadata.labels'
includeSnapshotsFrom: ["configmap-content"]
This hook will not be executed for events related to the binding “configmap-content”. executeHookOnSynchronization: false
accompanied by executeHookOnEvent: []
defines a “snapshot-only” binding. This is one of the techniques to reduce the number of kubectl
invocations.
“Synchronization” binding context with snapshots
During startup, the hook will be executed with the “Synchronization” binding context with snapshots
and objects
:
[
{
"binding": "monitor-pods",
"type": "Synchronization",
"objects": [
{
"object": {
"kind": "Pod",
"metadata": {
"name": "etcd-...",
"namespace": "kube-system",
"labels": { ... },
...
},
},
"filterResult": {
"label1": "value",
...
}
},
{
"object": {
"kind": "Pod",
"metadata": {
"name": "kube-proxy-...",
"namespace": "kube-system",
...
},
},
"filterResult": {
"label1": "value",
...
}
},
...
],
"snapshots": {
"configmap-content": [
{
"object": {
"kind": "ConfigMap",
"metadata": {"name": "settings-for-my-hook", ... },
"data": {"field1": ... }
}
}
]
}
}
]
“Event” binding context with snapshots
If pod pod-321d12
is then added into the “default” namespace, then the hook will be executed with the “Event” binding context with object
and filterResult
fields:
[
{
"binding": "monitor-pods",
"type": "Event",
"watchEvent": "Added",
"object": {
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "pod-321d12",
"namespace": "default",
...
},
"spec": {
...
},
...
},
"filterResult": { ... },
"snapshots": {
"configmap-content": [
{
"object": {
"kind": "ConfigMap",
"metadata": {"name": "settings-for-my-hook", ... },
"data": {"field1": ... }
}
}
]
}
}
]
“Schedule” binding context with snapshots
Every 3 hours, the hook will be executed with the binding context that include 2 snapshots (”monitor-pods” and “configmap-content”):
[
{
"binding": "periodic-checking",
"type": "Schedule",
"snapshots": {
"monitor-pods": [
{
"object": {
"kind": "Pod",
"metadata": {
"name": "etcd-...",
"namespace": "kube-system",
...
},
},
"filterResult": { ... },
},
...
],
"configmap-content": [
{
"object": {
"kind": "ConfigMap",
"metadata": {"name": "settings-for-my-hook", ... },
"data": {"field1": ... }
}
}
]
}
}
]
Binding context of grouped bindings
group
parameter defines a named group of bindings. Group is used when the source of the event is not important, and data in snapshots is enough for the hook. When binding with group
is triggered with the event, the hook receives snapshots from all kubernetes
bindings with the same group
name.
Adjacent tasks for kubernetes
and schedule
bindings with the same group
and queue
are “compacted”, and the hook is executed only once. So it is wise to use the same queue
for all hooks in a group. This “compaction” mechanism is not available for kubernetesValidating
and kubernetesCustomResourceConversion
bindings as they’re not queued.
executeHookOnSynchronization
, executeHookOnEvent
and keepFullObjectsInMemory
can be used with group
. Their effects are as described above for non-grouped bindings.
group
parameter is compatible with includeSnapshotsFrom
parameter. includeSnapshotsFrom
can be used to include additional snapshots into binding context.
Binding context for a group contains:
binding
field with the original binding name or, if the name field wasn’t set in the binding configuration, then strings “schedule” or “kubernetes” are used.type
field with the value “Group”.snapshots
field if there is at least onekubernetes
binding in the group orincludeSnapshotsFrom
is not empty.
Group binding context example
Consider the hook that is executed on changes of labels of all Pods, changes in ConfigMap’s data and also on schedule:
configVersion: v1
schedule:
- name: periodic-checking
crontab: "0 */3 * * *"
group: "pods"
kubernetes:
- name: monitor-pods
apiVersion: v1
kind: Pod
jqFilter: '.metadata.labels'
group: "pods"
- name: configmap-content
apiVersion: v1
kind: ConfigMap
nameSelector:
matchNames: ["settings-for-my-hook"]
jqFilter: '.data'
group: "pods"
binding context for grouped bindings
Grouped bindings is used when only the occurrence of an event is important. So, the hook receives actual state of Pods and the ConfigMap on every of these events:
- During startup.
- A new Pod is added.
- The Pod is deleted.
- Labels of the Pod are changed.
- ConfigMap/settings-for-my-hook is deleted.
- ConfigMap/settings-for-my-hook is added.
- Data field is changed in ConfigMap/settings-for-my-hook.
- Every 3 hours.
Binding contexts for these events will be pretty the same, except for the binding
field, as it will be equal to the corresponding name
field of the binding:
[
{
"binding": "monitor-pods",
"type": "Group",
"snapshots": {
"monitor-pods": [
{
"object": {
"kind": "Pod",
"metadata": {
"name": "etcd-...",
"namespace": "kube-system",
...
},
},
"filterResult": { ... },
},
...
],
"configmap-content": [
{
"object": {
"kind": "ConfigMap",
"metadata": {
"name": "etcd-...",
"namespace": "kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]
settings
An optional block with hook-level settings.
Syntax
configVersion: v1
settings:
executionMinInterval: 3s
executionBurst: 1
Parameters
executionMinInterval
defines a minimum time between hook executions.executionBurst
a number of allowed executions during a period.
Execution rate
executionMinInterval
and executionBurst
are parameters for “token bucket” algorithm. These parameters are used to throttle hook executions and wait for more events in the queue. It is wise to use a separate queue for bindings in such a hook, as a hook with execution rate settings and with default (”main”) queue can hold the execution of other hooks.
Example
configVersion: v1
kubernetes:
- name: "all-pods-in-ns"
kind: pod
executeHookOnEvent: ["Modified"]
namespace:
nameSelector: ["default"]
queue: handle-pods-queue
settings:
executionMinInterval: 3s
executionBurst: 1
If the Shell-operator will receive a lot of events for the “all-pods-in-ns” binding, the hook will be executed no more than once in 3 seconds.
Modifying Kubernetes objects
You can delegate Kubernetes object manipulation to the shell-operator.
To do this, you have to write one or more JSON or YAML documents describing operation type and its parameters to a file. List of possible operations and corresponding JSON specifications can be found below.
The path to the file is found in the $KUBERNETES_PATCH_PATH
environment variable.
Operations
Create
operation
— specifies an operation’s type.CreateOrUpdate
— accept a Kubernetes object. It retrieves an object, and if it already exists, computes a JSON Merge Patch and applies it (will not update .status field). If it does not exist, we create the object.Create
— will fail if an object already existsCreateIfNotExists
— create an object if such an object does not already exist by namespace/name.
object
— full object specification including “apiVersion”, “kind” and all necessary metadata. Can be a normal JSON or YAML object or a stringified JSON or YAML object.
Example
{
"operation": "CreateOrUpdate",
"object": {
"apiVersion": "apps/v1",
"kind": "DaemonSet",
"metadata": {
"name": "flannel",
"namespace": "d8-flannel"
},
"spec": {
"selector": {
"matchLabels": {
"app": "flannel"
}
},
"template": {
"metadata": {
"labels": {
"app": "flannel",
"tier": "node"
}
},
"spec": {
"containers": [
{
"args": [
"--ip-masq",
"--kube-subnet-mgr"
],
"image": "flannel:v0.11",
"name": "kube-flannel",
"securityContext": {
"privileged": true
}
}
],
"hostNetwork": true,
"imagePullSecrets": [
{
"name": "registry"
}
],
"terminationGracePeriodSeconds": 5
}
},
"updateStrategy": {
"type": "RollingUpdate"
}
}
}
}
operation: Create
object:
apiVersion: v1
kind: ConfigMap
metadata:
namespace: default
name: testcm
data: ...
operation: Create
object: |
{"apiVersion":"v1", "kind":"ConfigMap",
"metadata":{"namespace":"default","name":"testcm"},
"data":{"foo": "bar"}}
Delete
operation
— specifies an operation’s type. Deletion types map directly to Kubernetes DELETE’spropagationPolicy
.Delete
— foreground deletion. Hook will block the queue until the referenced object and all its descendants are deleted.DeleteInBackground
— will delete the referenced object immediately. All its descendants will be removed by Kubernetes’ garbage collector.DeleteNonCascading
— will delete the referenced object immediately, and orphan all its descendants.
apiVersion
— optional field that specifies object’s apiVersion. If not present, we’ll use preferred apiVersion for the given kind.kind
— object’s Kind.namespace
— object’s namespace. If empty, implies operation on a cluster-level resource.name
— object’s name.subresource
— a subresource name if subresource is to be transformed. For example,status
.
Example
{
"operation": "Delete",
"kind": "Pod",
"namespace": "default",
"name": "nginx"
}
Patch
Use JQPatch
for almost everything. Consider using MergePatch
or JSONPatch
if you are attempting to modify
rapidly changing object, for example status
field with many concurrent changes (and incrementing resourceVersion
).
Be careful, when updating a .status
field. If a /status
subresource is enabled on a resource,
it’ll ignore updates to the .status
field if you haven’t specified subresource: status
in the operation spec.
More info here.
JQPatch
operation
— specifies an operation’s type.apiVersion
— optional field that specifies object’s apiVersion. If not present, we’ll use preferred apiVersion for the given kind.kind
— object’s Kind.namespace
— object’s Namespace. If empty, implies operation on a Cluster-level resource.name
— object’s name.jqFilter
— describes transformations to perform on an object.subresource
— a subresource name if subresource is to be transformed. For example,status
.
Example
{
"operation": "JQPatch",
"kind": "Deployment",
"namespace": "default",
"name": "nginx",
"jqFilter": ".spec.replicas = 1"
}
MergePatch
operation
— specifies an operation’s type.apiVersion
— optional field that specifies object’s apiVersion. If not present, we’ll use preferred apiVersion for the given kind.kind
— object’s Kind.namespace
— object’s Namespace. If empty, implies operation on a Cluster-level resource.name
— object’s name.mergePatch
— describes transformations to perform on an object. Can be a normal JSON or YAML object or a stringified JSON or YAML object.subresource
— e.g.,status
.ignoreMissingObject
— set to true to ignore error when patching non existent object.
Example
{
"operation": "MergePatch",
"kind": "Deployment",
"namespace": "default",
"name": "nginx",
"mergePatch": {
"spec": {
"replicas": 1
}
}
}
operation: MergePatch
kind: Deployment
namespace: default
name: nginx
ignoreMissingObject: true
mergePatch: |
spec:
replicas: 1
JSONPatch
operation
— specifies an operation’s type.apiVersion
— optional field that specifies object’s apiVersion. If not present, we’ll use preferred apiVersion for the given kind.kind
— object’s Kind.namespace
— object’s Namespace. If empty, implies operation on a Cluster-level resource.name
— object’s name.jsonPatch
— describes transformations to perform on an object. Can be a normal JSON or YAML array or a stringified JSON or YAML array.subresource
— a subresource name if subresource is to be transformed. For example,status
.ignoreMissingObject
— set to true to ignore error when patching non existent object.
Example
{
"operation": "JSONPatch",
"kind": "Deployment",
"namespace": "default",
"name": "nginx",
"jsonPatch": [
{"op": "replace", "path": "/spec/replicas", "value": 1}
]
}
{
"operation": "JSONPatch",
"kind": "Deployment",
"namespace": "default",
"name": "nginx",
"jsonPatch": "[
{\"op\": \"replace\", \"path\": \"/spec/replicas\", \"value\": 1}
]"
}
kubernetesValidating
This binding transforms a hook into a handler for ValidatingWebhookConfiguration. The Shell-operator creates ValidatingWebhookConfiguration, starts HTTPS server, and runs hooks to handle AdmissionReview requests.
Note: shell-operator use
admissionregistration.k8s.io/v1
, so Kubernetes 1.16+ is needed.
Syntax
configVersion: v1
onStartup: 10
kubernetes:
- name: myCrdObjects
...
kubernetesValidating:
- name: my-crd-validator.example.com
# include snapshots by binding names
includeSnapshotsFrom: ["myCrdObjects"]
# or use group name to include all snapshots in a group
group: "group name"
labelSelector: # equivalent of objectSelector
matchLabels:
label1: value1
...
namespace:
labelSelector: # equivalent of namespaceSelector
matchLabels:
label1: value1
...
matchExpressions:
- key: environment
operator: In
values: ["prod","staging"]
rules:
- apiVersions:
- v1
apiGroups:
- stable.example.com
resources:
- CronTab
operations:
- "*"
- operations: ["CREATE", "UPDATE"]
apiGroups: ["apps"]
apiVersions: ["v1", "v1beta1"]
resources: ["deployments", "replicasets"]
scope: "Namespaced"
failurePolicy: Ignore | Fail (default)
sideEffects: None (default) | NoneOnDryRun
timeoutSeconds: 2 (default is 10)
Parameters
-
name
— a required parameter. It should be a domain with at least three segments separated by dots. -
includeSnapshotsFrom
— an array of names ofkubernetes
bindings in a hook. When specified, a list of monitored objects from these bindings will be added to the binding context in thesnapshots
field. -
group
— a key to include snapshots from a group ofschedule
andkubernetes
bindings. See grouping. -
labelSelector
— standard selector of objects by labels (examples of use). See objectSelector. -
namespace.labelSelector
— this filter works likelabelSelector
but for namespaces. See namespaceSelector. -
rules
— a required list of rules used to determine if a request to the Kubernetes API server should be sent to the hook. See Rules. -
failurePolicy
— defines how errors from the hook are handled. See Failure policy. Default isFail
. -
sideEffects
— determine whether the hook isdryRun
-aware. See side effects documentation. Default isNone
. -
timeoutSeconds
— a seconds API server should wait for a hook to respond before treating the call as a failure. See timeouts. Default is 10 (seconds).
As you can see, it is the close copy of a Webhook configuration. Differences are:
objectSelector
is alabelSelector
as in thekubernetes
binding.namespaceSelector
is anamespace.labelSelector
as in thekubernetes
binding.clientConfig
is managed by the Shell-operator. You should provide a Service for the Shell-operator HTTPS endpoint. See example 204-validating-webhook for possible solution.matchPolicy
is always “Equivalent”. See Matching requests: matchPolicy.- there are additional fields
group
andincludeSnapshotsFrom
to include snapshots in the binding context.
Example
configVersion: v1
kubernetesValidating:
- name: private-repo-policy.example.com
rules:
- apiGroups: ["stable.example.com"]
apiVersions: ["v1"]
operations: ["CREATE"]
resources: ["crontabs"]
scope: "Namespaced"
The Shell-operator will execute hook with this configuration on every creation of CronTab object.
See example 204-validating-webhook.
Hook input and output
Note that the
group
parameter is only for including snapshots.kubernetesValidating
hook is never executed onschedule
orkubernetes
events with binding context with"type":"Group"
.
The hook receives a binding context and should return response in $VALIDATING_RESPONSE_PATH
.
$BINDING_CONTEXT_PATH file example:
[{
# Name as defined in binding configuration.
"binding": "my-crd-validator.example.com",
# Validating to distinguish from other events.
"type": "Validating",
# Snapshots as defined by includeSnapshotsFrom or group.
"snapshots": { ... }
# AdmissionReview object.
"review": {
"apiVersion": "admission.k8s.io/v1",
"kind": "AdmissionReview",
"request": {
# Random uid uniquely identifying this admission call
"uid": "705ab4f5-6393-11e8-b7cc-42010a800002",
# Fully-qualified group/version/kind of the incoming object
"kind": {"group":"autoscaling","version":"v1","kind":"Scale"},
# Fully-qualified group/version/kind of the resource being modified
"resource": {"group":"apps","version":"v1","resource":"deployments"},
# subresource, if the request is to a subresource
"subResource": "scale",
# Fully-qualified group/version/kind of the incoming object in the original request to the API server.
# This only differs from `kind` if the webhook specified `matchPolicy: Equivalent` and the
# original request to the API server was converted to a version the webhook registered for.
"requestKind": {"group":"autoscaling","version":"v1","kind":"Scale"},
# Fully-qualified group/version/kind of the resource being modified in the original request to the API server.
# This only differs from `resource` if the webhook specified `matchPolicy: Equivalent` and the
# original request to the API server was converted to a version the webhook registered for.
"requestResource": {"group":"apps","version":"v1","resource":"deployments"},
# subresource, if the request is to a subresource
# This only differs from `subResource` if the webhook specified `matchPolicy: Equivalent` and the
# original request to the API server was converted to a version the webhook registered for.
"requestSubResource": "scale",
# Name of the resource being modified
"name": "my-deployment",
# Namespace of the resource being modified, if the resource is namespaced (or is a Namespace object)
"namespace": "my-namespace",
# operation can be CREATE, UPDATE, DELETE, or CONNECT
"operation": "UPDATE",
"userInfo": {
# Username of the authenticated user making the request to the API server
"username": "admin",
# UID of the authenticated user making the request to the API server
"uid": "014fbff9a07c",
# Group memberships of the authenticated user making the request to the API server
"groups": ["system:authenticated","my-admin-group"],
# Arbitrary extra info associated with the user making the request to the API server.
# This is populated by the API server authentication layer and should be included
# if any SubjectAccessReview checks are performed by the webhook.
"extra": {
"some-key":["some-value1", "some-value2"]
}
},
# object is the new object being admitted.
# It is null for DELETE operations.
"object": {"apiVersion":"autoscaling/v1","kind":"Scale",...},
# oldObject is the existing object.
# It is null for CREATE and CONNECT operations.
"oldObject": {"apiVersion":"autoscaling/v1","kind":"Scale",...},
# options contains the options for the operation being admitted, like meta.k8s.io/v1 CreateOptions, UpdateOptions, or DeleteOptions.
# It is null for CONNECT operations.
"options": {"apiVersion":"meta.k8s.io/v1","kind":"UpdateOptions",...},
# dryRun indicates the API request is running in dry run mode and will not be persisted.
# Webhooks with side effects should avoid actuating those side effects when dryRun is true.
# See http://k8s.io/docs/reference/using-api/api-concepts/#make-a-dry-run-request for more details.
"dryRun": false
}
}
}]
Response example:
cat <<EOF > $VALIDATING_RESPONSE_PATH
{"allowed": true}
EOF
Allow with warnings (Kubernetes 1.19+):
cat <<EOF > $VALIDATING_RESPONSE_PATH
{"allowed": true, "warnings":["It might be risky because it is Tuesday", "It might be risky because your name starts with A"]}
EOF
Deny object creation and explain why:
cat <<EOF > $VALIDATING_RESPONSE_PATH
{"allowed": false, "message": "You cannot do this because it is Tuesday and your name starts with A"}
EOF
User will see an error message:
Error from server: admission webhook "policy.example.com" denied the request: You cannot do this because it is Tuesday and your name starts with A
Empty or invalid $VALIDATING_RESPONSE_PATH file is considered as "allowed": false
with a short message about the problem and a more verbose error in the log.
HTTP server and Kubernetes configuration
Shell-operator should create an HTTP endpoint with TLS support and register endpoints in the ValidatingWebhookConfiguration resource.
There should be a Service for shell-operator (see Availability).
Command line options:
--validating-webhook-configuration-name="shell-operator-hooks"
A name of a ValidatingWebhookConfiguration resource. Can be set with
$VALIDATING_WEBHOOK_CONFIGURATION_NAME.
--validating-webhook-service-name="shell-operator-validating-svc"
A name of a service used in ValidatingWebhookConfiguration. Can be set
with $VALIDATING_WEBHOOK_SERVICE_NAME.
--validating-webhook-server-cert="/validating-certs/tls.crt"
A path to a server certificate for service used in
ValidatingWebhookConfiguration. Can be set with
$VALIDATING_WEBHOOK_SERVER_CERT.
--validating-webhook-server-key="/validating-certs/tls.key"
A path to a server private key for service used in
ValidatingWebhookConfiguration. Can be set with
$VALIDATING_WEBHOOK_SERVER_KEY.
--validating-webhook-ca="/validating-certs/ca.crt"
A path to a ca certificate for ValidatingWebhookConfiguration. Can be set
with $VALIDATING_WEBHOOK_CA.
--validating-webhook-client-ca=VALIDATING-WEBHOOK-CLIENT-CA ...
A path to a server certificate for ValidatingWebhookConfiguration. Can be
set with $VALIDATING_WEBHOOK_CLIENT_CA.
kubernetesCustomResourceConversion
This binding transforms a hook into a handler for conversions defined in CustomResourceDefinition. The Shell-operator updates a CRD with .spec.conversion, starts HTTPS server, and runs hooks to handle ConversionReview requests.
Note: shell-operator use
apiextensions.k8s.io/v1
, so Kubernetes 1.16+ is required.
Syntax
configVersion: v1
onStartup: 10
kubernetes:
- name: additionalObjects
...
kubernetesCustomResourceConversion:
- name: alpha1_to_alpha2
# Include snapshots by binding names.
includeSnapshotsFrom: ["additionalObjects"]
# Or use group name to include all snapshots in a group.
group: "group name"
# A CRD name.
crdName: crontabs.stable.example.com
# An array of conversions supported by this hook.
conversion:
- fromVersion: stable.example.com/v1alpha1
toVersion: stable.example.com/v1alpha2
Parameters
-
name
— a required parameter. It is used to distinguish between multiple schedules during runtime. For more information see binding context. -
includeSnapshotsFrom
— an array of names ofkubernetes
bindings in a hook. When specified, a list of monitored objects from these bindings will be added to the binding context in thesnapshots
field. -
group
— a key to include snapshots from a group ofschedule
andkubernetes
bindings. See grouping. -
crdName
— a required name of a CRD. -
conversions
— a required list of conversion rules. These rules are used to determine if a custom resource in ConversionReview can be converted by the hook.-
fromVersion
— a version of a custom resource that hook can convert. -
toVersion
— a version of a custom resource that hook can produce.
-
Example
configVersion: v1
kubernetesCustomResourceConversion:
- name: conversions
crdName: crontabs.stable.example.com
conversions:
- fromVersion: unstable.crontab.io/v1beta1
toVersion: stable.example.com/v1beta1
- fromVersion: stable.example.com/v1beta1
toVersion: stable.example.com/v1beta2
- fromVersion: v1beta2
toVersion: v1
The Shell-operator will execute this hook to convert custom resources ‘crontabs.stable.example.com’ from unstable.crontab.io/v1beta1 to stable.example.com/v1beta1, from stable.example.com/v1beta1 to stable.example.com/v1beta2, from unstable.crontab.io/v1beta1 to stable.example.com/v1 and so on.
See example 210-conversion-webhook.
Hook input and output
Note that the
group
parameter is only for including snapshots.kubernetesCustomResourceConversion
hook is never executed onschedule
orkubernetes
events with binding context with"type":"Group"
.
The hook receives a binding context and should return response in $CONVERSION_RESPONSE_PATH
.
$BINDING_CONTEXT_PATH file example:
[{
# Name as defined in binding configuration.
"binding": "alpha1_to_alpha2",
# type "Conversion" to distinguish from other events.
"type": "Conversion",
# Snapshots as defined by includeSnapshotsFrom or group.
"snapshots": { ... }
# fromVersion and toVersion as defined in a conversion rule.
"fromVersion": "unstable.crontab.io/v1beta1",
"toVersion": "stable.example.com/v1beta1",
# ConversionReview object.
"review": {
"apiVersion": "apiextensions.k8s.io/v1",
"kind": "ConversionReview",
"request": {
"desiredAPIVersion": "stable.example.com/v1beta1",
"objects": [
{
# A source version.
"apiVersion": "unstable.crontab.io/v1beta1",
"kind": "CronTab",
"metadata": {
"name": "crontab-v1alpha1",
"namespace": "example-210",
...
},
"spec": {
"cron": [
"*",
"*",
"*",
"*",
"*/5"
],
"imageName": [
"repo.example.com/my-awesome-cron-image:v1"
]
}
}
],
"uid": "42f90c87-87f5-4686-8109-eba065c7fa6e"
}
}
}]
Response example:
cat <<EOF >$CONVERSION_RESPONSE_PATH
{"convertedObjects": [{
# A converted version.
"apiVersion": "stable.example.com/v1beta1",
"kind": "CronTab",
"metadata": {
"name": "crontab-v1alpha1",
"namespace": "example-210",
...
},
"spec": {
"cron": [
"*",
"*",
"*",
"*",
"*/5"
],
"imageName": [
"repo.example.com/my-awesome-cron-image:v1"
]
}
}]}
EOF
Return a message if something goes wrong:
cat <<EOF >$CONVERSION_RESPONSE_PATH
{"failedMessage":"Conversion of crontabs.stable.example.com is failed"}
EOF
User will see an error message:
Error from server: conversion webhook for unstable.crontab.io/v1beta1, Kind=CronTab failed: Conversion of crontabs.stable.example.com is failed
Empty or invalid $CONVERSION_RESPONSE_PATH file is considered as a fail with a short message about the problem and a more verbose error in the log.
Note: kube-apiserver applies OpenAPI spec to the object returned by webhook. It can cause removing unknown fields without notifying a user.
HTTP server and Kubernetes configuration
Shell-operator should create an HTTP endpoint with TLS support and register an endpoint in the CustomResourceDefinition resource.
There should be a Service for shell-operator (see Service Reference).
There are command line options and corresponding environment variables to setup TLS certificates and a service name:
--conversion-webhook-service-name="shell-operator-conversion-svc"
A name of a service for clientConfig in CRD. Can be set with
$CONVERSION_WEBHOOK_SERVICE_NAME.
--conversion-webhook-server-cert="/conversion-certs/tls.crt"
A path to a server certificate for clientConfig in CRD. Can be set with
$CONVERSION_WEBHOOK_SERVER_CERT.
--conversion-webhook-server-key="/conversion-certs/tls.key"
A path to a server private key for clientConfig in CRD. Can be set with
$CONVERSION_WEBHOOK_SERVER_KEY.
--conversion-webhook-ca="/conversion-certs/ca.crt"
A path to a ca certificate for clientConfig in CRD. Can be set with
$CONVERSION_WEBHOOK_CA.
--conversion-webhook-client-ca=CONVERSION-WEBHOOK-CLIENT-CA ...
A path to a server certificate for CRD.spec.conversion.webhook. Can be set
with $CONVERSION_WEBHOOK_CLIENT_CA.
Shell-operator metrics
Shell-operator exports its built-in Prometheus metrics to the /metrics
path. Custom metrics generated by hooks are reported to /metrics/hooks
. The default port is 9115.
Metrics
-
shell_operator_hook_run_seconds{hook="", binding="", queue=""}
— a histogram with hook execution times. “hook” label is a name of the hook, “binding” is a binding name from configuration, “queue” is a queue name where hook is queued. -
shell_operator_hook_run_errors_total{hook="hook-name", binding="", queue=""}
— this is the counter of hooks’ execution errors. It only tracks errors of hooks with the disabledallowFailure
(i.e. respective key is omitted in the configuration or theallowFailure: false
parameter is set). This metric has a “hook” label with the name of a failed hook. -
shell_operator_hook_run_allowed_errors_total{hook="hook-name", binding="", queue=""}
— this is the counter of hooks’ execution errors. It only tracks errors of hooks that are allowed to exit with an error (the parameterallowFailure: true
is set in the configuration). The metric has a “hook” label with the name of a failed hook. -
shell_operator_hook_run_success_total{hook="hook-name", binding="", queue=""}
— this is the counter of hooks’ success execution. The metric has a “hook” label with the name of a succeeded hook. -
shell_operator_hook_enable_kubernetes_bindings_success{hook=""}
— this gauge have two values: 0.0 if Kubernetes informers are not started and 1.0 if Kubernetes informers are successfully started for a hook. -
shell_operator_hook_enable_kubernetes_bindings_errors_total{hook=""}
— a counter of failed attempts to start Kubernetes informers for a hook. -
shell_operator_hook_enable_kubernetes_bindings_seconds{hook=""}
— a gauge with time of Kubernetes informers start. -
shell_operator_tasks_queue_length{queue=""}
— a gauge showing the length of the working queue. This metric can be used to warn about stuck hooks. It has the “queue” label with the queue name. -
shell_operator_task_wait_in_queue_seconds_total{hook="", binding="", queue=""}
— a counter with seconds that the task to run a hook elapsed in the queue. -
shell_operator_live_ticks
— a counter that increases every 10 seconds. This metric can be used for alerting about an unhealthy Shell-operator. It has no labels. -
shell_operator_kube_jq_filter_duration_seconds{hook="", binding="", queue=""}
— a histogram with jq filter timings. -
shell_operator_kube_event_duration_seconds{hook="", binding="", queue=""}
— a histogram with kube event handling timings. -
shell_operator_kube_snapshot_objects{hook="", binding="", queue=""}
— a gauge with count of cached objects (the snapshot) for particular binding. -
shell_operator_kubernetes_client_request_result_total
— a counter of requests made by kubernetes/client-go library. -
shell_operator_kubernetes_client_request_latency_seconds
— a histogram with latency of requests made by kubernetes/client-go library. -
shell_operator_tasks_queue_action_duration_seconds{queue_name="", queue_action=""}
— a histogram with measurements of low level queue operations. Use QUEUE_ACTIONS_METRICS=”no” to disable this metric. -
shell_operator_hook_run_sys_cpu_seconds{hook="", binding="", queue=""}
— a histogram with system cpu seconds. -
shell_operator_hook_run_user_cpu_seconds{hook="", binding="", queue=""}
— a histogram with user cpu seconds. -
shell_operator_hook_run_max_rss_bytes{hook="", binding="", queue=""}
— a gauge with maximum resident set size used in bytes.
Custom metrics
Hooks can export metrics by writing a set of operations in JSON format into $METRICS_PATH file.
Operation to register a counter and increase its value:
{"name":"metric_name","action":"add","value":1,"labels":{"label1":"value1"}}
Operation to register a gauge and set its value:
{"name":"metric_name","action":"set","value":33,"labels":{"label1":"value1"}}
Operation to register a histogram and observe a duration:
{"name":"metric_name","action":"observe","value":42, "buckets": [1,2,5,10,20,50], "labels":{"label1":"value1"}}
Labels are not required, but Shell-operator adds a hook
label with a path to a hook script relative to hooks directory.
Several metrics can be exported at once. For example, this script will create 2 metrics:
echo '{"name":"hook_metric_count","action":"add","value":1,"labels":{"label1":"value1"}}' >> $METRICS_PATH
echo '{"name":"hook_metrics_items","action":"add","value":1,"labels":{"label1":"value1"}}' >> $METRICS_PATH
The metric name is used as-is, so several hooks can export the same metric name. It is advisable for a hooks‘ developer to maintain consistent label cardinality.
There are fields “add” and “set” that can be used as shortcuts for action and value. This feature may be deprecated in future releases.
{"name":"metric_name","add":1,"labels":{"label1":"value1"}}
Note that there is no mechanism to expire this kind of metrics except the shell-operator restart. It is the default behavior of prometheus-client.
Grouped metrics
The common cause to expire a metric is a removed object. It means that the object is no longer in the snapshot, and the hook can’t identify the metric that should be expired.
To solve this, use the “group” field in metric operations. When Shell-operator receives operations with the “group” field, it expires previous metrics with the same group and applies new metric values. This grouping works across hooks and label values.
echo '{"group":"group1", "name":"hook_metric_count", "action":"add", "value":1, "labels":{"label1":"value1"}}' >> $METRICS_PATH
echo '{"group":"group1", "name":"hook_metrics_items", "action":"add", "value":1, "labels":{"label1":"value1"}}' >> $METRICS_PATH
To expire all metrics in a group, use action “expire”:
{"group":"group_name_1", "action":"expire"}
WARNING: “observe” is currently an unsupported action for grouped metrics
Example
hook1.sh
returns these metrics:
echo '{"group":"hook1", "name":"hook_metric", "action":"add", "value":1, "labels":{"kind":"pod"}}' >> $METRICS_PATH
echo '{"group":"hook1", "name":"hook_metric", "action":"add", "value":1, "labels":{"kind":"replicaset"}}' >> $METRICS_PATH
echo '{"group":"hook1", "name":"hook_metric", "action":"add", "value":1, "labels":{"kind":"deployment"}}' >> $METRICS_PATH
echo '{"group":"hook1", "name":"hook1_special_metric", "action":"set", "value":12, "labels":{"label1":"value1"}}' >> $METRICS_PATH
echo '{"group":"hook1", "name":"common_metric", "action":"set", "value":300, "labels":{"source":"source3"}}' >> $METRICS_PATH
echo '{"name":"common_metric", "action":"set", "value":100, "labels":{"source":"source1"}}' >> $METRICS_PATH
hook2.sh
returns these metrics:
echo '{"group":"hook2", "name":"hook_metric","action":"add", "value":1, "labels":{"kind":"configmap"}}' >> $METRICS_PATH
echo '{"group":"hook2", "name":"hook_metric","action":"add", "value":1, "labels":{"kind":"secret"}}' >> $METRICS_PATH
echo '{"group":"hook2", "name":"hook2_special_metric", "action":"set", "value":42}' >> $METRICS_PATH
echo '{"name":"common_metric", "action":"set", "value":200, "labels":{"source":"source2"}}' >> $METRICS_PATH
Prometheus scrapes these metrics:
# HELP hook_metric hook_metric
# TYPE hook_metric counter
hook_metric{hook="hook1.sh", kind="pod"} 1 -------------------+---------- group:hook1
hook_metric{hook="hook1.sh", kind="replicaset"} 1 ------------+
hook_metric{hook="hook1.sh", kind="deployment"} 1 ------------+
hook_metric{hook="hook2.sh", kind="configmap"} 1 ------------|-------+-- group:hook2
hook_metric{hook="hook2.sh", kind="secret"} 1 ----------------|-------+
# HELP hook1_special_metric hook1_special_metric | |
# TYPE hook1_special_metric gauge | |
hook1_special_metric{hook="hook1.sh", label1="value1"} 12 ----+ |
# HELP hook2_special_metric hook2_special_metric | |
# TYPE hook2_special_metric gauge | |
hook2_special_metric{hook="hook2.sh"} 42 ---------------------|-------'
# HELP common_metric common_metric |
# TYPE common_metric gauge |
common_metric{hook="hook1.sh", source="source3"} 300 ---------'
common_metric{hook="hook1.sh", source="source1"} 100 ---------------+---- no group
common_metric{hook="hook2.sh", source="source2"} 200 ---------------'
On next execution of hook1.sh
values for hook_metric{kind="replicaset"}
, hook_metric{kind="deployment"}
, common_metric{source="source3"}
and hook1_special_metric
are expired and hook returns only one metric:
echo '{"group":"hook1", "name":"hook_metric", "action":"add", "value":1, "labels":{"kind":"pod"}}' >> $METRICS_PATH
Shell-operator expires previous values for group “hook1” and updates value for hook_metric{hook="hook1.sh", kind="pod"}
. Values for group hook2
and common_metric
without group are left intact. Now Prometheus scrapes these metrics:
# HELP hook_metric hook_metric
# TYPE hook_metric counter
hook_metric{hook="hook1.sh", kind="pod"} 2 --------------- group:hook1
hook_metric{hook="hook2.sh", kind="configmap"} 1 ----+---- group:hook2
hook_metric{hook="hook2.sh", kind="secret"} 1 -------+
# HELP hook2_special_metric hook2_special_metric |
# TYPE hook2_special_metric gauge |
hook2_special_metric{hook="hook2.sh"} 42 ------------'
# HELP common_metric common_metric
# TYPE common_metric gauge
common_metric{hook="hook1.sh", source="source1"} 100 --+-- no group
common_metric{hook="hook2.sh", source="source2"} 200 --'
Next execution of hook2.sh
expires all metrics in group ‘hook2’:
echo '{"group":"hook2", "action":"expire"}' >> $METRICS_PATH
Shell-operator expires previous values for group “hook2” but leaves common_metrics
for “hook2.sh” as is. Now Prometheus scrapes these metrics:
# HELP hook_metric hook_metric
# TYPE hook_metric counter
hook_metric{hook="hook1.sh", kind="pod"} 2 --------------- group:hook1
# HELP common_metric common_metric
# TYPE common_metric gauge
common_metric{hook="hook1.sh", source="source1"} 100 --+-- no group
common_metric{hook="hook2.sh", source="source2"} 200 --'