API Machinery¶
The API machinery is the foundational layer of Kubernetes extensibility. Understanding GVK/GVR, the informer pipeline, and patch semantics is essential for writing controllers and debugging API interactions.
GVK & GVR¶
Every Kubernetes object is identified by two related concepts:
GVK (Group / Version / Kind) — identifies a type schema:
| Component | Meaning | Examples |
|---|---|---|
| Group | Logical family | "" (core), apps, batch, networking.k8s.io, rbac.authorization.k8s.io |
| Version | Schema version | v1, v1beta1, v1alpha1 |
| Kind | Type name | Deployment, Pod, CustomResourceDefinition |
GVR (Group / Version / Resource) — identifies a REST endpoint:
| Component | Meaning | Examples |
|---|---|---|
| Resource | Plural lowercase noun | deployments, pods, customresourcedefinitions |
The mapping between them:
This mapping lives in the RESTMapper. In client-go: mapper.ResourceFor(gvk). In Go code, the Scheme maps GVK ↔ Go struct type.
URL structure¶
# Core group (group = "")
/api/v1/namespaces/{ns}/pods/{name}
/api/v1/nodes/{name}
# Named groups
/apis/{group}/{version}/namespaces/{ns}/{resource}/{name}
/apis/apps/v1/namespaces/default/deployments/nginx
/apis/batch/v1/namespaces/default/jobs
/apis/custom.io/v1alpha1/namespaces/myns/foos/myfoo
# Sub-resources
/apis/apps/v1/namespaces/default/deployments/nginx/scale
/api/v1/namespaces/default/pods/mypod/log
/api/v1/namespaces/default/pods/mypod/exec
API discovery¶
Clients discover available APIs before making requests. Two endpoints:
GET /api → core group versions
GET /apis → all named groups
GET /api/v1 → resource list for core v1
GET /apis/apps/v1 → resource list for apps/v1
GET /apis/custom.io/v1alpha1 → resource list for your CRD group
Aggregated discovery (GA 1.30) returns all groups + resources in two requests:
kubectl caches discovery documents in ~/.kube/cache/discovery/. Stale cache causes "no matches for kind" errors — clear with kubectl api-resources or delete ~/.kube/cache/.
resourceVersion & watches¶
Every object has a resourceVersion field — a monotonically increasing etcd revision string:
Semantics:
- Changes on every write (spec, status, metadata, labels — anything)
- Used for optimistic concurrency: include it in writes; stale writes get
409 Conflict - Passed to LIST/WATCH to resume from a known point
Watch protocol¶
A watch is a long-lived HTTP/2 streaming request. The API server streams newline-delimited JSON events:
GET /apis/apps/v1/deployments?watch=1&resourceVersion=42891
{"type":"MODIFIED","object":{...}}
{"type":"DELETED","object":{...}}
{"type":"ADDED","object":{...}}
Error cases:
| Condition | HTTP response | Action |
|---|---|---|
resourceVersion too old (compacted) |
410 Gone |
Re-list (RV=""), restart watch |
| Network timeout / server close | Connection closed | Resume with last known RV |
| RV="" | Start from current head | No missed events, but potentially stale |
The informer library handles all this automatically. Don't implement raw watches in application code.
resourceVersion semantics on LIST¶
resourceVersion value |
Meaning |
|---|---|
"" (empty) |
Return from API server cache (may be slightly stale). Most efficient. |
"0" |
Same as empty in practice. |
| Specific value | Return only if cluster state is at least that version. |
"0" + resourceVersionMatch: Exact |
Return exactly from that revision (expensive, hits etcd). |
generation & observedGeneration¶
Two separate revision counters:
metadata.generation— increments on every spec change. Status updates don't increment it.status.observedGeneration— set by the controller to thegenerationvalue it last successfully processed.
Use this pattern to detect controller drift:
if obj.Generation != obj.Status.ObservedGeneration {
// controller hasn't caught up to latest spec yet
}
Always set observedGeneration in your status update:
Patch strategies¶
Four patch types, each with different semantics:
JSON Patch (RFC 6902)¶
Array of operations. Positional — fragile for lists.
[
{"op": "replace", "path": "/spec/replicas", "value": 3},
{"op": "add", "path": "/metadata/labels/env", "value": "prod"},
{"op": "remove", "path": "/metadata/annotations/old-key"}
]
Merge Patch (RFC 7396)¶
Partial object. Null = delete field. Lists are replaced entirely — danger for containers[].
Strategic Merge Patch¶
Kubernetes-specific extension to merge patch. List fields have merge keys defined in the Go struct tags:
// containers merges by "name"
Containers []Container `json:"containers" patchStrategy:"merge" patchMergeKey:"name"`
Patching one container by name doesn't replace others:
kubectl patch deploy/myapp --type=strategic \
-p='{"spec":{"template":{"spec":{"containers":[{"name":"app","image":"myapp:2.0"}]}}}}'
Strategic merge patch doesn't work on CRDs
SMP merge keys are defined in Go struct tags in the Kubernetes source. CRDs have no such tags — SMP falls back to merge patch behavior (list replacement). Use Server-Side Apply for CRDs.
Server-Side Apply (GA 1.22)¶
Field-level ownership tracking. The API server tracks which manager owns which field.
Conflict: two managers try to own the same field → 409 Conflict with details. Force-take ownership:
In Go with controller-runtime:
SSA is the correct approach for controllers that manage partial objects — they own only the fields they care about, and other managers (user, Helm, etc.) can coexist.
Informer architecture¶
The informer is the standard pattern for watching resources without hammering the API server.
API server (ListWatch)
↓
Reflector → ListWatch implementation; handles 410 Gone (re-list + re-watch)
↓
DeltaFIFO → queue of (object, delta-type) pairs; deduplicates
↓
Indexer → thread-safe in-memory store; GVK-indexed; supports label queries
↓
Event handlers (AddFunc, UpdateFunc, DeleteFunc)
↓
WorkQueue → rate-limited, deduplicating; items are NamespacedName keys
↓
Reconciler goroutines (n workers)
Key property: all reads in the reconcile loop come from the Indexer (local cache) — never the API server directly. This means the reconciler never adds to the API server's read load, no matter how many controllers run.
SharedInformerFactory¶
Multiple controllers for the same resource type share a single informer/watch:
factory := informers.NewSharedInformerFactory(client, 30*time.Second)
deployInformer := factory.Apps().V1().Deployments()
podInformer := factory.Core().V1().Pods()
factory.Start(stopCh)
factory.WaitForCacheSync(stopCh) // block until initial list is complete
The factory deduplicates by GVR — two controllers watching Deployments share one HTTP watch stream to the API server.
Always wait for cache sync¶
if !cache.WaitForCacheSync(stopCh, r.deploymentsSynced, r.podsSynced) {
return fmt.Errorf("timed out waiting for caches to sync")
}
Starting reconcilers before the cache is synced causes false "not found" errors for objects that exist but haven't populated the local cache yet.
Object metadata deep dive¶
metadata:
uid: 550e8400-e29b-41d4-a716-446655440000
# Immutable. Unique forever (even after object deletion + recreation).
# etcd uses (namespace, name) for storage; uid identifies a specific instance.
resourceVersion: "42891"
# etcd revision. String — compare only for equality, never parse as integer.
# Changes on every write to any field (spec, status, labels, annotations, finalizers).
generation: 3
# Incremented only on spec changes. Status and metadata changes don't increment.
# Use generation/observedGeneration to detect pending reconciliation.
creationTimestamp: "2024-01-15T10:00:00Z"
deletionTimestamp: "2024-01-16T10:00:00Z"
# Set when DELETE is received AND finalizers are present.
# Object persists until finalizers[] is empty.
deletionGracePeriodSeconds: 30
# Grace period for SIGTERM before SIGKILL on container stop.
managedFields:
# Server-Side Apply field ownership map. One entry per field-manager.
# Can be stripped with --show-managed-fields=false or server-side.