Architecture
Velero is a single binary that runs both the server-side controller and the CLI. Understanding the internal process model is essential before reading the code.
Process topology¶
┌─────────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌── namespace: velero ──────────────────────────────────────┐ │
│ │ │ │
│ │ ┌─── velero-server pod ────┐ ┌─── node-agent DaemonSet─┐│ │
│ │ │ BackupController │ │ DataUploadController ││ │
│ │ │ RestoreController │ │ DataDownloadController ││ │
│ │ │ ScheduleController │ │ Kopia repository engine││ │
│ │ │ GCController │ │ hostPath: / (ro) ││ │
│ │ │ BSLController │ └─────────────────────────┘│ │
│ │ └──────────────────────────┘ │ │
│ │ │ │
│ │ ┌─── API Server (controller-runtime informers) ─────────┐│ │
│ │ └───────────────────────────────────────────────────────┘│ │
│ │ │ │
│ │ ┌─── Velero CRDs (etcd) ──┐ ┌─── PVC/VolumeSnapshot ──┐ │ │
│ │ └─────────────────────────┘ └─────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
│ │
▼ (go-plugin / gRPC) ▼
┌─── Plugin process ───┐ ┌─── Object Storage ───┐
│ ObjectStore impl │──────────│ S3 / GCS / Azure / │
│ VolumeSnapshotter │ │ custom │
└──────────────────────┘ └──────────────────────┘
Controllers¶
Velero uses controller-runtime (the same library as operator-sdk). Each CRD has one or more reconcilers that watch for changes and drive state.
Velero has 19 controllers across ~10 CRDs. Some CRDs are reconciled by multiple controllers at different lifecycle phases. The full inventory:
| Controller | Watches | Action |
|---|---|---|
BackupQueueController |
Backup CRD (New) | Queue ordering, namespace conflict detection, concurrency limits |
BackupController |
Backup CRD (ReadyToStart) | Executes backup via backup.Backup |
BackupOperationsController |
Backup CRD (InProgress) | Polls async plugin operations every 10s |
BackupFinalizerController |
Backup CRD (Finalizing) | Finalizes item actions, uploads metadata |
RestoreController |
Restore CRD | Downloads backup, replays resources via dynamic client |
RestoreOperationsController |
Restore CRD | Polls async plugin operations every 10s |
RestoreFinalizerController |
Restore CRD (Finalizing) | Runs finalization hooks, uploads results |
ScheduleController |
Schedule CRD | Creates Backup objects on cron cadence, skip/pause logic |
GCController |
Backup CRD (expired) | TTL-based expiration, creates DeleteBackupRequests (every 60m) |
BackupSyncController |
BackupStorageLocation | Syncs Backup objects from BSL into cluster (cross-cluster restores) |
BackupDeletionController |
DeleteBackupRequest CRD | Handles explicit backup deletion requests |
BSLController |
BackupStorageLocation | Validates storage connectivity every 10s |
BackupRepoController |
BackupRepository CRD | Establishes/maintains repos, triggers Kopia maintenance |
DataUploadController |
DataUpload CRD | Manages backup data mover pods (CSI path) |
DataDownloadController |
DataDownload CRD | Manages restore data mover pods (CSI path) |
PodVolumeBackupController |
PodVolumeBackup CRD | Legacy FS-based volume backup via node-agent |
PodVolumeRestoreController |
PodVolumeRestore CRD | Legacy FS-based volume restore via node-agent |
DownloadRequestController |
DownloadRequest CRD | Generates signed URLs for backup/restore artifacts |
ServerStatusRequestController |
ServerStatusRequest CRD | Returns server version and installed plugins |
Multiple controllers per CRD
A single Backup CRD is reconciled by 4 controllers in sequence: Queue → Backup → Operations → Finalizer. Understanding which controller owns which phase transition is essential for debugging and contributing. See Controller Deep Dive.
Plugin Process Model¶
Velero uses hashicorp/go-plugin to r un plugins as separate OS processes communicating over gRPC.
This design has deliberate consequences:
- Crash isolation: a crashing plugin doesn't take down
velero-server. - Language agnostic: plugins can be written in any language that speaks gRPC (though the Go SDK is the only officially supported one).
- No hot reload: plugins are discovered at startup from the
/pluginsdirectory in the velero pod. Changing plugins requires a pod restart.
// pkg/client/factory.go: plugin manager setup (simplified)
pluginManager := clientmgmt.NewManager(logger, logLevel, pluginRegistry)
// Plugin registry scans the /plugins dir in the pod at startup
// Each binary exposes its capabilities via the SDK handshake
objectStore, err := pluginManager.GetObjectStore("velero.io/aws")
snapshotter, err := pluginManager.GetVolumeSnapshotter("velero.io/aws")
velero CLI¶
The velero binary is the same binary as the server: it branches on
subcommand. velero server starts the controller manager.
The CLI communicates with the cluster exclusively through CRD objects and the API server: there is no direct channel to the velero-server pod.
Prototyping Tip
Because CLI actions work through CRDs, you can prototype behavior by
manually creating YAML and watching reconciliation:
no UI or CLI shim needed.
HA and Leadership Election¶
When running multiple replicas of velero-server (for HA),
controller-runtime's built-in leader election (via Kubernetes leases)
ensures only one replica runs the reconcilers at a time.
The lease is in the velero namespace.
# velero server flags for HA
--leader-elect=true
--leader-elect-lease-duration=15s
--leader-elect-renew-deadline=10s