Перейти до вмісту

Fleet Management

Цей контент ще не доступний вашою мовою.

Operating a single bare-metal Kubernetes cluster is an exercise in node and component lifecycle management. Operating fifty, five hundred, or five thousand clusters requires a fundamental shift in architecture. Standard GitOps tools designed for single-cluster continuous delivery break down under the N×M connection matrix required for fleet-wide policy and workload distribution.

Fleet management platforms provide a centralized control plane to register, select, and configure fleets of Kubernetes clusters dynamically, treating clusters as cattle rather than pets.

  • Differentiate between infrastructure lifecycle management (Cluster API) and workload/policy fleet management (OCM, Fleet, Karmada).
  • Evaluate push-based vs. pull-based fleet architectures regarding bare-metal network topologies and firewall constraints.
  • Implement Open Cluster Management (OCM) to bootstrap a hub-and-spoke fleet topology.
  • Formulate cluster placement rules based on custom hardware labels and capacity metrics.
  • Diagnose common fleet failures, including agent disconnections, API server exhaustion, and CRD version skew.

Scaling GitOps across multiple bare-metal clusters introduces distinct operational bottlenecks. If you run one ArgoCD or Flux instance per cluster, you face configuration drift in the CD tooling itself. If you run a centralized ArgoCD instance and push to hundreds of clusters, the central controller must maintain active credentials and network routes to every remote API server.

Fleet management separates what needs to be deployed from where it should be deployed.

The most critical architectural decision in fleet management is the network direction of the control loop.

flowchart TD
subgraph Push-Based [Push-Based (e.g., Default ArgoCD, Karmada)]
HubA[Central Control Plane] -->|HTTPS POST| SpokeA1[Spoke API Server]
HubA -->|HTTPS POST| SpokeA2[Spoke API Server]
end
subgraph Pull-Based [Pull-Based (e.g., OCM, Rancher Fleet)]
SpokeB1[Spoke Agent] -->|HTTPS GET/WATCH| HubB[Central Control Plane]
SpokeB2[Spoke Agent] -->|HTTPS GET/WATCH| HubB
end

Push-Based: The central control plane authenticates against the spoke cluster’s API server.

  • Pros: No agent installation required on the spoke. Immediate actuation.
  • Cons: Requires the spoke API server to be reachable from the central hub. In bare-metal environments, spoke clusters are often deployed at the edge, behind strict NAT or firewalls without inbound ingress. Storing hundreds of high-privileged kubeconfig files centrally creates a massive security blast radius.

Pull-Based: An agent runs on the spoke cluster and connects outbound to the central hub to retrieve configurations.

  • Pros: Works seamlessly through NAT and firewalls (outbound TCP/443 only). No inbound ports required on the spoke. Compromising a spoke only exposes that specific cluster.
  • Cons: Requires managing the lifecycle of the agent itself. Slightly higher latency due to polling/watch mechanisms.
ToolArchitecturePrimary AbstractionBest Use Case
ArgoCD ApplicationSetsPush (usually)ApplicationSet, ClusterGenerator10-50 clusters on flat networks. Developer-centric GitOps.
Rancher FleetPullGitRepo, Bundle, ClusterGroupEdge deployments (1000+ clusters). Tight integration with Rancher.
Open Cluster ManagementPullManagedCluster, ManifestWork, PlacementBare metal, complex dynamic targeting, policy orchestration.
KarmadaPush/PullPropagationPolicy, ResourceBindingMulti-cloud federation, stretching a single deployment across clusters.

Open Cluster Management (a CNCF Incubating project) is the defacto standard for bare-metal fleet management requiring high customizability. It uses a strict hub-and-spoke, pull-based architecture.

  1. Hub Cluster: Runs the OCM control plane (Registration, Work, and Placement controllers). It does not run workloads.
  2. Klusterlet: The agent running on the Managed (Spoke) Cluster. It initiates the connection to the Hub, creates a Certificate Signing Request (CSR), and pulls ManifestWork.
  3. ManagedCluster: A cluster-scoped CRD on the Hub representing a registered spoke.
  4. ManifestWork: A CRD placed in a dedicated namespace on the Hub (e.g., cluster1-ns). The Klusterlet watches this specific namespace, pulls the manifests, and applies them locally.
  5. Placement: Defines rules for selecting ManagedClusters based on labels, claims (e.g., Kubernetes version, hardware type), or scores.

Bootstrapping trust on bare metal without a cloud provider IAM requires a cryptographic handshake.

  1. The Hub admin generates a bootstrap token.
  2. The Klusterlet starts on the spoke using the bootstrap token. It connects to the Hub and submits a CSR.
  3. The Klusterlet creates a ManagedCluster request on the Hub.
  4. The Hub admin (or an automated operator) approves the CSR and sets hubAcceptsClient: true on the ManagedCluster.
  5. The Hub generates a unique client certificate for the Klusterlet. The Klusterlet drops the bootstrap token and authenticates using mTLS going forward.

While OCM handles the “how to deliver”, ArgoCD is often still used for “what to deliver”. ApplicationSets allow you to template ArgoCD Applications across multiple clusters.

When operating at fleet scale (100+ clusters), rely on the Matrix Generator combined with a Git directory structure, rather than configuring each cluster explicitly.

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: fleet-prometheus
namespace: argocd
spec:
goTemplate: true
goTemplateOptions: ["missingkey=error"]
generators:
- matrix:
generators:
- git:
repoURL: https://github.com/org/fleet-config.git
revision: HEAD
directories:
- path: workloads/prometheus/*
- cluster:
selector:
matchLabels:
environment: production
hardware: bare-metal
template:
metadata:
name: '{{.path.basename}}-prometheus'
spec:
project: default
source:
repoURL: https://github.com/org/fleet-config.git
targetRevision: HEAD
path: '{{.path.path}}'
helm:
valueFiles:
- values.yaml
- 'values-{{.name}}.yaml'
destination:
server: '{{.server}}'
namespace: monitoring

Production Gotcha: By default, ArgoCD connects to remote clusters via push. If you integrate ArgoCD with OCM, you can use the OCM Pull Model. OCM provides an argocd-pull-integration controller that translates ArgoCD Applications on the Hub into ManifestWorks that the Klusterlet pulls, giving you GitOps UX with a Pull-based network topology.

Fleet management is not just workloads; it is governance. Distributing Kyverno or OPA Gatekeeper policies across 500 bare-metal clusters requires strict consistency.

Use OCM’s Policy framework (governance-policy-propagator) to distribute CRDs.

  1. PlacementBinding: Binds a Policy to a Placement rule.
  2. ConfigurationPolicy: Instructs the Klusterlet to enforce the existence of specific Kubernetes resources (e.g., a Kyverno ClusterPolicy).
  3. Status Sync: The Klusterlet reports the compliance status of the policy back to the Hub.

For audit logs, do not rely on querying the hub. Configure fluent-bit/Vector via a fleet-wide DaemonSet to ship standard API audit logs directly from the spoke control plane nodes to a centralized SIEM (e.g., OpenSearch or Splunk) independent of the fleet management control channel.

In this lab, you will deploy an Open Cluster Management (OCM) hub, register a spoke cluster, and deploy a workload dynamically using a ManifestWork.

  • kind (Kubernetes in Docker)
  • kubectl
  • clusteradm (OCM CLI, install via curl -L https://raw.githubusercontent.com/open-cluster-management-io/clusteradm/main/install.sh | bash)

Create two independent Kubernetes clusters to simulate a bare-metal Hub and Spoke.

Terminal window
kind create cluster --name hub
kind create cluster --name spoke1

Switch your context to the hub cluster and initialize the OCM control plane.

Terminal window
kubectl config use-context kind-hub
clusteradm init --wait

Verification:

Terminal window
kubectl get po -n open-cluster-management

Expected Output: You should see cluster-manager pods in Running state.

Extract the bootstrap token command from the Hub.

Terminal window
clusteradm get token

Copy the output command. It will look similar to: clusteradm join --hub-token <token> --hub-apiserver <https://ip:port> --cluster-name <cluster_name>

Because we are running in kind, the spoke needs the internal Docker IP of the hub API server. Find it:

Terminal window
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' hub-control-plane

(Assume it is 172.18.0.2 for this example)

Switch to the spoke cluster and run the join command, modifying the IP and naming the cluster spoke1.

Terminal window
kubectl config use-context kind-spoke1
clusteradm join \
--hub-token <YOUR_TOKEN> \
--hub-apiserver https://172.18.0.2:6443 \
--cluster-name spoke1 \
--force-internal-endpoint-lookup \
--wait

The Spoke has initiated a CSR and created a ManagedCluster record. You must approve it on the Hub.

Terminal window
kubectl config use-context kind-hub
clusteradm accept --clusters spoke1

Verification:

Terminal window
kubectl get managedclusters

Expected Output: spoke1 should show JOINED=True, AVAILABLE=True.

Step 6: Deploy a Workload via ManifestWork

Section titled “Step 6: Deploy a Workload via ManifestWork”

Create a ManifestWork on the Hub to deploy Nginx to the spoke. Note that the namespace on the Hub must match the name of the managed cluster (spoke1).

Terminal window
cat <<EOF | kubectl apply -f -
apiVersion: work.open-cluster-management.io/v1
kind: ManifestWork
metadata:
name: deploy-nginx
namespace: spoke1
spec:
workload:
manifests:
- apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-fleet
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.25
EOF

Switch back to the spoke and verify the Klusterlet pulled and applied the manifest.

Terminal window
kubectl config use-context kind-spoke1
kubectl get deployments -n default

Expected Output: nginx-fleet is deployed and running 2 replicas.

  • Spoke stays in AVAILABLE=Unknown: The Klusterlet cannot reach the Hub API server. Verify the Docker networking and the --hub-apiserver IP provided during the join command. Check logs with kubectl logs -n open-cluster-management-agent -l app=klusterlet --context kind-spoke1.
  • ManifestWork is not applying: Check the Status field of the ManifestWork on the hub: kubectl get manifestwork deploy-nginx -n spoke1 -o yaml. OCM reports the exact remote API server error back to the hub.
  1. Etcd Limit Exhaustion on ManifestWorks: A ManifestWork encapsulates Kubernetes manifests inside a CRD. Etcd has a hard limit of 1.5MB per object. If you attempt to distribute a massive CRD installation (like Prometheus Operator bundled with all PrometheusRules) in a single ManifestWork, the Hub API server will reject it. Fix: Split large deployments into multiple ManifestWorks or use the fleet manager to deploy a lightweight Helm/Argo Application that pulls the heavy manifests directly from Git on the spoke.
  2. Agent Disconnection and Orphaned Resources: If a Spoke cluster goes offline or the Klusterlet crashes, the Hub marks the cluster as unavailable. If an administrator deletes the ManagedCluster from the Hub while the spoke is disconnected, the Klusterlet (upon reconnecting) will garbage collect (delete) all workloads it previously deployed. Fix: Explicitly configure deleteOption: Orphan on critical ManifestWorks if you want workloads to survive fleet unregistration.
  3. Hub API Server QPS Overload: In a pull architecture, thousands of Klusterlets constantly watch the Hub for ManifestWork changes. After a Hub control plane restart, all agents reconnect simultaneously, causing a thundering herd that can crash the Hub API server. Fix: Heavily tune the --max-requests-inflight and --max-mutating-requests-inflight on the Hub API server, and ensure Klusterlets are configured with jittered sync intervals.
  4. CRD Version Skew: Distributing a ManifestWork containing v1beta1 Ingress to a fleet spanning Kubernetes 1.21 and 1.25 will fail on the newer clusters. Fix: Use Placement rules to segment clusters by Kubernetes version (kube-version claim) and maintain version-specific ManifestWork templates.

1. You are managing 300 bare-metal Kubernetes clusters deployed in retail store backrooms. The stores sit behind strict corporate firewalls that deny all inbound connections. Which fleet architecture must you adopt?

  • A) Push-based, utilizing ArgoCD centralized controllers to connect to store API servers via NodePorts.
  • B) Push-based, utilizing Karmada PropagationPolicies to inject manifests via SSH tunnels.
  • C) Pull-based, utilizing Open Cluster Management or Rancher Fleet agents that poll the central hub.
  • D) Pull-based, configuring Cluster API (CAPI) on the hub to sync workloads to the workload clusters.
  • Correct Answer: C (Pull-based agents only require outbound connections, bypassing inbound firewall restrictions. D is incorrect as CAPI manages infrastructure, not workload syncing.)

2. An application team needs to deploy an application only to bare-metal clusters that possess GPU hardware and are located in the EU region. Using Open Cluster Management, how is this dynamically achieved?

  • A) Create a ManifestWork containing a NodeSelector for gpu=true and region=eu.
  • B) Create a Placement rule on the Hub that selects ManagedClusters matching the required cluster claims/labels.
  • C) Create a separate ArgoCD Application manually for every EU cluster and hardcode the GPU requirements.
  • D) Configure the Klusterlet on EU clusters to ignore ManifestWorks that do not contain GPU workloads.
  • Correct Answer: B (Placements dynamically match the ManagedCluster labels/claims to bind workloads to the correct subset of the fleet.)

3. You have deployed a ManifestWork containing 50 ConfigMaps and a massive set of custom Prometheus Rules. The kubectl apply command fails against the Hub cluster with an API server error, though the YAML is syntactically valid. What is the most likely cause?

  • A) The Spoke cluster lacks the memory to run the ConfigMaps.
  • B) The ManifestWork exceeded the Hub’s etcd object size limit (typically 1.5MB).
  • C) The Klusterlet agent timed out while parsing the Prometheus Rules.
  • D) The Hub cluster requires a StorageClass to store ManifestWorks.
  • Correct Answer: B (ManifestWorks encapsulate resources as JSON inside the CRD structure, easily hitting etcd object size limits for massive payloads.)

4. The hubAcceptsClient field on an OCM ManagedCluster resource is currently set to false. What state is the cluster registration in?

  • A) The Spoke cluster has successfully joined but has no active ManifestWorks.
  • B) The Klusterlet agent has crashed on the Spoke cluster.
  • C) The Spoke has submitted a Certificate Signing Request (CSR), but the Hub administrator has not yet authorized the spoke to join.
  • D) The Hub API server is currently unreachable due to network partition.
  • Correct Answer: C (Setting hubAcceptsClient: true is the explicit authorization step that allows the Hub to issue the client certificate via CSR approval.)

5. A centralized ArgoCD instance is configured with a ClusterGenerator ApplicationSet pushing to 200 bare-metal clusters. The ArgoCD application-controller begins OOMKilling and logs show connection timeouts to remote clusters. What is the architectural root cause?

  • A) ArgoCD requires an Enterprise license to manage more than 100 clusters.
  • B) The push-based model forces the controller to maintain active client connections and RBAC watches across 200 remote API servers, exhausting resources.
  • C) The remote bare-metal clusters are returning incompatible Kubernetes API versions.
  • D) The ClusterGenerator is deprecated in favor of the ListGenerator.
  • Correct Answer: B (The N×M connection matrix in a push model overwhelms the centralized controller’s memory and connection limits at scale.)