Skip to content

Latest commit

 

History

History
534 lines (390 loc) · 26.3 KB

infra-cluster.md

File metadata and controls

534 lines (390 loc) · 26.3 KB

Contract rules for InfraCluster

Infrastructure providers SHOULD implement an InfraCluster resource.

The goal of an InfraCluster resource is to supply whatever prerequisites (in term of infrastructure) are necessary for running machines. Examples might include networking, load balancers, firewall rules, and so on.

The InfraCluster resource will be referenced by one of the Cluster API core resources, Cluster.

The Cluster's controller will be responsible to coordinate operations of the InfraCluster, and the interaction between the Cluster's controller and the InfraCluster resource is based on the contract rules defined in this page.

Once contract rules are satisfied by an InfraCluster implementation, other implementation details could be addressed according to the specific needs (Cluster API is not prescriptive).

Nevertheless, it is always recommended to take a look at Cluster API controllers, in-tree providers, other providers and use them as a reference implementation (unless custom solutions are required in order to address very specific needs).

In order to facilitate the initial design for each InfraCluster resource, a few implementation best practices and infrastructure Provider Security Guidance are explicitly called out in dedicated pages.

Never rely on Cluster API behaviours not defined as a contract rule!

When developing a provider, you MUST consider any Cluster API behaviour that is not defined by a contract rule as a Cluster API internal implementation detail, and internal implementation details can change at any time.

Accordingly, in order to not expose users to the risk that your provider breaks when the Cluster API internal behavior changes, you MUST NOT rely on any Cluster API internal behaviour when implementing an InfraCluster resource.

Instead, whenever you need something more from the Cluster API contract, you MUST engage the community.

The Cluster API maintainers welcome feedback and contributions to the contract in order to improve how it's defined, its clarity and visibility to provider implementers and its suitability across the different kinds of Cluster API providers.

To provide feedback or open a discussion about the provider contract please open an issue on the Cluster API repo or add an item to the agenda in the Cluster API community meeting.

Rules (contract version v1beta1)

Rule Mandatory Note
All resources: scope Yes
All resources: TypeMeta and ObjectMetafield Yes
All resources: APIVersion field value Yes
InfraCluster, InfraClusterList resource definition Yes
InfraCluster: control plane endpoint No Mandatory if control plane endpoint is not provided by other means.
InfraCluster: failure domains No
InfraCluster: initialization completed Yes
InfraCluster: conditions No
InfraCluster: terminal failures No
InfraClusterTemplate, InfraClusterTemplateList resource definition No Mandatory for ClusterClasses support
Externally managed infrastructure No
Multi tenancy No Mandatory for clusterctl CLI support
Clusterctl support No Mandatory for clusterctl CLI support
InfraCluster: pausing No

Note:

  • All resources refers to all the provider's resources "core" Cluster API interacts with; In the context of this page: InfraCluster, InfraClusterTemplate and corresponding list types

All resources: scope

All resources MUST be namespace-scoped.

All resources: TypeMeta and ObjectMeta field

All resources MUST have the standard Kubernetes TypeMeta and ObjectMeta fields.

All resources: APIVersion field value

In Kubernetes APIVersion is a combination of API group and version. Special consideration MUST applies to both API group and version for all the resources Cluster API interacts with.

All resources: API group

The domain for Cluster API resources is cluster.x-k8s.io, and infrastructure providers under the Kubernetes SIGS org generally use infrastructure.cluster.x-k8s.io as API group.

If your provider uses a different API group, you MUST grant full read/write RBAC permissions for resources in your API group to the Cluster API core controllers. The canonical way to do so is via a ClusterRole resource with the aggregation label cluster.x-k8s.io/aggregate-to-manager: "true".

The following is an example ClusterRole for a FooCluster resource in the infrastructure.foo.com API group:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
    name: capi-foo-clusters
    labels:
      cluster.x-k8s.io/aggregate-to-manager: "true"
rules:
- apiGroups:
    - infrastructure.foo.com
  resources:
    - fooclusters
  verbs:
    - create
    - delete
    - get
    - list
    - patch
    - update
    - watch
- apiGroups:
    - infrastructure.foo.com
  resources:
    - fooclustertemplates
  verbs:
    - get
    - list
    - patch
    - update
    - watch

Note: The write permissions allow the Cluster controller to set owner references and labels on the InfraCluster resources; write permissions are not used for general mutations of InfraCluster resources, unless specifically required (e.g. when using ClusterClass and managed topologies).

All resources: version

The resource Version defines the stability of the API and its backward compatibility guarantees. Examples include v1alpha1, v1beta1, v1, etc. and are governed by the Kubernetes API Deprecation Policy.

Your provider SHOULD abide by the same policies.

Note: The version of your provider does not need to be in sync with the version of core Cluster API resources. Instead, prefer choosing a version that matches the stability of the provider API and its backward compatibility guarantees.

Additionally:

Providers MUST set cluster.x-k8s.io/<version> label on the InfraCluster Custom Resource Definitions.

The label is a map from a Cluster API contract version to your Custom Resource Definition versions. The value is an underscore-delimited (_) list of versions. Each value MUST point to an available version in your CRD Spec.

The label allows Cluster API controllers to perform automatic conversions for object references, the controllers will pick the last available version in the list if multiple versions are found.

To apply the label to CRDs it’s possible to use labels in your kustomization.yaml file, usually in config/crd:

labels:
- pairs:
    cluster.x-k8s.io/v1alpha2: v1alpha1
    cluster.x-k8s.io/v1alpha3: v1alpha2
    cluster.x-k8s.io/v1beta1: v1beta1

An example of this is in the Kubeadm Bootstrap provider.

InfraCluster, InfraClusterList resource definition

You MUST define a InfraCluster resource. The InfraCluster resource name must have the format produced by sigs.k8s.io/cluster-api/util/contract.CalculateCRDName(Group, Kind).

Note: Cluster API is using such a naming convention to avoid an expensive CRD lookup operation when looking for labels from the CRD definition of the InfraCluster resource.

It is a generally applied convention to use names in the format ${env}Cluster, where ${env} is a, possibly short, name for the environment in question. For example GCPCluster is an implementation for the Google Cloud Platform, and AWSCluster is one for Amazon Web Services.

// +kubebuilder:object:root=true
// +kubebuilder:resource:path=fooclusters,shortName=foocl,scope=Namespaced,categories=cluster-api
// +kubebuilder:storageversion
// +kubebuilder:subresource:status

// FooCluster is the Schema for fooclusters.
type FooCluster struct {
    metav1.TypeMeta `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`
    Spec FooClusterSpec `json:"spec,omitempty"`
    Status FooClusterStatus `json:"status,omitempty"`
}

type FooClusterSpec struct {
    // See other rules for more details about mandatory/optional fields in InfraCluster spec.
    // Other fields SHOULD be added based on the needs of your provider.
}

type FooClusterStatus struct {
    // See other rules for more details about mandatory/optional fields in InfraCluster status.
    // Other fields SHOULD be added based on the needs of your provider.
}

For each InfraCluster resource, you MUST also add the corresponding list resource. The list resource MUST be named as <InfraCluster>List.

// +kubebuilder:object:root=true

// FooClusterList contains a list of fooclusters.
type FooClusterList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata,omitempty"`
    Items           []FooCluster `json:"items"`
}

InfraCluster: control plane endpoint

Each Cluster needs a control plane endpoint to sit in front of control plane machines. Control plane endpoint can be provided in three ways in Cluster API: by the users, by the control plane provider or by the infrastructure provider.

In case you are developing an infrastructure provider which is responsible to provide a control plane endpoint for each Cluster, the host and port of the generated control plane endpoint MUST surface on spec.controlPlaneEndpoint in the InfraCluster resource.

type FooClusterSpec struct {
    // controlPlaneEndpoint represents the endpoint used to communicate with the control plane.
    // +optional
    ControlPlaneEndpoint APIEndpoint `json:"controlPlaneEndpoint"`
    
    // See other rules for more details about mandatory/optional fields in InfraCluster spec.
    // Other fields SHOULD be added based on the needs of your provider.
}

// APIEndpoint represents a reachable Kubernetes API endpoint.
type APIEndpoint struct {
    // host is the hostname on which the API server is serving.
    Host string `json:"host"`
    
    // port is the port on which the API server is serving.
    Port int32 `json:"port"`
}

Once spec.controlPlaneEndpoint is set on the InfraCluster resource and the [InfraCluster initialization completed], the Cluster controller will surface this info in Cluster's spec.controlPlaneEndpoint.

If instead you are developing an infrastructure provider which is NOT responsible to provide a control plane endpoint, the implementer should exit reconciliation until it sees Cluster's spec.controlPlaneEndpoint populated.

InfraCluster: failure domains

In case you are developing an infrastructure provider which has a notion of failure domains where machines should be placed in, the list of available failure domains MUST surface on status.failureDomains in the InfraCluster resource.

type FooClusterStatus struct {
    // failureDomains is a list of failure domain objects synced from the infrastructure provider.
    FailureDomains clusterv1.FailureDomains `json:"failureDomains,omitempty"`
    
    // See other rules for more details about mandatory/optional fields in InfraCluster status.
    // Other fields SHOULD be added based on the needs of your provider.
}

clusterv1.FailureDomains is a map, defined as map[string]FailureDomainSpec. A unique key must be used for each FailureDomainSpec. FailureDomainSpec is defined as:

  • controlPlane bool: indicates if failure domain is appropriate for running control plane instances.
  • attributes map[string]string: arbitrary attributes for users to apply to a failure domain.

Once status.failureDomains is set on the InfraCluster resource and the [InfraCluster initialization completed], the Cluster controller will surface this info in Cluster's status.failureDomains.

InfraCluster: initialization completed

Each InfraCluster MUST report when Machine's infrastructure is fully provisioned (initialization) by setting status.initialization.provisioned in the InfraCluster resource.

type FooClusterStatus struct {
    // initialization provides observations of the FooCluster initialization process.
    // NOTE: Fields in this struct are part of the Cluster API contract and are used to orchestrate initial Cluster provisioning.
    // +optional
    Initialization *FooClusterInitializationStatus `json:"initialization,omitempty"`
    
    // See other rules for more details about mandatory/optional fields in InfraCluster status.
    // Other fields SHOULD be added based on the needs of your provider.
}

// FooClusterInitializationStatus provides observations of the FooCluster initialization process.
type FooClusterInitializationStatus struct {
	// provisioned is true when the infrastructure provider reports that the Cluster's infrastructure is fully provisioned.
	// NOTE: this field is part of the Cluster API contract, and it is used to orchestrate initial Cluster provisioning.
	// +optional
	Provisioned bool `json:"provisioned,omitempty"`
}

Once status.initialization.provisioned is set the Cluster "core" controller will bubble up this info in Cluster's status.initialization.infrastructureProvisioned; if defined, also InfraCluster's spec.controlPlaneEndpoint and status.failureDomains will be surfaced on Cluster's corresponding fields at the same time.

Compatibility with the deprecated v1beta1 contract

In order to ease the transition for providers, the v1beta2 version of the Cluster API contract temporarily preserves compatibility with the deprecated v1beta1 contract; compatibility will be removed tentatively in August 2026.

With regards to initialization completed:

Cluster API will continue to temporarily support InfraCluster resource using status.ready field to report initialization completed.

After compatibility with the deprecated v1beta1 contract will be removed, status.ready field in the InfraCluster resource will be ignored.

InfraCluster: conditions

According to Kubernetes API Conventions, Conditions provide a standard mechanism for higher-level status reporting from a controller.

Providers implementers SHOULD implement status.conditions for their InfraCluster resource. In case conditions are implemented on a InfraCluster resource, Cluster API will only consider conditions providing the following information:

  • type (required)
  • status (required, one of True, False, Unknown)
  • reason (optional, if omitted a default one will be used)
  • message (optional, if omitted an empty message will be used)
  • lastTransitionTime (optional, if omitted time.Now will be used)
  • observedGeeneration (optional, if omitted the generation of the InfraCluster resource will be used)

Other fields will be ignored.

If a condition with type Ready exist, such condition will be mirrored in Cluster's InfrastructureReady condition.

Please note that the Ready condition is expected to surface the status of the InfraCluster during its own entire lifecycle, including initial provisioning, the final deletion process, and the period in between these two moments.

See Improving status in CAPI resources for more context.

Compatibility with the deprecated v1beta1 contract

In order to ease the transition for providers, the v1beta2 version of the Cluster API contract temporarily preserves compatibility with the deprecated v1beta1 contract; compatibility will be removed tentatively in August 2026.

With regards to conditions:

Cluster API will continue to read conditions from providers using deprecated Cluster API condition types.

Please note that provider that will continue to use deprecated Cluster API condition types MUST carefully take into account the implication of this choice which are described both in the Cluster API v1.11 migration notes and in the Improving status in CAPI resources proposal.

InfraCluster: terminal failures

Starting from the v1beta2 contract version, there is no more special treatment for provider's terminal failures within Cluster API.

In case necessary, "terminal failures" should be surfaced using conditions, with a well documented type/reason; it is up to consumers to treat them accordingly.

See Improving status in CAPI resources for more context.

Compatibility with the deprecated v1beta1 contract

In order to ease the transition for providers, the v1beta2 version of the Cluster API contract temporarily preserves compatibility with the deprecated v1beta1 contract; compatibility will be removed tentatively in August 2026.

With regards to terminal failures:

In case an infrastructure provider reports that a InfraCluster resource is in a state that cannot be recovered (terminal failure) by setting status.failureReason and status.failureMessage as defined by the deprecated v1beta1 contract, the "core" Cluster controller will surface those info in the corresponding fields in the Cluster's status.deprecated.v1beta1 struct.

However, those info won't have any impact on the Cluster lifecycle as before.

After compatibility with the deprecated v1beta1 contract will be removed, status.failureReason and status.failureMessage fields in the InfraCluster resource will be ignored and Cluster's status.deprecated.v1beta1 struct will be dropped.

InfraClusterTemplate, InfraClusterTemplateList resource definition

For a given InfraCluster resource, you should also add a corresponding InfraClusterTemplate resources in order to use it in ClusterClasses. The template resource MUST be named as <InfraCluster>Template.

// +kubebuilder:object:root=true
// +kubebuilder:resource:path=fooclustertemplates,scope=Namespaced,categories=cluster-api
// +kubebuilder:storageversion

// FooClusterTemplate is the Schema for the fooclustertemplates API.
type FooClusterTemplate struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    Spec FooClusterTemplateSpec `json:"spec,omitempty"`
}

type FooClusterTemplateSpec struct {
    Template FooClusterTemplateResource `json:"template"`
}

type FooClusterTemplateResource struct {
    // Standard object's metadata.
    // More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
    // +optional
    ObjectMeta clusterv1.ObjectMeta `json:"metadata,omitempty"`
    Spec FooClusterSpec `json:"spec"`
}

NOTE: in this example InfraClusterTemplate's spec.template.spec embeds FooClusterSpec from InfraCluster. This might not always be the best choice depending of if/how InfraCluster's spec fields applies to many clusters vs only one.

For each InfraClusterTemplate resource, you MUST also add the corresponding list resource. The list resource MUST be named as <InfraClusterTemplate>List.

// +kubebuilder:object:root=true

// FooClusterTemplateList contains a list of FooClusterTemplates.
type FooClusterTemplateList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata,omitempty"`
    Items           []FooClusterTemplate `json:"items"`
}

Externally managed infrastructure

In some cases, users might be required (or choose to) manage infrastructure out of band and run CAPI on top of already existing infrastructure.

In order to support this use case, the InfraCluster controller SHOULD skip reconciliation of InfraCluster resources with the cluster.x-k8s.io/managed-by: "<name-of-system>" label, and not update the resource or its status in any way.

Please note that when the cluster infrastructure is externally managed, it is responsibility of external management system to abide to the following contract rules:

  • [InfraCluster control plane endpoint]
  • [InfraCluster failure domains]
  • [InfraCluster initialization completed]
  • [InfraCluster terminal failures]

See the externally managed infrastructure proposal for more detail about this use case.

Multi tenancy

Multi tenancy in Cluster API defines the capability of an infrastructure provider to manage different credentials, each one of them corresponding to an infrastructure tenant.

See infrastructure Provider Security Guidance for considerations about cloud provider credential management.

Please also note that Cluster API does not support running multiples instances of the same provider, which someone can assume an alternative solution to implement multi tenancy; same applies to the clusterctl CLI.

See Support running multiple instances of the same provider for more context.

However, if you want to make it possible for users to run multiples instances of your provider, your controller's SHOULD:

  • support the --namespace flag.
  • support the --watch-filter flag.

Please, read carefully the page linked above to fully understand implications and risks related to this option.

Clusterctl support

The clusterctl command is designed to work with all the providers compliant with the rules defined in the clusterctl provider contract.

InfraCluster: pausing

Providers SHOULD implement the pause behaviour for every object with a reconciliation loop. This is done by checking if spec.paused is set on the Cluster object and by checking for the cluster.x-k8s.io/paused annotation on the InfraCluster object.

If implementing the pause behavior, providers SHOULD surface the paused status of an object using the Paused condition: Status.Conditions[Paused].

Typical InfraCluster reconciliation workflow

A cluster infrastructure provider must respond to changes to its InfraCluster resources. This process is typically called reconciliation. The provider must watch for new, updated, and deleted resources and respond accordingly.

As a reference you can look at the following workflow to understand how the typical reconciliation workflow is implemented in InfraCluster controllers:

Cluster infrastructure provider activity diagram

Normal resource

  1. If the resource is externally managed, exit the reconciliation
    1. The ResourceIsNotExternallyManaged predicate can be used to prevent reconciling externally managed resources
  2. If the resource does not have a Cluster owner, exit the reconciliation
    1. The Cluster API Cluster reconciler populates this based on the value in the Cluster's spec.infrastructureRef field.
  3. Add the provider-specific finalizer, if needed
  4. Reconcile provider-specific cluster infrastructure
    1. If any errors are encountered, exit the reconciliation
  5. If the provider created a load balancer for the control plane, record its hostname or IP in spec.controlPlaneEndpoint
  6. Set status.infrastructure.provisioned to true
  7. Set status.failureDomains based on available provider failure domains (optional)
  8. Patch the resource to persist changes

Deleted resource

  1. If the resource has a Cluster owner
    1. Perform deletion of provider-specific cluster infrastructure
    2. If any errors are encountered, exit the reconciliation
  2. Remove the provider-specific finalizer from the resource
  3. Patch the resource to persist changes