-
Notifications
You must be signed in to change notification settings - Fork 7
feat: add cluster-autoscaler CRS addon #423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Would love some foeedback on this before going much further 🙏 |
a6b53c8
to
0044549
Compare
More discussion on cluster-autoscaler and ClusterClass in kubernetes-sigs/cluster-api#8217 |
Interesting! We solved this in DKP by having the CLI client update both the annotations AND the MachineDeployment's replicas with an imperative command. But I think the Not a huge fan that the internal annotation details are exposed to the user, but maybe thats ok for the initial implementation. |
ceceda7
to
72cdf4b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
e51ce78
to
1be244d
Compare
CRS is not applied from bootstrap to workload:
|
Need to specify the namespace for CA and create all resources in there. |
With this approach, the CA will always be deployed on the CP nodes because workers will be set to 0, CA will be deployed and only then will worker nodes be scaled up. |
To change the number of replicas, a user can set the min annotation to the desired number of replicas and cluster-autoscaler will scale up the MachineDeployment.
1be244d
to
e9c7342
Compare
Yep, thank you e2e tests, should be fixed in e9c7342 |
The CAPI webhook will set the MD replicas based on the CA annotation https://github.com/kubernetes-sigs/cluster-api/blob/main/internal/webhooks/machinedeployment.go#L289-L291 I also see this behavior running locally when setting
We may want to support clusters with 0 replicas, but I think we should do that as a separate effort since other addons would need similar tolerations. |
e9c7342
to
0920fb0
Compare
Oops, I didn't see the auto-merge was enabled but the e2e tests passed 🎉 |
🤖 I have created a release *beep* *boop* --- ## 0.6.0 (2024-03-19) <!-- Release notes generated using configuration in .github/release.yaml at main --> ## What's Changed ### Exciting New Features 🎉 * feat: Support HelmAddon strategy to deploy NFD by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/390 * feat: Upgrade AWS ESB CSI and switch to using Helm chart by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/393 * feat: CAPA 2.4.0 APIs and e2e by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/415 * feat: Single defaults namespace flag by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/426 * feat: add cluster-autoscaler CRS addon by @dkoshkin in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/423 * feat: add Cluster Autoscaler Addon with HelmAddon by @dkoshkin in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/427 * feat: NFD v0.15.2 by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/442 * feat: Include CABPK APIs by @dlipovetsky in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/445 ### Fixes 🔧 * fix: Ensure addons defaults namespaces are correctly wired up by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/409 * fix: Disable hubble in Cilium deployment via CRS by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/411 * fix: Fix Cilium helm values to use kubernetes IPAM by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/413 * fix: don't use an SSH key in AWS clusters by @dkoshkin in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/425 * fix: set default priorityClassName on Deployment by @dkoshkin in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/431 * fix: set default tolerations on Deployment by @dkoshkin in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/430 * fix: Remove vendored types for core CAPI providers (CAPD, CABPK, KCP) by @dlipovetsky in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/452 ### Other Changes * test: Add initial e2e tests by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/360 * test(e2e): Add CNI e2e tests by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/383 * test(e2e): Resolve latest upstream provider releases in e2e config by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/388 * test(e2e): Add test for NFD addon by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/389 * build: Ignore controller-runtime upgrades by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/403 * test(e2e): Use ghcr.io/mesosphere/kind-node for bootstrap by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/406 * build: Update AWS CPI manifest filenames by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/410 * revert: Temporarily disable GOPROXY to workaround dodgy CAPA release (#395) by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/407 * build: Ensure release namespace is use in kustomize helm inflator by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/412 * docs: Update menu ordering and add some icons by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/414 * test(e2e): Add AWS e2e tests by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/408 * build: clusterawsadm v2.4.0 by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/424 * docs: simplify running examples in README by @dkoshkin in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/422 * ci: Add dependabot for api module by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/432 * build: Fix up third-party CAPD go.mod CAPI dependency by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/441 * build: controller-runtime v0.17.2 by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/440 * ci: Fix up release workflow by specifying workflow-dispatch version by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/451 * docs: Update docsy module by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/455 * build: Rename module to d2iq-labs/cluster-api-runtime-extensions-nutanix by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/454 * test(e2e): Update test config with new repo name by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/457 * build: Reorg example kustomizations by @jimmidyson in https://github.com/d2iq-labs/cluster-api-runtime-extensions-nutanix/pull/453 **Full Changelog**: d2iq-labs/cluster-api-runtime-extensions-nutanix@v0.5.0...v0.6.0 --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Adding the cluster-autoscaler addon, starting with ClusterResourceSet first.
The cluster-autoscaler addon is different than the other addons because it needs access to both the CAPI objects and the workload Cluster. Because of this, it needs to always be deployed on the management Cluster and use the CAPI kubeconfig Secret to watch the workload Cluster.
Tested locally, but was not able to work on e2e tests yet, since there is no mechanism yet to make the Cluster self-managed.