Skip to content

Commit 3a40a04

Browse files
authored
Add server metadata, soft-anti-affinity and local ssd flavors for gx-scs (#742)
* Add server metadata Signed-off-by: Roman Hros <[email protected]> * Add docs about server metadata Signed-off-by: Roman Hros <[email protected]> * Add default flavor with local ssd for gx-scs control-plane nodes Signed-off-by: Roman Hros <[email protected]> * Bump default k8s version to v1.28.11 Signed-off-by: Roman Hros <[email protected]> * Run CI without anti_affinity There is not enough hosts for local ssd flavors Signed-off-by: Roman Hros <[email protected]> * Run only KaaS v2 tests Signed-off-by: Roman Hros <[email protected]> * Allow the use of soft-anti-affinity for the control plane Signed-off-by: Roman Hros <[email protected]> --------- Signed-off-by: Roman Hros <[email protected]>
1 parent 930b47d commit 3a40a04

13 files changed

+60
-7
lines changed

doc/configuration.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,13 +67,16 @@ Parameters controlling the cluster creation:
6767
| `` | `CONTROL_PLANE_ROOT_DISKSIZE` | SCS | `20` | *If* diskless flavors are used for control plane nodes, this is the allocated root volume disk size (in GB) |
6868
| `` | `WORKER_ROOT_DISKSIZE` | SCS | `20` | *If* diskless flavors are used for worker nodes, this is the allocated root volume disk size (in GB) |
6969
| `anti_affinity` | `OPENSTACK_ANTI_AFFINITY` | SCS | `true` | Use anti-affinity server groups to prevent k8s nodes on same host (soft for workers, hard for controllers) |
70+
| `soft_anti_affinity_controller` | `OPENSTACK_SOFT_ANTI_AFFINITY_CONTROLLER` | SCS | `false` | Allow the use of soft-anti-affinity for the controllers (if `anti_affinity` is `true`) |
7071
| `` | `OPENSTACK_SRVGRP_CONTROLLER` | SCS | `nonono` | Autogenerated if `anti_affinity` is `true`, eliminated otherwise |
7172
| `` | `OPENSTACK_SRVGRP_WORKER` | SCS | `nonono` | Autogenerated if `anti_affinity` is `true`, eliminated otherwise |
7273
| `deploy_occm` | `DEPLOY_OCCM` | SCS | `true` | Deploy the given version of OCCM into the cluster. `true` (default) chooses the latest version matching the k8s version. You can specify `master` to chose the upstream master branch. Don't disable this. |
7374
| `deploy_cindercsi` | `DEPLOY_CINDERCSI` | SCS | `true` | Deploy the given (or latest matching for the default true value) of cinder CSI. |
7475
| `etcd_unsafe_fs` | `ETCD_UNSAFE_FS` | SCS | `false` | Use `barrier=0` for filesystem on control nodes to avoid storage latency. Use for multi-controller clusters on slow/networked storage, otherwise not recommended. |
7576
| `testcluster_name` | (cmd line) | SCS | `testcluster` | Allows setting the default cluster name, created at bootstrap (if `controller_count` is larger than 0) |
7677
| `restrict_kubeapi` | `RESTRICT_KUBEAPI` | SCS | `[ ]` | Allows restricting access to kubernetes API by list of CIDRs. Empty list (default) means public, `[ "none" ]` means internal access only. |
78+
| `controller_metadata` | `OPENSTACK_CONTROL_PLANE_MACHINE_METADATA` | SCS | `{ }` | Adds additional metadata for instances running the k8s management nodes |
79+
| `worker_metadata` | `OPENSTACK_NODE_MACHINE_METADATA` | SCS | `{ }` | Adds additional metadata for instances running the k8s worker nodes |
7780
| `` | `OPENSTACK_CLUSTER_GEN` | SCS | `geno01` | Generation counter for the OpenStackClusterTemplate resource. Increase, when changing restrict_kubeapi or other OC settings |
7881
| `capo_instance_create_timeout` | `CLUSTER_API_OPENSTACK_INSTANCE_CREATE_TIMEOUT` | capo | `5` | Time to wait for an OpenStack machine to be created (in minutes) |
7982
| `containerd_registry_files` | | SCS | `{"hosts":["./files/containerd/docker.io"], "certs":[]}` | Containerd registry hosts config files, see related [docs](./usage/containter-registry-configuration.md) for details. |

playbooks/tasks/scs_compliance.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
ansible.builtin.shell:
3232
cmd:
3333
". {{ python_venv_dir }}/bin/activate &&
34-
python3 {{ check_dir }}/Tests/scs-compliance-check.py {{ check_dir }}/Tests/scs-compatible-kaas.yaml -v -s KaaS_V1 -a kubeconfig={{ kubeconfig_path }}"
34+
python3 {{ check_dir }}/Tests/scs-compliance-check.py {{ check_dir }}/Tests/scs-compatible-kaas.yaml -v -s KaaS_V1 -V v2 -a kubeconfig={{ kubeconfig_path }}"
3535
changed_when: false
3636
register: scs_compliance_results
3737
always:

playbooks/templates/environment.tfvars.j2

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,16 @@ availability_zone = "nova"
66
external = "ext01"
77
dns_nameservers = ["62.138.222.111", "62.138.222.222"]
88
kind_flavor = "SCS-2V:4"
9-
controller_flavor = "SCS-2V:4:20"
9+
controller_flavor = "SCS-2V-4-20s"
1010
worker_flavor = "SCS-2V:4:20"
1111

12+
controller_metadata = {
13+
ps_restart_after_maint = "true"
14+
}
15+
16+
# FIXME: Remove when CI runs on gx-scs2 environment(3+ physical machines for local ssd flavors)
17+
soft_anti_affinity_controller = true
18+
1219
controller_count = 3
1320
worker_count = 3
1421

terraform/environments/environment-default.tfvars

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ node_cidr = "<CIDR>" # defaults to "10.8.0.0/20"
2727
service_cidr = "<CIDR>" # defaults to "10.96.0.0/12"
2828
pod_cidr = "<CIDR>" # defaults to "192.168.0.0/16"
2929
anti_affinity = "<boolean>" # defaults to "true"
30+
soft_anti_affinity_controller = "<boolean>" # defaults to "false"
3031
use_cilium = "version/true/false" # defaults to "true", can also be set to "vx.y.z", also see cilium_binaries
3132
use_ovn_lb_provider = "auto/true/false" # use OVN LB if available (auto) or force (true) or never (false)
3233
deploy_nginx_ingress = "version/true/false" # defaults to "true", you can also set vX.Y.Z if you want
@@ -39,6 +40,8 @@ deploy_cindercsi = "<version>" # defaults to "true", dito
3940
etcd_unsafe_fs = "<boolean>" # defaults to "false", dangerous
4041
testcluster_name = "NAME" # defaults to "testcluster"
4142
restrict_kubeapi = [ "IP/20", "IP/22" ] # defaults to empty (fully open), use [ "none" ] for exclusive internal access
43+
controller_metadata = { metadata_key = "metadata_value" } # defaults to empty dict (no additional metadata)
44+
worker_metadata = { metadata_key = "metadata_value" } # defaults to empty dict (no additional metadata)
4245
containerd_registry_files = {"hosts":["<list of registry host config files>"], "certs":["<list of custom cert files>"]} # defaults to '{"hosts":["./files/containerd/docker.io"], "certs":[]}'
4346
deploy_harbor = "<boolean>" # defaults to "false", "true" deploys Harbor and forces deployment of flux and potentially other services (`cert_manager`, `nginx_ingress` and `cindercsi`), see `doc/usage/harbor.md`
4447
harbor_config = {"domain_name":"<name>", "issuer_email":"<email>", "persistence":"<boolean>", "database_size":"size", "redis_size":"size", "trivy_size":"size"} # for defaults see ../variables.tf

terraform/environments/environment-gx-scs-staging.tfvars

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,10 @@ cloud_provider = "gx-scs-staging"
55
availability_zone = "nova"
66
external = "ext01"
77
kind_flavor = "SCS-2V:4"
8-
controller_flavor = "SCS-8V:16:100"
8+
controller_flavor = "SCS-4V-16-100s"
99
worker_flavor = "SCS-8V:16:100"
1010
#image = "Ubuntu 22.04"
1111
#ssh_username = "ubuntu"
12+
controller_metadata = {
13+
ps_restart_after_maint = "true"
14+
}

terraform/environments/environment-gx-scs.tfvars

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,13 @@ cloud_provider = "gx-scs"
44
availability_zone = "nova"
55
external = "ext01"
66
kind_flavor = "SCS-2V:4"
7-
controller_flavor = "SCS-2V:4:20"
7+
controller_flavor = "SCS-2V-4-20s"
88
worker_flavor = "SCS-2V:4:20"
99
#image = "Ubuntu 22.04"
1010
#ssh_username = "ubuntu"
1111
#kube_image_raw = "true"
1212
dns_nameservers = ["62.138.222.111", "62.138.222.222"]
1313
#controller_count = 0
14+
controller_metadata = {
15+
ps_restart_after_maint = "true"
16+
}

terraform/files/bin/create_cluster.sh

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,10 @@ if test "$CONTROL_PLANE_MACHINE_COUNT" -gt 0 && grep '^ *OPENSTACK_ANTI_AFFINITY
9191
SRVGRP_CONTROLLER=$(echo "$SRVGRP" | grep "${PREFIX}-${CLUSTER_NAME}-controller" | sed 's/^\([0-9a-f\-]*\) .*$/\1/')
9292
SRVGRP_WORKER=$(echo "$SRVGRP" | grep "${PREFIX}-${CLUSTER_NAME}-worker" | sed 's/^\([0-9a-f\-]*\) .*$/\1/')
9393
if test -z "$SRVGRP_CONTROLLER"; then
94-
SRVGRP_CONTROLLER=$(openstack --os-compute-api-version 2.15 server group create --policy anti-affinity -f value -c id ${PREFIX}-${CLUSTER_NAME}-controller)
94+
ANTI_AFFINITY_POLICY_CONTROLLER=anti-affinity
95+
SOFT_ANTI_AFFINITY_CONTROLLER=$(yq eval '.OPENSTACK_SOFT_ANTI_AFFINITY_CONTROLLER' $CCCFG)
96+
if test "$SOFT_ANTI_AFFINITY_CONTROLLER" = "true"; then ANTI_AFFINITY_POLICY_CONTROLLER=soft-anti-affinity; fi
97+
SRVGRP_CONTROLLER=$(openstack --os-compute-api-version 2.15 server group create --policy ${ANTI_AFFINITY_POLICY_CONTROLLER} -f value -c id ${PREFIX}-${CLUSTER_NAME}-controller)
9598
SRVGRP_WORKER=$(openstack --os-compute-api-version 2.15 server group create --policy soft-anti-affinity -f value -c id ${PREFIX}-${CLUSTER_NAME}-worker)
9699
fi
97100
echo "Adding server groups $SRVGRP_CONTROLLER and $SRVGRP_WORKER to $CCCFG"

terraform/files/bin/deploy_cluster_api.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,9 @@ clusterctl version --output yaml
3131
#MTU=`yq eval '.MTU_VALUE' ~/cluster-defaults/clusterctl.yaml`
3232
# Fix up nameserver list (trailing comma -- cosmetic)
3333
sed '/OPENSTACK_DNS_NAMESERVERS:/s@, \]"@ ]"@' -i ~/cluster-defaults/clusterctl.yaml
34+
# Fix metadata dicts (trailing comma -- cosmetic)
35+
sed '/OPENSTACK_CONTROL_PLANE_MACHINE_METADATA:/s@, }"@ }"@' -i ~/cluster-defaults/clusterctl.yaml
36+
sed '/OPENSTACK_NODE_MACHINE_METADATA:/s@, }"@ }"@' -i ~/cluster-defaults/clusterctl.yaml
3437

3538
# cp clusterctl.yaml to the right place
3639
if test "$(dotversion "$(clusterctl version -o short)")" -ge 10500; then

terraform/files/bin/openstack-kube-versions.inc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
# (c) Kurt Garloff <[email protected]>, 3/2022
44
# SPDX-License-Identifier: Apache-2.0
55
# Images from https://swift.services.a.regiocloud.tech/swift/v1/AUTH_b182637428444b9aa302bb8d5a5a418c/openstack-k8s-capi-images
6-
k8s_versions=("v1.21.14" "v1.22.17" "v1.23.16" "v1.24.15" "v1.25.15" "v1.26.14" "v1.27.12" "v1.28.10" "v1.29.3")
6+
k8s_versions=("v1.21.14" "v1.22.17" "v1.23.16" "v1.24.15" "v1.25.15" "v1.26.14" "v1.27.12" "v1.28.11" "v1.29.3")
77
# OCCM, CCM-RBAC, Cinder CSI, Cinder-Snapshot (TODO: Manila CSI)
88
occm_versions=("v1.21.1" "v1.22.2" "v1.23.4" "v1.24.6" "v1.25.6" "v1.26.4" "v1.27.3" "v1.28.2" "v1.29.0")
99
#ccmr_versions=("" "v1.22.2" "v1.23.4" "v1.24.6" "v1.25.6" "v1.26.4" "v1.27.3" "v1.28.2" "v1.29.0")

terraform/files/template/cluster-template.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,7 @@ spec:
320320
template:
321321
spec:
322322
flavor: ${OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR}
323+
serverMetadata: ${OPENSTACK_CONTROL_PLANE_MACHINE_METADATA}
323324
serverGroupID: ${OPENSTACK_SRVGRP_CONTROLLER}
324325
image: ${OPENSTACK_IMAGE_NAME}
325326
sshKeyName: ${OPENSTACK_SSH_KEY_NAME}
@@ -345,6 +346,7 @@ spec:
345346
name: ${CLUSTER_NAME}-cloud-config
346347
kind: Secret
347348
flavor: ${OPENSTACK_NODE_MACHINE_FLAVOR}
349+
serverMetadata: ${OPENSTACK_NODE_MACHINE_METADATA}
348350
serverGroupID: ${OPENSTACK_SRVGRP_WORKER}
349351
image: ${OPENSTACK_IMAGE_NAME}
350352
sshKeyName: ${OPENSTACK_SSH_KEY_NAME}

terraform/files/template/clusterctl.yaml.tmpl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,10 @@ DEPLOY_FLUX: ${deploy_flux}
3737
# deploy metrics service
3838
DEPLOY_METRICS: ${deploy_metrics}
3939

40+
# OpenStack instance additional metadata
41+
OPENSTACK_CONTROL_PLANE_MACHINE_METADATA: "{ %{ for metadata_key, metadata_value in controller_metadata ~} ${metadata_key}: '${metadata_value}', %{ endfor ~} }"
42+
OPENSTACK_NODE_MACHINE_METADATA: "{ %{ for metadata_key, metadata_value in worker_metadata ~} ${metadata_key}: '${metadata_value}', %{ endfor ~} }"
43+
4044
# OpenStack flavors and machine count
4145
OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR: ${controller_flavor}
4246
CONTROL_PLANE_MACHINE_COUNT: ${controller_count}
@@ -80,6 +84,7 @@ OPENSTACK_SSH_KEY_NAME: ${prefix}-keypair
8084

8185
# Use anti-affinity server groups
8286
OPENSTACK_ANTI_AFFINITY: ${anti_affinity}
87+
OPENSTACK_SOFT_ANTI_AFFINITY_CONTROLLER: ${soft_anti_affinity_controller}
8388
OPENSTACK_SRVGRP_CONTROLLER: nonono
8489
OPENSTACK_SRVGRP_WORKER: nonono
8590

terraform/mgmtcluster.tf

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -313,11 +313,13 @@ resource "terraform_data" "mgmtcluster_bootstrap_files" {
313313
provisioner "file" {
314314
content = templatefile("files/template/clusterctl.yaml.tmpl", {
315315
anti_affinity = var.anti_affinity,
316+
soft_anti_affinity_controller = var.soft_anti_affinity_controller,
316317
availability_zone = var.availability_zone,
317318
capo_instance_create_timeout = var.capo_instance_create_timeout,
318319
cloud_provider = var.cloud_provider,
319320
controller_count = var.controller_count,
320321
controller_flavor = var.controller_flavor,
322+
controller_metadata = var.controller_metadata,
321323
deploy_cert_manager = var.deploy_cert_manager,
322324
deploy_cindercsi = var.deploy_cindercsi,
323325
deploy_flux = var.deploy_flux,
@@ -340,7 +342,8 @@ resource "terraform_data" "mgmtcluster_bootstrap_files" {
340342
calico_version = var.calico_version,
341343
use_ovn_lb_provider = var.use_ovn_lb_provider,
342344
worker_count = var.worker_count,
343-
worker_flavor = var.worker_flavor
345+
worker_flavor = var.worker_flavor,
346+
worker_metadata = var.worker_metadata
344347
})
345348
destination = "/home/${var.ssh_username}/cluster-defaults/clusterctl.yaml"
346349
}

terraform/variables.tf

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,18 @@ variable "worker_flavor" {
3333
default = "SCS-2V-4-20s"
3434
}
3535

36+
variable "controller_metadata" {
37+
description = "additional metadata for instances running the k8s management nodes"
38+
type = map(string)
39+
default = {}
40+
}
41+
42+
variable "worker_metadata" {
43+
description = "additional metadata for instances running the k8s worker nodes"
44+
type = map(string)
45+
default = {}
46+
}
47+
3648
variable "availability_zone" {
3749
description = "availability zone for openstack resources"
3850
type = string
@@ -191,6 +203,12 @@ variable "anti_affinity" {
191203
default = true
192204
}
193205

206+
variable "soft_anti_affinity_controller" {
207+
description = "allow the use of soft-anti-affinity for the control plane"
208+
type = bool
209+
default = false
210+
}
211+
194212
variable "dns_nameservers" {
195213
description = "array of nameservers to be set for subnets, prefer local DNS servers if available"
196214
type = list(string)

0 commit comments

Comments
 (0)