Skip to content

Commit 1ad5dfb

Browse files
locmaihumblebundledorenschad
authored
fix: change headless service to gRPC and expose 9094 TCP (#494)
* change alertmanager-svc-headless from http to grpc port Signed-off-by: AlexandreRoux <[email protected]> * edit CHANGELOG.md Signed-off-by: AlexandreRoux <[email protected]> * expose 9094 TCP UDP for gossip cluster Signed-off-by: AlexandreRoux <[email protected]> * support configuration of gossip cluster port Signed-off-by: AlexandreRoux <[email protected]> * WIP: configure alertmanager HA cluster mode for sts Signed-off-by: Niclas Schad <[email protected]> * configure alertmanager cluster peers as comma seperated list Signed-off-by: AlexandreRoux <[email protected]> * clarify values.yaml about cluster enable by default Signed-off-by: AlexandreRoux <[email protected]> * replicaset should not set -alertmanager-cluster-peers when cluster is disable Signed-off-by: AlexandreRoux <[email protected]> * fix wrong name for grpc targetPort Signed-off-by: AlexandreRoux <[email protected]> * re-introduce alertmanager-dep.yaml Signed-off-by: AlexandreRoux <[email protected]> * docs: update READEME.md Signed-off-by: Loc Mai <[email protected]> * update CHANGELOG.md Signed-off-by: Loc Mai <[email protected]> * update CHANGELOG.md Signed-off-by: Loc Mai <[email protected]> * update CHANGELOG.md Signed-off-by: Loc Mai <[email protected]> * update CHANGELOG.md Signed-off-by: Loc Mai <[email protected]> * update CHANGELOG.md Signed-off-by: Loc Mai <[email protected]> * update CHANGELOG.md Signed-off-by: Loc Mai <[email protected]> * add Alertmanager scope Signed-off-by: Loc Mai <[email protected]> * add http-metrics back Signed-off-by: Loc Mai <[email protected]> * update CHANGELOG.md Signed-off-by: Loc Mai <[email protected]> --------- Signed-off-by: AlexandreRoux <[email protected]> Signed-off-by: Niclas Schad <[email protected]> Signed-off-by: Loc Mai <[email protected]> Co-authored-by: AlexandreRoux <[email protected]> Co-authored-by: Niclas Schad <[email protected]>
1 parent a572813 commit 1ad5dfb

7 files changed

+43
-4
lines changed

CHANGELOG.md

+4
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22

33
## master / unreleased
44

5+
* [ENHANCEMENT] Alertmanager: Add `grpc` port #494
6+
* [ENHANCEMENT] Alertmanager: Expose 9094 TCP and UDP for gossip cluster #494
7+
* If the AlertManager headless service existed prior to applying the change, it will have only one port set, which is a known issue. See [kubernetes/kubernetes#39188](https://github.com/kubernetes/kubernetes/issues/39188). Re-creating the headless service can resolve this issue
8+
59
## 2.2.0 / 2024-01-16
610

711
* [CHANGE] Removed `config.storage.engine` and any reference of it #488

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,7 @@ Kubernetes: `^1.19.0-0`
216216
| compactor.&ZeroWidthSpace;terminationGracePeriodSeconds | int | `240` | |
217217
| compactor.&ZeroWidthSpace;tolerations | list | `[]` | |
218218
| compactor.&ZeroWidthSpace;topologySpreadConstraints | list | `[]` | |
219+
| config.&ZeroWidthSpace;alertmanager.&ZeroWidthSpace;cluster | object | `{"listen_address":"0.0.0.0:9094"}` | Disable alertmanager gossip cluster by setting empty listen_address to empty string |
219220
| config.&ZeroWidthSpace;alertmanager.&ZeroWidthSpace;enable_api | bool | `false` | Enable the experimental alertmanager config api. |
220221
| config.&ZeroWidthSpace;alertmanager.&ZeroWidthSpace;external_url | string | `"/api/prom/alertmanager"` | |
221222
| config.&ZeroWidthSpace;api.&ZeroWidthSpace;prometheus_http_prefix | string | `"/prometheus"` | |

ci/test-deployment-values.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ runtimeconfigmap:
125125
annotations:
126126
foo: bar
127127
alertmanager:
128-
replicas: 1
128+
replicas: 3
129129
statefulSet:
130130
enabled: false
131131
extraVolumes:

ci/test-sts-values.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ runtimeconfigmap:
116116
annotations:
117117
foo: bar
118118
alertmanager:
119-
replicas: 1
119+
replicas: 3
120120
statefulSet:
121121
enabled: true
122122
extraVolumes:

templates/alertmanager/alertmanager-statefulset.yaml

+20
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
{{- $svcClusterAddress := ((.Values.config.alertmanager.cluster).listen_address) | default "0.0.0.0:9094" }}
2+
{{- $svcClusterPort := (split ":" $svcClusterAddress)._1 }}
13
{{- if .Values.alertmanager.enabled -}}
24
{{- if .Values.alertmanager.statefulSet.enabled -}}
35
apiVersion: apps/v1
@@ -152,6 +154,15 @@ spec:
152154
args:
153155
- "-target=alertmanager"
154156
- "-config.file=/etc/cortex/cortex.yaml"
157+
{{- if and (gt (int .Values.alertmanager.replicas) 1) (ne .Values.config.alertmanager.cluster.listen_address "") }}
158+
{{- $fullName := include "cortex.alertmanagerFullname" . }}
159+
{{- $peers := list }}
160+
{{- range $i := until (int .Values.alertmanager.replicas) }}
161+
{{- $peer := printf "%s-%d.%s-headless.%s.svc.cluster.local:%s" $fullName $i $fullName $.Release.Namespace $svcClusterPort }}
162+
{{- $peers = append $peers $peer }}
163+
{{- end }}
164+
- "-alertmanager.cluster.peers={{ join "," $peers }}"
165+
{{- end }}
155166
{{- range $key, $value := .Values.alertmanager.extraArgs }}
156167
- "-{{ $key }}={{ $value }}"
157168
{{- end }}
@@ -175,6 +186,15 @@ spec:
175186
- name: gossip
176187
containerPort: {{ .Values.config.memberlist.bind_port }}
177188
protocol: TCP
189+
- name: grpc
190+
containerPort: {{ .Values.config.server.grpc_listen_port }}
191+
protocol: TCP
192+
- containerPort: {{ $svcClusterPort }}
193+
name: alert-clu-tcp
194+
protocol: TCP
195+
- containerPort: {{ $svcClusterPort }}
196+
name: alert-clu-udp
197+
protocol: UDP
178198
startupProbe:
179199
{{- toYaml .Values.alertmanager.startupProbe | nindent 12 }}
180200
livenessProbe:

templates/alertmanager/alertmanager-svc-headless.yaml

+12-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
{{- if .Values.alertmanager.enabled -}}
2-
{{- if .Values.alertmanager.statefulSet.enabled -}}
32
apiVersion: v1
43
kind: Service
54
metadata:
@@ -21,7 +20,18 @@ spec:
2120
protocol: TCP
2221
name: http-metrics
2322
targetPort: http-metrics
23+
- port: {{ .Values.config.server.grpc_listen_port }}
24+
protocol: TCP
25+
name: grpc
26+
targetPort: grpc
27+
- port: 9094
28+
protocol: UDP
29+
name: alert-clu-udp
30+
targetPort: alert-clu-udp
31+
- port: 9094
32+
protocol: TCP
33+
name: alert-clu-tcp
34+
targetPort: alert-clu-tcp
2435
selector:
2536
{{- include "cortex.alertmanagerSelectorLabels" . | nindent 4 }}
2637
{{- end -}}
27-
{{- end -}}

values.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,10 @@ config:
126126
runtime_config:
127127
file: /etc/cortex-runtime-config/runtime_config.yaml
128128
alertmanager:
129+
# -- Enable alertmanager gossip cluster
130+
# -- Disable alertmanager gossip cluster by setting empty listen_address to empty string
131+
cluster:
132+
listen_address: '0.0.0.0:9094'
129133
# -- Enable the experimental alertmanager config api.
130134
enable_api: false
131135
external_url: '/api/prom/alertmanager'

0 commit comments

Comments
 (0)