You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -299,7 +305,17 @@ _This section must be completed when targeting beta graduation to a release._
299
305
No impact to running workloads, logs will indicate the problem.
300
306
301
307
###### What specific metrics should inform a rollback?
302
-
To be determined.
308
+
309
+
* This KEP is following the [opentelemetry-go issue #2547](https://github.com/open-telemetry/opentelemetry-go/issues/2547).
310
+
311
+
```
312
+
...using the OTLP trace exporter, it isn't currently possible to monitor (with metrics) whether or not spans are being successfully collected and exported.
313
+
For example, if my SDK cannot connect to an opentelemetry collector, and isn't able to send traces, I would like to be able to measure how many traces are collected,
314
+
vs how many are not sent. I would like to be able to set up SLOs to measure successful trace delivery from my applications.
315
+
```
316
+
317
+
* Pod Lifecycle and Kubelet [SLOs](https://github.com/kubernetes/community/tree/master/sig-scalability/slos) are the signals that should guide a rollback. In particular, the [`kubelet_pod_start_duration_seconds_count`, `kubelet_runtime_operations_errors_total`, and `kubelet_pleg_relist_interval_seconds_bucket`] would surface issues affecting kubelet performance.
318
+
303
319
304
320
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
305
321
Upgrades and rollbacks will be tested while feature-gate is experimental
0 commit comments