Skip to content

feat(otel): allow to configure the faas.trigger attribute of the span #903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 30, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions examples/opentelemetry-tracing/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,15 @@ async fn main() -> Result<(), BoxError> {
.init();

// Initialize the Lambda runtime and add OpenTelemetry tracing
let runtime = Runtime::new(service_fn(echo)).layer(OtelLayer::new(|| {
// Make sure that the trace is exported before the Lambda runtime is frozen
tracer_provider.force_flush();
}));
let runtime = Runtime::new(service_fn(echo)).layer(
// Create a tracing span for each Lambda invocation
OtelLayer::new(|| {
// Make sure that the trace is exported before the Lambda runtime is frozen
tracer_provider.force_flush();
})
// Set the "faas.trigger" attribute of the span (defaults to "http" otherwise)
.with_trigger("datasource"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capture d’écran 2024-06-28 à 15 37 34

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to use "pubsub" since "datasource" is the new default value
Capture d’écran 2024-06-30 à 15 25 35

);
runtime.run().await?;
Ok(())
}
23 changes: 21 additions & 2 deletions lambda-runtime/src/layers/otel.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ use tracing::{instrument::Instrumented, Instrument};
/// a function to flush OpenTelemetry after the end of the invocation.
pub struct OpenTelemetryLayer<F> {
flush_fn: F,
otel_attribute_trigger: Option<String>,
}

impl<F> OpenTelemetryLayer<F>
Expand All @@ -18,7 +19,20 @@ where
{
/// Create a new [OpenTelemetryLayer] with the provided flush function.
pub fn new(flush_fn: F) -> Self {
Self { flush_fn }
Self {
flush_fn,
otel_attribute_trigger: None,
}
}

/// Configure the `faas.trigger` attribute of the OpenTelemetry span.
/// Defaults to `http` if not set.
/// See https://opentelemetry.io/docs/specs/semconv/attributes-registry/faas/ for the list of possible triggers.
pub fn with_trigger<T: Into<String>>(self, trigger: T) -> Self {
Self {
otel_attribute_trigger: Some(trigger.into()),
..self
}
}
}

Expand All @@ -33,6 +47,10 @@ where
inner,
flush_fn: self.flush_fn.clone(),
coldstart: true,
otel_attribute_trigger: self
.otel_attribute_trigger
.clone()
.unwrap_or_else(|| "http".to_string()),
Copy link
Contributor Author

@Oliboy50 Oliboy50 Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a doubt here, I don't know if the OpenTelemetryLayer::layer() is executed each time the lambda is invoked... or if it's call only on coldstarts during the Runtime construction

because if it's called for each lambda invocation, maybe we should do best than that to avoid performance issues?

For example, we could defaults to http directly in the OpenTelemetryLayer::new() function... but it would allocate a String for nothing when users will use with_trigger right after (i.e. OpenTelemetryLayer::new().with_trigger("datasource") would allocate http for nothing)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

layer() is only executed when the runtime starts, so this would be invoked only once. It makes me wonder if datasource should be the default, it's more generic than http. I'm ok with one allocation to initialize the service.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the spec, maybe using an enum for the datasource makes more sense since there are only a few trigger types defined.

Copy link
Contributor Author

@Oliboy50 Oliboy50 Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes me wonder if datasource should be the default, it's more generic than http

Yes datasource or even other, but this would be a breaking change, so let me know if you really want me to change the current default value 🙏

looking at the spec, maybe using an enum for the datasource makes more sense since there are only a few trigger types defined

The OpenTelemetry specs do not seem to move fast, but if they do, don't you think that this could be an issue for the users to wait for the aws-lambda-runtime releases to keep up with the OpenTelemetry specs evolution?
Also note that the current specs for faas.trigger are still experimental... and even the opentelemetry_semantic_conventions does not expose such an enum yet (don't know why though) 🤷
That being said, let me know if we should expose an enum here (should we name it OpenTelemetryFaasTrigger?) 🙏

OpenTelemetry docs for reference

Capture d’écran 2024-06-28 à 21 17 23

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes datasource or even other, but this would be a breaking change, so let me know if you really want me to change the current default value 🙏

This is fine. I don't think anyone is really using this yet.

That being said, let me know if we should expose an enum here (should we name it OpenTelemetryFaasTrigger?) 🙏

That name is fine by me. I understand that's experimental, but if someone wants to use it, it's hard to know what's supported on a string. People can still use other if at some point there are more types and they need to wait for a release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 👍

}
}
}
Expand All @@ -42,6 +60,7 @@ pub struct OpenTelemetryService<S, F> {
inner: S,
flush_fn: F,
coldstart: bool,
otel_attribute_trigger: String,
}

impl<S, F> Service<LambdaInvocation> for OpenTelemetryService<S, F>
Expand All @@ -61,7 +80,7 @@ where
let span = tracing::info_span!(
"Lambda function invocation",
"otel.name" = req.context.env_config.function_name,
{ traceconv::FAAS_TRIGGER } = "http",
{ traceconv::FAAS_TRIGGER } = &self.otel_attribute_trigger,
{ traceconv::FAAS_INVOCATION_ID } = req.context.request_id,
{ traceconv::FAAS_COLDSTART } = self.coldstart
);
Expand Down
Loading