-
Notifications
You must be signed in to change notification settings - Fork 421
Nesting log_metrics only results in one set of metrics being logged #1668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for opening your first issue here! We'll come back to you as soon as we can. |
Here's a standalone example with the logging turned up to 11 showing that only one set of metrics gets output: import logging
import sys
from aws_lambda_powertools.metrics import Metrics, MetricUnit
# API handler metrics with dimensions: Stage, service
api_metrics = Metrics()
api_metrics.set_default_dimensions(Stage='Test')
# API handler metrics with dimensions: Method, Resource, Stage, service
detailed_api_metrics = Metrics()
detailed_api_metrics.set_default_dimensions(Stage='Test')
class MockLambdaContext:
function_name: str = 'FUNCTION_NAME'
def handler(event, context):
detailed_api_metrics.add_dimension(
name='Method', value='GET')
detailed_api_metrics.add_dimension(
name='Resource', value='/some/path')
api_metrics.add_metric(name='Count',
unit=MetricUnit.Count, value=1)
detailed_api_metrics.add_metric(
name='Count', unit=MetricUnit.Count, value=1)
# Add metrics last to properly flush metrics.
handler = api_metrics.log_metrics(handler, capture_cold_start_metric=True)
handler = detailed_api_metrics.log_metrics(handler)
def test_handler(caplog):
loggers = [logging.getLogger(name)
for name in logging.root.manager.loggerDict]
for logger in loggers:
stream = logging.StreamHandler(sys.stdout)
logger.setLevel(logging.DEBUG)
logger.addHandler(stream)
handler({}, MockLambdaContext())
raise ValueError() # Force pytest to fail and emit logs. Run with pytest. |
The issue seems to be these 4 class attributes in
This causes sharing across Just making these instance attributes fixes the issue. That being said I don't know what we lose. Presumably if people create |
We briefly discussed on Discord on why this being expected behaviour, and I'll provide a proper answer as soon as I can. I'll rephrase my last comment on that Discord thread as a proper response here and gather your ideas on what UX would be good to unlock this and other ISV use cases (e.g., multiple namespaces). |
Confirming that we'll be working on this for this Friday's release. In that thread (Discord), we weren't able to find a better class name. Based on what we know, the most suitable candidate would be a flag in the existing
At a first glance, the only "side effect" is that If you do have a better naming idea for a separate Class altogether, please shout out! |
Just merged the new class
As promised, here's the full answer to your original comments: why a new class is needed, why not making Staged docs: https://awslabs.github.io/aws-lambda-powertools-python/develop/core/metrics/#metrics-isolation Metrics isolationYou can use NOTE: "This is a typical use case is for multi-tenant, or emitting same metrics for distinct applications." from aws_lambda_powertools.metrics import EphemeralMetrics, MetricUnit
from aws_lambda_powertools.utilities.typing import LambdaContext
metrics = EphemeralMetrics()
@metrics.log_metrics
def lambda_handler(event: dict, context: LambdaContext):
metrics.add_metric(name="SuccessfulBooking", unit=MetricUnit.Count, value=1) Differences between
|
Feature | Metrics | EphemeralMetrics |
---|---|---|
Share data across instances (metrics, dimensions, metadata, etc.) | Yes | - |
Default dimensions that persists across Lambda invocations (metric flush) | Yes | - |
"Why not changing the default
Metrics
behaviour to not share data across instances?"
This is an intentional design to prevent accidental data deduplication or data loss issues due to CloudWatch EMF metric dimension constraint.
In CloudWatch, there are two metric ingestion mechanisms: EMF (async) and PutMetricData
API (sync).
The former creates metrics asynchronously via CloudWatch Logs, and the latter uses a synchronous and more flexible ingestion API.
Pause for a key concept
CloudWatch considers a metric unique by a combination of metric name, metric namespace, and zero or more metric dimensions.
With EMF, metric dimensions are shared with any metrics you define. With PutMetricData
API, you can set a list defining one or more metrics with distinct dimensions.
This is a subtle yet important distinction. Imagine you had the following metrics to emit:
Metric Name | Dimension | Intent |
---|---|---|
SuccessfulBooking | service="booking", tenant_id="sample" | Application metric |
IntegrationLatency | service="booking", function_name="sample" | Operational metric |
ColdStart | service="booking", function_name="sample" | Operational metric |
The tenant_id
dimension could vary leading to two common issues:
ColdStart
metric will be created multiple times (N * number of unique tenant_id dimension value), despite thefunction_name
being the sameIntegrationLatency
metric will be also created multiple times due totenant_id
as well asfunction_name
(may or not be intentional)
These issues are exacerbated when you create (A) metric dimensions conditionally, (B) multiple metrics' instances throughout your code instead of reusing them (globals). Subsequent metrics' instances will have (or lack) different metric dimensions resulting in different metrics and data points with the same name.
Intentional design to address these scenarios
On 1, when you enable capture_start_metric feature, we transparently create and flush an additional EMF JSON Blob that is independent from your application metrics. This prevents data pollution.
On 2, you can use EphemeralMetrics
to create an additional EMF JSON Blob from your application metric (SuccessfulBooking
). This ensures that IntegrationLatency
operational metric data points aren't tied to any dynamic dimension values like tenant_id
.
That is why Metrics
shares data across instances by default, as that covers 80% of use cases and different personas using Powertools. This allows them to instantiate Metrics
in multiple places throughout their code - be a separate file, a middleware, or an abstraction that sets default dimensions.
This is now released under 2.2.0 version! |
@tibbe this is now available as part of v2.2.0 release (Lambda Layer v13): https://github.com/awslabs/aws-lambda-powertools-python/releases/tag/v2.2.0 Let me know if that doesn't address your original ask. |
@heitorlessa thanks! I've now integrated 2.2.0 in our backend and things seem to be working well so far! |
Expected Behaviour
When using two
Metrics
objects and two nested call tolog_metrics
I expected both sets of metrics to be serialized. The reason for twoMetrics
object is that every metric is output twice, with a different set of dimensions. Usingsingle_metric
for everything works but seems sub-optimal from a performance standpoint.Example:
Current Behaviour
Looking in the CloudWatch logs only the JSON for one set of metrics (
api_metrics
above) is output.Code snippet
Possible Solution
No response
Steps to Reproduce
Lambda
.AWS Lambda Powertools for Python version
1.25.10
AWS Lambda function runtime
3.9
Packaging format used
PyPi
Debugging logs
Note that the two
Metrics
objects are used unconditionally right after each other so it would be weird for one to have metrics and one to have not.The text was updated successfully, but these errors were encountered: