Skip to content

Feature request: Add support for default dimensions on ColdStart metric captured with log_metrics decorator #5237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
angelcabo opened this issue Sep 25, 2024 · 6 comments
Assignees
Labels

Comments

@angelcabo
Copy link

angelcabo commented Sep 25, 2024

Use case

In my environment, I have two serverless applications: one built using Python and the other using TypeScript/Node.js. Both utilize AWS Powertools. However, I have noticed a discrepancy between the two in how they handle CloudWatch cold start metrics by default.

In the Node.js Powertools, the middleware automatically includes default dimensions such as service name, function name, and additional custom dimensions.

    .use(logMetrics(metrics, {
        throwOnEmptyMetrics: false,
        captureColdStartMetric: true,
        defaultDimensions: {'Environment': process.env.ENV!}
    }))

On the other hand, in the Python Powertools, the cold start metric only includes the function and service name by default, with no apparent way to extend or override this behavior using custom dimensions without completely managing the cold start metric manually.

@metrics.log_metrics(capture_cold_start_metric=True, default_dimensions={"Environment": ENVIRONMENT})
def lambda_handler(event: dict, context: LambdaContext):
...

This creates an inconsistency between the two libraries and forces us to write additional boilerplate code in Python if we want additional dimensions to be recorded along with the ColdStart metric, while the Node.js implementation provides a more seamless experience.

FWIW:

The docs are accurate in both cases:
Python's page says:

If it's a cold start invocation, this feature will:

Create a separate EMF blob solely containing a metric named ColdStart
Add function_name and service dimensions

and TypeScript's page says:

If it's a cold start invocation, this feature will:

Create a separate EMF blob solely containing a metric named ColdStart
Add the function_name, service and default dimensions

Solution/User Experience

Ideally, the default_dimensions provided to the log_metrics decorator would be used when recording the cold start metric in Python.

Alternative solutions

As mentioned, an alternative is to continue to have users manually handle the cold start metrics themselves to support adding custom dimensions to the ColdStart metric in Python.

Acknowledgment

@angelcabo angelcabo added feature-request feature request triage Pending triage from maintainers labels Sep 25, 2024
Copy link

boring-cyborg bot commented Sep 25, 2024

Thanks for opening your first issue here! We'll come back to you as soon as we can.
In the meantime, check out the #python channel on our Powertools for AWS Lambda Discord: Invite link

@dreamorosi
Copy link
Contributor

dreamorosi commented Sep 26, 2024

Hi, thank you for opening this issue.

Just for added context, I have gone ahead and created a sample project to help with contextualizing the request and better see the differences between the two versions.

Given these two functions:

TypeScript:

import { Metrics } from '@aws-lambda-powertools/metrics';
import { logMetrics } from '@aws-lambda-powertools/metrics/middleware';
import middy from '@middy/core';

const metrics = new Metrics({
  namespace: 'MyApp',
});

export const handler = middy(async () => {
  return {
    statusCode: 200,
    body: JSON.stringify('Hello, World!'),
  };
}).use(
  logMetrics(metrics, {
    throwOnEmptyMetrics: false,
    captureColdStartMetric: true,
    defaultDimensions: { Environment: 'prod' },
  })
);

Python:

from aws_lambda_powertools import Metrics
from aws_lambda_powertools.metrics import MetricUnit
from aws_lambda_powertools.utilities.typing import LambdaContext

metrics = Metrics(namespace="MyApp")

@metrics.log_metrics(raise_on_empty_metrics=False, capture_cold_start_metric=True, default_dimensions={"Environment": "prod"})
def lambda_handler(event: dict, context: LambdaContext):
    return {"statusCode": 200, "body": "Hello World"}

The EMF blog emitted by the two is similar but different in three aspects:

  • The EMF metric emitted by TypeScript includes default dimensions (in this case Environment), and Python doesn't
  • The EMF metric emitted by TypeScript includes the service dimension by default, and Python doesn't
  • The unit used for ColdStart in TypeScript is an integer (1) while in Python is a float (1.0) which is wrapped in an array/list

Below the full metric logs:

TypeScript:

{
  "_aws": {
    "Timestamp": 1727339583244,
    "CloudWatchMetrics": [
      {
        "Namespace": "MyApp",
        "Dimensions": [
          [
            "service",
            "Environment",
            "function_name"
          ]
        ],
        "Metrics": [
          {
            "Name": "ColdStart",
            "Unit": "Count"
          }
        ]
      }
    ]
  },
  "service": "service_undefined",
  "Environment": "prod",
  "function_name": "DimensionsFn",
  "ColdStart": 1
}

Python:

{
  "_aws": {
    "Timestamp": 1727339552523,
    "CloudWatchMetrics": [
      {
        "Namespace": "MyApp",
        "Dimensions": [
          [
            "function_name"
          ]
        ],
        "Metrics": [
          {
            "Name": "ColdStart",
            "Unit": "Count"
          }
        ]
      }
    ]
  },
  "function_name": "DimensionsFnPython",
  "ColdStart": [
    1.0
  ]
}

For the purpose of this issue, I think we are focusing on the first bullet point, the default dimensions being included in the cold start metric.

I see value in aligning the behavior between the two versions, although I don't have a strong opinion on which one should be the preferred one.

What I can say for sure is that making this change in either of the two libraries would be a breaking change, so we'll need to address this in a backwards compatible way if we decide to do so.

@leandrodamascena
Copy link
Contributor

Hi, thank you for opening this issue.

Just for added context, I have gone ahead and created a sample project to help with contextualizing the request and better see the differences between the two versions. You can find the repo, which you can deploy using CDK, here:

Hi @dreamorosi, thank you very much for this investigation using both runtimes! This will be very useful in helping us make a decision. I have a few considerations below.

The EMF blog emitted by the two is similar but different in three aspects:

  • The EMF metric emitted by TypeScript includes default dimensions (in this case Environment), and Python doesn't

Yes, we don't include the default dimensions in Python, which I think is wrong behavior because we should consider the default dimension in all metrics, including ColdStart. In this case, TypeScript is doing the right thing and Python is missing this information.

  • The EMF metric emitted by TypeScript includes the service dimension by default, and Python doesn't

We also include the service dimension in Python too. In this case, you're not seeing it because you didn't define the service using the parameter constructor or the env variable.

  • The unit used for ColdStart in TypeScript is an integer (1) while in Python is a float (1.0) which is wrapped in an array/list

In this case, there is no difference in using an int/float or a list with int/float. We do this to try to optimize as much as possible the blob that we emit to EMF, so in cases where you have two metrics, we define both in the Metrics key and pass the values ​​in that array.
The result/data in CloudWatch will be the same.

I see value in aligning the behavior between the two versions, although I don't have a strong opinion on which one should be the preferred one.

I also see value in aligning behavior, and I think including default dimensions in the ColdStart metric makes more sense because as a customer I would expect all my metrics —no matter which ones— to inherit default dimensions if I set them intentionally.

What I can say for sure is that making this change in either of the two libraries would be a breaking change, so we'll need to address this in a backwards compatible way if we decide to do so.

This is the most problematic point in this discussion. If we decide to include default dimensions in the ColdStart metric, we cannot make this the default behavior. This is a potential breaking change for customers who have dashboards looking at ColdStart without dimensions. CloudWatch costs can also be an issue for customers because each combination of metric_name x namespace x dimension added to CloudWatch is billed as a separate metric, and customers may experience increased billing if we make this the default behavior. And many other situations that I can't predict all of them.

I was thinking about creating a new parameter in @metrics.log_metrics() to allow customer to choose if they want to include the default dimensions. And in Powertools V4 we make this the default behavior. The experience could be something like this:

@metrics.log_metrics(capture_cold_start_metric=True, include_default_dimensions_cold_start=True)
def lambda_handler(event: dict, context: LambdaContext):

I don't like the name of this parameter, but it's just for illustration.

Please let me know what do you think @angelcabo and @dreamorosi.

@dreamorosi
Copy link
Contributor

Thanks for clarifying the int/float difference, for the service name I think the Python behavior is better for cost savings, but I can see why it was implemented this way in TS - anyway for now onwards I'll focus the discussion only on the default dimensions being included in the cold start metric.

Seeing how we all agree that default dimensions should be included in that metric, I think we have two options forward. One is the one you suggested, making the behavior opt-in in the current major version and eventually default in the next one.

The other option would be to consider this a bug and fix it right away. I acknowledge that the line here is quite blurry, however while the log_metrics decorator doesn't explicitly mention the default dimensions being included, the description of the default_dimensions in the API reference reads this:
image

With this in mind, I think there's room for interpret them being excluded as an unintended behavior and thus treat this as a bug fix. There are also precedents for this type of change being treated as bugs and having been fixed within minor releases.

Either way, I think there's a potential for data loss. Customers might be trying to set up dashboards that query the cold start metric with certain dimensions expecting to find values because they set default_dimensions and they don't find them.


If we go to the route you suggested, I'd like us to consider using an environment variable instead of adding another parameter. I think in the long term this would make the v3 to v4 switch easier, and the API less verbose in the code.

For example, having an env variable like POWERTOOLS_METRICS_DEFAULT_DIMENSIONS_IN_COLDSTART_METRIC that defaults to False when not present.

@dreamorosi dreamorosi added need-more-information Pending information to continue and removed triage Pending triage from maintainers labels Sep 26, 2024
@dreamorosi dreamorosi moved this from Triage to Ideas in Powertools for AWS Lambda (Python) Sep 26, 2024
@anafalcao anafalcao moved this from Ideas to Backlog in Powertools for AWS Lambda (Python) Jan 29, 2025
@anafalcao anafalcao added metrics and removed need-more-information Pending information to continue labels Jan 29, 2025
@leandrodamascena
Copy link
Contributor

leandrodamascena commented Jan 29, 2025

I'm coming here with some news after some testing with our codebase and published metrics for CloudWatch Metrics.

1 - Changing our codebase to include a new parameter like add_custom_dimensions_to_cold_start will add more complexity to our API and some complexity to the logic.

2 - Changing this behavior to be the default and including dimensions in ColdStart will create an undesirable impact for customers who already use standard dimensions in this metric, as they will pay more for this metric now and potential data loss when querying the ColdStart metric with only the functio_name dimension. I recognize that this should not be the expected behavior right now, but unfortunately we can't do this breaking change, at least not right now.

3 - We build Powertools as modular as possible and in this case you can subclass the EMF provider and create your own method to publish ColdStart metrics. You can do this with this code:

from aws_lambda_powertools.metrics import MetricUnit, single_metric
from aws_lambda_powertools.utilities.typing import LambdaContext
from aws_lambda_powertools.metrics.provider.cloudwatch_emf.cloudwatch import AmazonCloudWatchEMFProvider


class MetricsCustom(AmazonCloudWatchEMFProvider):
    def add_cold_start_metric(self, context: LambdaContext) -> None:
        with single_metric(name="ColdStart", unit=MetricUnit.Count, value=1, namespace=self.namespace) as metric:
            metric.add_dimension(name="function_name", value=context.function_name)
            if self.service:
                metric.add_dimension(name="service", value=str(self.service))

            # Including default dimensions
            for key, value in self.dimension_set.items():
                metric.add_dimension(name=key, value=str(value)) 


metrics = MetricsCustom(namespace="MyNameSpace")
metrics.add_dimension(name="dimension1", value="dimension1")

@metrics.log_metrics(capture_cold_start_metric=True)
def lambda_handler(event: dict, context: LambdaContext):
    metrics.add_metric(name="metric1", unit=MetricUnit.Bytes, value=1)
    return {"my": "metric"}

4 - I'm adding this feature request to our discussion that will collect items for Powertools v4. As soon as we have enough items to justify a new major version, we'll start working on it.

I'm closing this issue as not planned now, but we'll keep tracking it in our discussion.

Thanks everyone for your input and ideas.

@leandrodamascena leandrodamascena closed this as not planned Won't fix, can't repro, duplicate, stale Jan 29, 2025
@github-project-automation github-project-automation bot moved this from Backlog to Coming soon in Powertools for AWS Lambda (Python) Jan 29, 2025
Copy link
Contributor

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@leandrodamascena leandrodamascena moved this from Coming soon to Shipped in Powertools for AWS Lambda (Python) Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Shipped
Development

No branches or pull requests

4 participants