Skip to content

Bug: Custom CloudWatch metrics not always appearing for split route(s) #1805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pawosty opened this issue Jan 4, 2023 · 8 comments · Fixed by #1827
Closed

Bug: Custom CloudWatch metrics not always appearing for split route(s) #1805

pawosty opened this issue Jan 4, 2023 · 8 comments · Fixed by #1827

Comments

@pawosty
Copy link
Contributor

pawosty commented Jan 4, 2023

Expected Behaviour

When uploading a custom metric using the aws_lambda_powertools.metrics package, the metrics should appear in CloudWatch Metrics.

Current Behaviour

Metrics appear intermittently and appear to be tied to whether a lambda was cold started (and also had a cold start metric uploaded).

Code snippet

Split route handler:

metrics: Metrics = Metrics()

@router.delete("/v1/asset", cors=True)
@tracer.capture_method
def delete_asset() -> dict:
    metrics.add_metric(name="AssetDeleted", unit=MetricUnit.Count, value=1)
    return {"message": "Asset deletion successful"}

Lambda handler:

@metrics.log_metrics(capture_cold_start_metric=True)
@logger.inject_lambda_context(correlation_id_path=correlation_paths.API_GATEWAY_REST, log_event=True)
@tracer.capture_lambda_handler
@auth.enforce_allow_list
@auth.restrict_api_key
def lambda_handler(event: Dict, context: LambdaContext):
    return app.resolve(event, context)

Possible Solution

When examining the cloudwatch logs, the unsuccessful metrics appear in the same log line as the tracer log, for example:

{
    "resource": "/v1/asset",
    "path": "/v1/asset",
    "httpMethod": "DELETE",
...more data...
}{
    "_aws": {
        "Timestamp": 1672800187055,
        "CloudWatchMetrics": [
            {
                "Namespace": "MyNamespace",
                "Dimensions": [
                    [
                        "service"
                    ]
                ],
                "Metrics": [
                    {
                        "Name": "AssetDeleted",
                        "Unit": "Count"
                    }
                ]
            }
        ]
    },
    "service": "MyLambda",
    "AssetDeleted": [
        1
    ]
}

When successful metrics appear (such as when they are uploaded as part of a cold start), they appear in their own log line in CloudWatch and are presumably parsed.

Steps to Reproduce

  • Create a split route lambda handler, log cold starts
  • In the split route(s) attempt to create/upload custom metrics
  • Observe not all metrics appear in CloudWatch metrics as expected

AWS Lambda Powertools for Python version

latest

AWS Lambda function runtime

3.6

Packaging format used

PyPi

Debugging logs

No response

@pawosty pawosty added bug Something isn't working triage Pending triage from maintainers labels Jan 4, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Jan 4, 2023

Thanks for opening your first issue here! We'll come back to you as soon as we can.
In the meantime, check out the #python channel on our AWS Lambda Powertools Discord: Invite link

@leandrodamascena leandrodamascena self-assigned this Jan 4, 2023
@leandrodamascena
Copy link
Contributor

Hi @pawosty! Thank you for opening this issue and letting us know about a possible issue.

I inspected the code and tried to simulate this problem here and I couldn't. I don't know if I'm missing something specific you're doing, but I tried to create an example similar to what you did and I got the expected behavior: all metrics were sent to CloudWatch.

I created a project using sam cli and pushed it to a github repository (https://github.com/leandrodamascena/custom-metrics-example). You can look at the code I made to try to reproduce the problem and see if there is something different in your environment that I should adjust here to reproduce the problem.

Below are screenshots of the tests I performed.

image
image

Thank you.

@leandrodamascena leandrodamascena added researching metrics cant-reproduce Any issues that cannot be reproduced and removed bug Something isn't working triage Pending triage from maintainers labels Jan 4, 2023
@pawosty
Copy link
Contributor Author

pawosty commented Jan 5, 2023

Thanks for looking into this! I have noticed that I have lambda context data appearing twice, once as a log item as you do, and then a second time clobbered with the metrics data, so I think there might be a dependency interfering. I'll try to get to the bottom of this and confirm.

@pawosty pawosty closed this as completed Jan 5, 2023
@pawosty pawosty reopened this Jan 5, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@leandrodamascena
Copy link
Contributor

leandrodamascena commented Jan 5, 2023

Thanks for looking into this! I have noticed that I have lambda context data appearing twice, once as a log item as you do, and then a second time clobbered with the metrics data, so I think there might be a dependency interfering. I'll try to get to the bottom of this and confirm.

Thanks for reply.

Let me know if I can help in any way to detect if this is indeed a bug in Powertools. If so, we are fully interested in fixing it.

@pawosty
Copy link
Contributor Author

pawosty commented Jan 5, 2023

Ok, I believe I found the issue. The extra JSON appearing in the logs was because I had "debug" enabled on the APIGatewayResolver object, causing this line to be executed:

if self._debug: print(self._json_dump(event), end="")

This gets combined with the metrics output and then ignored by CloudWatch metrics. A member on our team was able to confirm setting debug mode to FALSE results in metrics appearing as expected.

I'm not sure why this specific print behaves this way, maybe because a newline is being omitted? Regardless I think it's an unexpected interaction.

@github-actions
Copy link
Contributor

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@leandrodamascena leandrodamascena added pending-release Fix or implementation already in dev waiting to be released event_handlers and removed researching cant-reproduce Any issues that cannot be reproduced labels Jan 11, 2023
@github-actions
Copy link
Contributor

This is now released under 2.6.0 version!

@github-actions github-actions bot removed the pending-release Fix or implementation already in dev waiting to be released label Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants