Skip to content

Feature request: HTTP request/response annotations #1179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 tasks done
revmischa opened this issue Nov 18, 2022 · 15 comments
Open
2 tasks done

Feature request: HTTP request/response annotations #1179

revmischa opened this issue Nov 18, 2022 · 15 comments
Labels
discussing The issue needs to be discussed, elaborated, or refined feature-request This item refers to a feature request for an existing or new utility need-customer-feedback Requires more customers feedback before making or revisiting a decision need-more-information Requires more information before making any calls tracer This item relates to the Tracer Utility

Comments

@revmischa
Copy link

revmischa commented Nov 18, 2022

Use case

I have a nextjs CDK construct with a simple APIGW lambda wrapper around nextjs. I want to annotate my traces with HTTP request and response metadata, e.g. "URL" and "method" and "status code"

I would like to use it here: https://github.com/jetbridge/cdk-nextjs/blob/main/assets/lambda/NextJsHandler.ts#L45

I'd like to see the status code, path, method, etc in xray:
image

Solution/User Experience

Perhaps something like traceRequestResponseCycle in https://docs.aws.amazon.com/xray-sdk-for-nodejs/latest/reference/AWSXRay.html

Alternative solutions

Discussed on discord: https://discord.com/channels/1006478942305263677/1006527385409179678/1042309163818168380

Acknowledgment

@revmischa revmischa added the triage This item has not been triaged by a maintainer, please wait label Nov 18, 2022
@dreamorosi dreamorosi added tracer This item relates to the Tracer Utility feature-request This item refers to a feature request for an existing or new utility discussing The issue needs to be discussed, elaborated, or refined need-more-information Requires more information before making any calls and removed triage This item has not been triaged by a maintainer, please wait labels Nov 18, 2022
@dreamorosi
Copy link
Contributor

Hi @revmischa thank you for taking the time to open a feature request.

As I have briefly mentioned on Discord: at the moment we don't use any of the middleware features from aws-xray-sdk so I'm unsure on how it will behave with Tracer. With that said, a first cursory look at traceRequestResponseCycle's implementation suggests that the utility provides some syntactic sugar around segment manipulation/annotation, which bodes well for Powertools compatibility.

Before considering the feature and putting it into the backlog, I would like some investigation to take place to make a small PoC and understand the impact and compatibility of this feature with Powertools.

Like with all features, contributions are more than welcome, and in case this is left to the maintainers, having 👍 on the feature request will help increasing its priority in our backlog.

If anyone is interested in picking up this issue, please leave a comment below to express your interest and agree on next steps. The maintainers will be happy to support you with clearing doubts or questions on the scope and codebase.

@dreamorosi dreamorosi added the help-wanted We would really appreciate some support from community for this one label Nov 18, 2022
@revmischa
Copy link
Author

For my purposes a simpler version of this would be to let me manually set the HTTP path, method, status code directly on the segment. But I have no idea how to do this and couldn't find any documentation on how to manually set those attributes.

@kennethwkz-mm
Copy link

this is https://www.npmjs.com/package/aws-xray-sdk-express feature, this is very useful when we are tracing some ip address or api URL in a easy way

@dreamorosi
Copy link
Contributor

FYI @revmischa, I have been looking at the aws-sdk-for-nodejs implementation to see if I could find a way to make it work for Tracer without changes but didn't have much luck.

I'll keep looking, also using the example that was just posted above.

@github-actions

This comment was marked as outdated.

@github-actions github-actions bot added the pending-close-response-required This issue will be closed soon unless the discussion moves forward label Feb 28, 2023
@revmischa
Copy link
Author

Please keep open

@dreamorosi dreamorosi added need-customer-feedback Requires more customers feedback before making or revisiting a decision and removed pending-close-response-required This issue will be closed soon unless the discussion moves forward need-more-information Requires more information before making any calls labels Feb 28, 2023
@github-actions

This comment was marked as off-topic.

@github-actions github-actions bot added the pending-close-response-required This issue will be closed soon unless the discussion moves forward label Mar 25, 2023
@dreamorosi dreamorosi added need-more-information Requires more information before making any calls and removed pending-close-response-required This issue will be closed soon unless the discussion moves forward need-customer-feedback Requires more customers feedback before making or revisiting a decision labels Mar 25, 2023
@dreamorosi dreamorosi added this to the Version 2.0 milestone Aug 2, 2023
@dreamorosi dreamorosi removed this from the Version 2.0 milestone Jan 30, 2024
@dreamorosi dreamorosi added confirmed The scope is clear, ready for implementation and removed discussing The issue needs to be discussed, elaborated, or refined need-more-information Requires more information before making any calls labels Jan 28, 2025
@dreamorosi dreamorosi moved this from Ideas to Backlog in Powertools for AWS Lambda (TypeScript) Jan 28, 2025
@dreamorosi
Copy link
Contributor

With the upcoming Event Handler feature incoming, we'll need a way to add these annotations to segments generated by Tracer.

I've been looking at the implementation more closely and I think that we should implement a version of this method so that the http field of the X-Ray segment data is populated as described in the original post.

I'm adding this to the backlog so we can work on it at some point in the coming months.

The issue is open for contributions. If you’re interested, please leave a comment before starting a PR. Let’s discuss and agree on a design for the new feature before implementing it. I don’t want to expose the method as-is from the X-Ray SDK.

@VatsalGoel3
Copy link
Contributor

Hi @dreamorosi,

I’d love to take on this feature request and implement HTTP metadata annotation for X-Ray segments in the Tracer utility.

@revmischa
Copy link
Author

It would also be nice to have as middy middleware for APIGWv2 handlers

@dreamorosi dreamorosi added discussing The issue needs to be discussed, elaborated, or refined need-customer-feedback Requires more customers feedback before making or revisiting a decision need-more-information Requires more information before making any calls and removed help-wanted We would really appreciate some support from community for this one confirmed The scope is clear, ready for implementation labels Feb 14, 2025
@dreamorosi
Copy link
Contributor

Hi @dreamorosi,

I’d love to take on this feature request and implement HTTP metadata annotation for X-Ray segments in the Tracer utility.

Hi @VatsalGoel3 sorry for the delay and confusion, I started doing a PoC for this feature and I realized that we need to first figure out a couple details before we can start implementing.

I'm changing the labels and status of the issue back, and I will share more details in a few minutes.

@dreamorosi
Copy link
Contributor

dreamorosi commented Feb 14, 2025

As I mentioned in the comment above, I've been looking into this and I am unsure about how to proceed.

I've first tested the X-Ray SDK instrumentation of Express.js without Powertools for AWS, to see how they are created and I found out that the sdk creates brand new segments for each request. This is kind of unusual, at least related to how our Tracer works, because we never create new unrelated segments but rather subsegments of the function segment.

To explain it more visually, this is the tree that X-Ray creates when you instrument:

  • AWS Lambda
    • function segment
  • request segment

When doing this, the CloudWatch Traces page populates the HTTP Method & URL Address fields.

Image

When working with Tracer from Powertools instead, the segment you get has this structure:

  • AWS Lambda
    • function segment
      • index.handler subsegment

This is because in Lambda you cannot mutate the function segment but can only either: 1/ create subsegments below it, or 2/ create entirely unrelated segments - which we don't normally do.

This is a bit of a problem in terms of this feature request.

At least as far as I could test, the CloudWatch Traces UI only populates the HTTP Method and URL Address fields when that data is part of the top-most segment, but doesn't do it for subsegments.

For example, even if I attach the http data to the subsegment created by Powertools, it's not shown in the main page:

Image

But it's shown in both the "Raw data" and single trace UIs:

Image

Image

I can't know for sure, but this type of representation seems to have been thought primarily for other compute environments other than Lambda. I say this because in Lambda you still want to maintain the lineage/relationship between your application segment and the function one, which is lost in this case.

Important

Because of this, I'm not inclined to follow the same implementation as the X-Ray SDK and instead I'm inclined towards adding the http data to the subsegment, even if it means not having it displayed in the main page.

I think overall having the data in the subsegment, and having that subsegment maintain the lineage with the request is superior than just having the UI show the field populated.

Especially because the SDK approach also completely breaks the trace map, which now looks like below, making it seem almost as if there were two separate clients and requests, rather than one:

Image

So to sum up, by adding the request/response data to the subsegment, the only decision point we have left is whether we want to this as a direct subsegment or as a sibling of the Powertools-generate done.

Option 1 (child segment)

The first option would be to just create a subsegment under the one that is already being created by the Powertools for AWS Tracer whenever you wrap your handler with a class method decorator or Middy.js middleware.

The structure of the trace data would look like this:

  • AWS Lambda
    • function segment
      • index.handler subsegment
        • request segment (i.e. GET /test)
        • request segment (i.e. GET /foo)

And would generate these screens on the CloudWatch UI:

Trace data

Image

Image

PoC code for this option
import { Tracer } from '@aws-lambda-powertools/tracer';
import { captureLambdaHandler } from '@aws-lambda-powertools/tracer/middleware';
import middy from '@middy/core';
import { parser } from '@aws-lambda-powertools/parser/middleware';
import { APIGatewayProxyEventV2Schema } from '@aws-lambda-powertools/parser/schemas/api-gatewayv2';
import { Segment } from 'aws-xray-sdk-core';

const tracer = new Tracer();

export const handler = middy()
  .use(captureLambdaHandler(tracer))
  .use(parser({ schema: APIGatewayProxyEventV2Schema }))
  .handler(async (event) => {
    const indexSegment = tracer.getSegment()!; // # index.handler
    const subsegment = new Segment(
      `## ${event.requestContext.http.method} ${event.rawPath}`,
      tracer.getRootXrayTraceId(),
      indexSegment.id
    );
    tracer.setSegment(subsegment);

    const method = event.requestContext.http.method;
    const userAgent = event.requestContext.http.userAgent;

    // @ts-expect-error: we are in untyped land
    subsegment.http = {
      request: {
        method,
        user_agent: userAgent,
        client_ip: event.requestContext.http.sourceIp,
        x_forwarded_for: true,
        url: `https://${event.requestContext.domainName}${event.rawPath}`,
      },
    };

    const responseStatusCode = 200;
    const responseBody = JSON.stringify({
      message: 'Hello, World!',
    });
    const contentLength = Buffer.byteLength(responseBody, 'utf8');
    const endTime = Date.now();

    // @ts-expect-error: we are in untyped land again
    subsegment.http.response = {
      status: responseStatusCode,
      content_length: contentLength,
    };

    subsegment.close();
    tracer.setSegment(indexSegment);

    return {
      statusCode: responseStatusCode,
      body: responseBody,
      headers: {
        'Content-Type': 'application/json',
        'Content-Length': contentLength,
      },
    };
  });

Option 2 (sibling segment)

The second option would be to create a subsegment that is a sibling of the Powertools one. Depending on what you're doing in your function, i.e. if you're doing a router and not wrapping the handler with captureLambdaHandler() you can even skip the index.handler segment entirely.

  • AWS Lambda
    • function segment
      • index.handler subsegment
      • request segment (i.e. GET /test)
      • request segment (i.e. GET /foo)

With this approach, the trace data would be represented like this in the UI:

Image

The neat thing about this approach is that you get a clear view in the map of your application route, and if you have more than one in the same function, you also get multiple nodes.

Same screen but other route

Image

And more importantly, you also get them as separate nodes in the Trace Map, which you don't get in the option 1.

Image

PoC code for this option
import { Tracer } from '@aws-lambda-powertools/tracer';
import { captureLambdaHandler } from '@aws-lambda-powertools/tracer/middleware';
import middy from '@middy/core';
import { parser } from '@aws-lambda-powertools/parser/middleware';
import { APIGatewayProxyEventV2Schema } from '@aws-lambda-powertools/parser/schemas/api-gatewayv2';

const tracer = new Tracer();

export const handler = middy()
  .use(captureLambdaHandler(tracer))
  .use(parser({ schema: APIGatewayProxyEventV2Schema }))
  .handler(async (event) => {
    const indexSegment = tracer.getSegment()!; // # index.handler
    const subsegment = indexSegment.addNewSubsegment(
      `## ${event.requestContext.http.method} ${event.rawPath}`,
    ); 
    tracer.setSegment(subsegment);

    const method = event.requestContext.http.method;
    const userAgent = event.requestContext.http.userAgent;

    // @ts-expect-error: we are in untyped land
    subsegment.http = {
      request: {
        method,
        user_agent: userAgent,
        client_ip: event.requestContext.http.sourceIp,
        x_forwarded_for: true,
        url: `https://${event.requestContext.domainName}${event.rawPath}`,
      },
    };

    const responseStatusCode = 200;
    const responseBody = JSON.stringify({
      message: 'Hello, World!',
    });
    const contentLength = Buffer.byteLength(responseBody, 'utf8');
    const endTime = Date.now();

    // @ts-expect-error: we are in untyped land again
    subsegment.http.response = {
      status: responseStatusCode,
      content_length: contentLength,
    };

    subsegment.close();
    tracer.setSegment(indexSegment);

    return {
      statusCode: responseStatusCode,
      body: responseBody,
      headers: {
        'Content-Type': 'application/json',
        'Content-Length': contentLength,
      },
    };
  });

Overall, between the two options I'm kind of leaning towards the second one.

While option 1 is more consistent with the current model and gives you direct lineage of the request segment, option 2 still maintains a full trace all the way from the client to the route while also creating separate route nodes for each one of your request/response flows.

What do we think?

Would also like to hear @leandrodamascena's opinion here, since I think this is a new behavior also for Python.

@leandrodamascena
Copy link
Contributor

leandrodamascena commented Feb 16, 2025

Hey, this is a super cool discussion about user experience and what we can do better for our customers, I really like it. I'll skip the part about XRAY Segments, because @dreamorosi already explained very well all the stuff about not being able to mutate the root segment, not being able to index/display http.url/http.response_code outside the root segment and others. BTW, It was all explained with rich details.

Creating new subsegments

When Event Handler comes out - this is also a reality in Python -, we will have routes resolving to functions, so, this is not what we are doing with standalone decorator and customers already can do that? The difference I see here are:

1/ the name of the segment/subsegment
2/ adding the URL in the HTTP fields and making it a HTTP integration, what will create a new subsegment/node. But this is not an external HTTP integration, not indexable and customer won't be able to search for this.

Why not we try to explore to improve the documentation with some information about adding annotations and making it searchable? Or even changing the standalone method to allow custom name? I don't know, just guessing here.

Also, I look at the two images and I'm trying to think like an operations person who's on call and just got an alert that Lambda function XYZ is running slow (10 seconds). When I check the traces, I imagine this Lambda function making GET and POST requests to somewhere/foo, because its looks like this. When I click on those, I see it looks like an external call to a URL in API Gateway. Then, looking at that API Gateway, I notice its GET and POST methods are actually integrating to the same Lambda function that's alerting. So, my immediate thought is: "Hmm, why is this function calling an endpoint that points right back to itself? Could this be a loop?" At that point, I'd probably reach out to the developer to get their input and have them take a closer look.

Image
Image

I know that this isn't exactly what's happening. I know that you're aiming to enhance the user experience by adding more specific data to SPAN subsegments or even new nodes. But in X-Ray, the concept of having a node for "HTTP requests" typically indicates that one of our functions is calling an external endpoint. However, the way this is currently presented could potentially confuse our customers because this is actually the function route itself.

In an APM environment, for example, where we could dive deeper into the function's code and analyze code-related operations, it might make sense to encapsulate this in an HTTP node. But with our current approach, I'm concerned that we might be addressing a very specific case while making the overall user experience more confusing.

I'm not entirely sure if we should take action here and develop code for very specific scenarios that might not fully address the issue. Also, if the customer wants to do this by creating segments on its own and making it HTTP requests, that's already possible, you've demonstrated that on your examples.

Given X-Ray's design constraints, it seems we can't fully meet the main request of this issue, so, perhaps a better approach would be to focus on providing more comprehensive guidance (documentation, real use case, examples, others) to our customers on how to handle situations like this.

I'd like to hear what you think, and please feel free to point out anything I might be misunderstanding.

@dreamorosi
Copy link
Contributor

dreamorosi commented Feb 17, 2025

Yes, I agree with you - I didn't think of that. Now that you mention it, the view above does look like my function is making two outbound calls, which is confusing.

I agree also that we shouldn't develop a specific workaround for this and we're perhaps better off documenting how to find these segments.

Before making a decision, let me do another test and see if the data within the http field of the segment is searchable/indexed or not. Based on that we can either document how to search it, or recommend that customers use annotations instead.

Thanks for the thoughtful answer, appreciate the discussion as well!

@leandrodamascena
Copy link
Contributor

Before making a decision, let me do another test and see if the data within the http field of the segment is searchable/indexed or not. Based on that we can either document how to search it, or recommend that customers use annotations instead.

Just a correction to my previous explanation. This is indexable because you can see it in the tracer document, but not searchable (because this is not in the root segment) using filter-parameters in the X-Ray Console nor using API for that. What customers can do if they create a subsegment with http fields is to use jq to parse the batch log and search for the value they want, but this is much more complex than using filter-parameters with annotations.

Something like this (I didn't test the script):

aws xray batch-get-traces --trace-ids "MY_TRACE_ID"  --query 'Traces[*].Segments[*].Document'  --output json | jq -r '.[][].http.........'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussing The issue needs to be discussed, elaborated, or refined feature-request This item refers to a feature request for an existing or new utility need-customer-feedback Requires more customers feedback before making or revisiting a decision need-more-information Requires more information before making any calls tracer This item relates to the Tracer Utility
Projects
Development

No branches or pull requests

5 participants