-
Notifications
You must be signed in to change notification settings - Fork 153
Feature request: HTTP request/response annotations #1179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @revmischa thank you for taking the time to open a feature request. As I have briefly mentioned on Discord: at the moment we don't use any of the Before considering the feature and putting it into the backlog, I would like some investigation to take place to make a small PoC and understand the impact and compatibility of this feature with Powertools. Like with all features, contributions are more than welcome, and in case this is left to the maintainers, having 👍 on the feature request will help increasing its priority in our backlog. If anyone is interested in picking up this issue, please leave a comment below to express your interest and agree on next steps. The maintainers will be happy to support you with clearing doubts or questions on the scope and codebase. |
For my purposes a simpler version of this would be to let me manually set the HTTP path, method, status code directly on the segment. But I have no idea how to do this and couldn't find any documentation on how to manually set those attributes. |
this is https://www.npmjs.com/package/aws-xray-sdk-express feature, this is very useful when we are tracing some ip address or api URL in a easy way |
FYI @revmischa, I have been looking at the I'll keep looking, also using the example that was just posted above. |
This comment was marked as outdated.
This comment was marked as outdated.
Please keep open |
This comment was marked as off-topic.
This comment was marked as off-topic.
With the upcoming Event Handler feature incoming, we'll need a way to add these annotations to segments generated by Tracer. I've been looking at the implementation more closely and I think that we should implement a version of this method so that the I'm adding this to the backlog so we can work on it at some point in the coming months. The issue is open for contributions. If you’re interested, please leave a comment before starting a PR. Let’s discuss and agree on a design for the new feature before implementing it. I don’t want to expose the method as-is from the X-Ray SDK. |
Hi @dreamorosi, I’d love to take on this feature request and implement HTTP metadata annotation for X-Ray segments in the Tracer utility. |
It would also be nice to have as middy middleware for APIGWv2 handlers |
Hi @VatsalGoel3 sorry for the delay and confusion, I started doing a PoC for this feature and I realized that we need to first figure out a couple details before we can start implementing. I'm changing the labels and status of the issue back, and I will share more details in a few minutes. |
As I mentioned in the comment above, I've been looking into this and I am unsure about how to proceed. I've first tested the X-Ray SDK instrumentation of Express.js without Powertools for AWS, to see how they are created and I found out that the sdk creates brand new segments for each request. This is kind of unusual, at least related to how our Tracer works, because we never create new unrelated segments but rather subsegments of the function segment. To explain it more visually, this is the tree that X-Ray creates when you instrument:
When doing this, the CloudWatch Traces page populates the HTTP Method & URL Address fields. When working with Tracer from Powertools instead, the segment you get has this structure:
This is because in Lambda you cannot mutate the function segment but can only either: 1/ create subsegments below it, or 2/ create entirely unrelated segments - which we don't normally do. This is a bit of a problem in terms of this feature request. At least as far as I could test, the CloudWatch Traces UI only populates the HTTP Method and URL Address fields when that data is part of the top-most segment, but doesn't do it for subsegments. For example, even if I attach the But it's shown in both the "Raw data" and single trace UIs: I can't know for sure, but this type of representation seems to have been thought primarily for other compute environments other than Lambda. I say this because in Lambda you still want to maintain the lineage/relationship between your application segment and the function one, which is lost in this case. Important Because of this, I'm not inclined to follow the same implementation as the X-Ray SDK and instead I'm inclined towards adding the I think overall having the data in the subsegment, and having that subsegment maintain the lineage with the request is superior than just having the UI show the field populated. Especially because the SDK approach also completely breaks the trace map, which now looks like below, making it seem almost as if there were two separate clients and requests, rather than one: So to sum up, by adding the request/response data to the subsegment, the only decision point we have left is whether we want to this as a direct subsegment or as a sibling of the Powertools-generate done. Option 1 (child segment) The first option would be to just create a subsegment under the one that is already being created by the Powertools for AWS Tracer whenever you wrap your handler with a class method decorator or Middy.js middleware. The structure of the trace data would look like this:
And would generate these screens on the CloudWatch UI: PoC code for this optionimport { Tracer } from '@aws-lambda-powertools/tracer';
import { captureLambdaHandler } from '@aws-lambda-powertools/tracer/middleware';
import middy from '@middy/core';
import { parser } from '@aws-lambda-powertools/parser/middleware';
import { APIGatewayProxyEventV2Schema } from '@aws-lambda-powertools/parser/schemas/api-gatewayv2';
import { Segment } from 'aws-xray-sdk-core';
const tracer = new Tracer();
export const handler = middy()
.use(captureLambdaHandler(tracer))
.use(parser({ schema: APIGatewayProxyEventV2Schema }))
.handler(async (event) => {
const indexSegment = tracer.getSegment()!; // # index.handler
const subsegment = new Segment(
`## ${event.requestContext.http.method} ${event.rawPath}`,
tracer.getRootXrayTraceId(),
indexSegment.id
);
tracer.setSegment(subsegment);
const method = event.requestContext.http.method;
const userAgent = event.requestContext.http.userAgent;
// @ts-expect-error: we are in untyped land
subsegment.http = {
request: {
method,
user_agent: userAgent,
client_ip: event.requestContext.http.sourceIp,
x_forwarded_for: true,
url: `https://${event.requestContext.domainName}${event.rawPath}`,
},
};
const responseStatusCode = 200;
const responseBody = JSON.stringify({
message: 'Hello, World!',
});
const contentLength = Buffer.byteLength(responseBody, 'utf8');
const endTime = Date.now();
// @ts-expect-error: we are in untyped land again
subsegment.http.response = {
status: responseStatusCode,
content_length: contentLength,
};
subsegment.close();
tracer.setSegment(indexSegment);
return {
statusCode: responseStatusCode,
body: responseBody,
headers: {
'Content-Type': 'application/json',
'Content-Length': contentLength,
},
};
}); Option 2 (sibling segment) The second option would be to create a subsegment that is a sibling of the Powertools one. Depending on what you're doing in your function, i.e. if you're doing a router and not wrapping the handler with
With this approach, the trace data would be represented like this in the UI: The neat thing about this approach is that you get a clear view in the map of your application route, and if you have more than one in the same function, you also get multiple nodes. And more importantly, you also get them as separate nodes in the Trace Map, which you don't get in the option 1. PoC code for this optionimport { Tracer } from '@aws-lambda-powertools/tracer';
import { captureLambdaHandler } from '@aws-lambda-powertools/tracer/middleware';
import middy from '@middy/core';
import { parser } from '@aws-lambda-powertools/parser/middleware';
import { APIGatewayProxyEventV2Schema } from '@aws-lambda-powertools/parser/schemas/api-gatewayv2';
const tracer = new Tracer();
export const handler = middy()
.use(captureLambdaHandler(tracer))
.use(parser({ schema: APIGatewayProxyEventV2Schema }))
.handler(async (event) => {
const indexSegment = tracer.getSegment()!; // # index.handler
const subsegment = indexSegment.addNewSubsegment(
`## ${event.requestContext.http.method} ${event.rawPath}`,
);
tracer.setSegment(subsegment);
const method = event.requestContext.http.method;
const userAgent = event.requestContext.http.userAgent;
// @ts-expect-error: we are in untyped land
subsegment.http = {
request: {
method,
user_agent: userAgent,
client_ip: event.requestContext.http.sourceIp,
x_forwarded_for: true,
url: `https://${event.requestContext.domainName}${event.rawPath}`,
},
};
const responseStatusCode = 200;
const responseBody = JSON.stringify({
message: 'Hello, World!',
});
const contentLength = Buffer.byteLength(responseBody, 'utf8');
const endTime = Date.now();
// @ts-expect-error: we are in untyped land again
subsegment.http.response = {
status: responseStatusCode,
content_length: contentLength,
};
subsegment.close();
tracer.setSegment(indexSegment);
return {
statusCode: responseStatusCode,
body: responseBody,
headers: {
'Content-Type': 'application/json',
'Content-Length': contentLength,
},
};
}); Overall, between the two options I'm kind of leaning towards the second one. While option 1 is more consistent with the current model and gives you direct lineage of the request segment, option 2 still maintains a full trace all the way from the client to the route while also creating separate route nodes for each one of your request/response flows. What do we think? Would also like to hear @leandrodamascena's opinion here, since I think this is a new behavior also for Python. |
Hey, this is a super cool discussion about user experience and what we can do better for our customers, I really like it. I'll skip the part about XRAY Segments, because @dreamorosi already explained very well all the stuff about not being able to mutate the root segment, not being able to index/display http.url/http.response_code outside the root segment and others. BTW, It was all explained with rich details. Creating new subsegmentsWhen Event Handler comes out - this is also a reality in Python -, we will have routes resolving to functions, so, this is not what we are doing with standalone decorator and customers already can do that? The difference I see here are: 1/ the name of the segment/subsegment Why not we try to explore to improve the documentation with some information about adding annotations and making it searchable? Or even changing the standalone method to allow custom name? I don't know, just guessing here. Also, I look at the two images and I'm trying to think like an operations person who's on call and just got an alert that Lambda function I know that this isn't exactly what's happening. I know that you're aiming to enhance the user experience by adding more specific data to SPAN subsegments or even new nodes. But in X-Ray, the concept of having a node for "HTTP requests" typically indicates that one of our functions is calling an external endpoint. However, the way this is currently presented could potentially confuse our customers because this is actually the function route itself. In an APM environment, for example, where we could dive deeper into the function's code and analyze code-related operations, it might make sense to encapsulate this in an HTTP node. But with our current approach, I'm concerned that we might be addressing a very specific case while making the overall user experience more confusing. I'm not entirely sure if we should take action here and develop code for very specific scenarios that might not fully address the issue. Also, if the customer wants to do this by creating segments on its own and making it HTTP requests, that's already possible, you've demonstrated that on your examples. Given X-Ray's design constraints, it seems we can't fully meet the main request of this issue, so, perhaps a better approach would be to focus on providing more comprehensive guidance (documentation, real use case, examples, others) to our customers on how to handle situations like this. I'd like to hear what you think, and please feel free to point out anything I might be misunderstanding. |
Yes, I agree with you - I didn't think of that. Now that you mention it, the view above does look like my function is making two outbound calls, which is confusing. I agree also that we shouldn't develop a specific workaround for this and we're perhaps better off documenting how to find these segments. Before making a decision, let me do another test and see if the data within the Thanks for the thoughtful answer, appreciate the discussion as well! |
Just a correction to my previous explanation. This is indexable because you can see it in the tracer document, but not searchable (because this is not in the root segment) using Something like this (I didn't test the script): aws xray batch-get-traces --trace-ids "MY_TRACE_ID" --query 'Traces[*].Segments[*].Document' --output json | jq -r '.[][].http.........' |
Use case
I have a nextjs CDK construct with a simple APIGW lambda wrapper around nextjs. I want to annotate my traces with HTTP request and response metadata, e.g. "URL" and "method" and "status code"
I would like to use it here: https://github.com/jetbridge/cdk-nextjs/blob/main/assets/lambda/NextJsHandler.ts#L45
I'd like to see the status code, path, method, etc in xray:

Solution/User Experience
Perhaps something like
traceRequestResponseCycle
in https://docs.aws.amazon.com/xray-sdk-for-nodejs/latest/reference/AWSXRay.htmlAlternative solutions
Discussed on discord: https://discord.com/channels/1006478942305263677/1006527385409179678/1042309163818168380
Acknowledgment
The text was updated successfully, but these errors were encountered: