Considering the usage of strip_url to name a node as a none perfect solution #64

pfreixes · 2018-05-30T14:13:26Z

Right now most of the extensions that implement patches or machinery to allow instrumentalize HTTP clients such as Aiohttp, Requests or HTTPLib use by default the strip_url as fallback when there is no name given [1] or even as a primary way to name a subsegment [2][3].

Relying on the output of this function when there is no an alternative name, the number of nodes that are created in the AWS Xray console is as many different URLrs has been generated, so the following URLs will end up creating different nodes each one. While common sense says that all of them refer to the same service:

>>> strip_url("http://localhost/resource/1?debug=true")
'http://localhost/resource/1'
>>> strip_url("http://localhost/resource/2")
'http://localhost/resource/2'
>>> strip_url("http://localhost/resource/1,2,3")
'http://localhost/resource/1,2,3'
>>> strip_url("http://localhost/resource/1,2,3")
'http://localhost/resource/1,2,3'

So, IMO we must address this situation or at least give to the user a way of inferring the name the segment based on the URL hosts or any strategy that allows the user to group those queries that belong to the same node. Taking as example the previous set of URLs all of them will be named as http://localhost.

I guess there is no a silver bullet for that, but before start digging into a proper solution I would like to gather your feedback about the issue by itself, and ofc if you have any proposal I will like to hear them

[1] https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/aiohttp/client.py#L25
[2] https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/requests/patch.py#L29
[3] https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/httplib/patch.py#L54

The text was updated successfully, but these errors were encountered:

haotianw465 · 2018-05-30T20:47:45Z

Hi,

Thank you for your time for the writeup. I agree with you. With the presence of path based routing, the SDK doesn't have enough information to do the grouping correctly. In your example it is very likely that the resource/* is served by a single service. But there are also counterexamples: data/1 is routed to control-plane fleet while data/2 is routed to data-plane fleet.

I think it really depends on who owns those services. If you own those services that serving resource/* and they are also integrated with X-Ray, they will also actively emit segments. The X-Ray service graph will respect the segment emitted from the server side other than caller. So if all resource/1, resource2 are served with a Django app which emits segment named my_service, then in service graph you will only see a node called my_service.

If you don't own the service that serving those requests, then the question comes down to if you want to aggregate them or not and aggregate up to which level on the url schema. Generally I would recommend to not aggregate anything because you don't know things behind the scene and it can be difficult to identify issues if all urls are aggregated. But I definitely would like to know more about your use case here. The SDK will for sure provide better customer experience by having more flexibility on user configuration.

Please let me know your thoughts.

pfreixes · 2018-06-04T05:55:23Z

Thanks for your answer @haotianw465, some comments about your comments

I think it really depends on who owns those services. If you own those services that serving resource/* and they are also integrated with X-Ray, they will also actively emit segments. The X-Ray service graph will respect the segment emitted from the server side other than caller. So if all resource/1, resource2 are served with a Django app which emits segment named my_service, then in service graph you will only see a node called my_service

Well in an organization that is starting to adopt the tracing technology, where only a few of services have this technology in place it's the perfect scenario which can't trust on the caller to name the node with the proper service name. True that at least for all of the services that you own at some point once the tracing technology is adopted all of the nodes will be named with the proper name. But it will take some time.

In any case, there is the case when you are calling an external service that you don't have any kind of control. In that case, the name of the node that will persist will be always the URL with all of the issues that I've commented.

Going back to your example of having two different URLs that are prefixed by the same hostname but can end up in a different service, data/1 and data/2. This is a scenario is quite similar to one of the use cases in our infrastructure. Where, there is an HTTP middleware that implements some commonalities such as Auth, Rate limiting and routes the traffic to downstream services using part of the URI to identify it univocally. In that case, I'm still thinking that the best option is to use the hostname to identify the node. So, having a unique node in the AWS Xray console that identifies all of the calls to this intermediate layer, while later, the downstream services that are behind to each URL path will have the chance to be identified also in the AWS Xray console with their proper name.

And last but not least take into account that the full URL is always recorded as a tag value.

I'm still thinking that the first part of the URL is the least bad.

haotianw465 · 2018-06-04T21:12:53Z

Thank you for the explanation. I totally understand your concerns and this use case is very reasonable and common. I'm open to discuss a way of configuring custom url name capture. Please let me know if you have any suggestions of how you want to conveniently configure url renaming.

We have a dynamic naming configuration for middleware to name segment based on host header from incoming request. See https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-python-middleware.html#xray-sdk-python-middleware-naming. Please let me know if you would like to see something similar on subsegment naming for outgoing http requests.

TheSkorm · 2019-05-09T11:04:45Z

I just recently ran into this issue with a backend that uses url parameters for calling an external service.

What I was hoping for was either a config where I could something like

http://hostname/path/*/blah/*

where * gets aggregated with a placeholder in the uploaded segments.

Or alternatively prior to using requests being able to specify a custom name for that segment.

chanchiem · 2019-05-13T17:35:30Z

Hey, thanks for the feedback! We'll definitely take this into consideration for the UX for this feature request. Do you have any recommendations for how custom names for downstream requests should be specified at the SDK? Would similar to what was mentioned above by Haotian465 be a good experience for naming these traces? https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-python-middleware.html#xray-sdk-python-middleware-naming

TheSkorm · 2019-05-13T17:50:30Z

It's pretty close to what I desire and would certainly make a good first pass of the feature. The only enhancement I would consider is making it path aware as http://test/0123/blah and http://test/0123/foo/blah would be classified as http://test/*/blah under a basic wild card approach when really they should be two entries.

metaskills · 2019-09-06T11:21:39Z

Seen this issue in some of our Python work at Custom Ink. Subscribed to the issue to see where we land on this solution. Thanks ahead of time!

In the meantime, I think we have utilized wrapt to freedom patch record_subsegment and that seems to have been working well.

sreid · 2019-11-13T01:28:23Z

I have also ended up with a graph much like @TheSkorm has. As an intermediary solution, would it be possible to configure in such a way so that you have the option of only using the domain name? (a boolean config vs custom path matching)

chanchiem · 2019-11-21T19:27:41Z

Can you elaborate more with what you mean?
As an example, let's say we have a key AggregateDomainName and it's a boolean flag.

Setting it to true would mean that any subsegment which has the same domain name would be aggregated into a single node by having their subsegments all have the same name with just the domain. This would mean that https://amazon.com/get/food and https://amazon.com/get/some/books would both aggregate as https://amazon.com/.

Is this an accurate depiction of what you're requesting?

sreid · 2019-11-22T17:00:13Z

Yes, that's along the lines of what I was thinking. It's not as configurable as the wildcard style matching mentioned previously but perhaps easier to implement?

Right now the requests tracing is not helpful in my particular situation, as there is no aggregation on response times/codes for HTTP requests (my graph looks similar to @TheSkorm )

chanchiem · 2019-12-05T00:37:55Z

Thanks for the response. We will consider both cases and mark this accordingly as a feature request.

The first case would be to allow customers to enable wildcards for url names and aggregate them based on the expression:
For example,
http://hostname/path/*/blah/* would aggregate the following as the same nodes
http://hostname/path/somepath/blah/unknown
http://hostname/path/diffpath/blah/somestring

The second case would be to aggregate whole nodes based on their host names:
This would mean that https://amazon.com/get/food and https://amazon.com/get/some/books would both aggregate as https://amazon.com/.

In either case, we would need a centralized system to be able to keep track of downstream calls. It would probably be better to utilize existing centralized sampling rules to to aggregate known domain names together, and have the SDKs name the downstream calls accordingly.

chanchiem · 2019-12-10T18:13:40Z

It turns out that the subsegment name should only contain the hostname of the endpoint that it's being targetted and has been a behavior consistent across all our other SDKs except this one.

WIth PR #192, does this fix the issue you guys were having? All then nodes were intended to be aggregated and not be differentiated into different nodes for each unique path.

sreid · 2019-12-20T19:48:44Z

Can confirm that the 2.4.3 release that includes #192 fixes the issue enough for me to make requests tracing usable. Thanks!

srprash · 2019-12-30T16:39:13Z

Hi @sreid
That's great! Hope this fixes the issue for everyone. In case the issue still persists, feel free to reopen this or create a new one.

haotianw465 added the discussion label May 31, 2018

haotianw465 added enhancement and removed discussion labels Jun 26, 2018

chanchiem mentioned this issue Dec 5, 2019

URL path naming aws/aws-xray-sdk-ruby#37

Closed

srprash closed this as completed Dec 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Considering the usage of strip_url to name a node as a none perfect solution #64

Considering the usage of strip_url to name a node as a none perfect solution #64

pfreixes commented May 30, 2018

haotianw465 commented May 30, 2018

Uh oh!

pfreixes commented Jun 4, 2018 •

edited

Loading

Uh oh!

haotianw465 commented Jun 4, 2018 •

edited

Loading

Uh oh!

TheSkorm commented May 9, 2019

Uh oh!

chanchiem commented May 13, 2019

Uh oh!

TheSkorm commented May 13, 2019

Uh oh!

metaskills commented Sep 6, 2019 •

edited

Loading

Uh oh!

sreid commented Nov 13, 2019

Uh oh!

chanchiem commented Nov 21, 2019

Uh oh!

sreid commented Nov 22, 2019

Uh oh!

chanchiem commented Dec 5, 2019

Uh oh!

chanchiem commented Dec 10, 2019

Uh oh!

sreid commented Dec 20, 2019

Uh oh!

srprash commented Dec 30, 2019

Uh oh!

Considering the usage of strip_url to name a node as a none perfect solution #64

Considering the usage of strip_url to name a node as a none perfect solution #64

Comments

pfreixes commented May 30, 2018

haotianw465 commented May 30, 2018

Uh oh!

pfreixes commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haotianw465 commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheSkorm commented May 9, 2019

Uh oh!

chanchiem commented May 13, 2019

Uh oh!

TheSkorm commented May 13, 2019

Uh oh!

metaskills commented Sep 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sreid commented Nov 13, 2019

Uh oh!

chanchiem commented Nov 21, 2019

Uh oh!

sreid commented Nov 22, 2019

Uh oh!

chanchiem commented Dec 5, 2019

Uh oh!

chanchiem commented Dec 10, 2019

Uh oh!

sreid commented Dec 20, 2019

Uh oh!

srprash commented Dec 30, 2019

Uh oh!

pfreixes commented Jun 4, 2018 •

edited

Loading

haotianw465 commented Jun 4, 2018 •

edited

Loading

metaskills commented Sep 6, 2019 •

edited

Loading