Skip to content

httplib support #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 23, 2018
Merged

httplib support #19

merged 8 commits into from
Feb 23, 2018

Conversation

thehesiod
Copy link
Contributor

@thehesiod thehesiod commented Feb 13, 2018

This is what we're going to use in production. It's important for tracing things like the old google API, which uses httplib2 which uses httplib.

NOTE: perhaps this should replace the requests patch as requests uses httplib

fixes: #20

@thehesiod
Copy link
Contributor Author

oo, should get the unittests setup via Travis/etc

@thehesiod thehesiod changed the title [WIP] initial httplib patch httplib patch Feb 13, 2018
@thehesiod thehesiod changed the title httplib patch httplib support Feb 13, 2018
@thehesiod
Copy link
Contributor Author

one question for this PR is it has three segments, I'm not sure I can coalesce it to two given redirects

Copy link
Contributor

@haotianw465 haotianw465 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the PR and we really appreciate your time for adding extra support. I have left a few comments. No major issue so I expect this can be merged soon.

@@ -10,6 +10,7 @@
'requests',
'sqlite3',
'mysql',
'httplib', # TODO: perhaps requests should map to this below
Copy link
Contributor

@haotianw465 haotianw465 Feb 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the requests patch for now. If a user's intention is to only patch requests but has other libraries depend on httplib, then patching at a lower level may result in unexpected behavior. IMO the SDK should only patch httplib when users explicit do so.

A small paragraph on https://github.com/aws/aws-xray-sdk-python/blob/master/docs/thirdparty.rst explaining httplib/httplib2/requests dependency and which to patch could be very helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed comment, and added some docs



# ? is not a valid entity, and we don't want things after the ? for the segment name
def _strip_url(url: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this helper method to aws_xray_sdk.ext.util? You use it on requests patch as well. It can be further used to strip any url string on any library patcher.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW as a helper method a None check could be useful in case the caller forgets to do the check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved, didn't put the None check as that's a fairly low-level API and you'll get an exception of operating on a 'None' already...thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest make it no-op if input is None. This becomes a general helper under a utility module so it could be used on logging, other patching code etc by other developers. It is bad if the SDK throws a None type error from this part.

Also a nitpick could you remove the prefix _ since its usage is not private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

subsegment.put_http_meta(http.METHOD, xray_data.method)
subsegment.put_http_meta(http.URL, xray_data.url)
subsegment.put_http_meta(http.STATUS, instance.status)
subsegment.apply_status_code(instance.status)
Copy link
Contributor

@haotianw465 haotianw465 Feb 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to manually call apply_status_code. put_http_meta has logic to adjust fault/error/throttle flags if a status code is added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, probably copied from an old version of the AWS SDK, removed

xray_data = _XRay_Data(method, instance.host, xray_url)
setattr(instance, _XRAY_PROP, xray_data)

return xray_recorder.record_subsegment(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to record this local computation as a subsegment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC if the connect fails, getresponse will not be called.


wrapt.wrap_function_wrapper(
'http.client',
'HTTPConnection.getresponse',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are patching both HTTPConnection.getresponse and HTTPResponse.read. Is the latter some duplicate work since the former already set url, method and status?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow, on the read you're reading the body, which you may not do on 204s, and you want all the information from the requests to be marked on the subsegment of the read in case the response segment is sampled out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood.

url = 'http://{}/status/{}?foo=bar&baz=foo'.format(BASE_URL, status_code)
requests.get(url)
subsegment = xray_recorder.current_segment().subsegments[1]
assert subsegment.name == url
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this assertion pass when query string is stripped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, bug

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your patching code you use the string before "?" as the subsegment name.
So to verify the http subsegment is named correctly, should the test asserts name == http://{}/status/{} and pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

xray_recorder.clear_trace_entities()


def test_ok():
Copy link
Contributor

@haotianw465 haotianw465 Feb 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure your intention on using requests syntax to test httplib patch. This unit test is valid given the knowledge of requests depends on httplib. But I feel like this is a little bit anti-pattern since the scope is beyond httplib and involves requests implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrote to use httplib

@@ -28,7 +28,7 @@ def construct_ctx():

def test_ok():
status_code = 200
url = 'http://{}/status/{}'.format(BASE_URL, status_code)
url = 'http://{}/status/{}?foo=bar'.format(BASE_URL, status_code)
requests.get(url)
subsegment = xray_recorder.current_segment().subsegments[0]
assert subsegment.name == url
Copy link
Contributor

@haotianw465 haotianw465 Feb 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about the segment name assertion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@thehesiod
Copy link
Contributor Author

is it possible to quickly add travis or circle ci support to this project? It's pretty simple and free and I think will help a lot, here's an example: https://github.com/aio-libs/aiobotocore/blob/master/.travis.yml

@haotianw465
Copy link
Contributor

The PR looks good on most part. But I would like to point out some concerns on the structure of the subsegments this patching code is generating.

In a steady state one http outbound call is broken into two subsegments. One is for pre_req and one is for getresponse. These two are siblings with the same name (the url without query string). The service graph statistics will be impacted on both call rate and avg latency on that node.

Ideally in a successfully case there should be only one subsegment representing one complete round trip. In a failure case there is one subsegment representing the failed connection.

I understand the ideal case could be a challenge given this is a low level library and a logical operation is broken into several functions. But on the X-Ray service graph level one endpoint should be abstracted to one node with correct statistics.

Another question is that you are saying one http call will generate three subsegments? From the PR I can only see two.

@thehesiod
Copy link
Contributor Author

thehesiod commented Feb 16, 2018

ya I agree, I'm not sure the best way to proceed. Perhaps the _prep_request segment can be cancelled if _xray_traced_http_client is reached, thoughts? Further, we need to make sure each redirect is captured...I think http_request_processor + http_response_processor takes care of this. I'll add a unittest after we decide what to do.

the third segment will come in if you call read() on the body of the response.

@haotianw465
Copy link
Contributor

I agree on only keep the subsegment for actual internet round trip on a successful case. Also read() is probably not necessary here as it is purely cache read? Otherwise your avg latency will be cut almost in half due to those read() operations.

@thehesiod
Copy link
Contributor Author

ok cool, will update ASAP. Read should read the body of the response, normally the headers come first, followed by optionally reading the body (remember the body can be megabytes in length). So 204s will have no body and no subsegment, whereas something like downloading a 2MB file will.

@thehesiod
Copy link
Contributor Author

another reason is that you may perform logic between getting the headers and the read and don't want to reflect that processing time in your read segment.

@thehesiod
Copy link
Contributor Author

thehesiod commented Feb 21, 2018

furthermore you may have multiple reads (it can get complicated quickly) :) If you have a fixed-size connection pool now imagine you may want even more segments:

  1. wait for connector to be available from pool
  2. connect time
  3. wait for headers
    3.1) if redirect, goto 2
  4. (optional) read(s) depending on your chunk size and content length

@haotianw465
Copy link
Contributor

Agreed. It really depends on the use cases and having the SDK to be able to provide more insights on a single logical operation is always better. But this could be a good first step.

The main thing to keep in mind is the granularity on the segments structure. You can take a look at X-Ray Go SDK as an example: https://github.com/aws/aws-xray-sdk-go. In the screenshot shown at landing page you can see the DynamoDB call as a top level parent and its children subsegments provides insights on the timing on DNS lookup and SSL handshake and retries etc. But you only drill down to this level of details when you see a "slow DynamoDB operation".

As long as the top level segment capture provides necessary information so service back-end can aggregate them correctly, more children subsegments are always useful.

@thehesiod
Copy link
Contributor Author

cool, will get to the changes soon hopefully, working with two broken arms at the moment due to VB accident :)

@thehesiod
Copy link
Contributor Author

thehesiod commented Feb 22, 2018

@haotianw465 so I take it back, after investigating this a bit more, I realized that each subsegment is a separate operation, note the following in the test:

    conn.request(method, path)  # sends headers
    resp = conn.getresponse()  # waits for response

this could then further have a resp.read()

so my current subsegment layout I believe is correct, where you have one subsegment per operation, which can be separated by user code.

@haotianw465
Copy link
Contributor

OK cool. In that case user will need to use @xray_recorder.capture() annotation to group httplib operations if necessary.

@thehesiod
Copy link
Contributor Author

ya I just realized I can't create a segment because I won't know when it should end beyond when the instance gets GC'd ;)

@haotianw465
Copy link
Contributor

You will have to do the following per README.md about function capture:

@xray_recorder.capture('example.com')
def myfunc():
    conn.request(method, path)  # sends headers
    resp = conn.getresponse()
    resp.read()

Then you will have one subsegment named "example.com" which has three children representing those three operations, if this is what you expect.

@thehesiod
Copy link
Contributor Author

cool, so anything left?

@haotianw465
Copy link
Contributor

No it looks good. I will approve and merge this PR.

@haotianw465 haotianw465 merged commit 0a9ce14 into aws:master Feb 23, 2018
@haotianw465
Copy link
Contributor

The httplib_test failed under Python2.7. Did you run it successfully on your side?

@thehesiod
Copy link
Contributor Author

will try now

@thehesiod
Copy link
Contributor Author

fixes in #23, let me know if you want in a clean branch, or squashing is ok on your side

@haotianw465
Copy link
Contributor

No worries. Already merged.

@thehesiod
Copy link
Contributor Author

@haotianw465 looks like I forgot to update changelist.rst and add this as a new unrelease feature

@haotianw465
Copy link
Contributor

You can have a PR adding this feature to "unreleased".

The other thing I noticed is that since httplib_test monkey patches httplib during test run, it breaks other two tests: requests_test under py2 and pynamodb_test under py2/py3. This is because before adding httplib support, all the SDK patched libraries are independent. But httplib is a dependency of requests and pynamodb.

The fix is to have some unpatch logic on httplib patcher so that upon httplib_test finish it reverts the module in the runtime to the original one so other tests are not impacted. I'd really appreciate if you have extra time to this fix.

The broken tests also show us how you can deliberately patch httlib (even you don't directly use it) to get more insights for some libraries depend on it. But also this would be dangerous as per https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/httplib/patch.py#L57 this line throws an error if _prep_request is not called. If an upstream library (in this case requests) use its own prepare_request method, this will break. It would be better if xray_data = getattr(instance, _XRAY_PROP) also handles key doesn't exist so that all three methods patched in httplib can be called independently.

These are some useful information I think I could share with you since you mentioned on patching httplib for X-Ray usage.

@thehesiod
Copy link
Contributor Author

ya, this module is missing unpatch logic from all the modules, will open a PR with fixes

@thehesiod
Copy link
Contributor Author

ok opened #24, however I wasn't able to reproduce your second issue in python3, is this a python2 only issue, or specific requests issue?

@haotianw465
Copy link
Contributor

I saw the following on tox run at py2.7

.tox/py27/lib/python2.7/site-packages/requests/models.py:745: in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
.tox/py27/lib/python2.7/site-packages/urllib3/response.py:436: in stream
    data = self.read(amt=amt, decode_content=decode_content)
.tox/py27/lib/python2.7/site-packages/urllib3/response.py:384: in read
    data = self._fp.read(amt)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

wrapped = <bound method HTTPResponse.read of <httplib.HTTPResponse instance at 0x10c1d0440>>, instance = <httplib.HTTPResponse instance at 0x10c1d0440>, args = (10240,)
kwargs = {}

    def _xray_traced_http_client_read(wrapped, instance, args, kwargs):
>       xray_data = getattr(instance, _XRAY_PROP)
E       AttributeError: HTTPResponse instance has no attribute '_xray_prop'

This might be a py2 specific issue and I don't have time to fully look at requests source code but this just reminds me of patched httplib requires the upstream library to call HTTPConnection._send_request so _XRAY_PROP can be set.

I'm not sure if this is the one you was unable to reproduce.

@thehesiod
Copy link
Contributor Author

seems to be fixed with 5816eab in my new PR

@haotianw465
Copy link
Contributor

For unit tests unpatch should solve all the problems. My point is that the following patch code

def patch():
    wrapt.wrap_function_wrapper(
        httplib_client_module,
        'HTTPConnection._send_request',
        _send_request
    )

    wrapt.wrap_function_wrapper(
        httplib_client_module,
        'HTTPConnection.getresponse',
        _xray_traced_http_getresponse
    )

    wrapt.wrap_function_wrapper(
        httplib_client_module,
        'HTTPResponse.read',
        _xray_traced_http_client_read
    )

In order to capture the 2nd and 3rd operation correctly, 1st must be called, as 2nd and 3rd operation assumes an attribute _XRAY_PROP set by 1st. This is not an issue for using httplib itself since its standard usage is to call the 1st at the beginning. Just putting a note here, not a blocking issue.

@thehesiod
Copy link
Contributor Author

ya I also tested this with requests and reproduced the issue you saw, and fixed with the above commit. From the impl of HTTPConnection I think it'll always get called or else you'd be rewriting major portions of the underlying class

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

subsegment name special characters
2 participants