-
Notifications
You must be signed in to change notification settings - Fork 182
Conversation
ee58269
to
94ec32b
Compare
Codecov Report
@@ Coverage Diff @@
## master #36 +/- ##
==========================================
+ Coverage 93.36% 93.59% +0.23%
==========================================
Files 11 11
Lines 829 859 +30
==========================================
+ Hits 774 804 +30
Misses 55 55
Continue to review full report at Codecov.
|
48223bf
to
4f815d4
Compare
watch/watch.py
Outdated
resp.close() | ||
resp.release_conn() | ||
|
||
if not keep or self._stop: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will result in all cached event to be delivered again. If we want to reconnect, we should catch the last resourceVersion and pass it to the watch call to make sure we continue receiving new events not old cached events. Also would be nice if we can fix this at root too (find the reason for timeout in http client and fix that). I think the server keeps the connection alive and it is the client that has a pre-defined timeout somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Let me do a little more checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for what it's worth, i indeed got this same kind of issue solved using @mbohlool recommendation and this kind of snippet:
DOMAIN = "kool.karmalabs.local"
....
resource_version = None
while True:
if resource_version is None:
stream = watch.Watch().stream(crds.list_cluster_custom_object, DOMAIN, "v1", "guitars")
else:
stream = watch.Watch().stream(crds.list_cluster_custom_object, DOMAIN, "v1", "guitars", resource_version=resource_version)
for event in stream:
obj = event["object"]
metadata = obj.get("metadata")
resource_version = metadata['resourceVersion']
....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some updates for the timeout setting:
There are two inputs available in the client can set timeouts:
a) timeout_seconds: This value will not used by client but included in the URL:
https://localhost:6443/api/v1/namespaces/default/pods?labelSelector=app%3Ddemo&timeoutSeconds=100&watch=True.
So, this setting is actually used by server.
b) _request_timeout
This can accept 2 type inputs, int or tuple with length equals to 2, checked at
kubernetes.client.rest.RESTClientObject.request()
The value will eventually set to the socket used for the connection.
b.1) if the input is tuple, the first value will actually been ignored.
b.2) when timeout happens, an exception will be raised, for example:
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='localhost', port=6443): Read timed out.
b.3) when the value is not set, the default value is None, socket has no timeout.
https://docs.python.org/2/library/socket.html#socket.getdefaulttimeout
Haven't figure out where/how server sets timeout, but I believe when we leave both timeout settings empty, client will keeps the connection alive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice update on the timeout. timeout_seconds
is what server expect for watch calls and if you don't set it, it won't timeout the connection. There is a hidden timeout on the request that fortunately the generated client don't know about and cannot set.
Even with not setting the timeout, the connection can fail or network can fail and it is a good idea to have this flag if we make it smart enough to continue the watch using resource_version
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain a little more about the "There is a hidden timeout on the request that fortunately the generated client don't know about and cannot set". Really courious about this....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continue the watch using resource_version updated.
Also, I did a little change. Since we already have "timeout_seconds" to let user set the timeouts, when the value is empty, I suppose this means watch forever, no extra flag is needed.
14847ad
to
ff02e98
Compare
ff02e98
to
aec1c52
Compare
resp.close() | ||
resp.release_conn() | ||
|
||
if timeouts or self._stop: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the user set timeouts but it has not been passed yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still trying to find the proof that the exit happens we saw at the beginning is caused by the client side.
I do find the flag 'min-request-timeout' in doc kube-apiserver, and it is said that the apiserver actually has a timeout setting by itself:
--min-request-timeout=1800: An optional field indicating the minimum number of seconds a handler must keep a request open before timing it out. Currently only honored by the watch request handler, which picks a randomized value above this number as the connection timeout, to spread out load.
And if it is not set, the default value is 1800:
https://github.com/kubernetes/apiserver/blob/master/pkg/server/config.go#L248
MinRequestTimeout: 1800,
So, base on what I know now: I don't think timeout will happen before the timeouts setting.
@lichen2013 This looks very good. Only one comment, otherwise good to go. |
@lichen2013 Only remaining item here is to reconnect if the user specified timeout is not passed yet (in addition to the case of not setting timeout which reconnect the watch always). I would want to include this in 4.0 client but that needs to be done soon. |
Fixes issue: kubernetes-client/python#124