-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Retry on API failure when connecting from builders #4902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f80fb1e
to
d9a9af1
Compare
connect=3, | ||
status=3, | ||
backoff_factor=0.5, # 0.5, 1, 2 seconds | ||
method_whitelist=('GET', 'PUT', 'PATCH', 'POST'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POST
could be a little dangerous here, but considering that we are using it to create a new BuildCommand I think it's necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the docs
By default, we only retry on methods which are considered to be idempotent (multiple requests with the same parameters end with the same state). See Retry.DEFAULT_METHOD_WHITELIST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be a little concerned about the POST retry as well, but these are the calls that we're currently failing on, so I'm not sure if there is anything we want to do differently here.
d9a9af1
to
eee2c36
Compare
Lint is failing with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix looks good! I don't have a strong opinion on POST support, it seems to be a requirement that we retry POST requests.
In the worst case, retrying on POST will give us duplicated BuildCommand; which is not good at all but at least it won't break anything. We could merge and investigate after deploy or maybe build something to de-duplicate these entries :/ |
Codecov Report
@@ Coverage Diff @@
## master #4902 +/- ##
==========================================
+ Coverage 76.65% 76.65% +<.01%
==========================================
Files 158 158
Lines 10057 10059 +2
Branches 1269 1269
==========================================
+ Hits 7709 7711 +2
Misses 2007 2007
Partials 341 341
|
I used 3 retries for now at 0.5, 1 and 2 seconds.
We can test with these values and we can increase later if we consider necessary.
Closes #2258