-
Notifications
You must be signed in to change notification settings - Fork 114
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adrianludwin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign @rjbez17 Also CC @erikgb , please feel free to leave comments but I'd like Ryan to approve since he has a bit more context here. The HRQ implementation I allude to in the commit description is at https://github.com/adrianludwin/hierarchical-namespaces/tree/hrq. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit and a question, otherwise lgtm
Instead of having the HC reconciler directly enqueue other reconcilers, it now has a level of indirection through the forest. This allows the anchor and object reconcilers - and, soon, the Hierarchical Resource Quota reconcilers - to be notified of changes to the hierarchy, without the HC reconciler having to know about them. (This change was prompted by my work on HRQs, which I felt would overly complicate the HC reconciler). At the same time, I've removed the enqueues of the HNC Config reconciler. Instead of reconciling whenever there was a change, with a rate-limited maximum, it's now just on a periodic ticker. Finally, I used the SetupWithManager functions as constructors to take the initialization of certain fields out of internal/setup and put them into the per-feature packages. Tested: all e2e tests still work (I didn't try the ones that require repair but those should be unaffected). Also tested this with my HRQ changes and HRQ works correctly and fits in with minimal changes to the rest of HNC.
Oooh, that's a good idea. I'll add that to #182.
…On Tue, Apr 5, 2022 at 10:31 AM Ryan Bezdicek ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In internal/objects/reconciler.go
<#180 (comment)>
:
> + // since this is in a goroutine, we can't return it. Therefore, we need to handle the error
+ // ourselves.
+ attempts := 0
+ for {
+ attempts++
+ err := r.enqueueExistingObjects(context.Background(), log, nsnm)
+ if err == nil {
+ break
+ }
+ log.Error(err, "Couldn't list objects")
+ time.Sleep(10 * time.Second)
+ if attempts == 3 {
+ // We can get into a permanent failure loop if, for example, the GVK has been deleted from
+ // the server (i.e. it was a CRD). Rather than looping forever, let's restart and hope that
+ // fixes it.
+ log.Info("Couldn't list objects for 30s. Something's gone very wrong; hopefully a restart will fix it.")
Yeah that makes sense. We could also just add some logging here.
Alternatively, we could just update the healthz endpoint in this case to
tell the control plane we aren't healthy. We could then loop forever in the
code and lets k8s restart us based on liveness probes. If the loop recovers
we could update the endpoint to be healthy again. It would also give a
better 'audit trail' of HNC health for monitoring tools as it'll show as
unhealthy in k8s instead of random restarts.
However, that seems like its out of scope of this specific PR.
—
Reply to this email directly, view it on GitHub
<#180 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE43PZGDCOQA7E3OYDC5AALVDRFCHANCNFSM5SQLCWUA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
/lgtm |
Instead of having the HC reconciler directly enqueue other reconcilers,
it now has a level of indirection through the forest. This allows the
anchor and object reconcilers - and, soon, the Hierarchical Resource
Quota reconcilers - to be notified of changes to the hierarchy, without
the HC reconciler having to know about them. (This change was prompted
by my work on HRQs, which I felt would overly complicate the HC
reconciler).
At the same time, I've removed the enqueues of the HNC Config
reconciler. Instead of reconciling whenever there was a change, with a
rate-limited maximum, it's now just on a periodic ticker.
Finally, I used the SetupWithManager functions as constructors to take
the initialization of certain fields out of internal/setup and put them
into the per-feature packages.
Tested: all e2e tests still work (I didn't try the ones that require
repair but those should be unaffected). Also tested this with my HRQ
changes and HRQ works correctly and fits in with minimal changes to the
rest of HNC.