Fix bucket* controller framework #25

wlan0 · 2021-03-08T08:10:11Z

backoff exponentially on errors
ensure correct behavior when multiple operations on the same object pile on top of one another

k8s-ci-robot · 2021-03-08T08:10:19Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wlan0

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [wlan0]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jeffvance · 2021-03-09T02:09:01Z

controller/controller.go


 	// Ensure that multiple operations on different versions of the same object
 	// do not happen in parallel
-	c.OpLock(op)
-	defer c.OpUnlock(op)
+	c.OpLock(uuid)


Sorry, I am missing something here. How does tracking uuid prevent operating in parallel on different versions of the same object? Wouldn't this serialize on all versions of the object?

In our controller, we run a single shared informer between multiple crd resource event handlers. The single shared informer reports changes in bucket* objects as they occur. The events are processed by first encoding them in operation structs, and putting them into the tail of a workqueue. The workqueue implementation is what this PR mostly touches on.

The workqueue was previously holding operation objects (addOp, updateOp, deleteOp). On any event, these operation objects were put into the queue, and waited to be processed by the next available thread.

These objects were not comparable with one another, and that lead to a few problems:

Gven an object, if update occurs while add is waiting in the queue, it should only result in 1 event - add, with the updated fields

Gven an object, if update occurs while another update is waiting in the queue, then the second update is the only one that should be processed

The previous implementation, albeit technically eventually consistent, could not ensure the above dynamics. This was simply because it was not possible to determine new events on an object already present in the queue.

Now, for the uuid. This is a string field that is easily comparable (==), and by using the kubernetes UID as the unique value in this field, it is possible to compare two events occurring on the same object.

This change also allowed me to improve the exponential backoff characteristics. If new events occur while an operation on the same object is in back-off, the new event will not be processed before the back-off period is complete. i.e. the uuid positionin the queue is not changed. Instead, the value of opMap[uuid] is set to the new operation structure denoting the event. By decoupling the operation itself from the queuing order, we gain a lot of flexibility.

Now to answer your direct question,

Sorry, I am missing something here. How does tracking uuid prevent operating in parallel on different versions of the same object? Wouldn't this serialize on all versions of the object?

By using uuid, we do not serialize the steps. We either replace old events with new ones if both are updates or handle them differently based on the type of the events.

jeffvance · 2021-03-09T02:12:26Z

controller/controller.go

+			o.Indexer.Delete(o.Object)
+		} else {
+			glog.Errorf("Error deleting %s %s: %v", ns, name, err)
+		}


in default block can we insert the operator that was not handled, or is that logged earlier?

jeffvance · 2021-03-09T02:15:11Z

controller/controller.go

+func (c *ObjectStorageController) GetOpLock(op types.UID) *sync.Mutex {
+	lockKey := op
+	c.lockerLock.Lock()
+	defer c.lockerLock.Unlock()


~~should the defer be after L326 to handle the case of c.locker == nil?~~

Even though the docs say that deferred funcs can be called anytime after the defer statement itself, my understanding is that it always gets called only after returning from the function it is called from. i.e. only after the defer starts going out of scope.

- backoff exponentially on errors - ensure correct behavior when multiple operations on the same object pile on top of one another

brahmaroutu · 2021-03-10T20:21:37Z

/lgtm

Fix context definition

remove opaque parameters from delete request and return bucket id o…

k8s-ci-robot requested review from brahmaroutu and msau42 March 8, 2021 08:10

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 8, 2021

jeffvance reviewed Mar 9, 2021

View reviewed changes

Fix bucket* controller framework

64fea45

- backoff exponentially on errors - ensure correct behavior when multiple operations on the same object pile on top of one another

wlan0 force-pushed the master branch from 5d13f90 to 64fea45 Compare March 9, 2021 07:59

wlan0 mentioned this pull request Mar 10, 2021

Removing the processing of Sync events #22

Closed

k8s-ci-robot assigned brahmaroutu Mar 10, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 10, 2021

k8s-ci-robot merged commit 05acb5e into kubernetes-retired:master Mar 10, 2021

shanduur pushed a commit to shanduur/container-object-storage-interface-api that referenced this pull request Jun 6, 2024

Merge pull request kubernetes-retired#25 from rrati/context-fix

9c77f71

Fix context definition

BlaineEXE pushed a commit to BlaineEXE/cosi-api that referenced this pull request Jun 14, 2024

Merge pull request kubernetes-retired#25 from rrati/context-fix

52d17f0

Fix context definition

BlaineEXE pushed a commit to BlaineEXE/cosi-api that referenced this pull request Jun 14, 2024

Merge pull request kubernetes-retired#25 from wlan0/master

3bbacbb

remove opaque parameters from delete request and return bucket id o…

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix bucket* controller framework #25

Fix bucket* controller framework #25

wlan0 commented Mar 8, 2021

Uh oh!

k8s-ci-robot commented Mar 8, 2021

Uh oh!

jeffvance Mar 9, 2021 •

edited

Loading

Uh oh!

wlan0 Mar 9, 2021

Uh oh!

jeffvance Mar 9, 2021 •

edited

Loading

Uh oh!

wlan0 Mar 9, 2021

Uh oh!

jeffvance Mar 9, 2021 •

edited

Loading

Uh oh!

wlan0 Mar 9, 2021

Uh oh!

brahmaroutu commented Mar 10, 2021

Uh oh!

Uh oh!

Fix bucket* controller framework #25

Fix bucket* controller framework #25

Conversation

wlan0 commented Mar 8, 2021

Uh oh!

k8s-ci-robot commented Mar 8, 2021

Uh oh!

jeffvance Mar 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wlan0 Mar 9, 2021

Choose a reason for hiding this comment

Uh oh!

jeffvance Mar 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wlan0 Mar 9, 2021

Choose a reason for hiding this comment

Uh oh!

jeffvance Mar 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wlan0 Mar 9, 2021

Choose a reason for hiding this comment

Uh oh!

brahmaroutu commented Mar 10, 2021

Uh oh!

Uh oh!

jeffvance Mar 9, 2021 •

edited

Loading

jeffvance Mar 9, 2021 •

edited

Loading

jeffvance Mar 9, 2021 •

edited

Loading