Skip to content
This repository was archived by the owner on Dec 3, 2024. It is now read-only.

Adding logic to process BR and BAR deletes. #29

Conversation

brahmaroutu
Copy link
Contributor

This PR will address BR and BAR deletion using Finalizers that will protect BR and BAR objects until the corresponding backend Bucket and BucketAccess objects are deleted. This will make sure that the clean up is done correctly without leaving any dangling objects.

/assign @wlan0 @rrati
/ping @jeffvance

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 4, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brahmaroutu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from rrati and wlan0 January 4, 2021 22:16
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 4, 2021
@brahmaroutu brahmaroutu force-pushed the controller_code_finalizer branch 2 times, most recently from 63a2838 to e62441a Compare January 4, 2021 22:22
@brahmaroutu
Copy link
Contributor Author

Fixes #12
Fixes #13
Fixes #14

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 13, 2021
@@ -57,13 +58,24 @@ func (b *bucketAccessRequestListener) Add(ctx context.Context, obj *v1alpha1.Buc
return nil
}

// Update is called in response to a change to BucketAccessRequesst. At this point

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "Requesst"

Copy link

@jeffvance jeffvance Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same typo still present

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -57,13 +58,24 @@ func (b *bucketAccessRequestListener) Add(ctx context.Context, obj *v1alpha1.Buc
return nil
}

// Update is called in response to a change to BucketAccessRequesst. At this point
Copy link

@jeffvance jeffvance Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For L48 do we want to validate that the existing BA is for this BAR?
Update: I see this is not necessary.

func (b *bucketAccessRequestListener) Update(ctx context.Context, old, new *v1alpha1.BucketAccessRequest) error {
glog.V(1).Infof("Update called for BucketAccessRequest %v", old.Name)
if (old.ObjectMeta.DeletionTimestamp == nil) &&
Copy link

@jeffvance jeffvance Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this condition guaranteed by k8s to exist upon delete of the BAR? Is it ever possible to miss the initial delete event but catch the following BAR update and now both timestamps != nil?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we should just check if new.DeletionTimestamp != nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, Fixed

// Update is called in response to a change to BucketAccessRequesst. At this point
// BucketAccess cannot be changed once created as the Provisioner might have already acted upon the create BucketAccess and created the backend Bucket Credentials
// Changes to Protocol, Provisioner, BucketInstanceName, BucketRequest cannot be allowed. Best is to delete and recreate a new BucketAccessRequest
// Changes to ServiceAccount, PolicyActionsConfigMapData and Parameters should be considered in lieu with sidecar implementation
func (b *bucketAccessRequestListener) Update(ctx context.Context, old, new *v1alpha1.BucketAccessRequest) error {
glog.V(1).Infof("Update called for BucketAccessRequest %v", old.Name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these logs only show the BAR name but omit its namespace. Thus the logged name may not be unique.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no change to reflect ns in log msgs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it by using ns/name format.

// Delete is in response to user deleting a BucketAccessRequest. The call here will respond by deleting a BucketAccess Object.
func (b *bucketAccessRequestListener) Delete(ctx context.Context, bucketAccessRequest *v1alpha1.BucketAccessRequest) error {
glog.V(1).Infof("Delete called for BucketAccessRequest %v", bucketAccessRequest.Name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In L92 below do we want to check that the existing BA belongs to this BAR?

Copy link

@jeffvance jeffvance Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In L97 below should err "IsMissing" be checked. IOW, a non-nil error could be an intermittent API failure and COSI should retry. How it is coded COSI won't retry.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In L103 below I think we need more robust code here. If we cannot get the BR we don't necessarily have an "invalidBucketRequest". It could be an intermittent API failure and a retry will fix that. It could be that the BR was deleted and we have a problem since the BAR requires a BR. In either case I think we need a more accurate error msg.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In L112 below is there a max on how long we wait? Is this built into some max value the controller is created with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In L92 below do we want to check that the existing BA belongs to this BAR? - Yes
In L97 below should err "IsMissing" be checked. IOW, a non-nil error could be an intermittent API failure and COSI should retry. How it is coded COSI won't retry. - Any error would trigger retry and this error will cause retry
In L103 below - BAR is not present we simply retry until we backoff?
L112 - retry is configurable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: L99 comment should say BA not found rather than bucket not found.

@@ -119,7 +131,7 @@ func (b *bucketAccessRequestListener) provisionBucketAccess(ctx context.Context,
UID: sa.ObjectMeta.UID}
// bucketaccess.Spec.MintedSecretName - set by the driver
bucketaccess.Spec.PolicyActionsConfigMapData, err = util.ReadConfigData(b.kubeClient, bucketAccessClass.PolicyActionsConfigMap)
if err != nil {
if err != nil && err != util.ErrNilConfigMap {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the CM now optional in the BAC? The KEP says it is required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what you meant, should I not check if CM is empty? Nothing guarantees that the CM is populated, we never catch that any other place.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that if the CM ref pointer is nil then we don't return an error here, which, to me, means that BAC.CM is optional. If a BAC admission controller is ensuring the BAC.CM ref field is filled in then we should should not have to check for a nil CM ref.

@@ -119,7 +131,7 @@ func (b *bucketAccessRequestListener) provisionBucketAccess(ctx context.Context,
UID: sa.ObjectMeta.UID}
// bucketaccess.Spec.MintedSecretName - set by the driver
bucketaccess.Spec.PolicyActionsConfigMapData, err = util.ReadConfigData(b.kubeClient, bucketAccessClass.PolicyActionsConfigMap)
if err != nil {
if err != nil && err != util.ErrNilConfigMap {
return err
}
// bucketaccess.Spec.Principal - set by the driver

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this and the other set by driver comment -- it's really set by the sidecar, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct.

@@ -131,6 +143,10 @@ func (b *bucketAccessRequestListener) provisionBucketAccess(ctx context.Context,
return err
}

if !util.CheckFinalizer(bucketAccessRequest, util.BARDeleteFinalizer) {
Copy link

@jeffvance jeffvance Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In L159 above what about the case of the BA already existing? This would be true in brownfield access.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brownfield case is not coded yet but my understanding is that brownfield case Bucket exist but BA/BAR works the same.

Copy link

@jeffvance jeffvance Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BA was just created so why do we need to check for the delete finalizer here? CheckFInalizer doesn't re-Get the object so we are dealing with the in-memory BA that was just created.
Nevermind. I mis-read the code and I see now that you're checking the BAR finalizer.

@@ -131,6 +143,10 @@ func (b *bucketAccessRequestListener) provisionBucketAccess(ctx context.Context,
return err
}

if !util.CheckFinalizer(bucketAccessRequest, util.BARDeleteFinalizer) {
bucketAccessRequest.ObjectMeta.Finalizers = append(bucketAccessRequest.ObjectMeta.Finalizers, util.BARDeleteFinalizer)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this append work if metadata.Finalizers is nil (empty)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fine as it is empty array.

@@ -131,6 +143,10 @@ func (b *bucketAccessRequestListener) provisionBucketAccess(ctx context.Context,
return err
}

if !util.CheckFinalizer(bucketAccessRequest, util.BARDeleteFinalizer) {
Copy link

@jeffvance jeffvance Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't the finalizer be added to the bucketAccess.Metadata above, before the Create call? If this works, then you don't need to retry finalizer update block below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finalizers to BA are addressed in the Provisioner as discussed.

bucketaccess := b.FindBucketAccess(ctx, bucketAccessRequest)
if bucketaccess == nil {
// bucketaccess for this BucketAccessRequest is not found
return util.ErrBucketAccessDoesNotExist
Copy link

@jeffvance jeffvance Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an error? The BA is gone. Or do we want to retry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a know error that we need not retry.

@brahmaroutu brahmaroutu force-pushed the controller_code_finalizer branch from e62441a to b104ba1 Compare January 18, 2021 22:57
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 18, 2021
@brahmaroutu brahmaroutu force-pushed the controller_code_finalizer branch 2 times, most recently from 6700e47 to 7b00219 Compare March 13, 2021 05:59
@wlan0
Copy link
Contributor

wlan0 commented Mar 15, 2021

@brahmaroutu pls fix build errors

@brahmaroutu brahmaroutu force-pushed the controller_code_finalizer branch from 7b00219 to 3b21a48 Compare March 16, 2021 02:53
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 6, 2021
@brahmaroutu brahmaroutu force-pushed the controller_code_finalizer branch from 3b21a48 to 70a3891 Compare April 20, 2021 04:26
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 20, 2021
@brahmaroutu
Copy link
Contributor Author

@wlan0 Can you please review

@k8s-ci-robot
Copy link
Contributor

@brahmaroutu: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 23, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 22, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 21, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
7 participants