Skip to content
This repository was archived by the owner on Oct 28, 2024. It is now read-only.

🐛 Webhook caBundle issues for Virtual Cluster #125

Closed
vincent-pli opened this issue Jun 11, 2021 · 15 comments · Fixed by #145
Closed

🐛 Webhook caBundle issues for Virtual Cluster #125

vincent-pli opened this issue Jun 11, 2021 · 15 comments · Fixed by #145
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@vincent-pli
Copy link
Contributor

Not dig in too much to the code, but in my env, the webhook for VirtualCluster not work.
I get this when I try to create VirtualCluster:

Error from server (InternalError): error when creating "virtualcluster_1_nodeport.yaml": Internal error occurred: failed calling webhook "virtualcluster.validating.webhook": Post "https://virtualcluster-webhook-service.vc-manager.svc:9443/validate-tenancy-x-k8s-io-v1alpha1-virtualcluster?timeout=30s": x509: certificate signed by unknown authority

Then i check the ValidatingWebhookConfiguration and there is no caBundle at all in virtualcluster-validating-webhook-configuration.

After I modify the virtualcluster-validating-webhook-configuration and set the caBundle with the cluster's CA, everything works as exppected.

Seems we do not set the caBundle, am i missing something?

@christopherhein
Copy link
Contributor

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 11, 2021
@christopherhein
Copy link
Contributor

/assign @charleszheng44

@charleszheng44
Copy link
Contributor

@vincent-pli are you using openshift or native kubernetes?

@vincent-pli
Copy link
Contributor Author

@charleszheng44
Native kubernetes, actually kind.

@christopherhein
Copy link
Contributor

/retitle 🐛 Webhook caBundle issues for Virtual Cluster

@k8s-ci-robot k8s-ci-robot changed the title Webhook for Virtual Cluster do not work 🐛 Webhook caBundle issues for Virtual Cluster Jun 15, 2021
@charleszheng44
Copy link
Contributor

@vincent-pli May I know which version of kind are you using?

@vincent-pli
Copy link
Contributor Author

Sure, so you can reproduce the problem I hit? @charleszheng44

root@rentz1:~# kind version
kind v0.10.0 go1.15.2 linux/amd64

@charleszheng44
Copy link
Contributor

@vincent-pli sorry for the late reply. I run into the same issue when trying to set up VC on the Kind cluster, looks like the certificate assigned to the webhook does not work properly.

However, I can successfully set up the VC framework and create VC on Minikube, so this issue may be a Kind-specific issue. I will try to find out the cause. But at the same time, could you try out Minikube or other testing environments?

The Minikube version I used is 1.20.0, which uses the same version(v1.20) of Kubernetes as kind v0.10.0.

@vincent-pli
Copy link
Contributor Author

That's weird, I check some code, here:
https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/2e2add9bba1ec0c5104df0f64ce3c560f625bef8/virtualcluster/pkg/webhook/virtualcluster/virtualcluster_webhook.go#L155-L162

we do not set caBundle when creating and neither update it after creation.
So there is no chance to add a caBundle to the ValidatingWebhookConfiguration

for miniKube, i will take a try but I guess there should be some certification injection feature like cert-manager or some thing help to inject the caBundle
@charleszheng44

@charleszheng44
Copy link
Contributor

charleszheng44 commented Jun 17, 2021

@vincent-pli This is intentional. If the caBundle is not specified, the system trusted CAs will be used. The details can be found in the definition of WebhookClientConfig. I guess the system-trusted CAs on the Kind cluster is somehow different from the Kubernetes cluster with physical nodes.

@vincent-pli
Copy link
Contributor Author

If unspecified, system trust roots on the apiserver are used.

It's means if caBundle unspecified the system trust roots will be used to validate the webhook's certificate。

But the certificate from our webhook is signed by CSR API, the CSR API signed certificate is not a system trust root.

Please help to check if the caBundle field is injected in minikube env, thanks
@charleszheng44

@charleszheng44
Copy link
Contributor

Please correct me if I am wrong. My understanding is the WebhookClientConfig here is used by APIServer to set up a connection between itself and the webhook. When talking to the webhook, the APIServer will act as a client and the caBundle is used to authenticate the response sent back from the webhook. The system trust roots on APIServer are the CAs loaded by APIServer during the start time and the CA used to sign the CSR is one of them.

Please help to check if the caBundle field is injected in minikube env, thanks

Did you mean the caBundle is injected in the APIServer pod or the node running APIServer?

@vincent-pli
Copy link
Contributor Author

Thanks @charleszheng44
I think all your presentation are correct except one thing:

the CA used to sign the CSR is one of them

I'm not expert in this area, but I guess the system trust maybe means Operator system trust, I mean for ubuntu these CA is local in path: /usr/local/share/ca-certificates/ and I notice pod of kube-apiserver mount the path as hostpath volume from the node.

I want to say again, I'm not expert but I'm happy to figure it out, thanks @charleszheng44

and I found one issue about openshift, seems they are talking the same thing with us, please take a look:
https://bugzilla.redhat.com/show_bug.cgi?id=1960936

@charleszheng44
Copy link
Contributor

charleszheng44 commented Jun 19, 2021

@vincent-pli Thanks for pointing me to the OpenShift issue. Looks like the CA used to sign the CSR is not one of the system trust roots (my fault 😅).

There are two options to resolve this issue, we can either leverage external components, like cert-manager, or run the webhook server pod with an init-container that generates a self-signing certificate and store the CA to the caBundle of the WebhookConfiguration later.

The cer-manager itself is a large application including many crds. In our case, there is only one webhook and I may go with the second option. I will try to implement it next week. At meanwhile, could you temporarily use Minikube for testing, or hack the code by adding the serviceaccount CA to the caBundle (I tried this before on Kind and it worked. )?

@vincent-pli
Copy link
Contributor Author

Thanks @charleszheng44
Expect to see your implements : )

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
4 participants