|
| 1 | +# VirtualCluster - Enabling Kubernetes Hard Multi-tenancy |
| 2 | + |
| 3 | +VirtualCluster represents a new architecture to address various Kubernetes control plane isolation challenges. |
| 4 | +It extends existing namespace based Kubernetes multi-tenancy model by providing each tenant a cluster view. |
| 5 | +VirtualCluster completely leverages Kubernetes extendability and preserves full API compatibility. |
| 6 | +That being said, the core Kubernetes components are not modified in virtual cluster. |
| 7 | + |
| 8 | +With VirtualCluster, each tenant is assigned a dedicated tenant control plane, which is a upstream Kubernetes distribution. |
| 9 | +Tenants can create cluster scope resources such as namespaces and CRDs in the tenant control plane without affecting others. |
| 10 | +As a result, most of the isolation problems due to sharing one apiserver disappear. |
| 11 | +The Kubernetes cluster that manages the actual physical nodes is called a super cluster, which now |
| 12 | +becomes a Pod resource provider. VirtualCluster is composed of the following components: |
| 13 | + |
| 14 | +- **vc-manager**: A new CRD [VirtualCluster](pkg/apis/tenancy/v1alpha1/virtualcluster_types.go) is introduced |
| 15 | +to model the tenant control plane. `vc-manager` manages the lifecycle of each `VirtualCluster` custom resource. |
| 16 | +Based on the specification, it either creates CAPN control plane Pods in local K8s cluster, |
| 17 | +or imports an existing cluster if a valid `kubeconfig` is provided. |
| 18 | + |
| 19 | +- **syncer**: A centralized controller that populates API objects needed for Pod provisioning from every tenant control plane |
| 20 | +to the super cluster, and bidirectionally syncs the object statuses. It also periodically scans the synced objects to ensure |
| 21 | +the states between tenant control plane and super cluster are consistent. |
| 22 | + |
| 23 | +- **vn-agent**: A node daemon that proxies all tenant kubelet API requests to the kubelet process that running |
| 24 | +in the node. It ensures each tenant can only access its own Pods in the node. |
| 25 | + |
| 26 | +With all above, from the tenant’s perspective, each tenant control plane behaves like an intact Kubernetes with nearly full API capabilities. |
| 27 | +For more technical details, please check our [ICDCS 2021 paper.](./doc/vc-icdcs.pdf) |
| 28 | + |
| 29 | +## Live Demos/Presentations |
| 30 | + |
| 31 | +Kubecon EU 2020 talk (~25 mins) | WG meeting demo (~50 mins) |
| 32 | +--- | --- |
| 33 | +[](https://www.youtube.com/watch?v=5RgF_dYyvEY "vc-kubecon-eu-2020") | [](http://www.youtube.com/watch?v=Kow00IEUbAA "vc-demo-long") |
| 34 | + |
| 35 | +## Quick Start |
| 36 | + |
| 37 | +Please follow the [instructions](./doc/demo.md) to install VirtualCluster in your local K8s cluster. |
| 38 | + |
| 39 | +## Abstraction |
| 40 | + |
| 41 | +In VirtualCluster, tenant control plane owns the source of the truth for the specs of all the synced objects. |
| 42 | +The exceptions are persistence volume, storage class and priority class resources whose source of the truth is the super cluster. |
| 43 | +The syncer updates the synced object's status in each tenant control plane, |
| 44 | +acting like a regular resource controller. This abstraction model means the following assumptions: |
| 45 | +- The synced object spec _SHOULD_ not be altered by any arbitrary controller in the super cluster. |
| 46 | +- Tenant master owns the lifecycle management for the synced object. The synced objects _SHOULD NOT_ be |
| 47 | + managed by any controllers (e.g., StatefulSet) in the super cluster. |
| 48 | + |
| 49 | +If any of the above assumptions is violated, VirtualCluster may not work as expected. Note that this |
| 50 | +does not mean that a cluster administrator cannot install webhooks, for example, a sidecar webhook, |
| 51 | +in the super cluster. Those webhooks will still work but the changes are going |
| 52 | +to be hidden to tenants. Alternatively, those webhooks can be installed in tenant control planes so that |
| 53 | +tenants will be aware of all changes. |
| 54 | + |
| 55 | +## Limitations |
| 56 | + |
| 57 | +Ideally, tenants should not be aware of the existence of the super cluster in most cases. |
| 58 | +There are still some noticeable differences comparing a tenant control plane and a normal Kubernetes cluster. |
| 59 | + |
| 60 | +- In the tenant control plane, node objects only show up after tenant Pods are created. The super cluster |
| 61 | + node topology is not fully exposed in the tenant control plane. This means the VirtualCluster does not support |
| 62 | + `DaemonSet` alike workloads in tenant control plane. Currently, the syncer controller rejects a newly |
| 63 | + created tenant Pod if its `nodename` has been set in the spec. |
| 64 | + |
| 65 | +- The syncer controller manages the lifecycle of the node objects in tenant control plane but |
| 66 | + it does not update the node lease objects in order to reduce network traffic. As a result, |
| 67 | + it is recommended to increase the tenant control plane node controller `--node-monitor-grace-period` |
| 68 | + parameter to a larger value ( >60 seconds, done in the sample clusterversion |
| 69 | + [yaml](config/sampleswithspec/clusterversion_v1_nodeport.yaml) already). |
| 70 | + |
| 71 | +- Coredns is not tenant-aware. Hence, tenant should install coredns in the tenant control plane if DNS is required. |
| 72 | +The DNS service should be created in the `kube-system` namespace using the name `kube-dns`. The syncer controller can then |
| 73 | +recognize the DNS service's cluster IP in super cluster and inject it into any Pod `spec.dnsConfig`. |
| 74 | + |
| 75 | +- The cluster IP field in the tenant service spec is a bogus value. If any tenant controller requires the |
| 76 | +actual cluster IP that takes effect in the super cluster nodes, a special handling is required. |
| 77 | +The syncer will backpopulate the cluster IP used in the super cluster in the |
| 78 | +annotations of the tenant service object using `transparency.tenancy.x-k8s.io/clusterIP` as the key. |
| 79 | +Then, the workaround usually is going to be a simple code change in the controller. |
| 80 | +This [document](./doc/tenant-dns.md) shows an example for coredns. |
| 81 | + |
| 82 | +- VirtualCluster does not support tenant PersistentVolumes. All PVs and Storageclasses are provided by the super cluster. |
| 83 | + |
| 84 | +VirtualCluster passes most of the Kubernetes conformance tests. One failing test asks for supporting |
| 85 | +`subdomain` which cannot be easily done in the VirtualCluster. |
| 86 | + |
| 87 | +## FAQ |
| 88 | + |
| 89 | +### Q: What is the difference between VirtualCluster and multi-cluster solution? |
| 90 | + |
| 91 | +One of the primary design goals of VirtualCluster is to improve the overall resource utilization |
| 92 | +of a super cluster by allowing multiple tenants to share the node resources in a control plane isolated manner. |
| 93 | +A multi-cluster solution can achieve the same isolation goal but resources won't be shared causing |
| 94 | +nodes to have lower utilization. |
| 95 | + |
| 96 | +### Q: Can the tenant control plane run its own scheduler? |
| 97 | + |
| 98 | +VirtualCluster was primarily designed for serverless use cases where users normally do not have |
| 99 | +scheduling preferences. Using the super cluster scheduler can much easily |
| 100 | +achieve good overall resource utilization. For these reasons, |
| 101 | +VirtualCluster does not support tenant scheduler. It is technically possible |
| 102 | +to support tenant scheduler by exposing some of the super cluster nodes directly in |
| 103 | +tenant control plane. Those nodes have to be dedicated to the tenant to avoid any scheduling |
| 104 | +conflicts. This type of tenant should be exceptional. |
| 105 | + |
| 106 | +### Q: What is the difference between Syncer and Virtual Kubelet? |
| 107 | + |
| 108 | +They have similarities. In some sense, the syncer controller can be viewed as the replacement of a virtual |
| 109 | +kubelet in cases where the resource provider of the virtual kubelet is a Kubernetes cluster. The syncer |
| 110 | +maintains the one to one mapping between a virtual node in tenant control plane and a real node |
| 111 | +in the super cluster. It preserves the Kubernetes API compatibility as closely as possible. Additionally, |
| 112 | +it provides fair queuing to mitigate tenant contention. |
| 113 | + |
| 114 | +## Release |
| 115 | + |
| 116 | +The first release is coming soon. |
| 117 | + |
| 118 | +## Community |
| 119 | +VirtualCluster is a SIG cluster-api-provider-nested (CAPN) supporting project. |
| 120 | +If you have any questions or want to contribute, you are welcome to file issues or pull requests. |
| 121 | + |
| 122 | +You can also directly contact VirtualCluster maintainers via the WG [slack channel](https://kubernetes.slack.com/messages/wg-multitenancy). |
| 123 | + |
| 124 | +Lead developer: @Fei-Guo( [email protected]) |
0 commit comments