Skip to content

Commit 4269e61

Browse files
authored
Merge pull request #54 from klueska/add-custom-config
Add support for opaque configs with examples
2 parents 0c883fd + dfc6542 commit 4269e61

18 files changed

+650
-486
lines changed

README.md

+48-18
Original file line numberDiff line numberDiff line change
@@ -225,10 +225,10 @@ metadata:
225225
```
226226

227227
Next, deploy four example apps that demonstrate how `ResourceClaim`s,
228-
`ResourceClaimTemplate`s, and custom `ClaimParameter` objects can be used to
229-
request access to resources in various ways:
228+
`ResourceClaimTemplate`s, and custom `GpuConfig` objects can be used to
229+
select and configure resources in various ways:
230230
```bash
231-
kubectl apply --filename=demo/gpu-test{1,2,3,4}.yaml
231+
kubectl apply --filename=demo/gpu-test{1,2,3,4,5}.yaml
232232
```
233233

234234
And verify that they are coming up successfully:
@@ -242,10 +242,11 @@ gpu-test2 pod0 0/2 Pending 0 2s
242242
gpu-test3 pod0 0/1 ContainerCreating 0 2s
243243
gpu-test3 pod1 0/1 ContainerCreating 0 2s
244244
gpu-test4 pod0 0/1 Pending 0 2s
245+
gpu-test5 pod0 0/4 Pending 0 2s
245246
...
246247
```
247248

248-
Use your favorite editor to look through each of the `gpu-test{1,2,3,4}.yaml`
249+
Use your favorite editor to look through each of the `gpu-test{1,2,3,4,5}.yaml`
249250
files and see what they are doing. The semantics of each match the figure
250251
below:
251252

@@ -254,12 +255,16 @@ below:
254255
Then dump the logs of each app to verify that GPUs were allocated to them
255256
according to these semantics:
256257
```bash
257-
for example in $(seq 1 4); do \
258+
for example in $(seq 1 5); do \
258259
echo "gpu-test${example}:"
259260
for pod in $(kubectl get pod -n gpu-test${example} --output=jsonpath='{.items[*].metadata.name}'); do \
260261
for ctr in $(kubectl get pod -n gpu-test${example} ${pod} -o jsonpath='{.spec.containers[*].name}'); do \
261262
echo "${pod} ${ctr}:"
262-
kubectl logs -n gpu-test${example} ${pod} -c ${ctr}| grep GPU_DEVICE
263+
if [ "${example}" -lt 3 ]; then
264+
kubectl logs -n gpu-test${example} ${pod} -c ${ctr}| grep -E "GPU_DEVICE_[0-9]+="
265+
else
266+
kubectl logs -n gpu-test${example} ${pod} -c ${ctr}| grep -E "GPU_DEVICE_[0-9]+"
267+
fi
263268
done
264269
done
265270
echo ""
@@ -270,43 +275,67 @@ This should produce output similar to the following:
270275
```bash
271276
gpu-test1:
272277
pod0 ctr0:
273-
declare -x GPU_DEVICE_0="gpu-e7b42cb1-4fd8-91b2-bc77-352a0c1f5747"
278+
declare -x GPU_DEVICE_0="gpu-ee3e4b55-fcda-44b8-0605-64b7a9967744"
274279
pod1 ctr0:
275-
declare -x GPU_DEVICE_0="gpu-f11773a1-5bfb-e48b-3d98-1beb5baaf08e"
280+
declare -x GPU_DEVICE_0="gpu-9ede7e32-5825-a11b-fa3d-bab6d47e0243"
276281

277282
gpu-test2:
278283
pod0 ctr0:
284+
declare -x GPU_DEVICE_0="gpu-e7b42cb1-4fd8-91b2-bc77-352a0c1f5747"
285+
declare -x GPU_DEVICE_1="gpu-f11773a1-5bfb-e48b-3d98-1beb5baaf08e"
286+
287+
gpu-test3:
288+
pod0 ctr0:
279289
declare -x GPU_DEVICE_0="gpu-0159f35e-99ee-b2b5-74f1-9d18df3f22ac"
290+
declare -x GPU_DEVICE_0_SHARING_STRATEGY="TimeSlicing"
291+
declare -x GPU_DEVICE_0_TIMESLICE_INTERVAL="Default"
280292
pod0 ctr1:
281293
declare -x GPU_DEVICE_0="gpu-0159f35e-99ee-b2b5-74f1-9d18df3f22ac"
294+
declare -x GPU_DEVICE_0_SHARING_STRATEGY="TimeSlicing"
295+
declare -x GPU_DEVICE_0_TIMESLICE_INTERVAL="Default"
282296

283-
gpu-test3:
297+
gpu-test4:
284298
pod0 ctr0:
285299
declare -x GPU_DEVICE_0="gpu-657bd2e7-f5c2-a7f2-fbaa-0d1cdc32f81b"
300+
declare -x GPU_DEVICE_0_SHARING_STRATEGY="TimeSlicing"
301+
declare -x GPU_DEVICE_0_TIMESLICE_INTERVAL="Default"
286302
pod1 ctr0:
287303
declare -x GPU_DEVICE_0="gpu-657bd2e7-f5c2-a7f2-fbaa-0d1cdc32f81b"
304+
declare -x GPU_DEVICE_0_SHARING_STRATEGY="TimeSlicing"
305+
declare -x GPU_DEVICE_0_TIMESLICE_INTERVAL="Default"
288306

289-
gpu-test4:
290-
pod0 ctr0:
307+
gpu-test5:
308+
pod0 ts-ctr0:
291309
declare -x GPU_DEVICE_0="gpu-18db0e85-99e9-c746-8531-ffeb86328b39"
310+
declare -x GPU_DEVICE_0_SHARING_STRATEGY="TimeSlicing"
311+
declare -x GPU_DEVICE_0_TIMESLICE_INTERVAL="Long"
312+
pod0 ts-ctr1:
313+
declare -x GPU_DEVICE_0="gpu-18db0e85-99e9-c746-8531-ffeb86328b39"
314+
declare -x GPU_DEVICE_0_SHARING_STRATEGY="TimeSlicing"
315+
declare -x GPU_DEVICE_0_TIMESLICE_INTERVAL="Long"
316+
pod0 sp-ctr0:
317+
declare -x GPU_DEVICE_1="gpu-93d37703-997c-c46f-a531-755e3e0dc2ac"
318+
declare -x GPU_DEVICE_1_PARTITION_COUNT="10"
319+
declare -x GPU_DEVICE_1_SHARING_STRATEGY="SpacePartitioning"
320+
pod0 sp-ctr1:
292321
declare -x GPU_DEVICE_1="gpu-93d37703-997c-c46f-a531-755e3e0dc2ac"
293-
declare -x GPU_DEVICE_2="gpu-ee3e4b55-fcda-44b8-0605-64b7a9967744"
294-
declare -x GPU_DEVICE_3="gpu-9ede7e32-5825-a11b-fa3d-bab6d47e0243"
322+
declare -x GPU_DEVICE_1_PARTITION_COUNT="10"
323+
declare -x GPU_DEVICE_1_SHARING_STRATEGY="SpacePartitioning"
295324
```
296325

297326
In this example resource driver, no "actual" GPUs are made available to any
298327
containers. Instead, a set of environment variables are set in each container
299328
to indicate which GPUs *would* have been injected into them by a real resource
300-
driver.
329+
driver and how they *would* have been configured.
301330

302-
You can use the UUIDs of the GPUs set in these environment variables to verify
303-
that they were handed out in a way consistent with the semantics shown in the
304-
figure above.
331+
You can use the UUIDs of the GPUs as well as the GPU sharing settings set in
332+
these environment variables to verify that they were handed out in a way
333+
consistent with the semantics shown in the figure above.
305334

306335
Once you have verified everything is running correctly, delete all of the
307336
example apps:
308337
```bash
309-
kubectl delete --wait=false --filename=demo/gpu-test{1,2,3,4}.yaml
338+
kubectl delete --wait=false --filename=demo/gpu-test{1,2,3,4,5}.yaml
310339
```
311340

312341
And wait for them to terminate:
@@ -320,6 +349,7 @@ gpu-test2 pod0 2/2 Terminating 0 31m
320349
gpu-test3 pod0 1/1 Terminating 0 31m
321350
gpu-test3 pod1 1/1 Terminating 0 31m
322351
gpu-test4 pod0 1/1 Terminating 0 31m
352+
gpu-test5 pod0 4/4 Terminating 0 31m
323353
...
324354
```
325355

api/example.com/resource/gpu/v1alpha1/api.go

+76-14
Original file line numberDiff line numberDiff line change
@@ -17,32 +17,94 @@
1717
package v1alpha1
1818

1919
import (
20-
"k8s.io/utils/ptr"
20+
"fmt"
21+
22+
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
23+
"k8s.io/apimachinery/pkg/runtime"
24+
"k8s.io/apimachinery/pkg/runtime/schema"
25+
"k8s.io/apimachinery/pkg/runtime/serializer/json"
2126
)
2227

2328
const (
2429
GroupName = "gpu.resource.example.com"
2530
Version = "v1alpha1"
2631

27-
GpuDeviceType = "gpu"
28-
UnknownDeviceType = "unknown"
29-
30-
GpuClaimParametersKind = "GpuClaimParameters"
32+
GpuConfigKind = "GpuConfig"
3133
)
3234

33-
func DefaultDeviceClassParametersSpec() *DeviceClassParametersSpec {
34-
return &DeviceClassParametersSpec{
35-
DeviceSelector: []DeviceSelector{
36-
{
37-
Type: GpuDeviceType,
38-
Name: "*",
35+
// Decoder implements a decoder for objects in this API group.
36+
var Decoder runtime.Decoder
37+
38+
// +genclient
39+
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
40+
41+
// GpuConfig holds the set of parameters for configuring a GPU.
42+
type GpuConfig struct {
43+
metav1.TypeMeta `json:",inline"`
44+
Sharing *GpuSharing `json:"sharing,omitempty"`
45+
}
46+
47+
// DefaultGpuConfig provides the default GPU configuration.
48+
func DefaultGpuConfig() *GpuConfig {
49+
return &GpuConfig{
50+
TypeMeta: metav1.TypeMeta{
51+
APIVersion: GroupName + "/" + Version,
52+
Kind: GpuConfigKind,
53+
},
54+
Sharing: &GpuSharing{
55+
Strategy: TimeSlicingStrategy,
56+
TimeSlicingConfig: &TimeSlicingConfig{
57+
Interval: "Default",
3958
},
4059
},
4160
}
4261
}
4362

44-
func DefaultGpuClaimParametersSpec() *GpuClaimParametersSpec {
45-
return &GpuClaimParametersSpec{
46-
Count: ptr.To(1),
63+
// Normalize updates a GpuConfig config with implied default values based on other settings.
64+
func (c *GpuConfig) Normalize() error {
65+
if c == nil {
66+
return fmt.Errorf("config is 'nil'")
67+
}
68+
if c.Sharing == nil {
69+
c.Sharing = &GpuSharing{
70+
Strategy: TimeSlicingStrategy,
71+
}
4772
}
73+
if c.Sharing.Strategy == TimeSlicingStrategy && c.Sharing.TimeSlicingConfig == nil {
74+
c.Sharing.TimeSlicingConfig = &TimeSlicingConfig{
75+
Interval: "Default",
76+
}
77+
}
78+
if c.Sharing.Strategy == SpacePartitioningStrategy && c.Sharing.SpacePartitioningConfig == nil {
79+
c.Sharing.SpacePartitioningConfig = &SpacePartitioningConfig{
80+
PartitionCount: 1,
81+
}
82+
}
83+
return nil
84+
}
85+
86+
func init() {
87+
// Create a new scheme and add our types to it. If at some point in the
88+
// future a new version of the configuration API becomes necessary, then
89+
// conversion functions can be generated and registered to continue
90+
// supporting older versions.
91+
scheme := runtime.NewScheme()
92+
schemeGroupVersion := schema.GroupVersion{
93+
Group: GroupName,
94+
Version: Version,
95+
}
96+
scheme.AddKnownTypes(schemeGroupVersion,
97+
&GpuConfig{},
98+
)
99+
metav1.AddToGroupVersion(scheme, schemeGroupVersion)
100+
101+
// Set up a json serializer to decode our types.
102+
Decoder = json.NewSerializerWithOptions(
103+
json.DefaultMetaFactory,
104+
scheme,
105+
scheme,
106+
json.SerializerOptions{
107+
Pretty: true, Strict: true,
108+
},
109+
)
48110
}

api/example.com/resource/gpu/v1alpha1/deviceclass.go

-56
This file was deleted.

api/example.com/resource/gpu/v1alpha1/gpuclaim.go

-49
This file was deleted.

0 commit comments

Comments
 (0)