Skip to content

Commit e72400d

Browse files
committed
chore(README.md): add hacking and troubleshooting sections
Signed-off-by: Cian Johnston <[email protected]>
1 parent b39fa4a commit e72400d

File tree

2 files changed

+36
-2
lines changed

2 files changed

+36
-2
lines changed

README.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,3 +86,37 @@ env {
8686
> }
8787
> }
8888
> ```
89+
90+
## GPUs
91+
92+
When passing through GPUs to the inner container, you may end up using associated tooling such as the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/index.html) or the [NVIDIA GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html). These will inject required utilities and libraries inside the inner container. You can verify this by directly running (without Envbox) a barebones image like `debian:bookworm` and running `mount` or `nvidia-smi` inside the container.
93+
94+
Envbox will detect these mounts and pass them inside the inner container it creates, so that GPU-aware tools run inside the inner container can still utilize these libraries.
95+
96+
## Hacking
97+
98+
Here's a simple one-liner to run the `codercom/enterprise-minimal:ubuntu` image in Envbox using Docker:
99+
100+
```
101+
docker run -it --rm \
102+
-v /tmp/envbox/docker:/var/lib/coder/docker \
103+
-v /tmp/envbox/containers:/var/lib/coder/containers \
104+
-v /tmp/envbox/sysbox:/var/lib/sysbox \
105+
-v /tmp/envbox/docker:/var/lib/docker \
106+
-v /usr/src:/usr/src:ro \
107+
-v /lib/modules:/lib/modules:ro \
108+
--privileged \
109+
-e CODER_INNER_IMAGE=codercom/enterprise-minimal:ubuntu \
110+
-e CODER_INNER_USERNAME=coder \
111+
envbox:latest /envbox docker
112+
```
113+
114+
This will store persistent data under `/tmp/envbox`.
115+
116+
## Troubleshooting
117+
118+
### `failed to write <number> to cgroup.procs: write /sys/fs/cgroup/docker/<id>/init.scope/cgroup.procs: operation not supported: unknown`
119+
120+
This issue occurs in Docker if you have `cgroupns-mode` set to `private`. To validate, add `--cgroupns=host` to your `docker run` invocation and re-run.
121+
122+
To permanently set this as the default in your Docker daemon, add `"default-cgroupns-mode": "host"` to your `/etc/docker/daemon.json` and restart Docker.

xunix/gpu_test.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,13 +56,13 @@ func TestGPUs(t *testing.T) {
5656
expectedUsrLibFiles = []string{
5757
filepath.Join(usrLibMountpoint, "nvidia", "libglxserver_nvidia.so"),
5858
filepath.Join(usrLibMountpoint, "libnvidia-ml.so"),
59+
filepath.Join(usrLibMountpoint, "nvidia", "libglxserver_nvidia.so.1"),
5960
}
6061

6162
// fakeUsrLibFiles are files that should be written to the "mounted"
6263
// /usr/lib directory. It includes files that shouldn't be returned.
6364
fakeUsrLibFiles = append([]string{
6465
filepath.Join(usrLibMountpoint, "libcurl-gnutls.so"),
65-
filepath.Join(usrLibMountpoint, "nvidia", "libglxserver_nvidia.so.1"),
6666
}, expectedUsrLibFiles...)
6767
)
6868

@@ -98,7 +98,7 @@ func TestGPUs(t *testing.T) {
9898
devices, binds, err := xunix.GPUs(ctx, log, usrLibMountpoint)
9999
require.NoError(t, err)
100100
require.Len(t, devices, 2, "unexpected 2 nvidia devices")
101-
require.Len(t, binds, 3, "expected 4 nvidia binds")
101+
require.Len(t, binds, 4, "expected 4 nvidia binds")
102102
require.Contains(t, binds, mount.MountPoint{
103103
Device: "/dev/sda1",
104104
Path: "/usr/local/nvidia",

0 commit comments

Comments
 (0)