Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Rootless on diskless compute nodes Slirp4netns Issuse #47803

Open
tbf3 opened this issue May 7, 2024 · 7 comments
Open

Docker Rootless on diskless compute nodes Slirp4netns Issuse #47803

tbf3 opened this issue May 7, 2024 · 7 comments
Labels
area/rootless Rootless mode kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage version/25.0

Comments

@tbf3
Copy link

tbf3 commented May 7, 2024

Description

Hello,
I am attempting to run docker rootless on compute nodes that are running RHEL 8.8 in memory. The OS is not installed to a physical disk and is running in RAM on the physical server.

Refer to steps to reproduce below for information on issue.

Reproduce

  1. Run dockerd-rootless-setup.sh
$ dockerd-rootless-setuptool.sh check 

[INFO] Requirements are satisfied
  1. $ dockerd-rootless-setuptool.sh install
[INFO] Creating /home/user/.config/systemd/user/docker.service
[INFO] starting systemd service docker.service
+ systemctl --user start docker.service
Job for docker.service failed because the control process exited with error code.
See "systemctl --user status docker.service" and "journalctl --user -xe" for details.
+ set +x
[ERROR] Failed to start docker.service. Run `journalctl -n 20 --no-pager --user --unit docker.service` to show the error log.
[ERROR] Before retrying installation, you might need to uninstall the current setup: `/usr/bin/dockerd-rootless-setuptool.sh uninstall -f ; /usr/bin/rootlesskit rm -rf /home/local-user/.local/share/docker`

If I change to the same user as I am already logged in as, this command executes successfully with the following output:

  1. $ dockerd-rootless-setuptool.sh install
[INFO] systemd not detected, dockerd-rootless.sh needs to be started manually:

PATH=/usr/bin:/sbin:/usr/sbin:$PATH dockerd-rootless.sh

[INFO] Creating CLI context "rootless"
Successfully created context "rootless"
[INFO] Using CLI context "rootless"
Current context is now "rootless"

[INFO] Make sure the following environment variable(s) are set (or add them to ~/.bashrc):
# WARNING: systemd not found. You have to remove XDG_RUNTIME_DIR manually on every logout.
export XDG_RUNTIME_DIR=/home/local-user/.docker/run
export PATH=/usr/bin:$PATH

[INFO] Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///home/local-user/.docker/run/docker.sock

  1. Then if I go to launch dockerd-rootless.sh manually, after exporting env variables, I get the following:

$ PATH=/usr/bin:/sbin:/usr/sbin:$PATH dockerd-rootless.sh
+ case "$1" in
+ '[' -w /home/local-user/.docker/run ']'
+ '[' -d /home/local-user ']'
+ rootlesskit=
+ for f in docker-rootlesskit rootlesskit
+ command -v docker-rootlesskit
+ for f in docker-rootlesskit rootlesskit
+ command -v rootlesskit
+ rootlesskit=rootlesskit
+ break
+ '[' -z rootlesskit ']'
+ : /home/local-user/.docker/run/dockerd-rootless
+ : ''
+ : ''
+ : builtin
+ : auto
+ : auto
+ net=
+ mtu=
+ '[' -z '' ']'
+ command -v slirp4netns
+ slirp4netns --help
+ grep -qw -- --netns-type
+ net=slirp4netns
+ '[' -z '' ']'
+ mtu=65520
+ '[' -z slirp4netns ']'
+ '[' -z 65520 ']'
+ dockerd=dockerd
+ '[' -z '' ']'
+ _DOCKERD_ROOTLESS_CHILD=1
+ export _DOCKERD_ROOTLESS_CHILD
++ id -u
+ '[' 1000 = 0 ']'
+ command -v selinuxenabled
+ selinuxenabled
+ exec rootlesskit --state-dir=/home/local-user/.docker/run/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
[rootlesskit:parent] error: failed to setup network &{logWriter:0xc000252ae0 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 159795 tap0): slirp4netns failed
[rootlesskit:child ] error: EOF

Expected behavior

dockerd-rootless-setup.sh should create systemd service for user account and run dockerd-rootless.sh with no issue.

docker version

Client: Docker Engine - Community
 Version:           25.0.3
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        4debf41
 Built:             Tue Feb  6 21:15:16 2024
 OS/Arch:           linux/amd64
 Context:           rootless
Cannot connect to the Docker daemon at unix:///home/local-user/.docker/run/docker.sock. Is the docker daemon running?

docker info

Client: Docker Engine - Community
 Version:    25.0.3
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.5
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Additional Info

Packages installed for docker:

 docker-ce                    x86_64    3:25.0.3-1.el8                               
 container-selinux            noarch    2:2.205.0-2.module+el8.8.0+18438+15d3aa65    
 containerd.io                x86_64    1.6.28-3.1.el8                               
 docker-ce-cli                x86_64    1:25.0.3-1.el8                               
 fuse-common                  x86_64    3.3.0-16.el8                                 
 fuse-overlayfs               x86_64    1.10-1.module+el8.8.0+18060+3f21f2cc         
 fuse3                        x86_64    3.3.0-16.el8                                 
 fuse3-libs                   x86_64    3.3.0-16.el8                                 
 libcgroup                    x86_64    0.41-19.el8                                  
 libslirp                     x86_64    4.4.0-1.module+el8.8.0+18060+3f21f2cc        
 slirp4netns                  x86_64    1.2.0-2.module+el8.8.0+18060+3f21f2cc        
 docker-buildx-plugin         x86_64    0.12.1-1.el8                                 
 docker-ce-rootless-extras    x86_64    25.0.3-1.el8                                 
 docker-compose-plugin        x86_64    2.24.5-1.el8

Logs show the following when attempting to execute "dockerd-rootless.sh"

localhost dockerd-rootless.sh[159644]: + exec rootlesskit --state-dir=/run/user/1000/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
May  8 03:14:15 localhost kernel: IPv6: ADDRCONF(NETDEV_UP): tap0: link is not ready
May  8 03:14:15 localhost kernel: IPv6: ADDRCONF(NETDEV_CHANGE): tap0: link becomes ready
May  8 03:14:16 localhost dockerd-rootless.sh[159644]: [rootlesskit:parent] error: failed to setup network &{logWriter:0xc000250ae0 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 159656 tap0): slirp4netns failed
May  8 03:14:16 localhost dockerd-rootless.sh[159656]: [rootlesskit:child ] error: EOF
May  8 03:14:16 localhost systemd[121379]: docker.service: Main process exited, code=exited, status=1/FAILURE
May  8 03:14:16 localhost systemd[121379]: docker.service: Killing process 159656 (exe) with signal SIGKILL.
May  8 03:14:16 localhost systemd[121379]: docker.service: Failed with result 'exit-code'.
May  8 03:14:16 localhost systemd[121379]: Failed to start Docker Application Container Engine (Rootless).
May  8 03:14:18 localhost systemd[121379]: docker.service: Service RestartSec=2s expired, scheduling restart.
May  8 03:14:18 localhost systemd[121379]: docker.service: Scheduled restart job, restart counter is at 3.
May  8 03:14:18 localhost systemd[121379]: Stopped Docker Application Container Engine (Rootless).
May  8 03:14:18 localhost systemd[121379]: docker.service: Start request repeated too quickly.
May  8 03:14:18 localhost systemd[121379]: docker.service: Failed with result 'exit-code'.
May  8 03:14:18 localhost systemd[121379]: Failed to start Docker Application Container Engine (Rootless).
May  8 03:14:43 localhost su[159679]: (to local-user) local-user on pts/2
May  8 03:16:38 localhost kernel: IPv6: ADDRCONF(NETDEV_UP): tap0: link is not ready
May  8 03:16:38 localhost kernel: IPv6: ADDRCONF(NETDEV_CHANGE): tap0: link becomes ready


When investigating I notice the line:

localhost dockerd-rootless.sh[159644]: [rootlesskit:parent] error: failed to setup network &{logWriter:0xc000250ae0 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 159656 tap0): slirp4netns failed

I believe the PID (159656) that is being called is an issue? But no idea, I am stumped.

When doing these exact same steps on a normal installed VM, there are no issues at all.

Any help or troubleshooting is appreciated. I would use a different network driver, but have somewhat of a hard requirement on slirp4netns. I am also aware other softwares are more suited for rootless container execution.

@tbf3 tbf3 added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels May 7, 2024
@AkihiroSuda
Copy link
Member

Could you try running slirp4netns without Docker? https://github.com/rootless-containers/slirp4netns?tab=readme-ov-file#usage

It may show more detailed error messages

@AkihiroSuda
Copy link
Member

AkihiroSuda commented May 9, 2024

# * DOCKERD_ROOTLESS_ROOTLESSKIT_SLIRP4NETNS_SANDBOX=(auto|true|false): whether to protect slirp4netns with a dedicated mount namespace. Defaults to "auto".
# * DOCKERD_ROOTLESS_ROOTLESSKIT_SLIRP4NETNS_SECCOMP=(auto|true|false): whether to protect slirp4netns with seccomp. Defaults to "auto".
# * DOCKERD_ROOTLESS_ROOTLESSKIT_DISABLE_HOST_LOOPBACK=(true|false): prohibit connections to 127.0.0.1 on the host (including via 10.0.2.2, in the case of slirp4netns). Defaults to "true".
# To apply an environment variable via systemd, create ~/.config/systemd/user/docker.service.d/override.conf as follows,
# and run `systemctl --user daemon-reload && systemctl --user restart docker`:
# --- BEGIN ---
# [Service]
# Environment="DOCKERD_ROOTLESS_ROOTLESSKIT_NET=pasta"
# Environment="DOCKERD_ROOTLESS_ROOTLESSKIT_PORT_DRIVER=implicit"
# --- END ---

DOCKERD_ROOTLESS_ROOTLESSKIT_SLIRP4NETNS_SANDBOX=false may work?

@tbf3
Copy link
Author

tbf3 commented May 9, 2024

# * DOCKERD_ROOTLESS_ROOTLESSKIT_SLIRP4NETNS_SANDBOX=(auto|true|false): whether to protect slirp4netns with a dedicated mount namespace. Defaults to "auto".
# * DOCKERD_ROOTLESS_ROOTLESSKIT_SLIRP4NETNS_SECCOMP=(auto|true|false): whether to protect slirp4netns with seccomp. Defaults to "auto".
# * DOCKERD_ROOTLESS_ROOTLESSKIT_DISABLE_HOST_LOOPBACK=(true|false): prohibit connections to 127.0.0.1 on the host (including via 10.0.2.2, in the case of slirp4netns). Defaults to "true".
# To apply an environment variable via systemd, create ~/.config/systemd/user/docker.service.d/override.conf as follows,
# and run `systemctl --user daemon-reload && systemctl --user restart docker`:
# --- BEGIN ---
# [Service]
# Environment="DOCKERD_ROOTLESS_ROOTLESSKIT_NET=pasta"
# Environment="DOCKERD_ROOTLESS_ROOTLESSKIT_PORT_DRIVER=implicit"
# --- END ---

DOCKERD_ROOTLESS_ROOTLESSKIT_SLIRP4NETNS_SANDBOX=false may work?

No luck, same error.

@tbf3
Copy link
Author

tbf3 commented May 9, 2024

Could you try running slirp4netns without Docker? https://github.com/rootless-containers/slirp4netns?tab=readme-ov-file#usage

It may show more detailed error messages

Gives a little bit more of a hint somewhere:

[local-use@compute_node~]$ echo $$ > /tmp/pid
[local-use@compute_node~]$ slirp4netns --configure --mtu=65520 --disable-host-loopback $(cat /tmp/pid) tap0
setns(CLONE_NEWNET): Operation not permitted
child failed(1)

Not sure exactly what dockerd-rootless.sh is doing in the background when it fetches the PID though, seeing as this error is different

@AkihiroSuda
Copy link
Member

AkihiroSuda commented May 9, 2024

Could you try running slirp4netns without Docker? https://github.com/rootless-containers/slirp4netns?tab=readme-ov-file#usage
It may show more detailed error messages

Gives a little bit more of a hint somewhere:

[local-use@compute_node~]$ echo $$ > /tmp/pid
[local-use@compute_node~]$ slirp4netns --configure --mtu=65520 --disable-host-loopback $(cat /tmp/pid) tap0
setns(CLONE_NEWNET): Operation not permitted
child failed(1)

Not sure exactly what dockerd-rootless.sh is doing in the background when it fetches the PID though, seeing as this error is different

echo $$ > /tmp/pid has to be executed inside unshare --user --map-root-user --net --mount.
The slirp4netns command has to be executed outside the unshare shell above

@tbf3
Copy link
Author

tbf3 commented May 9, 2024

Could you try running slirp4netns without Docker? https://github.com/rootless-containers/slirp4netns?tab=readme-ov-file#usage
It may show more detailed error messages

Gives a little bit more of a hint somewhere:

[local-use@compute_node~]$ echo $$ > /tmp/pid
[local-use@compute_node~]$ slirp4netns --configure --mtu=65520 --disable-host-loopback $(cat /tmp/pid) tap0
setns(CLONE_NEWNET): Operation not permitted
child failed(1)

Not sure exactly what dockerd-rootless.sh is doing in the background when it fetches the PID though, seeing as this error is different

echo $$ > /tmp/pid has to be executed inside unshare --user --map-root-user --net --mount. The slirp4netns command has to be executed outside the unshare shell above

Doing that executes successfully

$ unshare --user --map-root-user --net --mount
[root@node01 ~]# echo $$ > /tmp/pid
[root@node01 ~]#  slirp4netns --configure --mtu=65520 --disable-host-loopback $(cat /tmp/pid) tap0
sent tapfd=5 for tap0
received tapfd=5
Starting slirp
* MTU:             65520
* Network:         10.0.2.0
* Netmask:         255.255.255.0
* Gateway:         10.0.2.2
* DNS:             10.0.2.3
* DHCP begin:      10.0.2.15
* DHCP end:        10.0.2.30
* Recommended IP:  10.0.2.100

So unsure what I can configure to allow dockerd-rootless.sh to execute successfully

@tbf3
Copy link
Author

tbf3 commented May 9, 2024

Also, when I run the exact same command inside the unshare shell that dockerd-rootless.sh tries to execute, and pass in the PID, returns this:

+ exec rootlesskit --state-dir=/home/local-user/.docker/run/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
[rootlesskit:parent] error: failed to setup network &{logWriter:0xc00025cb00 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 189360 tap0): slirp4netns failed
[rootlesskit:child ] error: EOF
[local-user@compute-node~]$ unshare --user --map-root-user --net --mount
[root@node01 ~]# /usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp $(echo $$) tap0
WARNING: Support for seccomp is experimental
sent tapfd=5 for tap0
received tapfd=5
Starting slirp
* MTU:             65520
* Network:         10.0.2.0
* Netmask:         255.255.255.0
* Gateway:         10.0.2.2
* DNS:             10.0.2.3
* DHCP begin:      10.0.2.15
* DHCP end:        10.0.2.30
* Recommended IP:  10.0.2.100
cannot pivot_root to /tmp
create_sandbox failed
do_slirp is exiting
do_slirp failed
parent failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rootless Rootless mode kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage version/25.0
Projects
None yet
Development

No branches or pull requests

3 participants