I am tinkering with Linux namespaces to better understand how Docker (or rather, runc) interacts with them.
By default, Docker does not create a user namespace for a container, meaning that UID 0
inside the container also means UID 0
on the host.
One way to bridge this potential security issue is to change the UID under which the processes inside a container are run by using, either the USER
instruction inside a Dockerfile, or the --user
flag in Docker CLI.
From what I understood, the UID assigned to the container (e.g. via USER
inside a Dockerfile) must match the same existing UID on the host (i.e. the UID must exists in /etc/passwd
and GID in /etc/group
)
So I ran an experiment and created a Dockerfile with the following content:
FROM ubuntu
RUN groupadd -r -g 1001 appuser
RUN useradd -r -u 1001 -g appuser appuser
USER appuser
ENTRYPOINT ["sleep", "infinity"]
On the host machine, there is no user with a UID of 1001
nor a group with a GID of 1001
. Yet to my surprise, when I ran the container then ps aux | grep sleep
, the sleep infinity
process from the Docker container showed up and the associated username was not a name but rather the corresponding UID, 1001
.
So my questions are: how is it possible to ask for container's processes to run under a UID that the host has no knowledge of? And, how does the kernel checks for permissions if it has no knowledge on that UID (e.g. does it create a temporary unprivileged user for that purpose) ?