13

I'm using unshare to create per process mounts, which is working perfectly fine by

unshare -m --map-root-user

However, after having created my bind-mounts by

mount --bind src dst

I want to change the UID to my original user, so that whoami (and others) echoes my username like echo $USER does.

I have already tried the answer of Simulate chroot with unshare

However, doing su – user1 after chroot /, I get

su: Authentication failure
(Ignored)
setgid: Invalid argument

I have tested this on Ubuntu 18.04 Beta, Debian stretch, openSUSE-Leap-42.3. It's all the same. I guess something has changed in the kernel since this answer was working.

What is a working and correct way to do that (of course without beeing real root)?

spawn
  • 349

2 Answers2

7

The unshare(1) command can't do it:

-r, --map-root-user
[...] As a mere convenience feature, it does not support more sophisticated use cases, such as mapping multiple ranges of UIDs and GIDs.

Supplementary groups if any (video, ...) will be lost anyway (or mapped to nogroup).

By changing again into a 2nd new user namespace, it's possible to revert back the mapping. This requires a custom program, since unshare(1) won't do it. Here's a very minimalistic C program as proof of concept (one user only: uid/gid 1000/1000, zero failure check). Let's call it revertuid.c:

#define _GNU_SOURCE
#include <sched.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#include <unistd.h>

int main(int argc, char *argv[]) {
    int fd;

    unshare(CLONE_NEWUSER);
    fd=open("/proc/self/setgroups",O_WRONLY);
    write(fd,"deny",4);
    close(fd);
    fd=open("/proc/self/uid_map",O_WRONLY);
    write(fd,"1000 0 1",8);
    close(fd);
    fd=open("/proc/self/gid_map",O_WRONLY);
    write(fd,"1000 0 1",8);
    close(fd);
    execvp(argv[1],argv+1);
}

It's just doing the reverse mapping of the mapping done by unshare -r -m, which was unavoidable, to be able to be root and use mount, as seen with:

$ strace unshare -r -m /bin/sleep 1 2>&1 |sed -n '/^unshare/,/^execve/p'
unshare(CLONE_NEWNS|CLONE_NEWUSER)      = 0
open("/proc/self/setgroups", O_WRONLY)  = 3
write(3, "deny", 4)                     = 4
close(3)                                = 0
open("/proc/self/uid_map", O_WRONLY)    = 3
write(3, "0 1000 1", 8)                 = 8
close(3)                                = 0
open("/proc/self/gid_map", O_WRONLY)    = 3
write(3, "0 1000 1", 8)                 = 8
close(3)                                = 0
execve("/bin/sleep", ["/bin/sleep", "1"], [/* 18 vars */]) = 0

So that gives:

user@stretch-amd64:~$ gcc -o revertuid revertuid.c
user@stretch-amd64:~$ mkdir -p /tmp/src /tmp/dst
user@stretch-amd64:~$ touch /tmp/src/file
user@stretch-amd64:~$ ls /tmp/dst
user@stretch-amd64:~$ id
uid=1000(user) gid=1000(user) groups=1000(user)
user@stretch-amd64:~$ unshare -r -m
root@stretch-amd64:~# mount --bind /tmp/src /tmp/dst
root@stretch-amd64:~# ls /tmp/dst
file
root@stretch-amd64:~# exec ./revertuid bash
user@stretch-amd64:~$ ls /tmp/dst
file
user@stretch-amd64:~$ id
uid=1000(user) gid=1000(user) groups=1000(user)

Or shorter:

user@stretch-amd64:~$ unshare -r -m sh -c 'mount --bind /tmp/src /tmp/dst; exec ./revertuid bash'
user@stretch-amd64:~$ ls /tmp/dst
file

The behaviour probably changed after kernel 3.19 as seen in user_namespaces(7):

The /proc/[pid]/setgroups file was added in Linux 3.19, but was backported to many earlier stable kernel series, because it addresses a security issue. The issue concerned files with permissions such as "rwx---rwx".

A.B
  • 36,364
  • 2
  • 73
  • 118
  • does this work for more complicated cases? I ask because I created a namespace with unshare and used newuidmap to map the user to root with all the other common ids mapped to the ids in /etc/subuid. I then used revertuid in this namespace and it only seems to work for the user id ("0 1000 1") and not for any of the subuids ("1000 0 1\n0 1 999\n"). – Compholio Apr 21 '22 at 00:21
  • @Compholio newuidmap is a setuid root helper tool. This answer was written without accounting for the use of newuidmap or any privileged tool. Without external help from a privileged process, a non privileged (in the host namespace) user can only map between itself (in the current namespace) and one other user (in the new namespace). Only 1000 or 0 are of interest. Today the unshare command can do what's in this answer with --map-user= and --map-group=. – A.B Apr 21 '22 at 04:04
  • @Compholio Also user mapping is hierarchical: it's inherited from parent, and can only shrink in size. There's no way to "map back" anything that wasn't mapped in the parent. The initial user namespace has 2^32 uids. If the new created (with the help of newuidmap) has uid 1000 + typically uids 65536-131071 (or 100000-165535) mapped, there's no way to ever map back for example hosts' uid 1001, with or without help of any privileged process (which could grant more than newuidmap but can't anyway). – A.B Apr 21 '22 at 05:40
  • 1
    I'm not trying to map back to the host ids, I understand that is impossible. What I would like to be able to do is "drop root" but still retain all the other ids. So, step 1: make new namespace with user=root, ids 1-1000 from subuid pool; step 2: make root=user, but keep ids 1-1000 rather than map them all to nobody. (ideally I would also like to map, say, 1001 to root) – Compholio Apr 21 '22 at 14:45
  • I feel like this needs a picture, hopefully this helps: https://www.dropbox.com/s/l2kjg00zi8td7q5/example.png?dl=0 – Compholio Apr 21 '22 at 14:54
  • Interestingly, I can get this to work like I expect from the parent process by echoing the desired mapping to /proc/$ID/uid_map instead of using /proc/self/uid_map in the child process or doing the same with the custom "revertuid". – Compholio Apr 27 '22 at 22:52
0

Note: as bwrap is available in more places than util-linux > 2.39.1 right now, this answer using bwrap is probably worth consideration for systems prior to Ubuntu 23.10. (On the other hand, this solution propagates ^C properly and bwrap doesn't)

However unlike bwrap, this solution based on unshare, properly propagates ^C for interactive CLI invocations.

You can't change the UID but with modern util-linux you can start off as non-root and then use nsenter to get root and make the mount.

It is necessary to start the names-space with unshare, wait until it is ready, perform some mounts as fake-root, and when the mounts are ready, let the namespace run its tasks.

This takes two lots of synchronization, to wait until the namespace is ready before performing mounts, and to wait until mounts are ready before running the process.

use as:

in-unshare [bind-mount] ... [--] command [arg] ...

e.g.

in-unshare /build-dir=$PWD/build -- make build

Note: this avoids an implicit exec with: &quot;$@&quot; ; exit $? which is necessary for proper signal propagation on ^C

Who would have thought an extra waiting process layer would be so useful?

in-unshare() ( local session session_pid mounts while test $# != 0 do case "$1" in =) mounts+=("$1") ; shift ;; --) shift ; break ;; *) break; esac done

coproc mounter { # session_pid will be the same as PPID but we need to wait until namespaces are setup read -r session_pid || return $? exec 0<&-

# mount as &quot;root&quot; via nsenter
for mount in &quot;${mounts[@]}&quot;
do # quit on error without writing to stdout
   nsenter -U -m --target $session_pid mount &quot;${mount#*=}&quot; &quot;${mount%=*}&quot; -o rbind || return $?
done
# signal that mounts are setup
echo $?

}

exec {out}>>/dev/fd/${mounter[1]} {in}</dev/fd/${mounter[0]}

note avoiding implict exec with "$@" ; exit $? is neccessary for proper signal shutdown on ^C

who would have thought an extra waiting process layer would be so useful

exec unshare --mount --user --map-user=$(id -u) --map-group=$(id -g) --map-users=auto --map-groups=auto --keep-caps --setgroups allow /bin/bash --noprofile --norc -c "echo $$ >&${out} && exec ${out}>&- || exit $? ; read -u ${in} && exec ${in}<&- && &quot;$@&quot; ; exit $?" unshare "$@" exit $? )