Commit Graph

7 Commits

Author SHA1 Message Date
Aleksa Sarai
8e8b136c49 tree-wide: use /proc/thread-self for thread-local state
With the idmap work, we will have a tainted Go thread in our
thread-group that has a different mount namespace to the other threads.
It seems that (due to some bad luck) the Go scheduler tends to make this
thread the thread-group leader in our tests, which results in very
baffling failures where /proc/self/mountinfo produces gibberish results.

In order to avoid this, switch to using /proc/thread-self for everything
that is thread-local. This primarily includes switching all file
descriptor paths (CLONE_FS), all of the places that check the current
cgroup (technically we never will run a single runc thread in a separate
cgroup, but better to be safe than sorry), and the aforementioned
mountinfo code. We don't need to do anything for the following because
the results we need aren't thread-local:

 * Checks that certain namespaces are supported by stat(2)ing
   /proc/self/ns/...

 * /proc/self/exe and /proc/self/cmdline are not thread-local.

 * While threads can be in different cgroups, we do not do this for the
   runc binary (or libcontainer) and thus we do not need to switch to
   the thread-local version of /proc/self/cgroups.

 * All of the CLONE_NEWUSER files are not thread-local because you
   cannot set the usernamespace of a single thread (setns(CLONE_NEWUSER)
   is blocked for multi-threaded programs).

Note that we have to use runtime.LockOSThread when we have an open
handle to a tid-specific procfs file that we are operating on multiple
times. Go can reschedule us such that we are running on a different
thread and then kill the original thread (causing -ENOENT or similarly
confusing errors). This is not strictly necessary for most usages of
/proc/thread-self (such as using /proc/thread-self/fd/$n directly) since
only operating on the actual inodes associated with the tid requires
this locking, but because of the pre-3.17 fallback for CentOS, we have
to do this in most cases.

In addition, CentOS's kernel is too old for /proc/thread-self, which
requires us to emulate it -- however in rootfs_linux.go, we are in the
container pid namespace but /proc is the host's procfs. This leads to
the incredibly frustrating situation where there is no way (on pre-4.1
Linux) to figure out which /proc/self/task/... entry refers to the
current tid. We can just use /proc/self in this case.

Yes this is all pretty ugly. I also wish it wasn't necessary.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2023-12-14 11:36:41 +11:00
Kir Kolyshkin
5516294172 Remove io/ioutil use
See https://golang.org/doc/go1.16#ioutil

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-10-14 13:46:02 -07:00
Kir Kolyshkin
7be93a66b9 *: fmt.Errorf: use %w when appropriate
This should result in no change when the error is printed, but make the
errors returned unwrappable, meaning errors.As and errors.Is will work.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-22 16:09:47 -07:00
Sebastiaan van Stijn
c064304692 libcontainer/apparmor: split api (exported) from implementation
This prevents having to maintain GoDoc for the stub implementations,
and makes sure that the "stub" implementations have the same signature
as the "non-stub" versions.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-02 17:35:55 +02:00
Sebastiaan van Stijn
a608b7e725 libcontainer/apparmor: use sync.Once for AppArmor detection
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-16 15:12:55 +01:00
Akihiro Suda
f3f563bc0f apparmor: try attr/apparmor/exec before attr/exec
Fix issue 2801

Tested on Arch Linux with the following configuration
```
[root@archlinux ~]# pacman -Q runc containerd docker linux
runc 1.0.0rc93-1
containerd 1.4.3-1
docker 1:20.10.3-1
linux 5.10.14.arch1-1

[root@archlinux ~]# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=bd8aaf58-8735-4fd5-a0c1-804a998f8d57 rw net.ifnames=0 rootflags=compress-force=zstd apparmor=1 lsm=capability,lockdown,yama,bpf,apparmor

[root@archlinux ~]# cat /etc/docker/daemon.json
{
  "runtimes": {
    "runc-tmp": {
      "path": "/tmp/runc"
    }
  }
}

[root@archlinux ~]# docker run -it --rm alpine
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: apply apparmor profile: apparmor failed to apply profile: write /proc/self/attr/exec: invalid argument: unknown.

[root@archlinux ~]# docker run -it --rm --runtime=runc-tmp alpine
/ # cat /proc/self/attr/apparmor/current
docker-default (enforce)
/ # cat /proc/self/attr/current
cat: read error: Invalid argument
``

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-10 16:27:13 +09:00
Akihiro Suda
552a1c7bb1 remove "apparmor" build tag (Always compile AppArmor support)
The apparmor tag was introduced in a01ed80 (2014) to make cgo dependency
on libapparmor optional.

However, the cgo dependency was removed in db093f6 (2017), so it is no
longer meaningful to keep apparmor build tag.

Close #2704

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-16 17:39:48 +09:00