zishuo/runc

mirror of https://github.com/opencontainers/runc.git synced 2025-11-03 09:51:06 +08:00

Author	SHA1	Message	Date
Kir Kolyshkin	771903608c	libct/cg: write unified resources line by line It has been pointed out that some controllers can not accept multiple lines of output at once. In particular, io.max can only set one device at a time. Practically, the only multi-line resource values we can get come from unified.* -- let's write those line by line. Add a test case. Reported-by: Tao Shen <shentaoskyking@gmail.com> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-06-09 14:01:45 -07:00
Aleksa Sarai	89c93ddf28	cgroup: plug leaks of /sys/fs/cgroup handle We auto-close this file descriptor in the final exec step, but it's probably a good idea to not possibly leak the file descriptor to "runc init" (we've had issues like this in the past) especially since it is a directory handle from the host mount namespace. In practice, on runc 1.1 this does leak to "runc init" but on main the handle has a low enough file descriptor that it gets clobbered by the ForkExec of "runc init". OPEN_TREE_CLONE would let us protect this handle even further, but the performance impact of creating an anonymous mount namespace is probably not worth it. Also, switch to using an *os.File for the handle so if it goes out of scope during setup (i.e. an error occurs during setup) it will get cleaned up by the GC. Fixes: GHSA-xr7r-f8xq-vfvv CVE-2024-21626 Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2024-01-24 00:20:58 +11:00
Aleksa Sarai	8e8b136c49	tree-wide: use /proc/thread-self for thread-local state With the idmap work, we will have a tainted Go thread in our thread-group that has a different mount namespace to the other threads. It seems that (due to some bad luck) the Go scheduler tends to make this thread the thread-group leader in our tests, which results in very baffling failures where /proc/self/mountinfo produces gibberish results. In order to avoid this, switch to using /proc/thread-self for everything that is thread-local. This primarily includes switching all file descriptor paths (CLONE_FS), all of the places that check the current cgroup (technically we never will run a single runc thread in a separate cgroup, but better to be safe than sorry), and the aforementioned mountinfo code. We don't need to do anything for the following because the results we need aren't thread-local: * Checks that certain namespaces are supported by stat(2)ing /proc/self/ns/... * /proc/self/exe and /proc/self/cmdline are not thread-local. * While threads can be in different cgroups, we do not do this for the runc binary (or libcontainer) and thus we do not need to switch to the thread-local version of /proc/self/cgroups. * All of the CLONE_NEWUSER files are not thread-local because you cannot set the usernamespace of a single thread (setns(CLONE_NEWUSER) is blocked for multi-threaded programs). Note that we have to use runtime.LockOSThread when we have an open handle to a tid-specific procfs file that we are operating on multiple times. Go can reschedule us such that we are running on a different thread and then kill the original thread (causing -ENOENT or similarly confusing errors). This is not strictly necessary for most usages of /proc/thread-self (such as using /proc/thread-self/fd/$n directly) since only operating on the actual inodes associated with the tid requires this locking, but because of the pre-3.17 fallback for CentOS, we have to do this in most cases. In addition, CentOS's kernel is too old for /proc/thread-self, which requires us to emulate it -- however in rootfs_linux.go, we are in the container pid namespace but /proc is the host's procfs. This leads to the incredibly frustrating situation where there is no way (on pre-4.1 Linux) to figure out which /proc/self/task/... entry refers to the current tid. We can just use /proc/self in this case. Yes this is all pretty ugly. I also wish it wasn't necessary. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2023-12-14 11:36:41 +11:00
Kir Kolyshkin	2c9598c886	libct/cgroups.OpenFile: clean "file" argument This prevents potential exploit of using "../" in cgroups.OpenFile (as well as other methods that use OpenFile) to read or write to other cgroups. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-10-27 10:46:32 -07:00
Kir Kolyshkin	9cd5d6cddf	libct/cg: remove retry on EINTR in Commit `f34eb2c00` introduced a workaround to retry on EINTR due to changes in Go 1.14. It was fixed in Go 1.15 [1], meaning a custom retry loop is no longer necessary. Keep the test case to avoid future regressions. [1] https://github.com/golang/go/issues/38033 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-10-20 10:08:54 -07:00
Kir Kolyshkin	f62f0bdfbf	Remove nolint annotations for unix errno comparisons golangci-lint v1.54.2 comes with errorlint v1.4.4, which contains the fix [1] whitelisting all errno comparisons for errors coming from x/sys/unix. Thus, these annotations are no longer necessary. Hooray! [1] https://github.com/polyfloyd/go-errorlint/pull/47 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-08-24 17:28:10 -07:00
Kir Kolyshkin	bd50e7c420	libct/cg/OpenFile: check cgroupFd on error opencontainers/runc issue 3026 describes a scenario in which OpenFile failed to open a legitimate existing cgroupfs file. Added debug (similar to what this commit does) shown that cgroupFd is no longer opened to "/sys/fs/cgroup", but to "/" (it's not clear what caused it, and the source code is not available, but they might be using the same process on the both sides of the container/chroot/pivot_root/mntns boundary, or remounting /sys/fs/cgroup). Consider such use incorrect, but give a helpful hint as two what is going on by wrapping the error in a more useful message. NB: this can potentially be fixed by reopening the cgroupFd once we detected that it's screwed, and retrying openat2. Alas I do not have a test case for this, so left this as a TODO suggestion. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-08-11 17:00:05 -07:00
Kir Kolyshkin	c2d9668cc5	libct/cg/OpenFile: fix openat2 vs top cgroup dir Fix reading cgroup files from the top cgroup directory, i.e. /sys/fs/cgroup. The code was working for for any subdirectory of /sys/fs/cgroup, but for dir="/sys/fs/cgroup" a fallback (open and fstatfs) was used, because of the way the function worked with the dir argument. Fix those cases, and add unit tests to make sure they work. While at it, make the rules for dir and name components more relaxed, and add test cases for this, too. While at it, improve OpenFile documentation, and remove a duplicated doc comment for openFile. Without these fixes, the unit test fails the following cases: file_test.go:67: case {path:/sys/fs/cgroup name:cgroup.controllers}: fallback file_test.go:67: case {path:/sys/fs/cgroup/ name:cgroup.controllers}: openat2 /sys/fs/cgroup//cgroup.controllers: invalid cross-device link file_test.go:67: case {path:/sys/fs/cgroup/ name:/cgroup.controllers}: openat2 /sys/fs/cgroup///cgroup.controllers: invalid cross-device link file_test.go:67: case {path:/ name:/sys/fs/cgroup/cgroup.controllers}: fallback file_test.go:67: case {path:/ name:sys/fs/cgroup/cgroup.controllers}: fallback file_test.go:67: case {path:/sys/fs/cgroup/cgroup.controllers name:}: openat2 /sys/fs/cgroup/cgroup.controllers/: not a directory Here "fallback" means openat2-based implementation fails, and the fallback code is used (and works). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-08-09 11:17:05 -07:00
Kir Kolyshkin	b60e2edf75	libct/cg: stop using pkg/errors Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-22 16:09:47 -07:00
Kir Kolyshkin	56e478046a	: ignore errorlint warnings about unix. errors Errors from unix.* are always bare and thus can be used directly. Add //nolint:errorlint annotation to ignore errors such as these: libcontainer/system/xattrs_linux.go:18:7: comparing with == will fail on wrapped errors. Use errors.Is to check for a specific error (errorlint) case errno == unix.ERANGE: ^ libcontainer/container_linux.go:1259:9: comparing with != will fail on wrapped errors. Use errors.Is to check for a specific error (errorlint) if e != unix.EINVAL { ^ libcontainer/rootfs_linux.go:919:7: comparing with != will fail on wrapped errors. Use errors.Is to check for a specific error (errorlint) if err != unix.EINVAL && err != unix.EPERM { ^ libcontainer/rootfs_linux.go:1002:4: switch on an error will fail on wrapped errors. Use errors.Is to check for specific errors (errorlint) switch err { ^ Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-22 16:09:47 -07:00
Kir Kolyshkin	8f1b4d4a6f	libct/cg: mv fscommon.{Open,Read,Write}File to cgroups This is a better place as cgroups itself is using these. Should help with moving more stuff common in between fs and fs2 to fscommon. Looks big, but this is just moving the code around: fscommon/{fscommon,open}.go -> cgroups/file.go fscommon/fscommon_test.go -> cgroups/file_test.go and fixes for TestMode moved to a different package. There's no functional change. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-13 12:38:21 -07:00

11 Commits