zishuo/runc

mirror of https://github.com/opencontainers/runc.git synced 2025-12-24 11:50:58 +08:00

Author	SHA1	Message	Date
Mrunal Patel	8772c4dd2f	Merge pull request #3109 from kolyshkin/seccomp seccomp: skip redundant rules	2021-08-03 23:09:00 -04:00
Kir Kolyshkin	60e02b4b25	runc exec: fail with exit code of 255 Currently there's no way to distinguish between the two cases: - runc exec failed; - the command executed returned 1. This was possible before commit `8477638aab`, as runc exec exited with the code of 255 if exec itself has failed. The code of 255 is the same convention as used by e.g. ssh. Re-introduce the feature, document it, and add some tests so it won't be broken again. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-07-28 13:46:13 -07:00
Kir Kolyshkin	5dd92fd9b4	libct/seccomp: skip redundant rules This fixes using runc with podman on my system (Fedora 34). > $ podman --runtime `pwd`/runc run --rm --memory 4M fedora echo it works > Error: unable to start container process: error adding seccomp filter rule for syscall bdflush: permission denied: OCI permission denied The problem is, libseccomp returns EPERM when a redundant rule (i.e. the rule with the same action as the default one) is added, and podman (on my machine) sets the following rules in config.json: <....> "seccomp": { "defaultAction": "SCMP_ACT_ERRNO", "architectures": [ "SCMP_ARCH_X86_64", "SCMP_ARCH_X86", "SCMP_ARCH_X32" ], "syscalls": [ { "names": [ "bdflush", "io_pgetevents", <....> ], "action": "SCMP_ACT_ERRNO", "errnoRet": 1 }, <....> (Note that defaultErrnoRet is not set, but it defaults to 1). With this commit, it works: > $ podman --runtime `pwd`/runc run --memory 4M fedora echo it works > it works Add an integration test (that fails without the fix). Similar crun commit: * https://github.com/containers/crun/commit/08229f3fb904c5ea19a7d9 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-07-27 00:04:59 -07:00
Adrian Reber	9f656dbb11	Do not use Vagrant for CentOS 7/8 As Cirrus CI does not provide a real terminal this uses the same 'ssh -tt' workaround as the Vagrant setup. This sets up the CentOS 7 and 8 to allow SSH as root to localhost so that we can run all the tests via 'ssh -tt'. Not going through vagrant reduces CI times for CentOS 7 and 8 from 6 minutes to 4 minutes. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-07-23 09:23:23 +02:00
Kir Kolyshkin	d448016486	tests/rootless.sh: fixup for "update rt" test Without this, the test case fails with > Writing 1000000 to /sys/fs/cgroup/cpu,cpuacct/runc-cgroups-integration-test/cpu.rt_period_us > /tmp/bats-run-106836/bats.116418.src: line 548: /sys/fs/cgroup/cpu,cpuacct/runc-cgroups-integration-test/cpu.rt_period_us: Permission denied Since we do not currently have a setup to test this, this went unnoticed (can be seen in RHEL8 though). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> Signed-off-by: Adrian Reber <areber@redhat.com>	2021-07-23 09:23:23 +02:00
Kir Kolyshkin	86af524866	tests/int: fix "update rt period and runtime" for rootless Since commit `f09a3e1b8d`, the value passed on to read starts with a slash, resulting in the first element of the array to be empty. As a result, the test tries to write to the top-level cgroup, which fails when rootless: > # Writing 1000000 to /sys/fs/cgroup/cpu,cpuacct//cpu.rt_period_us > # /tmp/bats-run-106184/bats.115768.src: line 548: /sys/fs/cgroup/cpu,cpuacct//cpu.rt_period_us: Permission denied To fix, remove the leading slash. An alternative fix would be to do "for ((i = 1;" instead of "i = 0", but that seems less readable. Fixes: `f09a3e1b8d` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-07-23 09:23:23 +02:00
Aleksa Sarai	2c01cec8ac	merge branch 'pr-3067' Odin Ugedal (2): libct/cg/sd: Don't freeze cgroup on cgroup v2 Set Update device update tests LGTMs: kolyshkin mrunalp cyphar Closes #3067	2021-07-13 12:59:29 +10:00
Odin Ugedal	d41a273dae	Update device update tests Run device update tests on cgroup v2, and add a test verifying that we don't allow access to devices when we don't intend to. Signed-off-by: Odin Ugedal <odin@uged.al>	2021-07-07 22:44:08 +02:00
Odin Ugedal	6be088d69d	tests/int/dev: add CAP_SYSLOG to /dev/kmsg tests Add CAP_SYSLOG to ensure that /dev/kmsg can be accesses on systems where the sysctl kernel.dmesg_restrict = 1. Signed-off-by: Odin Ugedal <odin@uged.al>	2021-07-07 15:44:16 +02:00
Akihiro Suda	5547b5774f	Merge pull request #3033 from kolyshkin/rm-own-errors libcontainer: rm own error system	2021-07-01 13:47:27 +09:00
Kir Kolyshkin	1bbeadae72	tests/int/no_pivot: fix for new kernels The test is failing like this: not ok 70 runc run --no-pivot must not expose bare /proc # (in test file tests/integration/no_pivot.bats, line 20) # `[[ "$output" == "mount: permission denied" ]]' failed # runc spec (status=0): # # runc run --no-pivot test_no_pivot (status=1): # unshare: write error: Operation not permitted Apparently, a recent kernel commit db2e718a47984b9d prevents root from doing unshare -r unless it has CAP_SETFPCAP. Add the capability for this specific test. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-29 13:31:54 -07:00
Kir Kolyshkin	70fdc0573d	Revert "checkpoint: resolve symlink for external bind mount" This reverts commit `da22625f69` (PR 2902). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-24 20:52:43 -07:00
Mrunal Patel	1079288bef	Merge pull request #2902 from liusdu/checkpoint checkpoint: resolve symlink for external bind mount	2021-06-24 22:52:02 -04:00
Kir Kolyshkin	e918d02139	libcontainer: rm own error system This removes libcontainer's own error wrapping system, consisting of a few types and functions, aimed at typization, wrapping and unwrapping of errors, as well as saving error stack traces. Since Go 1.13 now provides its own error wrapping mechanism and a few related functions, it makes sense to switch to it. While doing that, improve some error messages so that they start with "error", "unable to", or "can't". A few things that are worth mentioning: 1. We lose stack traces (which were never shown anyway). 2. Users of libcontainer that relied on particular errors (like ContainerNotExists) need to switch to using errors.Is with the new errors defined in error.go. 3. encoding/json is unable to unmarshal the built-in error type, so we have to introduce initError and wrap the errors into it (basically passing the error as a string). This is the same as it was before, just a tad simpler (actually the initError is a type that got removed in commit afa844311; also suddenly ierr variable name makes sense now). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-24 10:21:04 -07:00
Kir Kolyshkin	2916817278	tests/int/cgroups: add test for bfq per-device weight This works for both cgroup v1 and v2. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-16 04:51:40 -07:00
Mrunal Patel	a3813ca249	Merge pull request #3000 from kolyshkin/test-dev-update libct/int: add device update test	2021-06-11 14:36:49 -04:00
Kir Kolyshkin	bd8e070109	libct/cg/sd: fix "SkipDevices" handling 1. The meaning of SkipDevices is what it is -- do not set any device-related options. 2. Reverts the part of commit `108ee85b82` which skipped the freeze when the SkipDevices is set. Apparently, the freeze is needed on update even if no Device* properties are being set. 3. Add "runc update" to "runc run [device cgroup deny]" test. Fixes: `752e7a8249` Fixes: `108ee85b82` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-10 12:01:21 -07:00
Kir Kolyshkin	508f5bf665	libct/int: add device update test ... and remove the one from tests/integration. The idea is similar to the one for the test case being removed -- try updating device rules many times to make sure we are not leaking eBPF programs after every update/Set(). This is better though as we can really change the device rules every time (which "runc update" can't) and check that the rule is applied. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-05 16:11:29 -07:00
Kir Kolyshkin	f5a2c9cced	tests/int/dev: only call lsblk once The "runc run [device cgroup allow rm block device]" test calls lsblk three times to get device name, minor and major number. This creates a potential problem when the devices are changed between the calls. Simplify the code by using bash read together with IFS (as there's no way to have lsblk output MAJOR:MINOR pair without a semicolon). Note that head -n 1 is not needed as read already reads a single line. [v2: don't use PATH as CentOS7's lsblk does not support it.] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-25 12:21:42 -07:00
Aleksa Sarai	00119c85a9	integration: add repeated "runc update" test This is to ensure that we aren't leaking eBPF programs after "runc update". Unfortunately we cannot directly test the behaviour of cgroup program updates in an integration test because "runc update" doesn't support that behaviour at the moment. So instead we rely on the fact that each "runc update" implicitly triggers the devices rules to be updated. Without the previous patches applied, this new test will fail with errors (on cgroupv2 systems). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2021-05-23 17:55:24 +10:00
Giuseppe Scrivano	c61f606254	libcontainer: honor seccomp defaultErrnoRet https://github.com/opencontainers/runtime-spec/pull/1087 added support for defaultErrnoRet to the OCI runtime specs. If a defaultErrnoRet is specified, disable patching the generated libseccomp cBPF. Closes: https://github.com/opencontainers/runc/issues/2943 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2021-05-17 09:23:32 +02:00
Aleksa Sarai	62250c13fe	merge branch 'pr-2944' Kir Kolyshkin (8): tests/int/helpers: rm old code libct/cg/sd/v2: call initPath from Path tests/int/update.bats: don't set cpuset in setup tests/int/helpers: generalize require cgroups_freezer tests/int: enable/use requires cgroups_<ctrl> tests/int/cgroups: don't check for hugetlb tests/int: run rootless_cgroup tests for v2+systemd Vagrantfile.fedora: set Delegate=yes LGTMs: AkihiroSuda cyphar Closes #2944	2021-05-10 11:16:11 +10:00
Kir Kolyshkin	ac70a9a1b2	tests/int: run rootless_cgroup tests for v2+systemd Before this commit, "require rootless_cgroup" feature check required "cgroup" to be present in ROOTLESS_FEATURES. The idea of the requirement, though, is to ensure that rootless runc can manage cgroups. In case of systemd + cgroup v2, rootless runc can manage cgroups, thanks to systemd delegation, so modify the feature check accordingly. Next, convert (simplify) some of the existing users to the modified check. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-07 22:07:27 -07:00
Kir Kolyshkin	601cf5825f	tests/int/cgroups: don't check for hugetlb Systemd is not able to delegate hugetlb controller, and it is needed for cgroup v2 + systemd + rootless case (which is currently skipped because of "requires rootless_cgroup", but will be enabled by a later commit). The failure being fixed looks like this: > not ok 4 runc create (limits + cgrouppath + permission on the cgroup dir) succeeds > # (from function `check_cgroup_value' in file /vagrant/tests/integration/helpers.bash, line 188, > # in test file /vagrant/tests/integration/cgroups.bats, line 53) > # `check_cgroup_value "cgroup.controllers" "$(cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/cgroup.controllers)"' failed > # <....> > # /sys/fs/cgroup/user.slice/user-2000.slice/user@2000.service/machine.slice/runc-cgroups-integration-test-20341.scope/cgroup.controllers > # current cpuset cpu io memory pids !? cpuset cpu io memory hugetlb pids Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-07 22:07:25 -07:00
Kir Kolyshkin	40b9791916	tests/int: enable/use requires cgroups_<ctrl> 1. In case of cgroup v2 + systemd + rootless, get the list of controllers from the own cgroup, rather than from the root one (as systemd cgroup delegation is required in this case, and it might not be set or fully working). 2. Use "requires cgroups_<controller>" in tests that need those controllers. Tested on Fedora 31 (which does not support delegation of cpuset controller). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-07 22:07:08 -07:00
Kir Kolyshkin	44fcbfd68f	tests/int/helpers: generalize require cgroups_freezer With this, something like cgroups_pids can be used (to be added by the next commit). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-07 22:06:13 -07:00
Kir Kolyshkin	353f2ad193	tests/int/update.bats: don't set cpuset in setup Setting cpuset.cpus requires cpuset controller, which might not be available in case of cgroup v2 + systemd < 244. Rather than require cpuset support from every test case, do not set the initial cpuset value from setup(), and do not check it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-07 22:06:08 -07:00
Kir Kolyshkin	0ed1f8022d	tests/int/helpers: rm old code This was added by commit `ca4f427af`, together with set_resources_limit, and was needed for the latter, since sed was used to update resources. Now when we switched to jq in commit `79fe41d3c1`, this kludge is no longer needed. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-07 22:05:49 -07:00
Qiang Huang	fc8a0d97a9	Merge pull request #2913 from liusdu/interative tiny fix iterative checkpoint test case	2021-05-08 10:16:30 +08:00
Liu Hua	da22625f69	checkpoint: resolve symlink for external bind mount runc resolves symlink before doing bind mount. So we should save original path while formatting CriuReq for checkpoint. Signed-off-by: Liu Hua <weldonliu@tencent.com>	2021-04-25 09:50:00 +08:00
Liu Hua	23e3794d9c	checkpoint: validate parent path `--parent-path` needs to be relative path of `--image-path`, and points to previous checkpoint directory. Signed-off-by: Liu Hua <weldonliu@tencent.com>	2021-04-23 13:20:08 +08:00
Mrunal Patel	42a18e7f02	Merge pull request #2818 from kolyshkin/rootless-cgroup2-mount Fix cgroup2 mount for rootless case	2021-04-22 17:04:19 -07:00
Mrunal Patel	b69bd53648	Merge pull request #2869 from dqminh/fix-oss-fuzz Fix oss-fuzz build	2021-04-20 19:07:52 -07:00
Kir Kolyshkin	0216716c18	tests/int: add a case for cgroupv2 mount This checks that in-container view of /sys/fs/cgroup does not contain any extra cgroups (which was the case for rootless before the previous commit). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-20 12:37:54 -07:00
Kir Kolyshkin	5ffcc5683e	tests/int: use bfq test with rootless As the rootless cgroup2 mount is now fixed, we can enable this test for rootless as well. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-20 12:37:39 -07:00
Liu Hua	aa62272382	tiny fix iterative checkpoint test case Criu creates symlink named "parent" under image-path while doing interactive checkpoint, without checking wether symlink points to right place. This patch corrects related test case. Signed-off-by: Liu Hua <weldonliu@tencent.com>	2021-04-19 20:04:52 +08:00
Kir Kolyshkin	31dd1e499b	tests/int: add rootless + host pidns test case For the fix, see previous commit. Without the fix, this test case fails: > container_linux.go:380: starting container process caused: > process_linux.go:545: container init caused: readonly path /proc/bus: > operation not permitted Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-14 17:35:15 -07:00
Qiang Huang	2d38476c96	Merge pull request #2840 from kolyshkin/ignore-kmem Ignore kernel memory settings	2021-04-13 09:44:14 +08:00
Kir Kolyshkin	52390d6804	Ignore kernel memory settings This is somewhat radical approach to deal with kernel memory. Per-cgroup kernel memory limiting was always problematic. A few examples: - older kernels had bugs and were even oopsing sometimes (best example is RHEL7 kernel); - kernel is unable to reclaim the kernel memory so once the limit is hit a cgroup is toasted; - some kernel memory allocations don't allow failing. In addition to that, - users don't have a clue about how to set kernel memory limits (as the concept is much more complicated than e.g. [user] memory); - different kernels might have different kernel memory usage, which is sort of unexpected; - cgroup v2 do not have a [dedicated] kmem limit knob, and thus runc silently ignores kernel memory limits for v2; - kernel v5.4 made cgroup v1 kmem.limit obsoleted (see https://github.com/torvalds/linux/commit/0158115f702b). In view of all this, and as the runtime-spec lists memory.kernel and memory.kernelTCP as OPTIONAL, let's ignore kernel memory limits (for cgroup v1, same as we're already doing for v2). This should result in less bugs and better user experience. The only bad side effect from it might be that stat can show kernel memory usage as 0 (since the accounting is not enabled). [v2: add a warning in specconv that limits are ignored] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-12 12:18:11 -07:00
Aleksa Sarai	64bb59f592	nsenter: improve debug logging In order to make 'runc --debug' actually useful for debugging nsexec bugs, provide information about all the internal operations when in debug mode. [@kolyshkin: rebasing; fix formatting via indent for make validate to pass] Signed-off-by: Aleksa Sarai <asarai@suse.de> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-08 09:54:43 -07:00
Peter Hunt	6ce2d63a5d	libct/init_linux: retry chdir to fix EPERM Alas, the EPERM on chdir saga continues... Unfortunately, the there were two releases between when `5e0e67d76c` was released and when the workaround https://github.com/opencontainers/runc/pull/2712 was added. Between this, folks started relying on the ability to have a workdir that the container user doesn't have access to. Since this case was previously valid, we should continue support for it. Now, we retry the chdir: Once at the top of the function (to catch cases where the runc user has access, but container user does not) and once after we setup user (to catch cases where the container user has access, and the runc user does not) Add a test case for this as well. Signed-off-by: Peter Hunt <pehunt@redhat.com>	2021-04-06 14:51:43 -04:00
Akihiro Suda	c453f1a523	Merge pull request #2881 from kolyshkin/test-rand-cg tests/int: some refactoring, fix a flake	2021-04-06 13:27:19 +09:00
Shengjing Zhu	c5029c001d	tests: fix hello-world tarball name in testdata for arm64 Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2021-04-05 23:55:27 +08:00
Kir Kolyshkin	36fe3cc28c	tests/int/cpt: fix lazy-pages flakiness "checkpoint --lazy-pages and restore" test sometimes fails on restore in our CI on Fedora 33 when systemd cgroup driver is used: > (00.076104) Error (compel/src/lib/infect.c:1513): Task 48521 is in unexpected state: f7f > (00.076122) Error (compel/src/lib/infect.c:1520): Task stopped with 15: Terminated > ... > (00.078246) Error (criu/cr-restore.c:2483): Restoring FAILED. I think what happens is 1. The test runs runc checkpoint in lazy-pages mode in background. 2. The test runs criu lazy-pages in background. 3. The test runs runc restore. Now, all three are working in together: criu restore restores, criu lazy-pages listens for page faults on a uffd and fetch missing pages from runc checkpoint, who serves those pages. At some point criu lazy-pages decides to fetch the rest of the pages, and once it's done it exits, and runc checkpoint, as there are no more pages to serve, exits too. At the end of runc checkpoint the container is removed (see "defer destroy(container)" in checkpoint.go. This involves a call to cgroupManager.Destroy, which, in case systemd manager is used, calls stopUnit, which makes systemd to not just remove the unit, but also send SIGTERM to its processes, if there are any. As the container is being restored into the same systemd unit, sometimes this results in sending SIGTERM to a process which criu restores, and thus restoring fails. The remedy here is to change the name of systemd unit to which the container is restored. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-01 12:57:53 -07:00
Kir Kolyshkin	0e0890027c	tests/int/checkpoint: close lazy_r fd Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-01 12:57:32 -07:00
Kir Kolyshkin	b09030a5f4	tests/int/checkpoint: close fds in check_pipes In check_pipes, make sure we - close all fds we opened in setup_pipes; - check that runc stderr is empty (and fail if it's not). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-01 12:57:27 -07:00
Kir Kolyshkin	e63df1e6f0	tests/int: really randomize cgroup/unit names Commit `41670e21f0` added some randomization to cgroup paths and (if systemd cgroup driver is used) systemd unit names, but the randomization was per bats instance, not per test. Fix this by refactoring init_cgroups_path/set_cgroups_path (moving variable/random part to set_cgroups_path). NOTE though that the randomization is only performed for those tests that explicitly call set_cgroups_path. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-01 12:19:22 -07:00
Kir Kolyshkin	6e4c5b6e85	tests/int/cgroups: don't use BUSYBOX_BUNDLE Commit `41670e21f` removed BUSYBOX_BUNDLE env var, but `c3ffd2ef81` was developed before `41670e21f` was merged. Everything still works because now BUSYBOX_BUNDLE has no value. Nevertheless, let's remove it to avoid confusion. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-01 11:44:53 -07:00
Kir Kolyshkin	bed4d89f57	Merge pull request #2807 from kolyshkin/google-golang-protobuf go.mod, libct: switch to google.golang.org/protobuf	2021-03-31 20:34:16 -07:00
Kir Kolyshkin	f09a3e1b8d	tests/int: don't set/use CGROUP_XXX variables Helper function init_cgroup_paths sets two sets of cgroup path variables for cgroup v1 case (below XXX is cgroup controller name, e.g. MEMORY): 1. CGROUP_XXX_BASE_PATH -- path to XXX controller mount point (e.g. CGROUP_MEMORY_BASE_PATH=/sys/fs/cgroup/memory); 2. CGROUP_XXX -- path to the particular container XXX controller cgroup (e.g. CGROUP_MEMORY=/sys/fs/cgroup/memory/runc-cgroups-integration-test/test-cgroup). The second set of variables is mostly used by check_cgroup_value(), with only two exceptions: - CGROUP_CPU in @test "update rt period and runtime"; - few CGROUP_XXX in @test "runc delete --force in cgroupv1 with subcgroups". Remove these variables, as their values are not used much and are easy to get (as can be seen in modified test cases). While at it, mark some variables as local. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-03-31 20:24:46 -07:00

1 2 3 4 5 ...

376 Commits