Commit Graph

7205 Commits

Author SHA1 Message Date
Aleksa Sarai
4774df3877 VERSION: release v1.2.7
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
v1.2.7
2025-09-04 20:21:36 +10:00
lfbzhm
daa7a74d9d Merge pull request #4872 from kolyshkin/1.2-4765
[1.2] Refactor/improve prepareCriuRestoreMounts
2025-08-28 12:47:39 +08:00
Kir Kolyshkin
66888fb75c criu: simplify isOnTmpfs check in prepareCriuRestoreMounts
Instead of generating a list of tmpfs mount and have a special function
to check whether the path is in the list, let's go over the list of
mounts directly. This simplifies the code and improves readability.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit ce3cd4234c)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-08-28 12:29:05 +08:00
Kir Kolyshkin
8559d3e527 criu: inline makeCriuRestoreMountpoints
Since its code is now trivial, and it is only called from a single
place, it does not make sense to have it as a separate function.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit f91fbd34d9)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-08-28 12:29:05 +08:00
Kir Kolyshkin
83a3755693 criu: ignore cgroup early in prepareCriuRestoreMounts
It makes sense to ignore cgroup mounts much early in the code,
saving some time on unnecessary operations.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit b8aa5481db)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-08-28 12:29:05 +08:00
Kir Kolyshkin
881a781eed criu: improve prepareCriuRestoreMounts
1. Replace the big "if !" block with the if block and continue,
   simplifying the code flow.

2. Move comments closer to the code, improving readability.

This commit is best reviewed with --ignore-all-space or similar.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 0c93d41c65)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-08-28 12:29:05 +08:00
lfbzhm
2f9d7aecf9 Merge pull request #4869 from cyphar/1.2-reset-cpu-affinity
[1.2] libct: reset CPU affinity by default
2025-08-28 12:27:08 +08:00
Aleksa Sarai
6983cc7ac1 [1.2] libct: reset CPU affinity by default
In certain deployments, it's possible for runc to be spawned by a
process with a restrictive cpumask (such as from a systemd unit with
CPUAffinity=... configured) which will be inherited by runc and thus the
container process by default.

The cpuset cgroup used to reconfigure the cpumask automatically for
joining processes, but kcommit da019032819a ("sched: Enforce user
requested affinity") changed this behaviour in Linux 6.2.

The solution is to try to emulate the expected behaviour by resetting
our cpumask to correspond with the configured cpuset (in the case of
"runc exec", if the user did not configure an alternative one). Normally
we would have to parse /proc/stat and /sys/fs/cgroup, but luckily
sched_setaffinity(2) will transparently convert an all-set cpumask (even
if it has more entries than the number of CPUs on the system) to the
correct value for our usecase.

For some reason, in our CI it seems that rootless --systemd-cgroup
results in the cpuset (presumably temporarily?) being configured such
that sched_setaffinity(2) will allow the full set of CPUs. For this
particular case, all we care about is that it is different to the
original set, so include some special-casing (but we should probably
investigate this further...).

Reported-by: ningmingxiao <ning.mingxiao@zte.com.cn>
Reported-by: Martin Sivak <msivak@redhat.com>
Reported-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
(Cherry-pick of commit 121192ade6c55f949d32ba486219e2b1d86898b2.)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-28 11:02:54 +10:00
Aleksa Sarai
a06ff08ea2 [1.2] tests: add RUNC_CMDLINE for tests incompatible with functions
Sometimes we need to run runc through some wrapper (like nohup), but
because "__runc" and "runc" are bash functions in our test suite this
doesn't work trivially -- and you cannot just pass "$RUNC" because you
you need to set --root for rootless tests.

So create a setup_runc_cmdline helper which sets $RUNC_CMDLINE to the
beginning cmdline used by __runc (and switch __runc to use that).

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
(Cherry-pick of commit d1f6acfab06e6f5eb15b7edfaa704f50907907b1.)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-28 11:02:53 +10:00
Aleksa Sarai
197c7fcd91 [1.2] tests: add sane_run helper
"runc" was a special wrapper around bats's "run" which output some very
useful diagnostic information to the bats log, but this was not usable
for other commands. So let's make it a more generic helper that we can
use for other commands.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
(Cherry-pick of commit ea385de40c9a006737399bc72918a19e5d038736.)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-28 11:02:50 +10:00
lfbzhm
d5350ff31b Merge pull request #4867 from cyphar/1.2-gha-arm
[1.2] CI: switch to GHA for arm
2025-08-27 18:24:59 +08:00
Kir Kolyshkin
376961e830 [1.3] Switch to packaged criu on arm
The issue on arm [1] is now fixed, so let's get back to using the
packaged criu version for most of the CI matrix.

This reverts commit 105674844e
("ci: use criu built from source on gha arm").

[1]: https://github.com/checkpoint-restore/criu/issues/2709

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(Cherry-picked from commit 96f4a90a6b1ca9e3f2011ebaeffb7dc52db2ca32.)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-27 14:31:51 +10:00
Kir Kolyshkin
553c74de8e [1.3] ci: use criu built from source on gha arm
Currently, criu package from opensuse build farm times out on GHA arm,
so let's only use criu-dev (i.e. compiled from source on CI machine).

Once this is fixed, this patch can be reverted.

Related to criu issue 2709.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(Cherry-picked from commit 105674844eaaf24bf14135ef0c64703e511882ab.)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-27 14:31:51 +10:00
Kir Kolyshkin
bf2eb5f2ac [1.3] CI: switch to GHA for arm
Since GHA now provides ARM, we can switch away from actuated.

Many thanks to @alexellis (@self-actuated) for being the sponsor of this
project.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(Cherry-picked from commit 1cf096803abb770c414ce0a1e2e0be283b09001d.)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-27 14:31:50 +10:00
Kir Kolyshkin
8d86671577 Merge pull request #4834 from marquiz/release-1.2
[release-1.2] runc update: don't lose intelRdt state
2025-08-04 16:53:37 -07:00
Markus Lehtonen
addad95c3b runc update: don't lose intelRdt state
Prevent --l3-cache-schema from clearing the intel_rdt.memBwSchema state
and --mem-bw-schema clearing l3_cache_schema, respectively.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
(cherry picked from commit 57b6a317bb)
2025-08-01 16:59:57 +03:00
Kir Kolyshkin
0220cf47aa Merge pull request #4820 from cyphar/1.2-proc-net-dev-overmount
[1.2] rootfs: remove /proc/net/dev from allowed overmount list
2025-07-25 20:44:46 -07:00
Aleksa Sarai
ae93828c27 [1.2] rootfs: remove /proc/net/dev from allowed overmount list
This was added in 2ee9cbbd12 ("It's /proc/stat, not /proc/stats") with
no actual justification, and doesn't really make much sense on further
inspection:

 * /proc/net is a symlink to "self/net", which means that /proc/net/dev
   is a per-process file, and so overmounting it would only affect pid1.
   Any other program that cares about /proc/net/dev would see their own
   process's configuration, and unprivileged processes wouldn't be able
   to see /proc/1/... data anyway.

   In addition, the fact that this is a symlink means that runc will
   deny the overmount because /proc/1/net/dev is not in the proc
   overmount allowlist. This means that this has not worked for many
   years, and probably never worked in the first place.

 * /proc/self/net is already namespaced with network namespaces, so the
   primary argument for allowing /proc overmounts (lxcfs-like masking of
   procfs files to emulate namespacing for files that are not properly
   namespaced for containers -- such as /proc/cpuinfo) is moot.

   It goes without saying that lxcfs has never overmounted
   /proc/self/net/... files, so the general "because lxcfs"
   justification doesn't hold water either.

 * The kernel has slowly been moving towards blocking overmounts in
   /proc/self/. Linux 6.12 blocked overmounts for fd, fdinfo, and
   map_files; future Linux versions will probably end up blocking
   everything under /proc/self/.

Fixes: 2ee9cbbd12 ("It's /proc/stat, not /proc/stats")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
(cherry-picked from commit 3620185d06b79da836559b75161027c6273fff7b.)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-07-25 19:19:15 +10:00
Rodrigo Campos
e6e6f64334 Merge pull request #4791 from rata/rootfs-propagation-12
[1.2] fix rootfs propagation mode to shared / unbindable
2025-07-16 07:27:32 -03:00
Yusuke Sakurai
2667d7365b fix rootfs propagation mode
Signed-off-by: Yusuke Sakurai <yusuke.sakurai@3-shake.com>
(cherry picked from commit 04be81b6a3)
2025-07-16 12:03:47 +02:00
Rodrigo Campos
87d0b17079 Merge pull request #4811 from kolyshkin/1.2-4806
[1.2] tests/int/cgroups.bats: exclude dmem controller
2025-07-16 06:50:42 -03:00
Kir Kolyshkin
fea9456578 tests/int/cgroups.bats: exclude dmem controller
The dmem controller is added into kernel v6.13 and is now enabled in
Fedora 42 kernels. Yet, systemd is not aware of dmem.

This fixes the test case failure on Fedora.

For the initial test case, see commit 27515719.

For earlier commits similar to this one, see
commits 601cf582, 05272718, e83ca519.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit b3432118ed)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-15 11:14:30 -07:00
lfbzhm
00b3a7fd96 Merge pull request #4800 from astrawind/seccomp-agent-conn-leak-12
[1.2] libcontainer: close seccomp agent connection to prevent resource leaks
2025-07-04 22:53:57 +08:00
Pavel Liubimov
3e32756ef7 libcontainer: close seccomp agent connection to prevent resource leaks
Add missing defer conn.Close().

Signed-off-by: Pavel Liubimov <prlyubimov@gmail.com>
(cherry picked from commit aa0e7989c4)
Signed-off-by: Pavel Liubimov <prlyubimov@gmail.com>
2025-07-04 00:11:13 +03:00
lfbzhm
98eb876379 Merge pull request #4740 from cyphar/1.2-mount-errors
[1.2] rootfs: improve mount-related errors
2025-04-23 13:12:50 +08:00
Aleksa Sarai
3954766557 [1.2] rootfs: improve error messages for bind-mount vfs flag setting
While debugging an issue involving failing mounts, I discovered that
just returning the plain mount error message when we are in the fallback
code for handling locked mounts leads to unnecessary confusion.

It also doesn't help that podman currently forcefully sets "rw" on
mounts, which means that rootless containers are likely to hit the
locked mounts issue fairly often.

So we should improve our error messages to explain why the mount is
failing in the locked flags case.

Fixes: 7c71a22705 ("rootfs: remove --no-mount-fallback and finally fix MS_REMOUNT")
(cherry picked from commit 58c3ab77b0)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-04-23 14:35:34 +10:00
Aleksa Sarai
dc0c6d0163 [1.2] mount: add string representation of mount flags
When reading mount errors, it is quite hard to make sense of mount flags
in their hex form. As this is the error path, the minor performance
impact of constructing a string is probably not worth hyper-optimising.

(cherry pick from commit 30302a2850)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-04-23 14:35:34 +10:00
lfbzhm
948cefeb5b Merge pull request #4742 from kolyshkin/1.2-4670
[1.2] ci fixes (ssh-keygen and criu version bumps for almalinux 8 and fedora)
2025-04-23 12:28:16 +08:00
Kir Kolyshkin
21e70de6d8 ci: upgrade to criu-4.1-2 in Fedora
Package criu-4.1-1 has a known bug [1] which is fixed in criu-4.1-2 [2],
which is currently only available in updates-testing. Add a kludge to
install newer criu if necessary to fix CI.

This will not be needed in ~2 weeks once the new package is promoted to
updates.

[1]: https://github.com/checkpoint-restore/criu/issues/2650
[2]: https://bodhi.fedoraproject.org/updates/FEDORA-2025-d374d8ce17

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 3e3e04824d)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 19:45:57 -07:00
Kir Kolyshkin
a703687437 Unify and fix rootless key setup
For some reason, ssh-keygen is unable to write to /root even as root on
AlmaLinux 8:

	# id
	uid=0(root) gid=0(root) groups=0(root) context=system_u:system_r:initrc_t:s0
	# id -Z
	ls -ld /root
	# ssh-keygen -t ecdsa -N "" -f /root/rootless.key || cat /var/log/audit/audit.log
	Saving key "/root/rootless.key" failed: Permission denied

The audit.log shows:

> type=AVC msg=audit(1744834995.352:546): avc:  denied  { dac_override } for  pid=13471 comm="ssh-keygen" capability=1  scontext=system_u:system_r:ssh_keygen_t:s0 tcontext=system_u:system_r:ssh_keygen_t:s0 tclass=capability permissive=0
> type=SYSCALL msg=audit(1744834995.352:546): arch=c000003e syscall=257 success=no exit=-13 a0=ffffff9c a1=5641c7587520 a2=241 a3=180 items=0 ppid=4978 pid=13471 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="ssh-keygen" exe="/usr/bin/ssh-keygen" subj=system_u:system_r:ssh_keygen_t:s0 key=(null)␝ARCH=x86_64 SYSCALL=openat AUID="unset" UID="root" GID="root" EUID="root" SUID="root" FSUID="root" EGID="root" SGID="root" FSGID="root"

A workaround is to use /root/.ssh directory instead of just /root.

While at it, let's unify rootless user and key setup into a single place.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 87ae2f8466)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 19:28:25 -07:00
Kir Kolyshkin
6b108eb624 ci: install newer criu for almalinux-8
We are seeing a ton on flakes on almalinux-8 CI job, all caused by criu
inability to freeze a cgroup. This was worked around in criu [1], but
obviously we can't rely on a distro vendor to update the package.

Let's use a copr (thanks to Adrian Reber!)

[1]: https://github.com/checkpoint-restore/criu/pull/2545

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit b520f750ef)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 19:28:25 -07:00
Kir Kolyshkin
5082191d70 script/setup_host_fedora.sh: use bash arrays
This makes the code more robust and allows to remove the
"shellcheck disable=SC2086" annotation.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 8e653e40c6)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 19:26:20 -07:00
Kir Kolyshkin
fd3bc5d0e2 script/setup_host_fedora.sh: remove -p from mkdir
1. There is no need to have -p option in mkdir here, since
   /home/rootless was already created by useradd above.

2. When there is no -p, there is no need to suppress the shellcheck
   warning (which looked like this):

> In script/setup_host_fedora.sh line 21:
> mkdir -m 0700 -p /home/rootless/.ssh
>       ^-- SC2174 (warning): When used with -p, -m only applies to the deepest directory.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit a76a1361b4)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 19:26:10 -07:00
Kir Kolyshkin
7d1df7b66b tests/int: rm some "shellcheck disable" annotations
Those are no longer needed with shellcheck v0.10.0 (possibly with an
earlier version, too, but I am too lazy to check that).

While at it, fix a typo in the comment.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit af386d1df1)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 19:26:10 -07:00
Kir Kolyshkin
4d1f79ca45 ci: bump shellcheck to v0.10.0
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit b48dd65114)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 19:26:02 -07:00
Kir Kolyshkin
e0f47b73dd Makefile: bump shfmt to v3.11.0
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 6e5ffb7cbc)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 19:26:02 -07:00
lfbzhm
244d3c5d19 Merge pull request #4720 from kolyshkin/1.2-4709
[1.2] runc pause/unpause/ps: get rid of excessive warning
2025-04-10 07:05:05 +08:00
Kir Kolyshkin
f6026143a4 runc pause/unpause/ps: get rid of excessive warning
This issue was originally reported in podman PR 25792.

When calling runc pause/unpause for an ordinary user, podman do not
provide --systemd-cgroups option, and shouldUseRootlessCgroupManager
returns true. This results in a warning:

	$ podman pause sleeper
	WARN[0000] runc pause may fail if you don't have the full access to cgroups
	sleeper

Actually, it does not make sense to call shouldUseRootlessCgroupManager
at this point, because we already know if we're rootless or not, from
the container state.json (same for systemd).

Also, busctl binary is not available either in this context, so
shouldUseRootlessCgroupManager would not work properly.

Finally, it doesn't really matter if we use systemd or not, because we
use fs/fs2 manager to freeze/unfreeze, and it will return something like
EPERM (or tell that cgroups is not configured, for a true rootless
container).

So, let's only print the warning after pause/unpause failed,
if the error returned looks like a permission error.

Same applies to "runc ps".

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit c5ab4b6e30)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-09 09:32:38 -07:00
Kir Kolyshkin
8fe7f17ad4 pause: refactor
This is to simplify code review for the next commit.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit fda034c9ec)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-09 09:32:38 -07:00
Akihiro Suda
1aa51988e1 Merge pull request #4714 from kolyshkin/1.2-4696
[1.2] criu: Add time namespace to container config after checkpoint/restore
2025-04-08 18:02:35 +09:00
Andrei Vagin
c248c8ecdf criu: Add time namespace to container config after checkpoint/restore
Since v3.14, CRIU always restores processes into a time namespace to
prevent backward jumps of monotonic and boottime clocks. This change
updates the container configuration to ensure that `runc exec` launches
new processes within the container's time namespace.

Fixes #2610

Signed-off-by: Andrei Vagin <avagin@gmail.com>
(cherry picked from commit b68cbdff34)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-07 17:09:08 -07:00
Akihiro Suda
1c50804572 Merge pull request #4683 from kolyshkin/1.2-ch-nits
[1.2] CHANGELOG: v1.2.6 formatting fixes
2025-03-21 09:14:52 +09:00
Kir Kolyshkin
6345aa2910 CHANGELOG: v1.2.6 formatting fixes
Fix headers

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-03-17 14:13:24 -07:00
Kir Kolyshkin
b44b3b6b0e Merge pull request #4674 from rata/release-1.2
Release 1.2.6
2025-03-17 12:27:37 -07:00
Rodrigo Campos
048d07adec VERSION: Switch back to dev
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2025-03-14 16:32:15 +01:00
Rodrigo Campos
e89a29929c VERSION: Release 1.2.6
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
v1.2.6
2025-03-14 11:54:49 +01:00
Akihiro Suda
b5ec91d1f8 Merge pull request #4678 from kolyshkin/1.2-4671
[1.2] .cirrus.yml: install less dependencies
2025-03-14 09:08:44 +09:00
Kir Kolyshkin
b582187ce9 .cirrus.yml: install less dependencies
In a nutshell:
 - use git-core instead of git;
 - do not install weak deps;
 - do not install docs.

This results in less packages to install:
 - 25 instead of 72 for almalinux-8
 - 24 instead of 90 for almalinux-9

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 1d9bea5378)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-03-13 11:19:27 -07:00
lfbzhm
9d0c86a72d Merge pull request #4668 from AkihiroSuda/cherrypick-4664-1.2
[1.2] CI: migrate Vagrant + Cirrus to Lima + GHA
2025-03-10 19:28:07 +08:00
Akihiro Suda
96f68384bb CI: migrate Vagrant + Cirrus to Lima + GHA
- Unlike proprietary Vagrant, Lima remains to be an open source project
- GHA now natively supports nested virt on Linux runners

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 135552e5e4)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2025-03-09 02:49:55 +09:00