Now that Go 1.25 is out, let's switch to go 1.24.0 as a minimally
supported version, drop Go 1.23 and add Go 1.25 to CI matrix.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since GHA now provides ARM, we can switch away from actuated.
Many thanks to @alexellis (@self-actuated) for being the sponsor of this
project.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This allows to make a 17% smaller runc binary by not compiling in
checkpoint/restore support.
It turns out that google.golang.org/protobuf package, used by go-criu,
is quite big, and go linker can't drop unused stuff if reflection is
used anywhere in the code.
Currently there's no alternative to using protobuf in go-criu, and since
not all users use c/r, let's provide them an option for a smaller
binary.
For the reference, here's top10 biggest vendored packages, as reported
by gsa[1]:
$ gsa runc | grep vendor | head
│ 8.59% │ google.golang.org/protobuf │ 1.3 MB │ vendor │
│ 5.76% │ github.com/opencontainers/runc │ 865 kB │ vendor │
│ 4.05% │ github.com/cilium/ebpf │ 608 kB │ vendor │
│ 2.86% │ github.com/godbus/dbus/v5 │ 429 kB │ vendor │
│ 1.25% │ github.com/urfave/cli │ 188 kB │ vendor │
│ 0.90% │ github.com/vishvananda/netlink │ 135 kB │ vendor │
│ 0.59% │ github.com/sirupsen/logrus │ 89 kB │ vendor │
│ 0.56% │ github.com/checkpoint-restore/go-criu/v6 │ 84 kB │ vendor │
│ 0.51% │ golang.org/x/sys │ 76 kB │ vendor │
│ 0.47% │ github.com/seccomp/libseccomp-golang │ 71 kB │ vendor │
And here is a total binary size saving when `runc_nocriu` is used.
For non-stripped binaries:
$ gsa runc-cr runc-nocr | tail -3
│ -17.04% │ runc-cr │ 15 MB │ 12 MB │ -2.6 MB │
│ │ runc-nocr │ │ │ │
└─────────┴──────────────────────────────────────────┴──────────┴──────────┴─────────┘
And for stripped binaries:
│ -17.01% │ runc-cr-stripped │ 11 MB │ 8.8 MB │ -1.8 MB │
│ │ runc-nocr-stripped │ │ │ │
└─────────┴──────────────────────────────────────────┴──────────┴──────────┴─────────┘
[1]: https://github.com/Zxilly/go-size-analyzer
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Because we have the overlay solution, we can drop runc-dmz binary
solution since it has too many limitations.
Signed-off-by: lifubang <lifubang@acmcoder.com>
Add this new make variable so users can specify build information
without modifying the runc version nor the source code.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Go 1.20 was released in February 2023 and is no longer supported since
February 2024. Time to move on.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Go 1.23 includes a fix (https://go.dev/cl/587919) so go1.23.x can be
used. This fix is also backported to 1.22.4, so go1.22.x can also be
used (when x >= 4). Finally, for glibc >= 2.32 it doesn't really matter.
Add a note about Go 1.22.x > 1.22.4 to README as well.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Now that runc-dmz is opt-in, we no longer need to try to detect whether
SELinux would cause issues for us. We can also remove the
special-purpose build-tag we added.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
If it is compiled, the user needs to opt-in with this env variable to
use it.
While we are there, remove the RUNC_DMZ=legacy as that is now the
default.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Add a workaround for a problem of older container-selinux not allowing
runc to use dmz feature. If runc sees that SELinux is in enforced mode
and the container's SELinux label is set, it disables dmz.
Add a build tag, runc_dmz_selinux_nocompat, which disables the workaround.
Newer distros that ship container-selinux >= 2.224.0 (currently CentOS
Stream 8 and 9, RHEL 8 and 9, and Fedora 38+) may build runc with this
build tag set to benefit from dmz working with SELinux.
Document the build tag in the top-level and libct/dmz READMEs.
Use the build tag in our CI builds for CentOS Stream 9 and Fedora 38,
as they already has container-selinux 2.224.0 available in updates.
Add a TODO to use the build tag for CentOS Stream 8 once it has
container-selinux updated.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This really isn't ideal but it can be used to avoid the largest issues
with the memfd-based runc binary protection. There are several caveats
with using this tool, see the help page for the new binary for details.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
The idea is to remove the need for cloning the entire runc binary by
replacing the final execve() call of the container process with an
execve() call to a clone of a small C binary which just does an execve()
of its arguments.
This provides similar protection against CVE-2019-5736 but without
requiring a >10MB binary copy for each "runc init". When compiled with
musl, runc-dmz is 13kB (though unfortunately with glibc, it is 1.1MB
which is still quite large).
It should be noted that there is still a window where the container
processes could get access to the host runc binary, but because we set
ourselves as non-dumpable the container would need CAP_SYS_PTRACE (which
is not enabled by default in Docker) in order to get around the
proc_fd_access_allowed() checks. In addition, since Linux 4.10[1] the
kernel blocks access entirely for user namespaced containers in this
scenario. For those cases we cannot use runc-dmz, but most containers
won't have this issue.
This new runc-dmz binary can be opted out of at compile time by setting
the "runc_nodmz" buildtag, and at runtime by setting the RUNC_DMZ=legacy
environment variable. In both cases, runc will fall back to the classic
/proc/self/exe-based cloning trick. If /proc/self/exe is already a
sealed memfd (namely if the user is using contrib/cmd/memfd-bind to
create a persistent sealed memfd for runc), neither runc-dmz nor
/proc/self/exe cloning will be used because they are not necessary.
[1]: bfedb58925
Co-authored-by: lifubang <lifubang@acmcoder.com>
Signed-off-by: lifubang <lifubang@acmcoder.com>
[cyphar: address various review nits]
[cyphar: fix runc-dmz cross-compilation]
[cyphar: embed runc-dmz into runc binary and clone in Go code]
[cyphar: make runc-dmz optional, with fallback to /proc/self/exe cloning]
[cyphar: do not use runc-dmz when the container has certain privs]
Co-authored-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
In order to allow any of the maintainers to cut releases for runc,
create a keyring file that distributions can use to verify that releases
are signed by one of the maintainers.
The format matches the gpg-offline format used by openSUSE packaging,
but it can be easily imported with "gpg --import" so any distribution
should be able to handle this keyring format wtihout issues.
Each key includes the GitHub handle of the associated user. There isn't
any way for this information to be automatically verified (outside of
using something like keybase.io) but since all changes of this file need
to be approved by maintainers this is okay for now.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Require go 1.17 from now on, since go 1.16 is no longer supported.
Drop go1.16 compatibility.
NOTE we also have to install go 1.18 from Vagrantfile, because
Fedora 35 comes with Go 1.16.x which can't be used.
Note the changes to go.mod and vendor are due to
https://go.dev/doc/go1.17#tools
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
What used to be godoc.org is now pkg.go.dev, and while the old URLs
still work, they might be broken in the future.
Updated badges are generated via https://pkg.go.dev/badge/
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Go 1.15 is not supported since Go 1.17 release (16 Aug 2021), and some
packages that we use already require Go 1.16+ (notably,
github.com/cilium/ebpf v0.7.0).
Let's require Go 1.16+.
Remove Go version requirement from README when describing dependencies,
since it is no longer needed:
$ GO=go1.15.15 make vendor
go1.15.15 mod tidy
go mod tidy: go.mod file indicates go 1.16, but maximum supported version is 1.15
make: *** [Makefile:141: vendor] Error 1
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This was added by commit 5aa82c950 back in the day when we thought
runc is going to be cross-platform. It's very clear now it's Linux-only
package.
While at it, further clarify it in README that we're Linux only.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
"`runc` X.Y.Z should implement the X.Y version of the specification." is no longer correct.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This reverts commit d0cbef576f.
Dockre/Moby still builds runc with Go 1.13, so we should still support
Go 1.13.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This is somewhat radical approach to deal with kernel memory.
Per-cgroup kernel memory limiting was always problematic. A few
examples:
- older kernels had bugs and were even oopsing sometimes (best example
is RHEL7 kernel);
- kernel is unable to reclaim the kernel memory so once the limit is
hit a cgroup is toasted;
- some kernel memory allocations don't allow failing.
In addition to that,
- users don't have a clue about how to set kernel memory limits
(as the concept is much more complicated than e.g. [user] memory);
- different kernels might have different kernel memory usage,
which is sort of unexpected;
- cgroup v2 do not have a [dedicated] kmem limit knob, and thus
runc silently ignores kernel memory limits for v2;
- kernel v5.4 made cgroup v1 kmem.limit obsoleted (see
https://github.com/torvalds/linux/commit/0158115f702b).
In view of all this, and as the runtime-spec lists memory.kernel
and memory.kernelTCP as OPTIONAL, let's ignore kernel memory
limits (for cgroup v1, same as we're already doing for v2).
This should result in less bugs and better user experience.
The only bad side effect from it might be that stat can show kernel
memory usage as 0 (since the accounting is not enabled).
[v2: add a warning in specconv that limits are ignored]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Move docs/systemd-properties.md to docs/systemd.md
2. Document the cgroupsPath to systemd unit name and slice conversion
rules, as well as mapping of OCI runtime spec resource limits to
systemd unit properties.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since go 1.14, mod=vendor is used automatically. Since go 1.16 is now
released, and minimally supported go version is 1.15.
As per commit fbeed5228, remove the go 1.13 workaround.
Fix README to require go 1.14.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The apparmor tag was introduced in a01ed80 (2014) to make cgo dependency
on libapparmor optional.
However, the cgo dependency was removed in db093f6 (2017), so it is no
longer meaningful to keep apparmor build tag.
Close#2704
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
After a lot of refactoring, our cgroup v1 and v2 drivers now have same level of implementation quality,
so we can move the v2 driver out of experimental.
Close#2663
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>