zishuo/runc

mirror of https://github.com/opencontainers/runc.git synced 2025-10-28 10:01:28 +08:00

Author	SHA1	Message	Date
Kir Kolyshkin	e8cf8783d1	libct/criuApplyCgroups: add a TODO I don't want to implement it now, because this might result in some new issues, but this is definitely something that is worth implementing. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-12-15 15:37:42 -08:00
Kir Kolyshkin	3438ef30b2	restore: fix --manage-cgroups-mode ignore on cgroup v2 When manage-cgroups-mode: ignore is used, criu still needs to know the cgroup path to work properly (see [1]). Revert "libct/criuApplyCgroups: don't set cgroup paths for v2" This reverts commit `d5c57dcea6`. [1]: https://github.com/checkpoint-restore/criu/issues/1793#issuecomment-1086675168 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-12-15 15:37:42 -08:00
Radostin Stoyanov	fbce47a6b6	deps: bump github.com/checkpoint-restore/go-criu to 6.3.0 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2022-11-01 10:08:14 +00:00
Prajwal S N	746f45807d	deps: bump go-criu to v6 The v6.0.0 release of go-criu has deprecated the `rpc` package in favour of the `crit` package. This commit provides the changes required to use this version in runc. Signed-off-by: Prajwal S N <prajwalnadig21@gmail.com>	2022-09-06 11:55:17 +05:30
Kir Kolyshkin	102b8abd26	libct: rm BaseContainer and Container interfaces The only implementation of these is linuxContainer. It does not make sense to have an interface with a single implementation, and we do not foresee other types of containers being added to runc. Remove BaseContainer and Container interfaces, moving their methods documentation to linuxContainer. Rename linuxContainer to Container. Adopt users from using interface to using struct. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-23 11:04:12 -07:00
Kir Kolyshkin	7cec81e060	libct: suppress strings.Title deprecation warning Function strings.Title is deprecated as of Go 1.18, because it does not handle some corner cases good enough. In this case, though, it is perfectly fine to use it since we have a single ASCII word as an argument, and strings.Title won't be removed until at least Go 2.0. Suppress the deprecation warning. The alternative is to not capitalize the namespace string; this will break restoring of a container checkpointed by earlier version of runc. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-22 12:22:10 -07:00
lifubang	01f00e1fd5	ensure the path is a sub-cgroup path Signed-off-by: lifubang <lifubang@acmcoder.com>	2022-02-19 09:45:09 +08:00
Sebastiaan van Stijn	949111237a	Merge pull request #3303 from kolyshkin/labels libcontainer: optimize utils.SearchLabels	2022-02-16 16:37:00 +01:00
Kir Kolyshkin	dbd990d555	libct: rm intelrtd.Manager interface, NewIntelRdtManager Remove intelrtd.Manager interface, since we only have a single implementation, and do not expect another one. Rename intelRdtManager to Manager, and modify its users accordingly. Remove NewIntelRdtManager from factory. Remove IntelRdtfs. Instead, make intelrdt.NewManager return nil if the feature is not available. Remove TestFactoryNewIntelRdt as it is now identical to TestFactoryNew. Add internal function newManager to be used for tests (to make sure some testing is done even when the feature is not available in kernel/hardware). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-02-03 17:33:03 -08:00
Kir Kolyshkin	9258eac072	libct/start: use execabs for newuidmap lookup Since we are looking up the path to newuidmap/newgidmap in one context, and executing those in another (libct/nsenter), it might make sense to use a stricter rules for looking up path to those binaries. Practically it means that if someone wants to use custom newuidmap and newgidmap binaries from $PATH, it would be impossible to use these from the current directory by means of PATH=.:$PATH; instead one would have to do something like PATH=$(pwd):$PATH. See https://go.dev/blog/path-security for background. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-02-03 17:33:00 -08:00
Kir Kolyshkin	39bd7b7217	libct: Container, Factory: rm newuidmap/newgidmap These were introduced in commit `d8b669400` back in 2017, with a TODO of "make binary names configurable". Apparently, everyone is happy with the hardcoded names. In fact, they are configurable (by prepending the PATH with a directory containing own version of newuidmap/newgidmap). Now, these binaries are only needed in a few specific cases (when rootless is set etc.), so let's look them up only when needed. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-02-03 11:40:29 -08:00
Kir Kolyshkin	630c0d7e8c	libct: Container, Factory: rm InitPath, InitArgs Those are always /proc/self/exe init, and it does not make sense to ever change these. More to say, if InitArgs option func (removed by this commit) is used to change these parameters, it will break things, since "init" is hardcoded elsewhere. Remove this. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-02-03 11:40:29 -08:00
Sebastiaan van Stijn	e4e2a9dda4	Merge pull request #3360 from danishprakash/remove-pausing libcontainer: remove "pausing" state	2022-02-01 23:40:31 +01:00
Akihiro Suda	e9190d3ae1	Merge pull request #3353 from kolyshkin/rm-criu-opt runc: remove --criu option	2022-02-01 08:28:14 +09:00
danishprakash	7346dda332	libcontainer: remove "pausing" state Signed-off-by: danishprakash <grafitykoncept@gmail.com>	2022-01-29 14:27:11 +05:30
Kir Kolyshkin	6e1d476aad	runc: remove --criu option This was introduced in an initial commit, back in the day when criu was a highly experimental thing. Today it's not; most users who need it have it packaged by their distro vendor. The usual way to run a binary is to look it up in directories listed in $PATH. This is flexible enough and allows for multiple scenarios (custom binaries, extra binaries, etc.). This is the way criu should be run. Make --criu a hidden option (thus removing it from help). Remove the option from man pages, integration tests, etc. Remove all traces of CriuPath from data structures. Add a warning that --criu is ignored and will be removed. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-01-26 20:25:56 -08:00
Kir Kolyshkin	bb6a838876	libct: initContainer: rename Id -> ID Since the next commit is going to touch this structure, our CI (lint-extra) is about to complain about improperly named field: > Warning: var-naming: struct field ContainerId should be ContainerID (revive) Make it happy. Brought to use by gopls rename. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-01-26 18:59:47 -08:00
Kir Kolyshkin	dffb8db7e1	libct: handleCriuConfigurationFile: use utils.SearchLabels The utils.Annotations was used here before only because it made it possible to distinguish between "key not found" and "empty value" cases. With the previous commit, utils.SearchLabels can do that, and so it makes sense to use it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-01-26 14:01:11 -08:00
Aleksa Sarai	d72d057ba7	runc init: avoid netlink message length overflows When writing netlink messages, it is possible to have a byte array larger than UINT16_MAX which would result in the length field overflowing and allowing user-controlled data to be parsed as control characters (such as creating custom mount points, changing which set of namespaces to allow, and so on). Co-authored-by: Kir Kolyshkin <kolyshkin@gmail.com> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2021-12-03 16:07:40 +11:00
Aleksa Sarai	dde509df4e	specconv: do not permit null bytes in mount fields Using null bytes as control characters for sending strings via netlink opens us up to a user explicitly putting a null byte in a mount string (which JSON will happily let you do) and then causing us to open a mount path different to the one expected. In practice this is more of an issue in an environment such as Kubernetes where you may have path-based access control policies (which are more susceptible to these kinds of flaws). Found by Google Project Zero. Fixes: `9c444070ec` ("Open bind mount sources from the host userns") Reported-by: Felix Wilhelm <fwilhelm@google.com> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2021-11-19 11:41:05 +11:00
Akihiro Suda	4d17654479	Merge pull request #2576 from kinvolk/alban/userns-2484-take2 Open bind mount sources from the host userns	2021-10-28 14:50:33 +09:00
Kir Kolyshkin	5516294172	Remove io/ioutil use See https://golang.org/doc/go1.16#ioutil Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-10-14 13:46:02 -07:00
Kir Kolyshkin	6a4f4a6a37	libcontainer/ignoreTerminateErrors: simplify for Go 1.16+ One less TODO in the code, yay! Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-10-14 13:46:02 -07:00
Alban Crequy	9c444070ec	Open bind mount sources from the host userns The source of the bind mount might not be accessible in a different user namespace because a component of the source path might not be traversed under the users and groups mapped inside the user namespace. This caused errors such as the following: # time="2020-06-22T13:48:26Z" level=error msg="container_linux.go:367: starting container process caused: process_linux.go:459: container init caused: rootfs_linux.go:58: mounting \"/tmp/busyboxtest/source-inaccessible/dir\" to rootfs at \"/tmp/inaccessible\" caused: stat /tmp/busyboxtest/source-inaccessible/dir: permission denied" To solve this problem, this patch performs the following: 1. in nsexec.c, it opens the source path in the host userns (so we have the right permissions to open it) but in the container mntns (so the kernel cross mntns mount check let us mount it later: https://github.com/torvalds/linux/blob/v5.8/fs/namespace.c#L2312). 2. in nsexec.c, it passes the file descriptors of the source to the child process with SCM_RIGHTS. 3. In runc-init in Golang, it finishes the mounts while inside the userns even without access to the some components of the source paths. Passing the fds with SCM_RIGHTS is necessary because once the child process is in the container mntns, it is already in the container userns so it cannot temporarily join the host mntns. This patch uses the existing mechanism with _LIBCONTAINER_* environment variables to pass the file descriptors from runc to runc init. This patch uses the existing mechanism with the Netlink-style bootstrap to pass information about the list of source mounts to nsexec.c. Rootless containers don't use this bind mount sources fdpassing mechanism because we can't setns() to the target mntns in a rootless container (we don't have the privileges when we are in the host userns). This patch takes care of using O_CLOEXEC on mount fds, and close them early. Fixes: #2484. Signed-off-by: Alban Crequy <alban@kinvolk.io> Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io> Co-authored-by: Rodrigo Campos <rodrigo@kinvolk.io>	2021-10-12 15:13:45 +02:00
Kir Kolyshkin	0202c398ff	runc exec: implement --cgroup In some setups, multiple cgroups are used inside a container, and sometime there is a need to execute a process in a particular sub-cgroup (in case of cgroup v1, for a particular controller). This is what this commit implements. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-09-27 10:25:42 -07:00
Kir Kolyshkin	03244ef2cf	Merge pull request #3217 from kolyshkin/delete-paused runc delete -f: fix for cg v1 + paused container	2021-09-20 10:51:40 -07:00
Adrian Reber	43b36dc4ac	Support changing of lsm mount context on restore Wire through CRIU's support to change the mount context on restore. This is especially useful if restoring a container in a different pod. Single container restore uses the same SELinux process label and same mount context as during checkpointing. If a container is being restored into an existing pod the process label and the mount context needs to be changed to the context of the pod. Changing process label on restore is already supported by runc. This patch adds the possibility to change the mount context. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-20 10:01:16 +02:00
Kir Kolyshkin	6806b2c1c4	runc delete -f: fix for cg v1 + paused container runc delete -f is not working for a paused container, since in cgroup v1 SIGKILL does nothing if a process is frozen (unlike cgroup v2, in which you can kill a frozen process with a fatal signal). Theoretically, we only need this for v1, but doing it for v2 as well is OK. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-09-15 14:55:14 -07:00
Aleksa Sarai	8bf032602a	merge branch 'pr-3047' Liu Hua (1): checkpoint: resolve symlink for external bind mount(fix ci broken) LGTMs: kolyshkin cyphar	2021-09-09 14:24:26 +10:00
Akihiro Suda	bd75bc2dc6	Merge pull request #3176 from kolyshkin/rm-config-error-alt libct/error.go: rm ConfigError (alt)	2021-09-02 14:34:32 +09:00
Kir Kolyshkin	9ff64c3d97	*: rm redundant linux build tag For files that end with _linux.go or _linux_test.go, there is no need to specify linux build tag, as it is assumed from the file name. In addition, rename libcontainer/notify_linux_v2.go -> libcontainer/notify_v2_linux.go for the file name to make sense. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-08-30 20:15:00 -07:00
Kir Kolyshkin	62ec6dc973	Merge pull request #2920 from marquiz/devel/rdt libcontainer/intelrdt: support ClosID parameter	2021-08-30 19:36:03 -07:00
Kir Kolyshkin	538ba846dd	libct/error.go: rm ConfigError ConfigError was added by commit `e918d02139`, while removing runc own error system, to preserve a way for a libcontainer user to distinguish between a configuration error and something else. The way ConfigError is implemented requires a different type of check (compared to all other errors defined by error.go). An attempt was made to rectify this, but the resulting code became even more complicated. As no one is using this functionality (of differentiating a "bad config" type of error from other errors), let's just drop the ConfigError type. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-08-23 18:56:08 -07:00
Liu Hua	74ae9e0fc9	checkpoint: resolve symlink for external bind mount(fix ci broken) runc resolves symlink before doing bind mount. So we should save original path while formatting CriuReq for dump and restore. "checkpoint: resolve symlink for external bind mount" is merged as da22625f6986f0ef196eaa1f8bb6adce098f0fb7(PR 2902) previously. And reverted in commit 70fdc0573dced3464e9c31d674559f77c1de3973(PR 3043) duo to behavior changes caused by commit 0ca91f44f1664da834bc61115a849b56d22f595f(Fixes: CVE-2021-30465) Signed-off-by: Liu Hua <weldonliu@tencent.com>	2021-08-19 18:42:42 +08:00
Kir Kolyshkin	75761bccf7	Fix codespell warnings, add codespell to ci The two exceptions I had to add to codespellrc are: - CLOS (used by intelrtd); - creat (syscall name used in tests/integration/testdata/seccomp_*.json). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-08-17 16:12:35 -07:00
Markus Lehtonen	7296dc1712	libcontainer/intelrdt: refactor clos path handling Simplify the code and make path a property of the container (via intelRdtManager). Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>	2021-08-09 15:58:03 +03:00
Kir Kolyshkin	be1d5f83c0	ci: enable unconvert linter, fix its warnings Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-07-07 10:42:48 -07:00
Akihiro Suda	5547b5774f	Merge pull request #3033 from kolyshkin/rm-own-errors libcontainer: rm own error system	2021-07-01 13:47:27 +09:00
Kir Kolyshkin	70fdc0573d	Revert "checkpoint: resolve symlink for external bind mount" This reverts commit `da22625f69` (PR 2902). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-24 20:52:43 -07:00
Mrunal Patel	1079288bef	Merge pull request #2902 from liusdu/checkpoint checkpoint: resolve symlink for external bind mount	2021-06-24 22:52:02 -04:00
Mrunal Patel	245fe2b678	Merge pull request #3029 from liusdu/work checkpoint: set default work-dir to image-path	2021-06-24 22:44:48 -04:00
Kir Kolyshkin	e918d02139	libcontainer: rm own error system This removes libcontainer's own error wrapping system, consisting of a few types and functions, aimed at typization, wrapping and unwrapping of errors, as well as saving error stack traces. Since Go 1.13 now provides its own error wrapping mechanism and a few related functions, it makes sense to switch to it. While doing that, improve some error messages so that they start with "error", "unable to", or "can't". A few things that are worth mentioning: 1. We lose stack traces (which were never shown anyway). 2. Users of libcontainer that relied on particular errors (like ContainerNotExists) need to switch to using errors.Is with the new errors defined in error.go. 3. encoding/json is unable to unmarshal the built-in error type, so we have to introduce initError and wrap the errors into it (basically passing the error as a string). This is the same as it was before, just a tad simpler (actually the initError is a type that got removed in commit afa844311; also suddenly ierr variable name makes sense now). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-24 10:21:04 -07:00
Kir Kolyshkin	a7cfb23b88	*: stop using pkg/errors Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-22 16:09:47 -07:00
Kir Kolyshkin	56e478046a	: ignore errorlint warnings about unix. errors Errors from unix.* are always bare and thus can be used directly. Add //nolint:errorlint annotation to ignore errors such as these: libcontainer/system/xattrs_linux.go:18:7: comparing with == will fail on wrapped errors. Use errors.Is to check for a specific error (errorlint) case errno == unix.ERANGE: ^ libcontainer/container_linux.go:1259:9: comparing with != will fail on wrapped errors. Use errors.Is to check for a specific error (errorlint) if e != unix.EINVAL { ^ libcontainer/rootfs_linux.go:919:7: comparing with != will fail on wrapped errors. Use errors.Is to check for a specific error (errorlint) if err != unix.EINVAL && err != unix.EPERM { ^ libcontainer/rootfs_linux.go:1002:4: switch on an error will fail on wrapped errors. Use errors.Is to check for specific errors (errorlint) switch err { ^ Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-22 16:09:47 -07:00
Kir Kolyshkin	7be93a66b9	*: fmt.Errorf: use %w when appropriate This should result in no change when the error is printed, but make the errors returned unwrappable, meaning errors.As and errors.Is will work. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-22 16:09:47 -07:00
Kir Kolyshkin	36aefad45d	libct: wrap unix.Mount/Unmount errors Errors returned by unix are bare. In some cases it's impossible to find out what went wrong because there's is not enough context. Add a mountError type (mostly copy-pasted from github.com/moby/sys/mount), and mount/unmount helpers. Use these where appropriate, and convert error checks to use errors.Is. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-22 16:09:37 -07:00
Kir Kolyshkin	627a06ad92	Replace fmt.Errorf w/o %-style to errors.New Using fmt.Errorf for errors that do not have %-style formatting directives is an overkill. Switch to errors.New. Found by git grep fmt.Errorf \| grep -v ^vendor \| grep -v '%' Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-22 11:42:07 -07:00
Liu Hua	85aabe233e	C/R: let criu use its default if --work-path is not set Now runc puts dump/restore logs in c.root defaultly, which will be deleted when container exits. So if checkpinting/restoring failed, we can not get these logs and analyze why. This patch lets criu use its default if --work-path is not set: - Use WorkDirectory found in criu's configfile. - Use ImageDirectory. Signed-off-by: Liu Hua <weldonliu@tencent.com>	2021-06-16 20:47:25 +08:00
Adrian Reber	535f25c44f	Allow restoring with a different LSM profile Restoring an SELinux enabled container with Podman will result in a container with the exactly same SELinux process labels as during checkpointing. CRIU takes care of all the process labels. Restoring multiple copies of a checkpointed container will result in all containers having the same SELinux process labels, which might be undesired. When looking at Pods all container in a Pod share the process label of the infrastructure container. To restore a container into and existing Pod it is necessary to tell CRIU to restore the container with the infrastructure container process label. CRIU supports setting different process labels using --lsm-profile for a long time and this just passes the process label information from runc to CRIU. Unfortunately CRIU has a bug as no one was using the --lsm-profile option so this changes requires the upcoming CRIU version 3.16. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-06-07 18:05:24 +02:00
Kir Kolyshkin	e6048715e4	Use gofumpt to format code gofumpt (mvdan.cc/gofumpt) is a fork of gofmt with stricter rules. Brought to you by git ls-files \*.go \| grep -v ^vendor/ \| xargs gofumpt -s -w Looking at the diff, all these changes make sense. Also, replace gofmt with gofumpt in golangci.yml. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-01 12:17:27 -07:00

1 2 3 4 5 ...

374 Commits