zishuo/runc

mirror of https://github.com/opencontainers/runc.git synced 2025-11-02 20:04:01 +08:00

Author	SHA1	Message	Date
Sebastiaan van Stijn	9b60a93cf3	libcontainer/userns: migrate to github.com/moby/sys/userns The userns package was moved to the moby/sys/userns module at commit `3778ae603c`. This patch deprecates the old location, and adds it as an alias for the moby/sys/userns package. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2024-10-09 22:20:25 +08:00
Aleksa Sarai	7c71a22705	rootfs: remove --no-mount-fallback and finally fix MS_REMOUNT The original reasoning for this option was to avoid having mount options be overwritten by runc. However, adding command-line arguments has historically been a bad idea because it forces strict-runc-compatible OCI runtimes to copy out-of-spec features directly from runc and these flags are usually quite difficult to enable by users when using runc through several layers of engines and orchestrators. A far more preferable solution is to have a heuristic which detects whether copying the original mount's mount options would override an explicit mount option specified by the user. In this case, we should return an error. You only end up in this path in the userns case, if you have a bind-mount source with locked flags. During the course of writing this patch, I discovered that several aspects of our handling of flags for bind-mounts left much to be desired. We have completely botched the handling of explicitly cleared flags since commit `97f5ee4e6a` ("Only remount if requested flags differ from current"), with our behaviour only becoming increasingly more weird with `50105de1d8` ("Fix failure with rw bind mount of a ro fuse") and `da780e4d27` ("Fix bind mounts of filesystems with certain options set"). In short, we would only clear flags explicitly request by the user purely by chance, in ways that it really should've been reported to us by now. The most egregious is that mounts explicitly marked "rw" were actually mounted "ro" if the bind-mount source was "ro" and no other special flags were included. In addition, our handling of atime was completely broken -- mostly due to how subtle the semantics of atime are on Linux. Unfortunately, while the runtime-spec requires us to implement mount(8)'s behaviour, several aspects of the util-linux mount(8)'s behaviour are broken and thus copying them makes little sense. Since the runtime-spec behaviour for this case (should mount options for a "bind" mount use the "mount --bind -o ..." or "mount --bind -o remount,..." semantics? Is the fallback code we have for userns actually spec-compliant?) and the mount(8) behaviour (see [1]) are not well-defined, this commit simply fixes the most obvious aspects of the behaviour that are broken while keeping the current spirit of the implementation. NOTE: The handling of atime in the base case is left for a future PR to deal with. This means that the atime of the source mount will be silently left alone unless the fallback path needs to be taken, and any flags not explicitly set will be cleared in the base case. Whether we should always be operating as "mount --bind -o remount,..." (where we default to the original mount source flags) is a topic for a separate PR and (probably) associated runtime-spec PR. So, to resolve this: * We store which flags were explicitly requested to be cleared by the user, so that we can detect whether the userns fallback path would end up setting a flag the user explicitly wished to clear. If so, we return an error because we couldn't fulfil the configuration settings. * Revert `97f5ee4e6a` ("Only remount if requested flags differ from current"), as missing flags do not mean we can skip MS_REMOUNT (in fact, missing flags are how you indicate a flag needs to be cleared with mount(2)). The original purpose of the patch was to fix the userns issue, but as mentioned above the correct mechanism is to do a fallback mount that copies the lockable flags from statfs(2). * Improve handling of atime in the fallback case by: - Correctly handling the returned flags in statfs(2). - Implement the MNT_LOCK_ATIME checks in our code to ensure we produce errors rather than silently producing incorrect atime mounts. * Improve the tests so we correctly detect all of these contingencies, including a general "bind-mount atime handling" test to ensure that the behaviour described here is accurate. This change also inlines the remount() function -- it was only ever used for the bind-mount remount case, and its behaviour is very bind-mount specific. [1]: https://github.com/util-linux/util-linux/issues/2433 Reverts: `97f5ee4e6a` ("Only remount if requested flags differ from current") Fixes: `50105de1d8` ("Fix failure with rw bind mount of a ro fuse") Fixes: `da780e4d27` ("Fix bind mounts of filesystems with certain options set") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2023-10-24 17:28:25 +11:00
Ruediger Pluem	da780e4d27	Fix bind mounts of filesystems with certain options set Currently bind mounts of filesystems with nodev, nosuid, noexec, noatime, relatime, strictatime, nodiratime options set fail in rootless mode if the same options are not set for the bind mount. For ro filesystems this was resolved by #2570 by remounting again with ro set. Follow the same approach for nodev, nosuid, noexec, noatime, relatime, strictatime, nodiratime but allow to revert back to the old behaviour via the new `--no-mount-fallback` command line option. Add a testcase to verify that bind mounts of filesystems with nodev, nosuid, noexec, noatime options set work in rootless mode. Add a testcase that mounts a nodev, nosuid, noexec, noatime filesystem with a ro flag. Add two further testcases that ensure that the above testcases would fail if the `--no-mount-fallback` command line option is set. * contrib/completions/bash/runc: Add `--no-mount-fallback` command line option for bash completion. * create.go: Add `--no-mount-fallback` command line option. * restore.go: Add `--no-mount-fallback` command line option. * run.go: Add `--no-mount-fallback` command line option. * libcontainer/configs/config.go: Add `NoMountFallback` field to the `Config` struct to store the command line option value. * libcontainer/specconv/spec_linux.go: Add `NoMountFallback` field to the `CreateOpts` struct to store the command line option value and store it in the libcontainer config. * utils_linux.go: Store the command line option value in the `CreateOpts` struct. * libcontainer/rootfs_linux.go: In case that `--no-mount-fallback` is not set try to remount the bind filesystem again with the options nodev, nosuid, noexec, noatime, relatime, strictatime or nodiratime if they are set on the source filesystem. * tests/integration/mounts_sshfs.bats: Add testcases and rework sshfs setup to allow specifying different mount options depending on the test case. Signed-off-by: Ruediger Pluem <ruediger.pluem@vodafone.com>	2023-07-28 16:32:02 -07:00
Kir Kolyshkin	212d25e853	checkpoint/restore: add --manage-cgroups-mode ignore - add the new mode and document it; - slightly improve the --help output; - slightly simplify the parsing code. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-12-15 15:37:42 -08:00
Kir Kolyshkin	ff3b4f3bb4	restore: fix ignoring --manage-cgroups-mode Merge the logic of setPageServer, setManageCgroupsMode, and setEmptyNsMask into criuOptions. This does three things: 1. Fixes ignoring --manage-cgroups-mode on restore; 2. Simplifies the code in checkpoint.go and restore.go; 3. Ensures issues like 1 won't happen again. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-12-15 15:37:39 -08:00
Kir Kolyshkin	eb2f08dc4e	checkpoint,restore,list: don't call fatal There is a mix of styles when handling CLI commands. In most cases we return an error, which is handled from app.Run in main.go (it calls fatal if there is an error). In a few cases, though, we call fatal(err) from random places. Let's be consistent and always return an error. The only exception is runc exec, which needs to exit with a particular exit code. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-02-18 16:05:29 -08:00
Adrian Reber	43b36dc4ac	Support changing of lsm mount context on restore Wire through CRIU's support to change the mount context on restore. This is especially useful if restoring a container in a different pod. Single container restore uses the same SELinux process label and same mount context as during checkpointing. If a container is being restored into an existing pod the process label and the mount context needs to be changed to the context of the pod. Changing process label on restore is already supported by runc. This patch adds the possibility to change the mount context. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-20 10:01:16 +02:00
Kir Kolyshkin	9ba2f65d6b	startContainer: minor refactor All three callers* of startContainer call revisePidFile and createSpec before calling it, so it makes sense to move those calls to inside of the startContainer, and drop the spec argument. * -- in fact restore does not call revisePidFile, but it should. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-09-14 10:53:11 -07:00
Kir Kolyshkin	c5b0be78e8	Rm build tags from main pkg This was added by commit `5aa82c950` back in the day when we thought runc is going to be cross-platform. It's very clear now it's Linux-only package. While at it, further clarify it in README that we're Linux only. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-08-30 20:15:01 -07:00
Adrian Reber	535f25c44f	Allow restoring with a different LSM profile Restoring an SELinux enabled container with Podman will result in a container with the exactly same SELinux process labels as during checkpointing. CRIU takes care of all the process labels. Restoring multiple copies of a checkpointed container will result in all containers having the same SELinux process labels, which might be undesired. When looking at Pods all container in a Pod share the process label of the infrastructure container. To restore a container into and existing Pod it is necessary to tell CRIU to restore the container with the infrastructure container process label. CRIU supports setting different process labels using --lsm-profile for a long time and this just passes the process label information from runc to CRIU. Unfortunately CRIU has a bug as no one was using the --lsm-profile option so this changes requires the upcoming CRIU version 3.16. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-06-07 18:05:24 +02:00
Liu Hua	23e3794d9c	checkpoint: validate parent path `--parent-path` needs to be relative path of `--image-path`, and points to previous checkpoint directory. Signed-off-by: Liu Hua <weldonliu@tencent.com>	2021-04-23 13:20:08 +08:00
Sebastiaan van Stijn	4316df8b53	libcontainer/system: move userns utilities to separate package Moving these utilities to a separate package, so that consumers of this package don't have to pull in the whole "system" package. Looking at uses of these utilities (outside of runc itself); `RunningInUserNS()` is used by [various external consumers][1], so adding a "Deprecated" alias for this. [1]: https://grep.app/search?current=2&q=.RunningInUserNS Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2021-04-04 22:42:03 +02:00
Kir Kolyshkin	ca1d135bd4	runc checkpoint: fix --status-fd to accept fd 1. The command `runc checkpoint --lazy-server --status-fd $FD` actually accepts a file name as an $FD. Make it accept a file descriptor, like its name implies and the documentation states. In addition, since runc itself does not use the result of CRIU status fd, remove the code which relays it, and pass the FD directly to CRIU. Note 1: runc should close this file descriptor itself after passing it to criu, otherwise whoever waits on it might wait forever. Note 2: due to the way criu swrk consumes the fd (it reopens /proc/$SENDER_PID/fd/$FD), runc can't close it as soon as criu swrk has started. There is no good way to know when criu swrk has reopened the fd, so we assume that as soon as we have received something back, the fd is already reopened. 2. Since the meaning of --status-fd has changed, the test case using it needs to be fixed as well. Modify the lazy migration test to remove "sleep 2", actually waiting for the the lazy page server to be ready. While at it, - remove the double fork (using shell's background process is sufficient here); - check the exit code for "runc checkpoint" and "criu lazy-pages"; - remove the check for no errors in dump.log after restore, as we are already checking its exit code. [v2: properly close status fd after spawning criu] [v3: move close status fd to after the first read] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-11 15:36:50 -07:00
Lifubang	472fe623a7	criu image path permission error in rootless checkpoint Signed-off-by: Lifubang <lifubang@acmcoder.com>	2019-03-11 23:49:52 +08:00
Akihiro Suda	06f789cf26	Disable rootless mode except RootlessCgMgr when executed as the root in userns This PR decomposes `libcontainer/configs.Config.Rootless bool` into `RootlessEUID bool` and `RootlessCgroups bool`, so as to make "runc-in-userns" to be more compatible with "rootful" runc. `RootlessEUID` denotes that runc is being executed as a non-root user (euid != 0) in the current user namespace. `RootlessEUID` is almost identical to the former `Rootless` except cgroups stuff. `RootlessCgroups` denotes that runc is unlikely to have the full access to cgroups. `RootlessCgroups` is set to false if runc is executed as the root (euid == 0) in the initial namespace. Otherwise `RootlessCgroups` is set to true. (Hint: if `RootlessEUID` is true, `RootlessCgroups` becomes true as well) When runc is executed as the root (euid == 0) in an user namespace (e.g. by Docker-in-LXD, Podman, Usernetes), `RootlessEUID` is set to false but `RootlessCgroups` is set to true. So, "runc-in-userns" behaves almost same as "rootful" runc except that cgroups errors are ignored. This PR does not have any impact on CLI flags and `state.json`. Note about CLI: * Now `runc --rootless=(auto\|true\|false)` CLI flag is only used for setting `RootlessCgroups`. * Now `runc spec --rootless` is only required when `RootlessEUID` is set to true. For runc-in-userns, `runc spec` without `--rootless` should work, when sufficient numbers of UID/GID are mapped. Note about `$XDG_RUNTIME_DIR` (e.g. `/run/user/1000`): * `$XDG_RUNTIME_DIR` is ignored if runc is being executed as the root (euid == 0) in the initial namespace, for backward compatibility. (`/run/runc` is used) * If runc is executed as the root (euid == 0) in an user namespace, `$XDG_RUNTIME_DIR` is honored if `$USER != "" && $USER != "root"`. This allows unprivileged users to allow execute runc as the root in userns, without mounting writable `/run/runc`. Note about `state.json`: * `rootless` is set to true when `RootlessEUID == true && RootlessCgroups == true`. Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-09-07 15:05:03 +09:00
Ace-Tang	4803faf00e	cr: don't restore net namespace by default since runc don't manage net device and their configuration, checkpoint also don't dump net namespace by default, so set 'nsmask = unix.CLONE_NEWNET' by default in restore. Or if user do not pass 'empty-ns network', criu will cost extra time in restore. Signed-off-by: Ace-Tang <aceapril@126.com>	2018-08-17 16:03:21 +08:00
Akihiro Suda	f103de57ec	main: support rootless mode in userns Running rootless containers in userns is useful for mounting filesystems (e.g. overlay) with mapped euid 0, but without actual root privilege. Usage: (Note that `unshare --mount` requires `--map-root-user`) user$ mkdir lower upper work rootfs user$ curl http://dl-cdn.alpinelinux.org/alpine/v3.7/releases/x86_64/alpine-minirootfs-3.7.0-x86_64.tar.gz \| tar Cxz ./lower \|\| ( true; echo "mknod errors were ignored" ) user$ unshare --mount --map-root-user mappedroot# runc spec --rootless mappedroot# sed -i 's/"readonly": true/"readonly": false/g' config.json mappedroot# mount -t overlay -o lowerdir=./lower,upperdir=./upper,workdir=./work overlayfs ./rootfs mappedroot# runc run foo Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-05-10 12:16:43 +09:00
Adrian Reber	60ae7091de	checkpoint: support lazy migration With the help of userfaultfd CRIU supports lazy migration. Lazy migration means that memory pages are only transferred from the migration source to the migration destination on page fault. This enables to reduce the downtime during process or container migration to a minimum as the memory does not need to be transferred during migration. Lazy migration currently depends on userfaultfd being available on the current Linux kernel and if the used CRIU version supports lazy migration. Both dependencies can be checked by querying CRIU via RPC if the lazy migration feature is available. Using feature checking instead of version comparison enables runC to use CRIU features from the criu-dev branch. This way the user can decide if lazy migration should be available by choosing the right kernel and CRIU branch. To use lazy migration the CRIU process during dump needs to dump everything besides the memory pages and then it opens a network port waiting for remote page fault requests: # runc checkpoint httpd --lazy-pages --page-server 0.0.0.0:27 \ --status-fd /tmp/postcopy-pipe In this example CRIU will hang/wait once it has opened the network port and wait for network connection. As runC waits for CRIU to finish it will also hang until the lazy migration has finished. To know when the restore on the destination side can start the '--status-fd' parameter is used: #️ runc checkpoint --help \| grep status --status-fd value criu writes \0 to this FD once lazy-pages is ready The parameter '--status-fd' is directly from CRIU and this way the process outside of runC which controls the migration knows exactly when to transfer the checkpoint (without memory pages) to the destination and that the restore can be started. On the destination side it is necessary to start CRIU in 'lazy-pages' mode like this: # criu lazy-pages --page-server --address 192.168.122.3 --port 27 \ -D checkpoint and tell runC to do a lazy restore: # runc restore -d --image-path checkpoint --work-path checkpoint \ --lazy-pages httpd If both processes on the restore side have the same working directory 'criu lazy-pages' creates a unix domain socket where it waits for requests from the actual restore. runC starts CRIU restore in lazy restore mode and talks to 'criu lazy-pages' that it wants to restore memory pages on demand. CRIU continues to restore the process and once the process is running and accesses the first non-existing memory page the 'criu lazy-pages' server will request the page from the source system. Thus all pages from the source system will be transferred to the destination system. Once all pages have been transferred runC on the source system will end and the container will have finished migration. This can also be combined with CRIU's pre-copy support. The combination of pre-copy and post-copy (lazy migration) provides the possibility to migrate containers with minimal downtimes. Some additional background about post-copy migration can be found in these articles: https://lisas.de/~adrian/?p=1253 https://lisas.de/~adrian/?p=1183 Signed-off-by: Adrian Reber <areber@redhat.com>	2017-09-06 12:35:38 +00:00
Nikolas Sepos	3f234b15d0	Add auto-dedup flag for checkpoint/restore When doing incremental dumps is useful to use auto deduplication of memory images to save space. Signed-off-by: Nikolas Sepos <nikolas.sepos@gmail.com>	2017-08-18 16:19:21 +02:00
Andrei Vagin	1c43d091a1	checkpoint: add support for containers with terminals CRIU was extended to report about orphaned master pty-s via RPC. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-05-02 04:48:47 +03:00
Andrei Vagin	a4fcbfb704	Prepare startContainer() to have more action Currently startContainer() is used to create and to run a container. In the next patch it will be used to restore a container. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-05-01 21:55:57 +03:00
Tim Potter	9458b39ca9	Fix misspelling of "properties" in various places Signed-off-by: Tim Potter <tpot@hpe.com>	2017-04-21 13:29:58 +10:00
Aleksa Sarai	d2f49696b0	runc: add support for rootless containers This enables the support for the rootless container mode. There are many restrictions on what rootless containers can do, so many different runC commands have been disabled: * runc checkpoint * runc events * runc pause * runc ps * runc restore * runc resume * runc update The following commands work: * runc create * runc delete * runc exec * runc kill * runc list * runc run * runc spec * runc state In addition, any specification options that imply joining cgroups have also been disabled. This is due to support for unprivileged subtree management not being available from Linux upstream. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-03-23 20:45:24 +11:00
Michael Crosby	00a0ecf554	Add separate console socket Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2017-03-16 10:23:59 -07:00
Mrunal Patel	899b0748f0	Merge pull request #1308 from giuseppe/fix-systemd-notify fix systemd-notify when using a different PID namespace	2017-02-24 11:05:21 -08:00
Giuseppe Scrivano	d5026f0e43	signals: support detach and notify socket together let runc run until READY= is received and then proceed with detaching the process. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2017-02-22 22:28:03 +01:00
Giuseppe Scrivano	892f2ded6f	fix systemd-notify when using a different PID namespace The current support of systemd-notify has a race condition as the message send to the systemd notify socket might be dropped if the sender process is not running by the time systemd checks for the sender of the datagram. A proper fix of this in systemd would require changes to the kernel to maintain the cgroup of the sender process when it is dead (but it is not probably going to happen...) Generally, the solution to this issue is to specify the PID in the message itself so that systemd has not to guess the sender, but this wouldn't work when running in a PID namespace as the container will pass the PID known in its namespace (something like PID=1,2,3..) and systemd running on the host is not able to map it to the runc service. The proposed solution is to have a proxy in runc that forwards the messages to the host systemd. Example of this issue: https://github.com/projectatomic/atomic-system-containers/pull/24 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2017-02-22 22:27:59 +01:00
Deng Guangxing	98f004182b	add pre-dump and parent-path to checkpoint CRIU gets pre-dump to complete iterative migration. pre-dump saves process memory info only. And it need parent-path to specify the former memory files. This patch add pre-dump and parent-path arguments to runc checkpoint Signed-off-by: Deng Guangxing <dengguangxing@huawei.com> Signed-off-by: Adrian Reber <areber@redhat.com>	2017-02-14 19:45:07 +08:00
Mrunal Patel	c54f1495e3	Fix error shadow and error check warnings Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-01-06 16:21:23 -08:00
Aleksa Sarai	c6d8a2f26f	merge branch 'pr-1158' Closes #1158 LGTMs: @hqhq @cyphar	2016-12-26 13:59:47 +11:00
Aleksa Sarai	244c9fc426	*: console rewrite This implements {createTTY, detach} and all of the combinations and negations of the two that were previously implemented. There are some valid questions about out-of-OCI-scope topics like !createTTY and how things should be handled (why do we dup the current stdio to the process, and how is that not a security issue). However, these will be dealt with in a separate patchset. In order to allow for late console setup, split setupRootfs into the "preparation" section where all of the mounts are created and the "finalize" section where we pivot_root and set things as ro. In between the two we can set up all of the console mountpoints and symlinks we need. We use two-stage synchronisation to ensures that when the syscalls are reordered in a suboptimal way, an out-of-place read() on the parentPipe will not gobble the ancilliary information. This patch is part of the console rewrite patchset. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-12-01 15:49:36 +11:00
Zhang Wei	b517076907	Check args numbers before application start Add a general args number validator for all client commands. Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2016-11-29 11:18:51 +08:00
xiekeyang	55e783b57a	remove unused returned variables name The returned variables name seems be able to removed. Signed-off-by: xiekeyang <xiekeyang@huawei.com>	2016-06-15 17:41:57 +08:00
Andrew Vagin	acef7461a4	restore: add the empty-ns option For example: ./runc restore --empty-ns network CTID In this case criu creates a network namespace, but doesn't restore it. We are going to use this option to restore docker containers and Docker sets a hook to restore a network namespace. https://github.com/xemul/criu/issues/165 Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>	2016-06-07 20:24:59 +03:00
Mrunal Patel	a753b06645	Replace github.com/codegangsta/cli by github.com/urfave/cli The package got moved to a different repository Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-06-06 11:47:20 -07:00
Qiang Huang	2503fca35d	Update man pages to refect the latest cli change The major change is the description of options, change it as the latest cli help message shows, which specify a "value" after an option if it takes value, and add (default: xxx) if the option has a default value. This also includes some other minor consistency fixes. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-05-28 13:33:57 +08:00
Aleksa Sarai	1a913c7b89	*: correctly chown() consoles In user namespaces, we need to make sure we don't chown() the console to unmapped users. This means we need to get both the UID and GID of the root user in the container when changing the owner. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-05-22 22:37:13 +10:00
Qiang Huang	8477638aab	Update cli package The old one has bug when showing help message for IntFlags. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-05-10 13:58:09 +08:00
Michael Crosby	f417e993d0	Update spec to v0.5.0 Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-04-12 14:11:40 -07:00
Michael Crosby	12bd4cffd0	Add --no-pivot option for containers on ramdisk This adds a `--no-pivot` cli flag to runc so that a container's rootfs can be located ontop of ramdisk/tmpfs and not fail because you cannot pivot root. This should be a cli flag and not part of the spec because this is a detail of the host/runtime environment and not an attribute of a container. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-03-30 12:02:17 -07:00
Ido Yariv	28b21a5988	Export CreateLibcontainerConfig Users of libcontainer other than runc may also require parsing and converting specification configuration files. Since runc cannot be imported, move the relevant functions and definitions to a separate package, libcontainer/specconv. Signed-off-by: Ido Yariv <ido@wizery.com>	2016-03-25 12:19:18 -04:00
Mrunal Patel	7e91a96605	Add support for systemd cgroups in runc Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2016-03-22 17:08:07 -07:00
Michael Crosby	fdb100d247	Destroy container along with processes before stdio We need to make sure the container is destroyed before closing the stdio for the container. This becomes a big issues when running in the host's pid namespace because the other processes could have inherited the stdio of the initial process. The call to close will just block as they still have the io open. Calling destroy before closing io, especially in the host pid namespace will cause all additional processes to be killed in the container's cgroup. This will allow the io to be closed successfuly. This change makes sure the order for destroy and close is correct as well as ensuring that if any errors encoutered during start or exec will be handled by terminating the process and destroying the container. We cannot use defers here because we need to enforce the correct ordering on destroy. This also sets the subreaper setting for runc so that when running in pid host, runc can wait on the addiontal processes launched by the container, useful on destroy, but also good for reaping the additional processes that were launched. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-03-15 13:17:11 -07:00
Michael Crosby	47eaa08f5a	Update runc usage for new specs changes Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-03-10 14:18:39 -08:00
Michael Crosby	044e298507	Improve error handling in runc The error handling on the runc cli is currenly pretty messy because messages to the user are split between regular stderr format and logrus message format. This changes all the error reporting to the cli to only output on stderr and exit(1) for consumers of the api. By default logrus logs to /dev/null so that it is not seen by the user. If the user wants extra and/or structured loggging/errors from runc they can use the `--log` flag to provide a path to the file where they want this information. This allows a consistent behavior on the cli but extra power and information when debugging with logs. This also includes a change to enable the same logging information inside the container's init by adding an init cli command that can share the existing flags for all other runc commands. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-03-09 11:08:30 -08:00
Michael Crosby	8d0a05b8dd	Wait for pipes to write all data before exit Add a waitgroup to wait for the io.Copy of stdout/err to finish before existing runc. The problem happens more in exec because it is really fast and the pipe has data buffered but not yet read after the process has already exited. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-02-26 12:14:47 -08:00
Mrunal Patel	90472aeb9e	Merge pull request #546 from mikebrow/usage-updates updating usage for runc, and all runc commands that now use <container id> as the first argument	2016-02-17 21:13:22 +05:30
Mike Brown	f4e37ab63e	updating usage for runc and runc commands Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2016-02-17 09:00:39 -06:00
Michael Crosby	ce72f86a2b	Merge pull request #558 from rajasec/tty-panic panic during start of failed detached container	2016-02-16 16:01:08 -08:00
Julian Friedman	5fbdf6c3fc	Register signal handlers earlier to avoid zombies newSignalHandler needs to be called before the process is started, otherwise when the process exits quickly the SIGCHLD is recieved (and ignored) before the handler is set up. When this happens the reaper never runs, the process becomes a zombie, and the exit code isn't returned to the user. Signed-off-by: Julian Friedman <julz.friedman@uk.ibm.com>	2016-02-16 18:38:54 +00:00

1 2

74 Commits