zishuo/runc

mirror of https://github.com/opencontainers/runc.git synced 2025-09-26 19:41:35 +08:00

Author	SHA1	Message	Date
Kir Kolyshkin	e6048715e4	Use gofumpt to format code gofumpt (mvdan.cc/gofumpt) is a fork of gofmt with stricter rules. Brought to you by git ls-files \*.go \| grep -v ^vendor/ \| xargs gofumpt -s -w Looking at the diff, all these changes make sense. Also, replace gofmt with gofumpt in golangci.yml. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-01 12:17:27 -07:00
Aleksa Sarai	63ee74376e	merge branch 'pr-2958' Kir Kolyshkin (2) libct/cg/sd: fix SkipDevices for systemd libct/cg/sd: add SkipDevices unit test LGTMs: mrunalp AkihiroSuda cyphar Closes #2958	2021-05-28 14:24:05 +10:00
Kir Kolyshkin	752e7a8249	libct/cg/sd: fix SkipDevices for systemd Commit `108ee85b82` adds SkipDevices flag, which is used by kubernetes to create cgroups for pods. Unfortunately the above commit falls short, and systemd DevicePolicy and DeviceAllow properties are still set, which requires kubernetes to set "allow everything" rule. This commit fixes this: if SkipDevices flag is set, we return Device* properties to allow all devices. Fixes: `108ee85b82` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-24 17:00:37 -07:00
Kir Kolyshkin	33c9f8b9c7	libct/cg/sd: return error from stopUnit Historically, we never returned an error from failed startUnit or stopUnit. The startUnit case was fixed by commit `3844789`. It is time to fix stopUnit, too. The reasons are: 1. Ignoring an error from stopUnit means an unexpected trouble down the road, for example a failure to create a container with the same name: > time="2021-05-07T19:51:27Z" level=error msg="container_linux.go:380: starting container process caused: process_linux.go:385: applying cgroup configuration for process caused: Unit runc-test_busybox.scope already exists." 2. A somewhat short timeout of 1 second means the cgroup might actually be removed a few seconds later but we might have a race between removing the cgroup and creating another one with the same name, resulting in the same error as amove. So, return an error if removal failed, and increase the timeout. Now, modify the systemd cgroup v1 manager to not mask the error from stopUnit (stopErr) with the subsequent one from cgroups.RemovePath, as stopErr is most probably the reason why RemovePath failed. Note that for v1 we do want to remove the paths even in case of a failure from stopUnit, as some were not created by systemd. There's no need to do that for v2, thanks to unified hierarchy, so no changes there. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-12 11:40:21 -07:00
Kir Kolyshkin	3f65946756	libct/cg: make Set accept configs.Resources A cgroup manager's Set method sets cgroup resources, but historically it was accepting configs.Cgroups. Refactor it to accept resources only. This is an improvement from the API point of view, as the method can not change cgroup configuration (such as path to the cgroup etc), it can only set (modify) its resources/limits. This also lays the foundation for complicated resource updates, as now Set has two sets of resources -- the one that was previously specified during cgroup manager creation (or the previous Set), and the one passed in the argument, so it could deduce the difference between these. This is a long term goal though. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-29 15:24:19 -07:00
Kir Kolyshkin	47ef9a104f	libct/cg/sd: retry on dbus disconnect Instead of reconnecting to dbus after some failed operations, and returning an error (so a caller has to retry), reconnect AND retry in place for all such operations. This should fix issues caused by a stale dbus connection after e.g. a dbus daemon restart. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-27 16:19:49 -07:00
Shiming Zhang	15fee9899f	libct/cg/sd: add renew dbus connection [@kolyshkin: doc nits, use dbus.ErrClosed and isDbusError] Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-27 16:16:05 -07:00
Shiming Zhang	cdbed6f02f	libct/cg/sd: add dbus manager [@kolyshkin: documentation nits] Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-27 16:02:39 -07:00
Qiang Huang	2d38476c96	Merge pull request #2840 from kolyshkin/ignore-kmem Ignore kernel memory settings	2021-04-13 09:44:14 +08:00
Kir Kolyshkin	52390d6804	Ignore kernel memory settings This is somewhat radical approach to deal with kernel memory. Per-cgroup kernel memory limiting was always problematic. A few examples: - older kernels had bugs and were even oopsing sometimes (best example is RHEL7 kernel); - kernel is unable to reclaim the kernel memory so once the limit is hit a cgroup is toasted; - some kernel memory allocations don't allow failing. In addition to that, - users don't have a clue about how to set kernel memory limits (as the concept is much more complicated than e.g. [user] memory); - different kernels might have different kernel memory usage, which is sort of unexpected; - cgroup v2 do not have a [dedicated] kmem limit knob, and thus runc silently ignores kernel memory limits for v2; - kernel v5.4 made cgroup v1 kmem.limit obsoleted (see https://github.com/torvalds/linux/commit/0158115f702b). In view of all this, and as the runtime-spec lists memory.kernel and memory.kernelTCP as OPTIONAL, let's ignore kernel memory limits (for cgroup v1, same as we're already doing for v2). This should result in less bugs and better user experience. The only bad side effect from it might be that stat can show kernel memory usage as 0 (since the accounting is not enabled). [v2: add a warning in specconv that limits are ignored] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-04-12 12:18:11 -07:00
Kir Kolyshkin	f585cec7dc	libct/cg/v2: always enable TasksAccounting This unconditionally enables TasksAccounting for systemd unified (v2) cgroup driver, making it work the same way as the legacy (v1) driver. Practically, it is probably a no-op since DefaultTasksAccounting is usually true. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-03-09 20:17:21 -08:00
Kir Kolyshkin	5d0ffbf9c8	runc start/run: report OOM In some cases, container init fails to start because it is killed by the kernel OOM killer. The errors returned by runc in such cases are semi-random and rather cryptic. Below are a few examples. On cgroup v1 + systemd cgroup driver: > process_linux.go:348: copying bootstrap data to pipe caused: write init-p: broken pipe > process_linux.go:352: getting the final child's pid from pipe caused: EOF On cgroup v2: > process_linux.go:495: container init caused: read init-p: connection reset by peer > process_linux.go:484: writing syncT 'resume' caused: write init-p: broken pipe This commits adds the OOM method to cgroup managers, which tells whether the container was OOM-killed. In case that has happened, the original error is discarded (unless --debug is set), and the new OOM error is reported instead: > ERRO[0000] container_linux.go:367: starting container process caused: container init was OOM-killed (memory limit too low?) Also, fix the rootless test cases that are failing because they expect an error in the first line, and we have an additional warning now: > unable to get oom kill count" error="no directory specified for memory.oom_control Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-02-23 16:15:33 -08:00
Kir Kolyshkin	af521ed580	libct/cgroups/systemd: don't set limits in Apply All cgroup managers has Apply() and Set() methods: - Apply is used to create a cgroup (and, in case of systemd, a systemd unit) and/or put a PID into the cgroup (and unit); - Set is used to set various cgroup resources and limits. The fs/fs2 cgroup manager implements the functionality as described above. The systemd v1/v2 manager deviate -- it sets most of cgroup limits (those that can be projected to systemd unit properties) in Apply(), and then again all cgroup limits in Set (first indirectly via systemd properties -- same as in Apply, then via cgroupfs). This commit removes setting the cgroup limits from Apply, so now the systemd manager behaves the same way as the fs manager. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-02-19 18:53:24 -08:00
Kir Kolyshkin	03b512e511	libc/cg: convert r.CPU.Cpus/Mems to systemd props Support for systemd properties AllowedCPUs and AllowedMemoryNodes was added by commit `13afa58d0e`, but only for unified resources of systemd v2 driver. This adds support for Cpu.Cpus and Cpu.Mems resources to both systemd v1 and v2 cgroup drivers. An integration test is added to check that the settings work. [v2: check for systemd version] [v3: same in the test] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-01-13 19:28:54 -08:00
Kir Kolyshkin	f0fdde79d2	libct/cg/systemd/v1: fix err check in enableKmem Commit `27d3dd3df3` ("don't fail when subsystem not mounted") added ignoring "not found" error to enableKmem, and as a result the function now tries to call Mkdir with an empty path, which results in a weird error message. For example, this is a failure from a libcontainer/integration test: > === RUN TestRunWithKernelMemorySystemd > exec_test.go:704: runContainer failed with kernel memory limit: container_linux.go:370: starting container process caused: process_linux.go:327: applying cgroup configuration for process caused: mkdir : no such file or directory I am not entirely sure if it is a good idea to silently ignore set limits, but at least let's fix the error handling. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-10-05 20:51:02 -07:00
Kir Kolyshkin	c1bba720f7	libct/cg/systemd/v1: do not use c.Path Commit `a1d5398afa` ("Respect container's cgroup path") added a cgroupPath argument to FindCgroupMountpoint to make runc/libcontainer work in a custom multitenant environment with multiple cgroup mount points. It also added passing c.Path as an argument to FindCgroupMountpoint for systemd (v1) controller. This is wrong, because 1. systemd controller do not use c.Path at all (and c.Path is never set by specconv) -- instead, it uses Name and Parent. 2. c.Path, if set, is not absolute -- it is relative to /sys/fs/cgroup -- but it is used as an absolute path here. Since c.Path is never set, the change did not result in any breakage, so this code sit quietly for some time and the issue might not have been discovered -- until we started running libcontainer/integration tests in a CentOS 7 VM, which resulted in a following weird error: > FAIL: TestPidsSystemd: utils_test.go:55: exec_test.go:630: unexpected error: container_linux.go:353: starting container process caused: process_linux.go:326: applying cgroup configuration for process caused: mountpoint for devices not found The error was "fixed" in commit `f57bb2fe3d` by changing the tests' cgroups Path to be "/sys/fs/cgroup/". This actually resulted in creation of cgroup directories like /sys/fs/cgroup/memory/sys/fs/cgroup, /sys/fs/cgroup/devices/sys/fs/cgroup and so on. The proper fix to the test case is implemented in the previous commit, which sets c.Name and c.Parent. This commit just removes the invalid use of c.Path, and tells the whole story. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-10-05 20:51:02 -07:00
Kir Kolyshkin	9e78b66e88	libct/cg/systemd/v1.enableKmem: use fscommon.ReadFile Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-10-05 14:07:15 -07:00
Kir Kolyshkin	b006f4a180	libct/cgroups: support Cgroups.Resources.Unified Add support for unified resource map (as per [1]), and add some test cases for the new functionality. [1] https://github.com/opencontainers/runtime-spec/pull/1040 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-09-24 15:29:35 -07:00
Kir Kolyshkin	940e15479f	cgroupv1/systemd: (re)use m.paths In all these cases, getSubsystemPath() was already called, and its result stored in m.paths map. It makes no sense to not reuse it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-08-25 11:42:24 -07:00
Kir Kolyshkin	f075084a47	cgroupv1/systemd: rework Apply/joinCgroups We call joinCgroups() from Apply, and in there we iterate through the list of subsystems, calling getSubsystemPath() for each. This is expensive, since every getSubsystemPath() involves parsing mountinfo. At the end of Apply(), we iterate through the list of subsystems to fill the m.paths, again calling getSubsystemPath() for every subsystem. As a result, we parse mountinfo about 20 times here. Let's find the paths first and reuse m.paths in joinCgroups(). While at it, since join() is just two calls now, inline it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-08-25 11:42:21 -07:00
Kir Kolyshkin	fad92bbffa	cgroupv1/Apply: do not overuse d.path/getSubsystemPath When paths are set, we only need to place the PID into proper cgroups, and we do know all the paths already. Both fs/d.path() and systemd/v1/getSubsystemPath() parse /proc/self/mountinfo, and the only reason they are used here is to check whether the subsystem is available. Use a much simpler/faster check instead. Frankly, I am not sure why the check is needed at all. Maybe it should be dropped. Also, for fs driver, since d is no longer used in this code path, move its initialization to after it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-08-25 11:40:58 -07:00
Mrunal Patel	a5847db387	Merge pull request #2506 from kolyshkin/cgroup-fixes cgroupv1 removal nits	2020-08-17 21:13:31 -07:00
Kir Kolyshkin	2a322e91ec	cgroupv1: remove subsystemSet.Get() Instead of iterating over m.paths, iterate over subsystems and look up the path for each. This is faster since a map lookup is faster than iterating over the names in Get. A quick benchmark shows that the new way is 2.5x faster than the old one. Note though that this is not done to make things faster, as savings are negligible, but to make things simpler by removing some code. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-07-06 18:31:46 -07:00
Kir Kolyshkin	254d23b964	libc/cgroups: empty map in RemovePaths RemovePaths() deletes elements from the paths map for paths that has been successfully removed. Although, it does not empty the map itself (which is needed that AFAIK Go garbage collector does not shrink the map), but all its callers do. Move this operation from callers to RemovePaths. No functional change, except the old map should be garbage collected now. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-07-06 17:54:44 -07:00
Mrunal Patel	30dc54a995	Merge pull request #2503 from giuseppe/cgroup-fixes cgroup, systemd: cleanup cgroups	2020-07-06 15:14:29 -07:00
Mrunal Patel	3f81131845	Merge pull request #2490 from kolyshkin/dev-opt libct/cgroups: add SkipDevices to Resources	2020-07-06 14:28:30 -07:00
Giuseppe Scrivano	32034481ea	cgroup, systemd: cleanup cgroups some hierarchies were created directly by .Apply() on top of systemd managed cgroups. systemd doesn't manage these and as a result we leak these cgroups. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2020-07-06 23:06:16 +02:00
Kir Kolyshkin	cd479f9d14	cgroupv1/freezer: don't use subsystemSet.Get() Iterating over the list of subsystems and comparing their names to get an instance of fs.cgroupFreezer is useless and a waste of time, since it is a shallow type (i.e. does not have any data/state) and we can create an instance in place. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-07-03 14:00:44 -07:00
Kir Kolyshkin	108ee85b82	libct/cgroups: add SkipDevices to Resources The kubelet uses libct/cgroups code to set up cgroups. It creates a parent cgroup (kubepods) to put the containers into. The problem (for cgroupv2 that uses eBPF for device configuration) is the hard requirement to have devices cgroup configured results in leaking an eBPF program upon every kubelet restart. program. If kubelet is restarted 64+ times, the cgroup can't be configured anymore. Work around this by adding a SkipDevices flag to Resources. A check was added so that if SkipDevices is set, such a "container" can't be started (to make sure it is only used for non-containers). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-07-02 15:19:31 -07:00
Mrunal Patel	406298fdf0	Merge pull request #2466 from kolyshkin/systemd-cpu-quota-period cgroups/systemd: add setting CPUQuotaPeriod prop	2020-06-17 12:03:30 -07:00
Kir Kolyshkin	e751a168dc	cgroups/systemd: add setting CPUQuotaPeriod prop For some reason, runc systemd drivers (both v1 and v2) never set systemd unit property named `CPUQuotaPeriod` (known as `CPUQuotaPeriodUSec` on dbus and in `systemctl show` output). Set it, and add a check to all the integration tests. The check is less than trivial because, when not set, the value is shown as "infinity" but when set to the same (default) value, shown as "100ms", so in case we expect 100ms (period = 100000 us), we have to _also_ check for "infinity". [v2: add systemd version checks since CPUQuotaPeriod requires v242+] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-06-16 15:48:06 -07:00
Kir Kolyshkin	dd2426d067	libct/cgroups: fix m.paths map access This fixes a few cases of accessing m.paths map directly without holding the mutex lock. Fixes: `9087f2e82` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-06-15 18:30:16 -07:00
Kir Kolyshkin	5b247e739c	Merge pull request #2338 from lifubang/systemdcgroupv2 fix path error in systemd when stopped LGTMs: @mrunalp @AkihiroSuda	2020-06-15 18:01:13 -07:00
Kir Kolyshkin	8b9646775e	cgroups/systemd: unify adding CpuQuota The code that adds CpuQuotaPerSecUSec is the same in v1 and v2 systemd cgroup driver. Move it to common. No functional change. Note that the comment telling that we always set this property contradicts with the current code, and therefore it is removed. [v2: drop cgroupv1-specific comment] [v3: drop returning error as it's not used] [v4: remove an obsoleted comment] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-06-09 17:14:43 -07:00
Kir Kolyshkin	2ce20ed158	cgroups/systemd: simplify gen*ResourcesProperties Use r instead of c.Resources for readability. No functional change. This commit has been brought to you by '<,'>s/c\.Resources\./r./g Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-06-08 13:42:09 -07:00
lifubang	9087f2e827	fix path error in systemd when stopped When we use cgroup with systemd driver, the cgroup path will be auto removed by systemd when all processes exited. So we should check cgroup path exists when we access the cgroup path, for example in `kill/ps`, or else we will got an error. Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-06-02 18:17:43 +08:00
Mrunal Patel	332a84581e	Merge pull request #2443 from kolyshkin/kmem-fixup cgroupv1/systemd.Set: don't enable kernel memory acct	2020-05-31 10:04:45 -07:00
Kir Kolyshkin	3fe6e04510	cgroupv1/systemd.Set: don't enable kernel memory acct This is a regression from commit `1d4ccc8e0`. We only need to enable kernel memory accounting once, from the (legacyManager).Apply(), and there is no need to do it in (legacyManager).Set(). While at it, rename the method to better reflect what it's doing. This saves 1 call to mountinfo parser. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-29 17:54:50 -07:00
Kir Kolyshkin	3249e2379c	cgroupv1: check cpu shares in place Commit `4e65e0e90a` added a check for cpu shares. Apparently, the kernel allows to set a value higher than max or lower than min without an error, but the value read back is always within the limits. The check (which was later moved out to a separate CheckCpushares() function) is always performed after setting the cpu shares, so let's move it to the very place where it is set. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-29 16:46:28 -07:00
Kir Kolyshkin	be5467872d	cgroupv1: minimal fix for cpu quota regression This is a quick-n-dirty fix the regression introduced by commit `06d7c1d`, which made it impossible to only set CpuQuota (without the CpuPeriod). It partially reverts the above commit, and adds a test case. The proper fix will follow. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-26 11:02:16 -07:00
Kir Kolyshkin	59897367c4	cgroups/systemd: allow to set -1 as pids.limit Currently, both systemd cgroup drivers (v1 and v2) only set "TasksMax" unit property if the value > 0, so there is no way to update the limit to -1 / unlimited / infinity / max. Since systemd driver is backed by fs driver, and both fs and fs2 set the limit of -1 properly, it works, but systemd still has the old value: # runc --systemd-cgroup update $CT --pids-limit 42 # systemctl show runc-$CT.scope \| grep TasksMax TasksMax=42 # cat /sys/fs/cgroup/system.slice/runc-$CT.scope/pids.max 42 # ./runc --systemd-cgroup update $CT --pids-limit -1 # systemctl show runc-$CT.scope \| grep TasksMax= TasksMax=42 # cat /sys/fs/cgroup/system.slice/runc-xx77.scope/pids.max max Fix by changing the condition to allow -1 as a valid value. NOTE other negative values are still being ignored by systemd drivers (as it was done before). I am not sure whether this is correct, or should we return an error. A test case is added. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-20 13:20:04 -07:00
Kir Kolyshkin	06d7c1d261	systemd+cgroupv1: fix updating CPUQuotaPerSecUSec 1. do not allow to set quota without period or period without quota, as we won't be able to calculate new value for CPUQuotaPerSecUSec otherwise. 2. do not ignore setting quota to -1 when a period is not set. 3. update the test case accordingly. Note that systemd value checks will be added in the next commit. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-20 13:17:18 -07:00
Aleksa Sarai	b810da1490	cgroups: systemd: make use of Device= properties It seems we missed that systemd added support for the devices cgroup, as a result systemd would actually write an allow-all rule each time you did 'runc update'* if you used the systemd cgroup driver. This is obviously ... bad and was a clear security bug. Luckily the commits which introduced this were never in an actual runc release. So we simply generate the cgroupv1-style rules (which is what systemd's DeviceAllow wants) and default to a deny-all ruleset. Unfortunately it turns out that systemd is susceptible to the same spurrious error failure that we were, so that problem is out of our hands for systemd cgroup users. However, systemd has a similar bug to the one fixed in [1]. It will happily write a disruptive deny-all rule when it is not necessary. Unfortunately, we cannot even use devices.Emulator to generate a minimal set of transition rules because the DBus API is limited (you can only clear or append to the DeviceAllow= list -- so we are forced to always clear it). To work around this, we simply freeze the container during SetUnitProperties. [1]: `afe83489d4` ("cgroupv1: devices: use minimal transition rules with devices.Emulator") Fixes: `1d4ccc8e0c` ("fix data inconsistent when runc update in systemd driven cgroup v1") Fixes: `7682a2b2a5` ("fix data inconsistent when runc update in systemd driven cgroup v2") Signed-off-by: Aleksa Sarai <asarai@suse.de>	2020-05-13 17:43:56 +10:00
Aleksa Sarai	859a780d6f	cgroups: add GetFreezerState() helper to Manager This is effectively a nicer implementation of the container.isPaused() helper, but to be used within the cgroup code for handling some fun issues we have to fix with the systemd cgroup driver. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2020-05-13 17:38:45 +10:00
Kir Kolyshkin	714c91e9f7	Simplify cgroup path handing in v2 via unified API This unties the Gordian Knot of using GetPaths in cgroupv2 code. The problem is, the current code uses GetPaths for three kinds of things: 1. Get all the paths to cgroup v1 controllers to save its state (see (linuxContainer).currentState(), (LinuxFactory).loadState() methods). 2. Get all the paths to cgroup v1 controllers to have the setns process enter the proper cgroups in `(*setnsProcess).start()`. 3. Get the path to a specific controller (for example, `m.GetPaths()["devices"]`). Now, for cgroup v2 instead of a set of per-controller paths, we have only one single unified path, and a dedicated function `GetUnifiedPath()` to get it. This discrepancy between v1 and v2 cgroupManager API leads to the following problems with the code: - multiple if/else code blocks that have to treat v1 and v2 separately; - backward-compatible GetPaths() methods in v2 controllers; - - repeated writing of the PID into the same cgroup for v2; Overall, it's hard to write the right code with all this, and the code that is written is kinda hard to follow. The solution is to slightly change the API to do the 3 things outlined above in the same manner for v1 and v2: 1. Use `GetPaths()` for state saving and setns process cgroups entering. 2. Introduce and use Path(subsys string) to obtain a path to a subsystem. For v2, the argument is ignored and the unified path is returned. This commit converts all the controllers to the new API, and modifies all the users to use it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-08 12:04:06 -07:00
Kir Kolyshkin	51e1a0842d	libct/cgroups/systemd/v1: privatize v1 manager This patch was generated entirely by gorename -- nothing to review here. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-08 10:09:48 -07:00
Kir Kolyshkin	d827e323b0	libct/cgroups/systemd/v1: add NewLegacyManager Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-08 10:07:40 -07:00
Akihiro Suda	bf15cc99b1	cgroup v2: support rootless systemd Tested with both Podman (master) and Moby (master), on Ubuntu 19.10 . $ podman --cgroup-manager=systemd run -it --rm --runtime=runc \ --cgroupns=host --memory 42m --cpus 0.42 --pids-limit 42 alpine / # cat /proc/self/cgroup 0::/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-132ff0d72245e6f13a3bbc6cdc5376886897b60ac59eaa8dea1df7ab959cbf1c.scope / # cat /sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-132ff0d72245e6f13a3bbc6cdc5376886897b60ac59eaa8dea1df7ab959cbf1c.scope/memory.max 44040192 / # cat /sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-132ff0d72245e6f13a3bbc6cdc5376886897b60ac59eaa8dea1df7ab959cbf1c.scope/cpu.max 42000 100000 / # cat /sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-132ff0d72245e6f13a3bbc6cdc5376886897b60ac59eaa8dea1df7ab959cbf1c.scope/pids.max 42 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-05-08 12:39:20 +09:00
lifubang	bfa1b2aab3	check that StartTransientUnit and StopUnit succeeds Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-04-28 15:46:28 +08:00
lifubang	1d4ccc8e0c	fix data inconsistent when runc update in systemd driven cgroup v1 Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-04-23 19:32:57 +08:00

1 2

51 Commits