Commit Graph

27 Commits

Author SHA1 Message Date
Kir Kolyshkin
e6048715e4 Use gofumpt to format code
gofumpt (mvdan.cc/gofumpt) is a fork of gofmt with stricter rules.

Brought to you by

	git ls-files \*.go | grep -v ^vendor/ | xargs gofumpt -s -w

Looking at the diff, all these changes make sense.

Also, replace gofmt with gofumpt in golangci.yml.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-01 12:17:27 -07:00
Kir Kolyshkin
3f65946756 libct/cg: make Set accept configs.Resources
A cgroup manager's Set method sets cgroup resources, but historically
it was accepting configs.Cgroups.

Refactor it to accept resources only. This is an improvement from the
API point of view, as the method can not change cgroup configuration
(such as path to the cgroup etc), it can only set (modify) its
resources/limits.

This also lays the foundation for complicated resource updates, as now
Set has two sets of resources -- the one that was previously specified
during cgroup manager creation (or the previous Set), and the one passed
in the argument, so it could deduce the difference between these. This
is a long term goal though.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-29 15:24:19 -07:00
Kir Kolyshkin
ef9922c26c libct/cg: don't return OOMKillCount error when rootless
Commit 5d0ffbf9c8 added OOM kill count checking and better container
start/run/exec error reporting in case we hit OOM.

It also introduced warnings like these:

> level=warning msg="unable to get oom kill count" error="openat2
> /sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/test_hello/memory.events:
> no such file or directory"

In case of rootless containers, unless cgroup is delegated or systemd is
used, runc can not create a cgroup and thus it fails to get OOM kill
count. This is expected, and the warning should not be shown in this
case.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-14 17:57:00 -07:00
Kir Kolyshkin
5cdd9022a9 libct/cg/fs[2]: fix comments about m.rootless
For fs, commit fc620fdf81 made rootless field private,
and for fs2, it was always private, and yet comments in both
mention it as m.Rootless.

Fix it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-14 17:44:22 -07:00
Kir Kolyshkin
5d0ffbf9c8 runc start/run: report OOM
In some cases, container init fails to start because it is killed by
the kernel OOM killer. The errors returned by runc in such cases are
semi-random and rather cryptic. Below are a few examples.

On cgroup v1 + systemd cgroup driver:

> process_linux.go:348: copying bootstrap data to pipe caused: write init-p: broken pipe

> process_linux.go:352: getting the final child's pid from pipe caused: EOF

On cgroup v2:

> process_linux.go:495: container init caused: read init-p: connection reset by peer

> process_linux.go:484: writing syncT 'resume' caused: write init-p: broken pipe

This commits adds the OOM method to cgroup managers, which tells whether
the container was OOM-killed. In case that has happened, the original error
is discarded (unless --debug is set), and the new OOM error is reported
instead:

> ERRO[0000] container_linux.go:367: starting container process caused: container init was OOM-killed (memory limit too low?)

Also, fix the rootless test cases that are failing because they expect
an error in the first line, and we have an additional warning now:

> unable to get oom kill count" error="no directory specified for memory.oom_control

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-02-23 16:15:33 -08:00
Kir Kolyshkin
c0e14b8b5a libct/cg/fs.getCgroupRoot: reuse (cached) cgroup mountinfo
Drop the custom mountinfo parser, reuse the cgroups.GetCgroupMounts()
to get the cgroup root. Faster, and much less code.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-01-06 14:54:22 -08:00
Kenta Tada
b76652fbeb libcontainer: remove removePath from cgroups
`removePath` is unused as below
11fb94965c

Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
2020-10-01 12:22:12 +09:00
Kir Kolyshkin
b006f4a180 libct/cgroups: support Cgroups.Resources.Unified
Add support for unified resource map (as per [1]), and add some test
cases for the new functionality.

[1] https://github.com/opencontainers/runtime-spec/pull/1040

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-09-24 15:29:35 -07:00
Kir Kolyshkin
fad92bbffa cgroupv1/Apply: do not overuse d.path/getSubsystemPath
When paths are set, we only need to place the PID into proper
cgroups, and we do know all the paths already.

Both fs/d.path() and systemd/v1/getSubsystemPath() parse
/proc/self/mountinfo, and the only reason they are used
here is to check whether the subsystem is available.

Use a much simpler/faster check instead.

Frankly, I am not sure why the check is needed at all. Maybe it should
be dropped.

Also, for fs driver, since d is no longer used in this code path,
move its initialization to after it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-08-25 11:40:58 -07:00
Mrunal Patel
a5847db387 Merge pull request #2506 from kolyshkin/cgroup-fixes
cgroupv1 removal nits
2020-08-17 21:13:31 -07:00
Akihiro Suda
d6f5641c20 Merge pull request #2507 from kolyshkin/alt-to-2497
libct/cgroups/GetCgroupRoot: make it faster
2020-07-31 11:43:38 +09:00
Mrunal Patel
46243fcea1 Merge pull request #2500 from kolyshkin/fs-apply
libct/cgroups/fs: rework Apply()
2020-07-30 16:39:53 -07:00
Kir Kolyshkin
e0c0b0cf32 libct/cgroups/GetCgroupRoot: make it faster
...by checking the default path first.

Quick benchmark shows it's about 5x faster on an idle system, and the
gain should be much more on a system doing mounts etc.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-30 13:45:21 -07:00
Mrunal Patel
cf1273abf4 Merge pull request #2498 from kolyshkin/v1-code-cleanups
libct/cgroups/fs: code cleanups
2020-07-09 15:58:06 -07:00
Kir Kolyshkin
c1adc99a20 cgroup/fs: rework Apply()
In manager.Apply() method, a path to each subsystem is obtained by
calling d.path(sys.Name()), and the sys.Apply() is called that does
the same call to d.path() again.

d.path() is an expensive call, so rather than to call it twice, let's
reuse the result.

This results the number of times we parse mountinfo during container
start from 62 to 34 on my setup.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-07 10:58:37 -07:00
Aleksa Sarai
819fcc687e merge branch 'pr-2495'
Kir Kolyshkin (1):
  cgroups/fs/path: optimize

LGTMs: @mrunalp @cyphar
Closes #2495
2020-07-07 11:51:06 +10:00
Kir Kolyshkin
2a322e91ec cgroupv1: remove subsystemSet.Get()
Instead of iterating over m.paths, iterate over subsystems and look up
the path for each. This is faster since a map lookup is faster than
iterating over the names in Get. A quick benchmark shows that the new
way is 2.5x faster than the old one.

Note though that this is not done to make things faster, as savings are
negligible, but to make things simpler by removing some code.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 18:31:46 -07:00
Kir Kolyshkin
daf30cb7ca cgroups/fs: rm getSubsystems
It does not add any value.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 18:29:14 -07:00
Kir Kolyshkin
2e22579946 libct/cgroups/fs.GetStats: drop PathExists check
Half of controllers' GetStats just return nil, and most of the others
ignore ENOENT on files, so it will be cheaper to not check that the
path exists in the main GetStats method, offloading that to the
controllers.

Drop PathExists check from GetStats, add it to those controllers'
GetStats where it was missing.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 18:02:17 -07:00
Kir Kolyshkin
11fb94965c cgroups/fs: rm Remove method from controllers
To my surprise, those are not used anywhere in the code.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 18:02:17 -07:00
Kir Kolyshkin
254d23b964 libc/cgroups: empty map in RemovePaths
RemovePaths() deletes elements from the paths map for paths that has
been successfully removed.

Although, it does not empty the map itself (which is needed that AFAIK
Go garbage collector does not shrink the map), but all its callers do.

Move this operation from callers to RemovePaths.

No functional change, except the old map should be garbage collected now.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-06 17:54:44 -07:00
Mrunal Patel
3f81131845 Merge pull request #2490 from kolyshkin/dev-opt
libct/cgroups: add SkipDevices to Resources
2020-07-06 14:28:30 -07:00
Kir Kolyshkin
62a30709d2 cgroups/fs/path: optimize
The result of cgroupv1.FindCgroupMountpoint() call (which is relatively
expensive) is only used in case raw.innerPath is absolute, so it only
makes sense to call it in that case.

This drastically reduces the number of calls to FindCgroupMountpoint
during container start (from 116 to 62 in my setup).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-03 14:07:27 -07:00
Kir Kolyshkin
46b26bc05d cgroups/fs/Freeze: simplify
In here, defer looks like an overkill, since the code is very simple and
we already have an error path.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-03 14:02:57 -07:00
Kir Kolyshkin
cd479f9d14 cgroupv1/freezer: don't use subsystemSet.Get()
Iterating over the list of subsystems and comparing their names to get an
instance of fs.cgroupFreezer is useless and a waste of time, since it is
a shallow type (i.e. does not have any data/state) and we can create an
instance in place.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-03 14:00:44 -07:00
Kir Kolyshkin
108ee85b82 libct/cgroups: add SkipDevices to Resources
The kubelet uses libct/cgroups code to set up cgroups. It creates a
parent cgroup (kubepods) to put the containers into.

The problem (for cgroupv2 that uses eBPF for device configuration) is
the hard requirement to have devices cgroup configured results in
leaking an eBPF program upon every kubelet restart.  program. If kubelet
is restarted 64+ times, the cgroup can't be configured anymore.

Work around this by adding a SkipDevices flag to Resources.

A check was added so that if SkipDevices is set, such a "container"
can't be started (to make sure it is only used for non-containers).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-07-02 15:19:31 -07:00
Kir Kolyshkin
8c5a19f79b libct/cgroups/fs: rename some files
no changes, just a few git renames

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-06-16 12:45:54 -07:00