runc start/run: report OOM

In some cases, container init fails to start because it is killed by
the kernel OOM killer. The errors returned by runc in such cases are
semi-random and rather cryptic. Below are a few examples.

On cgroup v1 + systemd cgroup driver:

> process_linux.go:348: copying bootstrap data to pipe caused: write init-p: broken pipe

> process_linux.go:352: getting the final child's pid from pipe caused: EOF

On cgroup v2:

> process_linux.go:495: container init caused: read init-p: connection reset by peer

> process_linux.go:484: writing syncT 'resume' caused: write init-p: broken pipe

This commits adds the OOM method to cgroup managers, which tells whether
the container was OOM-killed. In case that has happened, the original error
is discarded (unless --debug is set), and the new OOM error is reported
instead:

> ERRO[0000] container_linux.go:367: starting container process caused: container init was OOM-killed (memory limit too low?)

Also, fix the rootless test cases that are failing because they expect
an error in the first line, and we have an additional warning now:

> unable to get oom kill count" error="no directory specified for memory.oom_control

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This commit is contained in:
Kir Kolyshkin
2021-02-19 02:53:27 -08:00
parent 7e137b9044
commit 5d0ffbf9c8
8 changed files with 53 additions and 2 deletions

View File

@@ -55,6 +55,10 @@ func (m *mockCgroupManager) Exists() bool {
return err == nil
}
func (m *mockCgroupManager) OOMKillCount() (uint64, error) {
return 0, nil
}
func (m *mockCgroupManager) GetPaths() map[string]string {
return m.paths
}