In case early stage of runc init (nsenter) fails for some reason, it
logs error(s) with FATAL log level, via bail().
The runc init log is read by a parent (runc create/run/exec) and is
logged via normal logrus mechanism, which is all fine and dandy, except
when `runc init` fails, we return the error from the parent (which is
usually not too helpful, for example):
runc run failed: unable to start container process: can't get final child's PID from pipe: EOF
Now, the actual underlying error is from runc init and it was logged
earlier; here's how full runc output looks like:
FATA[0000] nsexec-1[3247792]: failed to unshare remaining namespaces: No space left on device
FATA[0000] nsexec-0[3247790]: failed to sync with stage-1: next state
ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF
The problem is, upper level runtimes tend to ignore everything except
the last line from runc, and thus error reported by e.g. docker is not
very helpful.
This patch tries to improve the situation by collecting FATAL errors
from runc init and appending those to the error returned (instead of
logging). With it, the above error will look like this:
ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF; runc init error(s): nsexec-1[141549]: failed to unshare remaining namespaces: No space left on device; nsexec-0[141547]: failed to sync with stage-1: next state
Yes, it is long and ugly, but at least the upper level runtime will
report it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit b2f8a74d "clothed" the naked return as inflicted by gofumpt
v0.9.0. Since gofumpt v0.9.2 this rule was moved to "extra" category,
not enabled by default. The only other "extra" rule is to group adjacent
parameters with the same type, which also makes sense.
Enable gofumpt "extra" rules, and reformat the code accordingly.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit 9f3d7534ea enabled logrus to show information about log
caller, if --debug is set. It is helpful in many scenarios, but does
not work very well when we are debugging runc init, for example:
# runc --debug run -d xx4557
DEBU[0000]libcontainer/logs/logs.go:45 github.com/opencontainers/runc/libcontainer/logs.processEntry() nsexec[279687]: logging set up
DEBU[0000]libcontainer/logs/logs.go:45 github.com/opencontainers/runc/libcontainer/logs.processEntry() nsexec[279687]: logging set up
DEBU[0000]libcontainer/logs/logs.go:45 github.com/opencontainers/runc/libcontainer/logs.processEntry() nsexec[279687]: => nsexec container setup
DEBU[0000]libcontainer/logs/logs.go:45 github.com/opencontainers/runc/libcontainer/logs.processEntry() nsexec[279687]: update /proc/self/oom_score_adj to '30'
As we're merely forwarding the logs here, printing out filename:line and
function is useless and clutters the logs a log.
To fix, create and use a copy of the standard logger with caller info
turned off.
With this in place, nsexec logs are sane again:
# runc --debug --log-format=text run -d xe34
DEBU[0000] nsexec[293595]: logging set up
DEBU[0000] nsexec[293595]: logging set up
DEBU[0000] nsexec[293595]: => nsexec container setup
DEBU[0000] nsexec[293595]: update /proc/self/oom_score_adj to '30'
This patch also changes Logf to Log in processEntry, as this is what it
should be.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
There's no need to call logrus.ParseLevel as logrus.Level already
implements encoding.TextUnmarshaler interface.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When playing with the log forwarder, I broke it, but all the units tests
were still passing. This happened because test cases were merely looking
for a word (like "kitten") in the log output, which also happened to be
there in case of an error (as a part of an error message produced by log
forwarder).
Make the test a bit more robust by
- looking for a complete log message, not just part of it;
- also checking that log file does NOT contain errors.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Previous commits removed all its users -- the only one left is package's
own unit tests.
Modify those unit tests to configure logrus directly, and remove
ConfigureLogging for good. The world is better without it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
gofumpt (mvdan.cc/gofumpt) is a fork of gofmt with stricter rules.
Brought to you by
git ls-files \*.go | grep -v ^vendor/ | xargs gofumpt -s -w
Looking at the diff, all these changes make sense.
Also, replace gofmt with gofumpt in golangci.yml.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
t.Cleanup is not present in Go 1.13.
Dockre/Moby still builds runc with Go 1.13, so we should still support
Go 1.13.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This helps a lot to find out where the errors come from.
Before:
> level=warning msg="lstat /sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/test_hello: no such file or directory"
After:
> level=warning msg="lstat /sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/test_hello: no such file or directory" func=github.com/opencontainers/runc/libcontainer.destroy file="github.com/opencontainers/runc/libcontainer/state_linux.go:44"
Presumably this comes with an overhead, but I guess no one is using
--debug by default anyway.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Sometimes debug.bats test cases are failing like this:
> not ok 27 global --debug to --log --log-format 'json'
> # (in test file tests/integration/debug.bats, line 77)
> # `[[ "${output}" == *"child process in init()"* ]]' failed
It happens more when writing to disk.
This issue is caused by the fact that runc spawns log forwarding goroutine
(ForwardLogs) but does not wait for it to finish, resulting in missing
debug lines from nsexec.
ForwardLogs itself, though, never finishes, because it reads from a
reading side of a pipe which writing side is not closed. This is
especially true in case of runc create, which spawns runc init and
exits; meanwhile runc init waits on exec fifo for arbitrarily long
time before doing execve.
So, to fix the failure described above, we need to:
1. Make runc create/run/exec wait for ForwardLogs to finish;
2. Make runc init close its log pipe file descriptor (i.e.
the one which value is passed in _LIBCONTAINER_LOGPIPE
environment variable).
This is exactly what this commit does:
1. Amend ForwardLogs to return a channel, and wait for it in start().
2. In runc init, save the log fd and close it as late as possible.
PS I have to admit I still do not understand why an explicit close of
log pipe fd is required in e.g. (*linuxSetnsInit).Init, right before
the execve which (thanks to CLOEXEC) closes the fd anyway.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
- add check, checkWait, and finish helpers;
- move test cleanup to runLogForwarding;
- introduce and use log struct.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
I have noticed that ConfigureLogs do not return an error in case logging
was already configured -- instead it just warns about it. So I went
ahead and changed the warning to the actual error...
... only to discover I broke things badly, because in case of runc init
logging is configured twice. The fix is to not configure logging in case
we are init.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Error handling is slightly cleaner this way.
While at it, do minor refactoring and fix error logging
in processEntry.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Refactor configuring logging into a reusable component
so that it can be nicely used in both main() and init process init()
Co-authored-by: Georgi Sabev <georgethebeatle@gmail.com>
Co-authored-by: Giuseppe Capizzi <gcapizzi@pivotal.io>
Co-authored-by: Claudia Beresford <cberesford@pivotal.io>
Signed-off-by: Danail Branekov <danailster@gmail.com>