Files
runc/libcontainer/apparmor/apparmor_linux.go
Aleksa Sarai 8e8b136c49 tree-wide: use /proc/thread-self for thread-local state
With the idmap work, we will have a tainted Go thread in our
thread-group that has a different mount namespace to the other threads.
It seems that (due to some bad luck) the Go scheduler tends to make this
thread the thread-group leader in our tests, which results in very
baffling failures where /proc/self/mountinfo produces gibberish results.

In order to avoid this, switch to using /proc/thread-self for everything
that is thread-local. This primarily includes switching all file
descriptor paths (CLONE_FS), all of the places that check the current
cgroup (technically we never will run a single runc thread in a separate
cgroup, but better to be safe than sorry), and the aforementioned
mountinfo code. We don't need to do anything for the following because
the results we need aren't thread-local:

 * Checks that certain namespaces are supported by stat(2)ing
   /proc/self/ns/...

 * /proc/self/exe and /proc/self/cmdline are not thread-local.

 * While threads can be in different cgroups, we do not do this for the
   runc binary (or libcontainer) and thus we do not need to switch to
   the thread-local version of /proc/self/cgroups.

 * All of the CLONE_NEWUSER files are not thread-local because you
   cannot set the usernamespace of a single thread (setns(CLONE_NEWUSER)
   is blocked for multi-threaded programs).

Note that we have to use runtime.LockOSThread when we have an open
handle to a tid-specific procfs file that we are operating on multiple
times. Go can reschedule us such that we are running on a different
thread and then kill the original thread (causing -ENOENT or similarly
confusing errors). This is not strictly necessary for most usages of
/proc/thread-self (such as using /proc/thread-self/fd/$n directly) since
only operating on the actual inodes associated with the tid requires
this locking, but because of the pre-3.17 fallback for CentOS, we have
to do this in most cases.

In addition, CentOS's kernel is too old for /proc/thread-self, which
requires us to emulate it -- however in rootfs_linux.go, we are in the
container pid namespace but /proc is the host's procfs. This leads to
the incredibly frustrating situation where there is no way (on pre-4.1
Linux) to figure out which /proc/self/task/... entry refers to the
current tid. We can just use /proc/self in this case.

Yes this is all pretty ugly. I also wish it wasn't necessary.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2023-12-14 11:36:41 +11:00

74 lines
1.7 KiB
Go

package apparmor
import (
"errors"
"fmt"
"os"
"sync"
"github.com/opencontainers/runc/libcontainer/utils"
)
var (
appArmorEnabled bool
checkAppArmor sync.Once
)
// isEnabled returns true if apparmor is enabled for the host.
func isEnabled() bool {
checkAppArmor.Do(func() {
if _, err := os.Stat("/sys/kernel/security/apparmor"); err == nil {
buf, err := os.ReadFile("/sys/module/apparmor/parameters/enabled")
appArmorEnabled = err == nil && len(buf) > 1 && buf[0] == 'Y'
}
})
return appArmorEnabled
}
func setProcAttr(attr, value string) error {
attr = utils.CleanPath(attr)
attrSubPath := "attr/apparmor/" + attr
if _, err := os.Stat("/proc/self/" + attrSubPath); errors.Is(err, os.ErrNotExist) {
// fall back to the old convention
attrSubPath = "attr/" + attr
}
// Under AppArmor you can only change your own attr, so there's no reason
// to not use /proc/thread-self/ (instead of /proc/<tid>/, like libapparmor
// does).
attrPath, closer := utils.ProcThreadSelf(attrSubPath)
defer closer()
f, err := os.OpenFile(attrPath, os.O_WRONLY, 0)
if err != nil {
return err
}
defer f.Close()
if err := utils.EnsureProcHandle(f); err != nil {
return err
}
_, err = f.WriteString(value)
return err
}
// changeOnExec reimplements aa_change_onexec from libapparmor in Go
func changeOnExec(name string) error {
if err := setProcAttr("exec", "exec "+name); err != nil {
return fmt.Errorf("apparmor failed to apply profile: %w", err)
}
return nil
}
// applyProfile will apply the profile with the specified name to the process after
// the next exec. It is only supported on Linux and produces an error on other
// platforms.
func applyProfile(name string) error {
if name == "" {
return nil
}
return changeOnExec(name)
}