Implement support for the linux.intelRdt.schemata field of the spec.
This allows management of the "schemata" file in the resctrl group in a
generic way.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
Also, use GetPath() in Apply to get the resctrl group path, similar to
other methods of intelRdtManager.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
If intelRdt is specified in the spec, check that the resctrl fs is
actually mounted. Fixes e.g. the case where "intelRdt.closID" is
specified but runc silently ignores this if resctrl is not mounted.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
> libcontainer/intelrdt/cmt.go:5:1: ST1020: comment on exported function IsCMTEnabled should be of the form "IsCMTEnabled ..." (staticcheck)
> // Check if Intel RDT/CMT is enabled.
> ^
> libcontainer/intelrdt/intelrdt.go:419:1: ST1020: comment on exported function IsCATEnabled should be of the form "IsCATEnabled ..." (staticcheck)
> // Check if Intel RDT/CAT is enabled
> ^
> libcontainer/intelrdt/intelrdt.go:425:1: ST1020: comment on exported function IsMBAEnabled should be of the form "IsMBAEnabled ..." (staticcheck)
> // Check if Intel RDT/MBA is enabled
> ^
> libcontainer/intelrdt/intelrdt.go:446:1: ST1020: comment on exported method Apply should be of the form "Apply ..." (staticcheck)
> // Applies Intel RDT configuration to the process with the specified pid
> ^
> libcontainer/intelrdt/intelrdt.go:481:1: ST1020: comment on exported method Destroy should be of the form "Destroy ..." (staticcheck)
> // Destroys the Intel RDT container-specific 'container_id' group
> ^
> libcontainer/intelrdt/intelrdt.go:497:1: ST1020: comment on exported method GetPath should be of the form "GetPath ..." (staticcheck)
> // Returns Intel RDT path to save in a state file and to be able to
> ^
> libcontainer/intelrdt/intelrdt.go:506:1: ST1020: comment on exported method GetStats should be of the form "GetStats ..." (staticcheck)
> // Returns statistics for Intel RDT
> ^
> libcontainer/intelrdt/mbm.go:6:1: ST1020: comment on exported function IsMBMEnabled should be of the form "IsMBMEnabled ..." (staticcheck)
> // Check if Intel RDT/MBM is enabled.
> ^
> 8 issues:
> * staticcheck: 8
While at it, add missing periods.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This removes libcontainer/cgroups packages and starts
using those from github.com/opencontainers/cgroups repo.
Mostly generated by:
git rm -f libcontainer/cgroups
find . -type f -name "*.go" -exec sed -i \
's|github.com/opencontainers/runc/libcontainer/cgroups|github.com/opencontainers/cgroups|g' \
{} +
go get github.com/opencontainers/cgroups@v0.0.1
make vendor
gofumpt -w .
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Unless the container's runtime config has intelRdt configuration set,
any checks for whether Intel RDT is supported or the resctrl filesystem
is mounted are a waste of time as, per the OCI Runtime Spec, "the
runtime MUST NOT manipulate any resctrl pseudo-filesystems." And in the
likely case where Intel RDT is supported by both the hardware and
kernel but the resctrl filesystem is not mounted, these checks can get
expensive as the intelrdt package needs to parse mountinfo to check
whether the filesystem has been mounted to a non-standard path.
Optimize for the common case of containers with no intelRdt
configuration by only performing the checks when the container has opted
in.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The OCI runtime spec mandates "[i]f intelRdt is not set, the runtime
MUST NOT manipulate any resctrl pseudo-filesystems." Attempting to
delete files counts as manipulating, so stop doing that when the
container's RDT configuration is nil.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The intelrdt package only needs to parse mountinfo to find the mount
point of the resctrl filesystem. Users are generally going to mount the
resctrl filesystem to the pre-created /sys/fs/resctrl directory, so
there is a common case where mountinfo parsing is not required. Optimize
for the common case with a fast path which checks both for the existence
of the /sys/fs/resctrl directory and whether the resctrl filesystem was
mounted to that path using a single statfs syscall.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Reading /proc/cpuinfo is a surprisingly expensive operation. Since
kernel version 4.12 [1], opening /proc/cpuinfo on an x86 system can
block for around 20 milliseconds while the kernel samples the current
CPU frequency. There is a very recent patch [2] which gets rid of the
delay, but has yet to make it into the mainline kenel. Regardless,
kernels for which opening /proc/cpuinfo takes 20ms will continue to be
run in production for years to come. libcontainer only opens
/proc/cpuinfo to read the processor feature flags so all the delays to
get an accurate snapshot of the CPU frequency are just wasted time.
If we wanted to, we could interrogate the CPU features directly from
userspace using the `CPUID` instruction. However, Intel and AMD CPUs
have flags in different positions for their analogous sub-features and
there are CPU quirks [3] which would need to be accounted for. Some
Haswell server CPUs support RDT/CAT but are missing the `CPUID` flags
advertising their support; the kernel checks for support on that
processor family by probing the the hardware using privileged
RDMSR/WRMSR instructions [4]. This sort of probing could not be
implemented in userspace so it would not be possible to check for RDT
feature support in userspace without false negatives on some hardware
configurations.
It looks like libcontainer reads the CPU feature flags as a kind of
optimization so that it can skip checking whether the kernel supports an
RDT sub-feature if the hardware support is missing. As the kernel only
exposes subtrees in the `resctrl` filesystem for RDT sub-features with
hardware and kernel support, checking the CPU feature flags is redundant
from a correctness point of view. Remove the /proc/cpuinfo check as it
is an optimization which actually hurts performance.
[1]: https://unix.stackexchange.com/a/526679
[2]: https://lore.kernel.org/all/20220415161206.875029458@linutronix.de/
[3]: 7cf6a8a17f/arch/x86/kernel/cpu/resctrl/core.c (L834-L851)
[4]: a6b450573b/arch/x86/kernel/cpu/resctrl/core.c (L111-L153)
Signed-off-by: Cory Snider <csnider@mirantis.com>
Remove intelrtd.Manager interface, since we only have a single
implementation, and do not expect another one.
Rename intelRdtManager to Manager, and modify its users accordingly.
Remove NewIntelRdtManager from factory.
Remove IntelRdtfs. Instead, make intelrdt.NewManager return nil if the
feature is not available.
Remove TestFactoryNewIntelRdt as it is now identical to TestFactoryNew.
Add internal function newManager to be used for tests (to make sure
some testing is done even when the feature is not available in
kernel/hardware).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
For the Nth time I wanted to replace parsing mountinfo with
statfs and the check for superblock magic, but it is not possible
since some code relies of mount options check which can only
be obtained via mountinfo.
Add a note about it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In a (quite common) case RDT is not supported by the kernel/hardware,
it does not make sense to parse /proc/cpuinfo and /proc/self/mountinfo,
and yet the current code does it (on every runc exec, for example).
Fortunately, there is a quick way to check whether RDT is available --
if so, kernel creates /sys/fs/resctrl directory. Check its existence,
and skip all the other initialization if it's not present.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This test was written back in the day when findIntelRdtMountpointDir
was using its own mountinfo parser. Commit f1c1fdf911 changed that,
and thus this test is actually testing moby/sys/mountinfo parser, which
does not make much sense.
Remove the test, and drop the io.Reader argument since we no longer need
to parse a custom file.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case resctrl filesystem can not be found in /proc/self/mountinfo
(which is pretty common on non-server or non-x86 hardware), subsequent
calls to Root() will result in parsing it again and again.
Use sync.Once to avoid it. Make unit tests call it so that Root() won't
actually parse mountinfo in tests.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since commit 7296dc1712, type intelRdtData is only used by tests,
and since commit 79d292b9f, its only member is config.
Change the test to use config directly, and remove the type.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Export getIntelRdtRoot function as Root.
This is needed by google/cadvisor, which is (ab)using GetIntelRdtPath,
removed by commit 7296dc1712.
While at it, do some minimal refactoring to always use Root()
internally, not relying on variable value. Other than that it's just
some renaming.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Only some libcontainer packages can be built on non-linux platforms
(not that it make sense, but at least go build succeeds). Let's call
these "good" packages.
For all other packages (i.e. ones that fail to build with GOOS other
than linux), it does not make sense to have linux build tag (as they
are broken already, and thus are not and can not be used on anything
other than Linux).
Remove linux build tag for all non-"good" packages.
This was mostly done by the following script, with just a few manual
fixes on top.
function list_good_pkgs() {
for pkg in $(find . -type d -print); do
GOOS=freebsd go build $pkg 2>/dev/null \
&& GOOS=solaris go build $pkg 2>/dev/null \
&& echo $pkg
done | sed -e 's|^./||' | tr '\n' '|' | sed -e 's/|$//'
}
function remove_tag() {
sed -i -e '\|^// +build linux$|d' $1
go fmt $1
}
SKIP="^("$(list_good_pkgs)")"
for f in $(git ls-files . | grep .go$); do
if echo $f | grep -qE "$SKIP"; then
echo skip $f
continue
fi
echo proc $f
remove_tag $f
done
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Use the term "clos group" instead of "container_id group" as the group
that a container belongs to is not necessarily tied to its container id.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
Check that the ClosID directory pre-exists if no L3 or MB schema has
been specified. Conform with the following line from runtime-spec
(config-linux):
If closID is set, and neither of l3CacheSchema and memBwSchema are
set, runtime MUST check if corresponding pre-configured directory
closID is present in mounted resctrl. If such pre-configured directory
closID exists, runtime MUST assign container to this closID and
generate an error if directory does not exist.
Add a TODO note for verifying existing schemata against L3/MB
parameters.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
Handle ClosID parameter of IntelRdt. Makes it possible to use
pre-configured classes/ClosIDs and avoid running out of available IDs
which easily happens with per-container classes.
Remove validator checks for empty L3CacheSchema and MemBwSchema fields
in order to be able to leave them empty, and only specify ClosID for
a pre-configured class.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
These are not used anywhere outside of the package
(I have also checked the only external user of the package
(github.com/google/cadvisor).
No changes other than changing the case. The following
identifiers are now private:
* IntelRdtTasks
* NewLastCmdError
* NewStats
Brought to you by gorename.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
... the stack, so every caller will automatically benefit from it.
The only change that it causes is the user in
libcontainer/process_linux.go will get a better error message.
[v2: typo fix]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
For errors that only have a string and an underlying error, using
fmt.Errorf with %w to wrap an error is sufficient.
In this particular case, the code is simplified, and now we have
unwrappable errors as a bonus (same could be achieved by adding
(*LastCmdError).Unwrap() method, but that's adding more code).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Initially, this was copied over from libcontainer/cgroups, where it made
sense as for cgroup v1 we have multiple controllers and mount points.
Here, we only have a single mount, so there's no need for the whole
type.
Replace all that with a simple error (which is currently internal since
the only user is our own test case).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case getIntelRdtData() returns an error, d is set to nil.
In case the error returned is of NotFoundError type (which happens
if resctlr mount is not found in /proc/self/mountinfo), the function
proceeds to call d.join(), resulting in a nil deref and a panic.
In practice, this never happens in runc because of the checks in
intelrdt() function in libcontainer/configs/validate, which raises
an error in case any of the parameters are set in config but
the IntelRTD itself is not available (that includes checking
that the mount point is there).
Nevertheless, the code is wrong, and can result in nil dereference
if some external users uses Apply on a system without resctrl mount.
Fix this by removing the exclusion.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Do this for all errors except one from unix.*.
This fixes a bunch of errorlint warnings, like these
libcontainer/generic_error.go:25:15: type assertion on error will fail on wrapped errors. Use errors.As to check for specific errors (errorlint)
if le, ok := err.(Error); ok {
^
libcontainer/factory_linux_test.go:145:14: type assertion on error will fail on wrapped errors. Use errors.As to check for specific errors (errorlint)
lerr, ok := err.(Error)
^
libcontainer/state_linux_test.go:28:11: type assertion on error will fail on wrapped errors. Use errors.As to check for specific errors (errorlint)
_, ok := err.(*stateTransitionError)
^
libcontainer/seccomp/patchbpf/enosys_linux.go:88:4: switch on an error will fail on wrapped errors. Use errors.Is to check for specific errors (errorlint)
switch err {
^
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This one is tough as errorlint insists on using errors.Is, and the
latter is known to not work for Go 1.13 which we still support.
So, add a nolint annotation to suppress the warning, and a TODO to
address it later.
For intelrdt, we can do the same, but it is easier to reuse the very
same function from fscommon (note we can't use fscommon for other stuff
as it expects cgroupfs).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This should result in no change when the error is printed, but make the
errors returned unwrappable, meaning errors.As and errors.Is will work.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
An errror from ioutil.WriteFile already contains file name, so there is
no need to duplicate that information.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This commit adjusts the file mode to use the latest golang style
and also changes the file mode value in accordance with default.
Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
Use sync.Once to init Intel RDT when needed for a small speedup to
operations which do not require Intel RDT.
Simplify IntelRdtManager initialization in LinuxFactory.
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Intel RDT sub-features can be selectively disabled or enabled by kernel
command line. See "rdt=" option details in kernel document:
https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt
But Cache Monitoring Technology (CMT) feature is not correctly checked
in init() and getCMTNumaNodeStats() now. If CMT is disabled by kernel
command line (e.g., rdt=!cmt,mbmtotal,mbmlocal,l3cat,mba) while hardware
supports CMT, we may get following error when getting Intel RDT stats:
runc run c1
runc events c1
ERRO[0005] container_linux.go:200: getting container's Intel RDT stats
caused: open /sys/fs/resctrl/c1/mon_data/mon_L3_00/llc_occupancy: no
such file or directory
Fix CMT feature check in init() and GetStats() call paths.
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
It might be a tad slower but it surely more correct and well maintained,
so it's better to use it than rely on a custom implementation which is
kind of hard to get entirely right.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>