| Commit message (Collapse) | Author | Age |
|
|
|
|
| |
Similar to MemoryMax=, MemorySwapMax= limits swap usage. This controls
controls "memory.swap.max" attribute in unified cgroup.
|
| |
|
|
|
|
|
|
|
| |
https://github.com/elogind/elogind/pull/3685 introduced
/run/elogind/inaccessible/{chr,blk} to map inacessible devices,
this patch allows elogind running inside a nspawn container to create
/run/elogind/inaccessible/{chr,blk}.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's lot at LOG_NOTICE about any processes that we are going to
SIGKILL/SIGABRT because clean termination of them didn't work.
This turns the various boolean flag parameters to cg_kill(), cg_migrate() and
related calls into a single binary flags parameter, simply because the function
now gained even more parameters and the parameter listed shouldn't get too
long.
Logging for killing processes is done either when the kill signal is SIGABRT or
SIGKILL, or on explicit request if KILL_TERMINATE_AND_LOG instead of LOG_TERMINATE
is passed. This isn't used yet in this patch, but is made use of in a later
patch.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jun 16 05:12:08 elogind[1]: Controller 'io' supported: yes
Jun 16 05:12:08 elogind[1]: Controller 'memory' supported: yes
Jun 16 05:12:08 elogind[1]: Controller 'pids' supported: yes
instead of
Jun 16 04:06:50 elogind[1]: Controller 'memory' supported: yes
Jun 16 04:06:50 elogind[1]: Controller 'devices' supported: yes
Jun 16 04:06:50 elogind[1]: Controller 'pids' supported: yes
|
|
|
|
|
|
|
|
|
|
|
|
| |
cgroup_context_apply() and friends take CGroupContext and cgroup path as input
and has no way of getting back to the associated Unit and thus uses raw cgroup
path for logging. This makes the log messages difficult to track down.
There's no reason to avoid passing in Unit into these functions. Pass in Unit
and use log_unit*() instead.
While at it, make cgroup_context_apply(), which has no outside users, static.
Also, drop cgroup path from log messages where the path itself isn't too
interesting and can be easily obtained from the unit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On the unified hierarchy, memory controller implements three control knobs -
low, high and max which enables more useable and versatile control over memory
usage. This patch implements support for the three control knobs.
* MemoryLow, MemoryHigh and MemoryMax are added for memory.low, memory.high and
memory.max, respectively.
* As all absolute limits on the unified hierarchy use "max" for no limit, make
memory limit parse functions accept "max" in addition to "infinity" and
document "max" for the new knobs.
* Implement compatibility translation between MemoryMax and MemoryLimit.
v2:
- Fixed missing else's in config_parse_memory_limit().
- Fixed missing newline when writing out drop-ins.
- Coding style updates to use "val > 0" instead of "val".
- Minor updates to documentation.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dbus-daemon currently uses a backlog of 30 on its D-bus system bus socket. On
overloaded systems this means that only 30 connections may be queued without
dbus-daemon processing them before further connection attempts fail. Our
cgroups-agent binary so far used D-Bus for its messaging, and hitting this
limit hence may result in us losing cgroup empty messages.
This patch adds a seperate cgroup agent socket of type AF_UNIX/SOCK_DGRAM.
Since sockets of these types need no connection set up, no listen() backlog
applies. Our cgroup-agent binary will hence simply block as long as it can't
enqueue its datagram message, so that we won't lose cgroup empty messages as
likely anymore.
This also rearranges the ordering of the processing of SIGCHLD signals, service
notification messages (sd_notify()...) and the two types of cgroup
notifications (inotify for the unified hierarchy support, and agent for the
classic hierarchy support). We now always process events for these in the
following order:
1. service notification messages (SD_EVENT_PRIORITY_NORMAL-7)
2. SIGCHLD signals (SD_EVENT_PRIORITY_NORMAL-6)
3. cgroup inotify and cgroup agent (SD_EVENT_PRIORITY_NORMAL-5)
This is because when receiving SIGCHLD we invalidate PID information, which we
need to process the service notification messages which are bound to PIDs.
Hence the order between the first two items. And we want to process SIGCHLD
metadata to detect whether a service is gone, before using cgroup
notifications, to decide when a service is gone, since the former carries more
useful metadata.
Related to this:
https://bugs.freedesktop.org/show_bug.cgi?id=95264
https://github.com/elogind/elogind/issues/1961
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unit_has_mask_realized() determines whether the specified unit has its cgroups
set up properly given the desired target_mask; however, on the unified
hierarchy, controllers need to be enabled explicitly for children and the mask
of enabled controllers can deviate from target_mask. Only considering
target_mask in unit_has_mask_realized() can lead to false positives and
skipping enabling the requested controllers.
This patch adds unit->cgroup_enabled_mask to track which controllers are
enabled and updates unit_has_mask_realized() to also consider enable_mask.
Signed-off-by: Tejun Heo <htejun@fb.com>
|
|
|
|
|
|
|
| |
Earlier during the development of unified hierarchy, the populated event was
reported through by the dedicated "cgroup.populated" file; however, the
interface was updated so that it's reported through the "populated" field of
"cgroup.events" file. Update populated event handling logic accordingly.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support for net_cls.class_id through the NetClass= configuration directive
has been added in v227 in preparation for a per-unit packet filter mechanism.
However, it turns out the kernel people have decided to deprecate the net_cls
and net_prio controllers in v2. Tejun provides a comprehensive justification
for this in his commit, which has landed during the merge window for kernel
v4.5:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=bd1060a1d671
As we're aiming for full support for the v2 cgroup hierarchy, we can no
longer support this feature. Userspace tool such as nftables are moving over
to setting rules that are specific to the full cgroup path of a task, which
obsoletes these controllers anyway.
This commit removes support for tweaking details in the net_cls controller,
but keeps the NetClass= directive around for legacy compatibility reasons.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Apply remaining fixes and the performed move of utility functions
into their own foo-util.[hc] files on the rest of elogind.
|
|
|
|
|
|
| |
--user due to EACCES
After all, in the classic hierarchy that's pretty much the default case.
|
| |
|
|
|
|
|
| |
Although it is nice to have it read ELOGIND instead of SYSTEMD, all
diffs just show too many irrelevant (false) positives.
|
|
|
|
|
| |
This is done for systems, which init systems are no cgroup
controllers. One example is runit on Void Linux.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous variant was nice and sleek. But unfortunately, there are
constructs like:
#if 0
(... old code ...)
#else
(... alternative code for elogind ...)
#endif // 0
These fragments couldn't be handled by the old code, but can by the
new one.
To make this work, the precompiler macros must be set like shown above.
Apart from that, all lines like:
/// Any doxygen one-line-comments with elogind in it are removed
are removed, too. Please note the three slashes.
And finally, all commented out #include directives are removed as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- src/shared/install.h - removed
- src/basic/unit-name.[hc] - cleaned
- src/core/cgroup.[hc] - cleaned
- src/libelogind/libelogind.sym - cleaned
- src/libelogind/sd-daemon/sd-daemon.c - cleaned
- src/shared/acl-util.[hc] - cleaned
- src/shared/bus-util.[hc] - cleaned
- src/shared/output-mode.h - cleaned
- src/shared/path-lookup.h - cleaned
- src/systemd/sd-daemon.h - cleaned
|
|
|
|
|
|
|
| |
a) Add some debugging messages to track what's going on with eloginds
cgroup handling.
b) Do not create a cgroup path "/elogind" if our cgroup root is
already "/elogind".
|
| |
|
|
|
|
|
| |
Add a highly reduced src/core/cgroup.[hc] to enable elogind to setup
cgroups for proper usage.
|
| |
|
| |
|
|
|
|
|
|
| |
This adds support for showing the accumulated consumed CPU time per-unit
in the "systemctl status" output. The property is also readable via the
bus.
|
|
|
|
| |
https://github.com/docker/docker/issues/10280
|
| |
|
| |
|
| |
|
|
|
|
| |
systemd[1]: Failed to set memory.limit_in_bytes on : Invalid argument
|
|
|
|
| |
mounted read-only
|
| |
|
|
|
|
|
|
|
| |
for leaf units
Otherwise a slice or delegation unit might move PIDs around ignoring the
fact that it is attached to a subcgroup.
|
|
|
|
| |
it's not quite as destructive as it sounds nowadays
|
|
|
|
|
|
|
|
|
| |
If a cgroup fails to be destroyed (most likely because there are still
processes running as part of a service after the main pid exits), don't
free and remove the cgroup unit from the manager. This fixes a
regression introduced by the cgroup rework in v205 where systemd would
forget about processes still running after the unit becomes inactive.
(This can happen when the main pid exits and KillMode=process or none).
|
|
|
|
|
| |
Using the same scripts as in f647962d64e "treewide: yet more log_*_errno
+ return simplifications".
|
|
|
|
|
|
|
|
|
|
|
| |
If the format string contains %m, clearly errno must have a meaningful
value, so we might as well use log_*_errno to have ERRNO= logged.
Using:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\((".*%m.*")/log_\1_errno(errno, \2/'
Plus some whitespace, linewrap, and indent adjustments.
|
| |
|
|
|
|
| |
It corrrectly handles both positive and negative errno values.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'
Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
subhierarchies
For priviliged units this resource control property ensures that the
processes have all controllers systemd manages enabled.
For unpriviliged services (those with User= set) this ensures that
access rights to the service cgroup is granted to the user in question,
to create further subgroups. Note that this only applies to the
name=systemd hierarchy though, as access to other controllers is not
safe for unpriviliged processes.
Delegate=yes should be set for container scopes where a systemd instance
inside the container shall manage the hierarchies below its own cgroup
and have access to all controllers.
Delegate=yes should also be set for user@.service, so that systemd
--user can run, controlling its own cgroup tree.
This commit changes machined, systemd-nspawn@.service and user@.service
to set this boolean, in order to ensure that container management will
just work, and the user systemd instance can run fine.
|
|
|
|
|
|
|
|
|
|
|
| |
systemctl would print 'CPUQuotaPerSecUSec=(null)' for no limit. This
does not look right.
Since USEC_INFINITY is one of the valid values, format_timespan()
could return NULL, and we should wrap every use of it in strna() or
similar. But most callers didn't do that, and it seems more robust to
return a string ("infinity") that makes sense most of the time, even
if in some places the result will not be grammatically correct.
|
|
|
|
|
|
|
|
| |
We'll stay in "initializing" until basic.target has reached, at which
point we will enter "starting".
This is preparation so that we can change the startip timeout to only
apply to the first phase of startup, not the full procedure.
|
| |
|
| |
|