summaryrefslogtreecommitdiff
path: root/src/core/mount-setup.c
Commit message (Collapse)AuthorAge
* tree-wide: remove Lennart's copyright linesLennart Poettering2018-08-24
| | | | | | | | | | | These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.
* tree-wide: drop 'This file is part of systemd' blurbLennart Poettering2018-08-24
| | | | | | | | | | | | | | | | This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.
* tree-wide: unify how we define bit mak enumsLennart Poettering2018-08-24
| | | | | | Let's always write "1 << 0", "1 << 1" and so on, except where we need more than 31 flag bits, where we write "UINT64(1) << 0", and so on to force 64bit values.
* core/mount-setup: remove part of check which is always trueZbigniew Jędrzejewski-Szmek2018-08-24
| | | | | | | f1470e424b2b5337e3c383d68dc5a26af1ff4ce6 removed one check, but missed a similar one a few lines down. CID #1390949.
* core/mount-setup: remove part of check which is always trueZbigniew Jędrzejewski-Szmek2018-08-24
| | | | | | | k was set to join_controllers at this point and only incremented, so it cannot be null at this point. CID #1390949.
* mount-setup: add a comment that the character/block device nodes are ↵Lennart Poettering2018-08-24
| | | | | | | | | | "optional" (#8893) if we lack privs to create device nodes that's fine, and creating /run/systemd/inaccessible/chr or /run/systemd/inaccessible/blk won't work then. Document this in longer comments. Fixes: #4484
* tree-wide: drop license boilerplateZbigniew Jędrzejewski-Szmek2018-08-24
| | | | | | | | | | Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.
* machine-image,mount-setup: minor coding style fixesLennart Poettering2018-08-24
|
* core: dont't remount /sys/fs/cgroup for relabel if not needed (#8595)Krzysztof Nowicki2018-08-24
| | | | | | | | | | | | | | | | | | | | | | The initial fix for relabelling the cgroup filesystem for SELinux delivered in commit 8739f23e3 was based on the assumption that the cgroup filesystem is already populated once mount_setup() is executed, which was true for my system. What I wasn't aware is that this is the case only when another instance of systemd was running before this one, which can happen if systemd is used in the initrd (for ex. by dracut). In case of a clean systemd start-up the cgroup filesystem is actually being populated after mount_setup() and does not need relabelling as at that moment the SELinux policy is already loaded. Since however the root cgroup filesystem was remounted read-only in the meantime this operation will now fail. To fix this check for the filesystem mount flags before relabelling and only remount ro->rw->ro if necessary and leave the filesystem read-write otherwise. Fixes #7901.
* label: rework label_fix() implementations (#8583)Lennart Poettering2018-08-24
| | | | | | | | | | | | | | | | | | | | | This reworks the SELinux and SMACK label fixing calls in a number of ways: 1. The two separate boolean arguments of these functions are converted into a flags type LabelFixFlags. 2. The operations are now implemented based on O_PATH. This should resolve TTOCTTOU races between determining the label for the file system object and applying it, as it it allows to pin the object while we are operating on it. 3. When changing a label fails we'll query the label previously set, and if matches what we want to set anyway we'll suppress the error. Also, all calls to label_fix() are now (void)ified, when we ignore the return values. Fixes: #8566
* macro: introduce TAKE_PTR() macroLennart Poettering2018-08-24
| | | | | | | | | | | | | | | | This macro will read a pointer of any type, return it, and set the pointer to NULL. This is useful as an explicit concept of passing ownership of a memory area between pointers. This takes inspiration from Rust: https://doc.rust-lang.org/std/option/enum.Option.html#method.take and was suggested by Alan Jenkins (@sourcejedi). It drops ~160 lines of code from our codebase, which makes me like it. Also, I think it clarifies passing of ownership, and thus helps readability a bit (at least for the initiated who know the new macro)
* Fix foul typo in f281944Sven Eden2018-06-29
|
* Fix cgroup directory mounting:Sven Eden2018-06-29
| | | | | | | | | | | | | A little misunderstanding has been fixed, and elogind now mounts the following directories, if (and only if) it has to act as its own cgroups controller. * -Ddefault-hierarchy=legacy : /sys/fs/cgroup as tmpfs /sys/fs/cgroup/elogind as cgroup * -Ddefault-hierarchy=hybrid : The same as with 'legacy', plus /sys/fs/cgroup/unified as cgroup2 * -Ddefault-hierarchy=unified : /sys/fs/cgroup2 as cgroup2
* core: dont't remount /sys/fs/cgroup for relabel if not needed (#8595)Krzysztof Nowicki2018-06-28
| | | | | | | | | | | | | | | | | | | | | | | | | | The initial fix for relabelling the cgroup filesystem for SELinux delivered in commit 8739f23e3 was based on the assumption that the cgroup filesystem is already populated once mount_setup() is executed, which was true for my system. What I wasn't aware is that this is the case only when another instance of systemd was running before this one, which can happen if systemd is used in the initrd (for ex. by dracut). In case of a clean systemd start-up the cgroup filesystem is actually being populated after mount_setup() and does not need relabelling as at that moment the SELinux policy is already loaded. Since however the root cgroup filesystem was remounted read-only in the meantime this operation will now fail. To fix this check for the filesystem mount flags before relabelling and only remount ro->rw->ro if necessary and leave the filesystem read-write otherwise. Fixes #7901. (cherry picked from commit 6f7729c1767998110c4460c85c94435c5782a613) Also https://bugzilla.redhat.com/show_bug.cgi?id=1576240.
* core: do not free heap-allocated strings (#8391)Yu Watanabe2018-06-28
| | | | | | Fixes #8387. (cherry picked from commit 5cbaad2f6795088db56063d20695c6444595822f)
* mount-setup: change bpf mount mode to 0700 (#8334)Lennart Poettering2018-05-30
| | | | | After discussing with the kernel folks, we agreed to default to 0700 for this. Better safe than sorry.
* mount-setup: always use the same source as fstype for the API VFS we mountLennart Poettering2018-05-30
| | | | | | | So far, for all our API VFS mounts we used the fstype also as mount source, let's do that for the cgroupsv2 mounts too. The kernel doesn't really care about the source for API VFS, but it's visible to the user, hence let's clean this up and follow the rule we otherwise follow.
* bpf: mount bpffs by default on bootLennart Poettering2018-05-30
| | | | | | We make heavy use of BPF functionality these days, hence expose the BPF file system too by default now. (Note however, that we don't actually make use bpf file systems object yet, but we might later on too.)
* pid1: do not initialize join_controllers by defaultZbigniew Jędrzejewski-Szmek2018-05-30
| | | | | We're moving towards unified cgroup hierarchy where this is not necessary. This makes main.c a bit simpler.
* mount-setup: fix MNT_CHECK_WRITABLE error handling, and log about the issueLennart Poettering2018-05-30
| | | | | Let's correct the error handling (the error is in errno, not r), and let's add logging like the rest of the function has it.
* Prep v236 : Add missing SPDX-License-Identifier (3/9) src/coreSven Eden2018-03-26
|
* Fix SELinux labels in cgroup filesystem root directory (#7496)Krzysztof Nowicki2017-11-30
| | | | | | | | | | | | | | | When using SELinux with legacy cgroups the tmpfs on /sys/fs/cgroup is by default labelled as tmpfs_t. This label is also inherited by the "cpu" and "cpuacct" symbolic links. Unfortunately the policy expects them to be labelled as cgroup_t, which is used for all the actual cgroup filesystems. Failure to do so results in a stream of denials. This state cannot be fixed reliably when the cgroup filesystem structure is set-up as the SELinux policy is not yet loaded at this moment. It also cannot be fixed later as the root of the cgroup filesystem is remounted read-only. In order to fix it the root of the cgroup filesystem needs to be temporary remounted read-write, relabelled and remounted back read-only.
* Prep 235: Make cgroups2 available, hybrid mode already works.Sven Eden2018-01-10
|
* build-sys: s/HAVE_SMACK/ENABLE_SMACK/Zbigniew Jędrzejewski-Szmek2017-12-08
| | | | Same justification as for HAVE_UTMP.
* build-sys: use #if Y instead of #ifdef Y everywhereZbigniew Jędrzejewski-Szmek2017-11-23
| | | | | | | | | | | | | | | The advantage is that is the name is mispellt, cpp will warn us. $ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/" $ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;' $ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g' $ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g' + manual changes to meson.build squash! build-sys: use #if Y instead of #ifdef Y everywhere v2: - fix incorrect setting of HAVE_LIBIDN2
* Prep v235: Apply pending upstream updates in src/core [2/4]Sven Eden2017-08-30
|
* Prep v235: Apply upstream fixes (4/10) [src/core]Sven Eden2017-08-14
|
* Prep v234: Eventually fix the cgroup stuff. elogind is not init.Sven Eden2017-07-27
|
* Prep v233.3: Unmask various functions for future coverage tests.Sven Eden2017-07-19
| | | | | These functions, although not used by elogind itself, are mostly tiny and crucial for important tests to work.
* Prep v233: Add missing updates from upstream in src/coreSven Eden2017-07-17
|
* core/mount-setup: if unified hierarchy is not supported, fall back to legacyZbigniew Jędrzejewski-Szmek2017-07-17
| | | | | | | | | | | We need this to gracefully support older or strangely configured kernels. v2: - do not install a callback handler, just embed the right conditions into cg_is_*_wanted() v3: - fix bug in cg_is_legacy_wanted()
* Rename cg_is_unified_elogind_controller_wanted to cg_is_hybrid_wantedZbigniew Jędrzejewski-Szmek2017-07-17
| | | | Less typing and doesn't make the table so incredibly wide.
* core: add comment why we don't bother with MS_SHARED remounting of / in ↵Lennart Poettering2017-07-17
| | | | containers
* core: make hybrid cgroup unified mode keep compat /sys/fs/cgroup/elogind ↵Tejun Heo2017-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hierarchy Currently the hybrid mode mounts cgroup v2 on /sys/fs/cgroup instead of the v1 name=elogind hierarchy. While this works fine for elogind itself, it breaks tools which expect cgroup v1 hierarchy on /sys/fs/cgroup/elogind. This patch updates the hybrid mode so that it mounts v2 hierarchy on /sys/fs/cgroup/unified and keeps v1 "name=elogind" hierarchy on /sys/fs/cgroup/elogind for compatibility. elogind itself doesn't depend on the "name=elogind" hierarchy at all. All operations take place on the v2 hierarchy as before but the v1 hierarchy is kept in sync so that any tools which expect it to be there can keep doing so. This allows elogind to take advantage of cgroup v2 process management without requiring other tools to be aware of the hybrid mode. The hybrid mode is implemented by mapping the special elogind controller to /sys/fs/cgroup/unified and making the basic cgroup utility operations - cg_attach(), cg_create(), cg_rmdir() and cg_trim() - also operate on the /sys/fs/cgroup/elogind hierarchy whenever the cgroup2 hierarchy is updated. While a bit messy, this will allow dropping complications from using cgroup v1 for process management a lot sooner than otherwise possible which should make it a net gain in terms of maintainability. v2: Fixed !cgns breakage reported by @evverx and renamed the unified mount point to /sys/fs/cgroup/unified as suggested by @brauner. v3: chown the compat hierarchy too on delegation. Suggested by @evverx. v4: [zj] - drop the change to default, full "legacy" is still the default.
* tree-wide: stop using canonicalize_file_name(), use chase_symlinks() insteadLennart Poettering2017-07-17
| | | | | | | | Let's use chase_symlinks() everywhere, and stop using GNU canonicalize_file_name() everywhere. For most cases this should not change behaviour, however increase exposure of our function to get better tested. Most importantly in a few cases (most notably nspawn) it can take the correct root directory into account when chasing symlinks.
* core: use the unified hierarchy for the elogind cgroup controller hierarchyTejun Heo2017-07-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, elogind uses either the legacy hierarchies or the unified hierarchy. When the legacy hierarchies are used, elogind uses a named legacy hierarchy mounted on /sys/fs/cgroup/elogind without any kernel controllers for process management. Due to the shortcomings in the legacy hierarchy, this involves a lot of workarounds and complexities. Because the unified hierarchy can be mounted and used in parallel to legacy hierarchies, there's no reason for elogind to use a legacy hierarchy for management even if the kernel resource controllers need to be mounted on legacy hierarchies. It can simply mount the unified hierarchy under /sys/fs/cgroup/elogind and use it without affecting other legacy hierarchies. This disables a significant amount of fragile workaround logics and would allow using features which depend on the unified hierarchy membership such bpf cgroup v2 membership test. In time, this would also allow deleting the said complexities. This patch updates elogind so that it prefers the unified hierarchy for the elogind cgroup controller hierarchy when legacy hierarchies are used for kernel resource controllers. * cg_unified(@controller) is introduced which tests whether the specific controller in on unified hierarchy and used to choose the unified hierarchy code path for process and service management when available. Kernel controller specific operations remain gated by cg_all_unified(). * "elogind.legacy_elogind_cgroup_controller" kernel argument can be used to force the use of legacy hierarchy for elogind cgroup controller. * nspawn: By default nspawn uses the same hierarchies as the host. If UNIFIED_CGROUP_HIERARCHY is set to 1, unified hierarchy is used for all. If 0, legacy for all. * nspawn: arg_unified_cgroup_hierarchy is made an enum and now encodes one of three options - legacy, only elogind controller on unified, and unified. The value is passed into mount setup functions and controls cgroup configuration. * nspawn: Interpretation of SYSTEMD_CGROUP_CONTROLLER to the actual mount option is moved to mount_legacy_cgroup_hierarchy() so that it can take an appropriate action depending on the configuration of the host. v2: - CGroupUnified enum replaces open coded integer values to indicate the cgroup operation mode. - Various style updates. v3: Fixed a bug in detect_unified_cgroup_hierarchy() introduced during v2. v4: Restored legacy container on unified host support and fixed another bug in detect_unified_cgroup_hierarchy().
* Prep v231: Apply missing fixes from upstream (2/6) src/coreSven Eden2017-06-16
|
* Prep v230: Apply missing upstream fixes and updates (4/8) src/core.Sven Eden2017-06-16
|
* core/mount-setup.c: also relabel /dev/shm for selinux (#3039)Harald Hoyer2017-06-16
| | | | | | | | daemons, which wish to transition state from the initramfs to the real root, might use /dev/shm for their state. As /dev is not relabeled across mount points, /dev/shm has to be relabled explicitly.
* Prep v229: Add missing fixes from upstream [2/6] src/coreSven Eden2017-05-17
|
* core: log about path_is_mount_point() errorsLennart Poettering2017-05-17
| | | | | | | We really shouldn't fail silently, but print a log message about these errors. Also make sure to attach error codes to all log messages where that makes sense. (While we are at it, add a couple of (void) casts to functions where we knowingly ignore return values.)
* mount-setup.c: fix handling of symlink Smack labelling in cgroup setupPatrick Ohly2017-05-17
| | | | | | | | | | | | | | | | The code introduced in f8c1a81c51 (= elogind 227) failed for me with: Failed to copy smack label from net_cls to /sys/fs/cgroup/net_cls: No such file or directory There is no need for a symlink in this case because source and target are identical. The symlink() call is allowed to fail when the target already exists. When that happens, copying the Smack label must be skipped. But the code also failed when there is a symlink, like "cpu -> cpu,cpuacct", because mac_smack_copy() got called with src="cpu,cpuacct" which fails to find the entry because the current directory is not inside /sys/fs/cgroup. The absolute path to the existing entry must be used instead.
* Prep v228: Condense elogind source masks (5/5)Sven Eden2017-04-26
|
* Prep v228: Condense elogind source masks (4/5)Sven Eden2017-04-26
|
* Prep v228: Add remaining updates from upstream (3/3)Sven Eden2017-04-26
| | | | | Apply remaining fixes and the performed move of utility functions into their own foo-util.[hc] files on the rest of elogind.
* [3/5] Apply missing fixes from upstreamSven Eden2017-03-29
|
* mount: propagate error codes correctlyDavid Herrmann2017-03-29
| | | | | | | | | | Make sure to propagate error codes from mount-loops correctly. Right now, we return the return-code of the first mount that did _something_. This is not what we want. Make sure we return an error if _any_ mount fails (and then make sure to return the first error to not hide proper errors due to consequential errors like -ENOTDIR). Reported by cee1 <fykcee1@gmail.com>.
* smack: bugfix the smack label of symlink when '--with-smack-run-label' is setSangjung Woo2017-03-29
| | | | | | | | | | | | | | | | | | | Even though elogind has its own smack label since '--with-smack-run-label' configuration is set, the smack label of each CGROUP root directory should have the star (i.e. *) label. This is mainly because current Linux Kernel set the label in this way. (Refer to smack_d_instantiate() in security/smack/smack_lsm.c) However, if elogind has its own smack label and arg_join_controllers is explicitly set or initialized by initialize_join_controllers() function, current elogind creates the symlink in CGROUP root directory with its own smack label as below. lrwxrwxrwx. 1 root root System 11 Dec 31 16:00 cpu -> cpu,cpuacct dr-xr-xr-x. 4 root root * 0 Dec 31 16:01 cpu,cpuacct lrwxrwxrwx. 1 root root System 11 Dec 31 16:00 cpuacct -> cpu,cpuacct This patch fixes that bug by copying the smack label from the origin.
* Add mounting of a name=elogind cgroup if no init controller is found.Sven Eden2017-03-14
| | | | | This is done for systems, which init systems are no cgroup controllers. One example is runit on Void Linux.
* Remove src/coreAndy Wingo2015-04-19
|