summaryrefslogtreecommitdiff
path: root/Documentation/btrfs-quota.asciidoc
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/btrfs-quota.asciidoc')
-rw-r--r--Documentation/btrfs-quota.asciidoc208
1 files changed, 201 insertions, 7 deletions
diff --git a/Documentation/btrfs-quota.asciidoc b/Documentation/btrfs-quota.asciidoc
index 00e09c61..33c3bfd7 100644
--- a/Documentation/btrfs-quota.asciidoc
+++ b/Documentation/btrfs-quota.asciidoc
@@ -3,7 +3,7 @@ btrfs-quota(8)
NAME
----
-btrfs-quota - control the quota of a btrfs filesystem
+btrfs-quota - control the global quota status of a btrfs filesystem
SYNOPSIS
--------
@@ -11,14 +11,208 @@ SYNOPSIS
DESCRIPTION
-----------
-*btrfs quota* is used to enable/disable or rescan subvolume quota of a btrfs
-filesystem.
+The commands under *btrfs quota* are used to affect the global status of quotas
+of a btrfs filesystem. The quota groups (qgroups) are managed by the subcommand
+`btrfs qgroup`(8).
-For setting quota or other quota operations on a btrfs filesystem, please see
-`btrfs-qgroup`(8) for details.
+NOTE: the qgroups are different than the traditional user quotas and designed
+to track shared and exlusive data per-subvolume. Plese refer to the section
+'HIERARCHICAL QUOTA GROUP CONCEPTS' for a detailed description.
-WARNING: Quota and qgroup in btrfs filesystem is not stable and impacts
-performance in mainline kernel yet(v3.14 so far).
+PERFORMANCE IMPLICATIONS
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+When the quotas are turned on, they affect all extent processing, taking a
+performance hit. It is not recommended to turn on qgroups unless the user
+intends to actually use them.
+
+STABILITY STATUS
+~~~~~~~~~~~~~~~~
+
+The qgroup implementation has turned out to be quite difficult as it affects
+the core of the filesystem operation. The users have hit various corner cases
+over time, eg. wrong accounting or system instability. The situation is
+gradually improving but currently (4.7) there are still issues found and fixed.
+
+HIERARCHICAL QUOTA GROUP CONCEPTS
+---------------------------------
+
+The concept of quota has a long-standing tradition in the Unix world. Ever
+since computers allow multiple users to work simultaneously in one filesystem,
+there is the need to prevent one user from using up the entire space. Every
+user should get his fair share of the available resources.
+
+In case of files, the solution is quite straightforward. Each file has an
+'owner' recorded along with it, and it has a size. Traditional quota just
+restricts the total size of all files that are owned by a user. The concept is
+quite flexible: if a user hits his quota limit, the administrator can raise it
+on the fly.
+
+On the other hand, the traditional approach has only a poor solution to
+restrict directories.
+At installation time, the harddisk can be partitioned so that every directory
+(eg. /usr, /var/, ...) that needs a limit gets its own partition. The obvious
+problem is, that those limits cannot be changed without a reinstall ation. The
+btrfs subvolume feature builds a bridge. Subvolumes correspond in many ways to
+partitions, as every subvolume looks like its own filesystem. With subvolume
+quota, it is now possible to restrict each subvolume like a partition, but keep
+the flexibility of quota. The space for each subvolume can be expanded or
+restricted on the fly.
+
+As subvolumes are the basis for snapshots, interesting questions arise as to
+how to account used space in the presence of snapshots. If you have a file
+shared between a subvolume and a snapshot, whom to account the file to? The
+creator? Both? What if the file gets modified in the snapshot, should only
+these changes be accounted to it? But wait, both the snapshot and the subvolume
+belong to the same user home. I just want to limit the total space used by
+both! But somebody else might not want to charge the snapshots to the users.
+
+Btrfs subvolume quota solves these problems by introducing groups of subvolumes
+and let the user put limits on them. It is even possible to have groups of
+groups. In the following, we refer to them as 'qgruops'.
+
+Each qgroup primarily tracks two numbers, the amount of total referenced
+space and the amount of exclusively referenced space.
+
+referenced::
+space is the amount of data that can be reached from any of the subvolumes
+contained in the qgroup, while
+exclusive::
+is the amount of data where all references to this data can be reached
+from within this qgroup.
+
+SUBVOLUME QUOTA GROUPS
+~~~~~~~~~~~~~~~~~~~~~~
+
+The basic notion of the Subvolume Quota feature is the qouta group, short
+qgroup. Qgroups are notated as 'level/id', eg. the qgroup 3/2 is a qgroup of
+level 3. For level 0, the leading '0/' can be omitted.
+Qgroups of level 0 get created automatically when a subvolume/snapshot gets
+created. The ID of the qgroup corresponds to the ID of the subvolume, so 0/5
+is the qgroup for the root subvolume.
+For the *btrfs qgroup* command, the path to the subvolume can also be used
+instead of '0/ID'. For all higher levels, the ID can be choosen freely.
+
+Each qgroup can contain a set of lower level qgroups, thus creating a hierarchy
+of qgroups. Figure 1 shows an example qgroup tree.
+
+// TODO: insert Figure 1
+
+At the bottom, some extents are depicted showing which qgroups reference which
+extents. It is important to understand the notion of 'referenced' vs
+'exclusive'. In the example, qgroup 0/2 references extents 2 and 3, while 1/2
+references extents 2-4, 2/1 references all extents.
+
+On the other hand, extent 1 is exclusive to 0/1, extent 2 is exclusive to 0/2,
+while extent 3 is neither exclusive to 0/2 nor to 0/3. But because both
+references can be reached from 1/2, extent 3 is exclusive to 1/2. All extents
+are exclusive to 2/1.
+
+So exclusive does not mean there is no other way to reach the extent, but it
+does mean that if you delete all subvolumes contained in a qgroup, the extent
+will get deleted.
+
+Exclusive of a qgroup conveys the useful information how much space will be
+freed in case all subvolumes of the qgroup get deleted.
+
+All data extents are accounted this way. Metadata that belongs to a specific
+subvolume (i.e. its filesystem tree) is also accounted. Checksums and extent
+allocation information are not accounted.
+
+In turn, the referenced count of a qgroup can be limited. All writes beyond
+this limit will lead to a 'Quota Exceeded' error.
+
+INHERITANCE
+~~~~~~~~~~~
+
+Things get a bit more complicated when new subvolumes or snapshots are created.
+The case of (empty) subvolumes is still quite easy. If a subvolume should be
+part of a qgroup, it has to be added to the qgroup at creation time. To add it
+at a later time, it would be necessary to at least rescan the full subvolume
+for a proper accounting.
+
+Creation of a snapshot is the hard case. Obviously, the snapshot will
+reference the exact amount of space as its source, and both source and
+destination now have an exclusive count of 0 (the filesystem nodesize to be
+precise, as the roots of the trees are not shared). But what about qgroups of
+higher levels? If the qgroup contains both the source and the destination,
+nothing changes. If the qgroup contains only the source, it might lose some
+exclusive.
+
+But how much? The tempting answer is, subtract all exclusive of the source from
+the qgroup, but that is wrong, or at least not enough. There could have been
+an extent that is referenced from the source and another subvolume from that
+qgroup. This extent would have been exclusive to the qgroup, but not to the
+source subvolume. With the creation of the snapshot, the qgroup would also
+lose this extent from its exclusive set.
+
+So how can this problem be solved? In the instant the snapshot gets created, we
+already have to know the correct exclusive count. We need to have a second
+qgroup that contains all the subvolumes as the first qgroup, except the
+subvolume we want to snapshot. The moment we create the snapshot, the
+exclusive count from the second qgroup needs to be copied to the first qgroup,
+as it represents the correct value. The second qgroup is called a tracking
+qgroup. It is only there in case a snapshot is needed.
+
+USE CASES
+~~~~~~~~~
+
+Below are some usecases that do not mean to be extensive. You can find your
+own way how to integrate qgroups.
+
+==== SINGLE-USER MACHINE ====
+
+`Replacement for partitions`
+
+The simplest use case is to use qgroups as simple replacement for partitions.
+Btrfs takes the disk as a whole, and /, /usr, /var etc. are created as
+subvolumes. As each subvolume gets it own qgroup automatically, they can
+simply be restricted. No hierarchy is needed for that.
+
+`Track usage of snapshots`
+
+When a snapshot is taken, a qgroup for it will automatically be created with
+the correct values. 'Referenced' will show how much is in it, possibly shared
+with other subvolumes. 'Exclusive' will be the amount of space that gets freed
+when the subvolume is deleted.
+
+==== MULTI-USER MACHINE ====
+
+`Restricting homes`
+
+When you have several users on a machine, with home directories probably under
+/home, you might want to restrict /home as a whole, while restricting every
+user to an indiviual limit as well. This is easily accomplished by creating a
+qgroup for /home , eg. 1/1, and assigning all user subvolumes to it.
+Restricting this qgroup will limit /home, while every user subvolume can get
+its own (lower) limit.
+
+`Accounting snapshots to the user`
+
+Let's say the user is allowed to create snapshots via some mechanism. It would
+only be fair to account space used by the snapshots to the user. This does not
+mean the user doubles his usage as soon as he takes a snapshot. Of course,
+files that are present in his home and the snapshot should only be accounted
+once. This can be accomplished by creating a qgroup for each user, say
+'1/UID'. The user home and all snapshots are assigned to this qgroup.
+Limiting it will extend the limit to all snapshots, counting files only once.
+To limit /home as a whole, a higher level group 2/1 replacing 1/1 from the
+previous example is needed, with all user qgroups assigned to it.
+
+`Do not account snapshots`
+
+On the other hand, when the snapshots get created automatically, the user has
+no chance to control them, so the space used by them should not be accounted to
+him. This is already the case when creating snapshots in the example from
+the previous section.
+
+`Snapshots for backup purposes`
+
+This scenario is a mixture of the previous two. The user can create snapshots,
+but some snapshots for backup purposes are being created by the system. The
+user's snapshots should be accounted to the user, not the system. The solution
+is similar to the one from section 'Accounting snapshots to the user', but do
+not assign system snapshots to user's qgroup.
SUBCOMMAND
----------