summaryrefslogtreecommitdiff
path: root/Documentation/btrfs-balance.asciidoc
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/btrfs-balance.asciidoc')
-rw-r--r--Documentation/btrfs-balance.asciidoc150
1 files changed, 147 insertions, 3 deletions
diff --git a/Documentation/btrfs-balance.asciidoc b/Documentation/btrfs-balance.asciidoc
index 7df40b9c..fff8a321 100644
--- a/Documentation/btrfs-balance.asciidoc
+++ b/Documentation/btrfs-balance.asciidoc
@@ -51,17 +51,32 @@ NOTE: A short syntax *btrfs balance <path>* works due to backward compatibility
but is deprecated and should not be used anymore. Use *btrfs balance start*
command instead.
+PERFORMANCE IMPLICATIONS
+------------------------
+
+Balance operation is intense namely in the IO respect, but can be also CPU
+intense. It affects other actions on the filesystem. There are typically lots
+of data being copied from one location to another, and lots of metadata get
+updated.
+
+Depending on the actual block group layout, it can be also seek-heavy. The
+performance on rotational devices is noticeably worse than on SSDs or fast
+arrays.
+
SUBCOMMAND
----------
*cancel* <path>::
-cancel running or paused balance
+cancel running or paused balance, the command will block and wait until the
+actually processed blockgroup is finished
*pause* <path>::
pause running balance operation, this will store the state of the balance
progress and used filters to the filesystem
*resume* <path>::
-resume interrupted balance
+resume interrupted balance, the balance status must be stored on the filesystem
+from previous run, eg. after it was forcibly interrupted and mounted again with
+'skip_balance'
*start* [options] <path>::
start the balance operation according to the specified filters, no filters
@@ -73,6 +88,10 @@ filesystem size. To prevent starting a full balance by accident, the user is
warned and has a few seconds to cancel the operation before it starts. The
warning and delay can be skipped with '--full-balance' option.
+
+Please note that the filters must be written together with the '-d', '-m' and
+'-s' options, because they're optional and bare '-d' etc alwo work and mean no
+filters.
++
`Options`
+
-d[<filters>]::::
@@ -94,7 +113,7 @@ If '-v' option is given, output will be verbose.
FILTERS
-------
From kernel 3.3 onwards, btrfs balance can limit its action to a subset of the
-full filesystem, and can be used to change the replication configuration (e.g.
+whole filesystem, and can be used to change the replication configuration (e.g.
moving data from single to RAID1). This functionality is accessed through the
'-d', '-m' or '-s' options to btrfs balance start, which filter on data,
metadata and system blocks respectively.
@@ -140,6 +159,9 @@ parameters.
+
NOTE: starting with kernel 4.5, the 'data' chunks can be converted to/from the
'DUP' profile on a single device.
++
+NOTE: starting with kernel 4.6, all profiles can be converted to/from 'DUP' on
+multi-device filesystems.
*limit=<number>*::
*limit=<range>*::
@@ -206,6 +228,128 @@ Conversion to profiles based on striping (RAID0, RAID5/6) require the work
space on each device. An interrupted balance may leave partially filled block
groups that might consume the work space.
+EXAMPLES
+--------
+
+A more comprehensive example when going from one to multiple devices, and back,
+can be found in section 'TYPICAL USECASES' of `btrfs-device`(8).
+
+MAKING BLOCK GROUP LAYOUT MORE COMPACT
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The layout of block groups is not normally visible, most tools report only
+summarized numbers of free or used space, but there are still some hints
+provided.
+
+Let's use the following real life example and start with the output:
+
+--------------------
+$ btrfs fi df /path
+Data, single: total=75.81GiB, used=64.44GiB
+System, RAID1: total=32.00MiB, used=20.00KiB
+Metadata, RAID1: total=15.87GiB, used=8.84GiB
+GlobalReserve, single: total=512.00MiB, used=0.00B
+--------------------
+
+Roughly calculating for data, '75G - 64G = 11G', the used/total ratio is
+about '85%'. How can we can interpret that:
+
+* chunks are filled by 85% on average, ie. the 'usage' filter with anything
+ smaller than 85 will likely not affect anything
+* in a more realistic scenario, the space is distributed unevenly, we can
+ assume there are completely used chunks and the remaining are partially filled
+
+Compacting the layout could be used on both. In the former case it would spread
+data of a given chunk to the others and removing it. Here we can estimate that
+roughly 850 MiB of data have to be moved (85% of a 1 GiB chunk).
+
+In the latter case, targeting the partially used chunks will have to move less
+data and thus will be faster. A typical filter command would look like:
+
+--------------------
+# btrfs balance start -dusage=50 /path
+Done, had to relocate 2 out of 97 chunks
+
+$ btrfs fi df /path
+Data, single: total=74.03GiB, used=64.43GiB
+System, RAID1: total=32.00MiB, used=20.00KiB
+Metadata, RAID1: total=15.87GiB, used=8.84GiB
+GlobalReserve, single: total=512.00MiB, used=0.00B
+--------------------
+
+As you can see, the 'total' amount of data is decreased by just 1 GiB, which is
+an expected result. Let's see what will happen when we increase the estimated
+usage filter.
+
+--------------------
+# btrfs balance start -dusage=85 /path
+Done, had to relocate 13 out of 95 chunks
+
+$ btrfs fi df /path
+Data, single: total=68.03GiB, used=64.43GiB
+System, RAID1: total=32.00MiB, used=20.00KiB
+Metadata, RAID1: total=15.87GiB, used=8.85GiB
+GlobalReserve, single: total=512.00MiB, used=0.00B
+--------------------
+
+Now the used/total ratio is about 94% and we moved about '74G - 68G = 6G' of
+data to the remaining blockgroups, ie. the 6GiB are now free of filesystem
+structures, and can be reused for new data or metadata block groups.
+
+We can do a similar exercise with the metadata block groups, but this should
+not be typically necessary, unless the used/total ration is really off. Here
+the ratio is roughly 50% but the difference as an absolute number is "a few
+gigabytes", which can be considered normal for a workload with snapshots or
+reflinks updated frequently.
+
+--------------------
+# btrfs balance start -musage=50 /path
+Done, had to relocate 4 out of 89 chunks
+
+$ btrfs fi df /path
+Data, single: total=68.03GiB, used=64.43GiB
+System, RAID1: total=32.00MiB, used=20.00KiB
+Metadata, RAID1: total=14.87GiB, used=8.85GiB
+GlobalReserve, single: total=512.00MiB, used=0.00B
+--------------------
+
+Just 1 GiB decrease, which possibly means there are block groups with good
+utilization. Making the metadata layout more compact would in turn require
+updating more metadata structures, ie. lots of IO. As running out of metadata
+space is a more severe problem, it's not necessary to keep the utilization
+ratio too high. For the purpose of this example, let's see the effects of
+further compaction:
+
+--------------------
+# btrfs balance start -musage=70 /path
+Done, had to relocate 13 out of 88 chunks
+
+$ btrfs fi df .
+Data, single: total=68.03GiB, used=64.43GiB
+System, RAID1: total=32.00MiB, used=20.00KiB
+Metadata, RAID1: total=11.97GiB, used=8.83GiB
+GlobalReserve, single: total=512.00MiB, used=0.00B
+--------------------
+
+GETTING RID OF COMPLETELY UNUSED BLOCK GROUPS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Normally the balance operation needs a work space, to temporarily move the
+data before the old block groups gets removed. If there's no work space, it
+ends with 'no space left'.
+
+There's a special case when the block groups are completely unused, possibly
+left after removing lots of files or deleting snapshots. Removing empty block
+groups is automatic since 3.18. The same can be achieved manually with a
+notable exception that this operation does not require the work space. Thus it
+can be used to reclaim unused block groups to make it available.
+
+--------------------
+# btrfs balance start -dusage=0 /path
+--------------------
+
+This should lead to decrease in the 'total' numbers in the *btrfs fi df* output.
+
EXIT STATUS
-----------
*btrfs balance* returns a zero exit status if it succeeds. Non zero is