path: root/Documentation/btrfs-man5.asciidoc
diff options
Diffstat (limited to 'Documentation/btrfs-man5.asciidoc')
1 files changed, 465 insertions, 0 deletions
diff --git a/Documentation/btrfs-man5.asciidoc b/Documentation/btrfs-man5.asciidoc
new file mode 100644
index 00000000..7dd323fa
--- /dev/null
+++ b/Documentation/btrfs-man5.asciidoc
@@ -0,0 +1,465 @@
+btrfs-man5 - topics about the BTRFS filesystem (mount options, supported file attributes and other)
+This document describes topics related to BTRFS that are not specific to the
+tools. Currently covers:
+1. mount options
+2. file attributes
+This section describes mount options specific to BTRFS. For the generic mount
+options please refer to `mount`(8) manpage. The options are sorted alphabetically
+(discarding the 'no' prefix).
+(default: on)
+Enable/disable support for Posix Access Control Lists (ACLs). See the
+`acl`(5) manual page for more information about ACLs.
+The support for ACL is build-time configurable (BTRFS_FS_POSIX_ACL) and
+mount fails if 'acl' is requested but the feature is not compiled in.
+(default: 1M, minimum: 1M)
+Debugging option to force all block allocations above a certain
+byte threshold on each block device. The value is specified in
+bytes, optionally with a K, M, or G suffix (case insensitive).
+This option was used for testing and has no practical use, it's slated to be
+removed in the future.
+(since: 3.0, default: off)
+Enable automatic file defragmentation.
+When enabled, small random writes into files (in a range of tens of kilobytes,
+currently it's 64K) are detected and queued up for the defragmentation process.
+Not well suited for large database workloads.
+The read latency may increase due to reading the adjacent blocks that make up the
+range for defragmentation, successive write will merge the blocks in the new
+WARNING: Defragmenting with Linux kernel versions < 3.9 or ≥ 3.14-rc2 as
+well as with Linux stable kernel versions ≥ 3.10.31, ≥ 3.12.12 or
+≥ 3.13.4 will break up the ref-links of CoW data (for example files
+copied with `cp --reflink`, snapshots or de-duplicated data).
+This may cause considerable increase of space usage depending on the
+broken up ref-links.
+(default: on)
+Ensure that all IO write operations make it through the device cache and are stored
+permanently when the filesystem is at it's consistency checkpoint. This
+typically means that a flush command is sent to the device that will
+synchronize all pending data and ordinary metadata blocks, then writes the
+superblock and issues another flush.
+The write flushes incur a slight hit and also prevent the IO block
+scheduler to reorder requests in a more effective way. Disabling barriers gets
+rid of that penalty but will most certainly lead to a corrupted filesystem in
+case of a crash or power loss. The ordinary metadata blocks could be yet
+unwritten at the time the new superblock is stored permanently, expecting that
+the block pointers to metadata were stored permanently before.
+On a device with a volatile battery-backed write-back cache, the 'nobarrier'
+option will not lead to filesystem corruption as the pending blocks are
+supposed to make it to the permanent storage.
+(since: 3.0, default: off)
+These debugging options control the behavior of the integrity checking
+module (the BTRFS_FS_CHECK_INTEGRITY config option required). +
+`check_int` enables the integrity checker module, which examines all
+block write requests to ensure on-disk consistency, at a large
+memory and CPU cost. +
+`check_int_data` includes extent data in the integrity checks, and
+implies the check_int option. +
+`check_int_print_mask` takes a bitmask of BTRFSIC_PRINT_MASK_* values
+as defined in 'fs/btrfs/check-integrity.c', to control the integrity
+checker module behavior. +
+See comments at the top of 'fs/btrfs/check-integrity.c'
+for more info.
+Force clearing and rebuilding of the disk space cache if something
+has gone wrong. See also: 'space_cache'.
+(since: 3.12, default: 30)
+Set the interval of periodic commit. Higher
+values defer data being synced to permanent storage with obvious
+consequences when the system crashes. The upper bound is not forced,
+but a warning is printed if it's more than 300 seconds (5 minutes).
+(default: off)
+Control BTRFS file data compression. Type may be specified as 'zlib',
+'lzo' or 'no' (for no compression, used for remounting). If no type
+is specified, 'zlib' is used. If 'compress-force' is specified,
+all files will be compressed, whether or not they compress well. Otherwise
+some simple heuristics are applied to detect an incompressible file. If the
+first blocks written to a file are not compressible, the whole file is
+permanently marked to skip compression.
+NOTE: If compression is enabled, 'nodatacow' and 'nodatasum' are disabled.
+(default: on)
+Enable data copy-on-write for newly created files.
+'Nodatacow' implies 'nodatasum', and disables 'compression'. All files created
+under 'nodatacow' are also set the NOCOW file attribute (see `chattr`(1)).
+NOTE: If 'nodatacow' or 'nodatasum' are enabled, compression is disabled.
+(default: on)
+Enable data checksumming for newly created files.
+'Datasum' implies 'datacow', ie. the normal mode of operation. All files created
+under 'nodatasum' inherit the "no checksums" property, however there's no
+corresponding file attribute (see `chattr`(1)).
+NOTE: If 'nodatacow' or 'nodatasum' are enabled, compression is disabled.
+(default: off)
+Allow mounts with less devices than the raid profile constraints
+require. A read-write mount (or remount) may fail with too many devices
+missing, for example if a stripe member is completely missing from RAID0.
+Specify a path to a device that will be scanned for BTRFS filesystem during
+mount. This is usually done automatically by a device manager (like udev) or
+using the *btrfs device scan* command (eg. run from the initial ramdisk). In
+cases where this is not possible the 'device' mount option can help.
+NOTE: booting eg. a RAID1 system may fail even if all filesystem's 'device'
+paths are provided as the actual device nodes may not be discovered by the
+system at that point.
+(default: off)
+Enable discarding of freed file blocks using TRIM operation. This is useful
+for SSD devices, thinly provisioned LUNs or virtual machine images where the
+backing device understands the operation. Depending on support of the
+underlying device, the operation may severely hurt performance in case the TRIM
+operation is synchronous (eg. with SATA devices up to revision 3.0).
+If discarding is not necessary to be done at the block freeing time, there's
+`fstrim` tool that lets the filesystem discard all free blocks in a batch,
+possibly not much interfering with other operations. Also, the the device may
+ignore the TRIM command if the range is too small, so running the batch discard
+can actually discard the blocks.
+(default: off)
+Enable verbose output for some ENOSPC conditions. It's safe to use but can
+be noisy if the system reaches near-full state.
+(since: 3.4, default: bug)
+Action to take when encountering a fatal error.
+'BUG()' on a fatal error, the system will stay in the crashed state and may be
+still partially usable, but reboot is required for full operation
+'panic()' on a fatal error, depending on other system configuration, this may
+be followed by a reboot. Please refer to the documentation of kernel boot
+parameters, eg. 'panic', 'oops' or 'crashkernel'.
+(default: on)
+This option forces any data dirtied by a write in a prior transaction to commit
+as part of the current commit. This makes the committed state a fully
+consistent view of the file system from the application's perspective (i.e., it
+includes all completed file system operations). This was previously the
+behavior only when a snapshot was created.
+Disabling flushing may improve performance but is not crash-safe.
+(depends on compile-time option BTRFS_DEBUG, since: 4.4, default: off)
+A debugging helper to intentionally fragment given 'type' of block groups. The
+type can be 'data', 'metadata' or 'all'. This mount option should not be used
+outside of debugging environments and is not recognized if the kernel config
+option 'BTRFS_DEBUG' is not enabled.
+(since: 3.0, default: off)
+Enable free inode number caching. Not recommended to use unless files on your
+filesystem get assigned inode numbers that are approaching 2^64^. Normally, new
+files in each subvolume get assigned incrementally (plus one from the last
+time) and are not reused. The mount option turns on caching of the existing
+inode numbers and reuse of inode numbers of deleted files.
+This option may slow down your system at first run, or after mounting without
+the option.
+NOTE: Defaults to off due to a potential overflow problem when the free space
+checksums don't fit inside a single page.
+(default: on, even read-only)
+Enable/disable log replay at mount time. See also 'treelog'.
+WARNING: currently, the tree log is replayed even with a read-only mount! To
+disable that behaviour, mount also with 'nologreplay'.
+(default: min(2048, page size) )
+Specify the maximum amount of space, in bytes, that can be inlined in
+a metadata B-tree leaf. The value is specified in bytes, optionally
+with a K suffix (case insensitive). In practice, this value
+is limited by the filesystem block size (named 'sectorsize' at mkfs time),
+and memory page size of the system. In case of sectorsize limit, there's
+some space unavailable due to leaf headers. For example, a 4k sectorsize,
+maximum size of inline data is about 3900 bytes.
+Inlining can be completely turned off by specifying 0. This will increase data
+block slack if file sizes are much smaller than block size but will reduce
+metadata consumption in return.
+NOTE: the default value has changed to 2048 in kernel 4.6.
+(default: 0, internal logic)
+Specifies that 1 metadata chunk should be allocated after every 'value' data
+chunks. Default behaviour depends on internal logic, some percent of unused
+metadata space is attempted to be maintained but is not always possible if
+there's not enough space left for chunk allocation. The option could be useful to
+override the internal logic in favor of the metadata allocation if the expected
+workload is supposed to be metadata intense (snapshots, reflinks, xattrs,
+inlined files).
+(since: 3.2, default: off, deprecated since: 4.5)
+NOTE: this option has been replaced by 'usebackuproot' and should not be used
+but will work on 4.5+ kernels.
+(since: 4.5, default: off)
+Do not attempt any data recovery at mount time. This will disable 'logreplay'
+and avoids other write operations.
+NOTE: The opposite option 'recovery' used to have different meaning but was
+changed for consistency with other filesystems, where 'norecovery' is used for
+skipping log replay. BTRFS does the same and in general will try to avoid any
+write operations.
+(since: 3.12, default: off)
+Force check and rebuild procedure of the UUID tree. This should not
+normally be needed.
+(since: 3.3, default: off)
+Skip automatic resume of interrupted balance operation after mount.
+May be resumed with *btrfs balance resume* or the paused state can be removed
+by *btrfs balance cancel*. The default behaviour is to start interrutpd balance.
+('nospace_cache' since: 3.2, 'space_cache=v2' since 4.5, default: on)
+Options to control the free space cache. This affects performance as searching
+for new free blocks could take longer if the space cache is not enabled. On the
+other hand, managing the space cache consumes some resources. It can be
+disabled without clearing at mount time.
+There are two implementations of how the space is tracked. The safe default is
+'v1'. On large filesystems (many-terabytes) and certain workloads the 'v1'
+performance may degrade. This problem is addressed by 'v2', that is based on
+b-trees, sometimes referred to as 'free-space-tree'.
+'Compatibility notes:'
+* the 'v2' has to be enabled manually at mount time, once
+* kernel without 'v2' support will be able to mount the filesystem in read-only mode
+* 'v2' can be removed by mounting with 'clear_cache'
+(default: SSD autodetected)
+Options to control SSD allocation schemes. By default, BTRFS will
+enable or disable SSD allocation heuristics depending on whether a
+rotational or non-rotational disk is in use (contents of
+'/sys/block/DEV/queue/rotational'). The 'ssd' and 'nossd' options
+can override this autodetection.
+The 'ssd_spread' mount option attempts to allocate into bigger and aligned
+chunks of unused space, and may perform better on low-end SSDs. 'ssd_spread'
+implies 'ssd', enabling all other SSD heuristics as well.
+Mount subvolume from 'path' rather than the toplevel subvolume. The
+'path' is absolute (ie. starts at the toplevel subvolume).
+This mount option overrides the default subvolume set for the given filesystem.
+Mount subvolume specified by a 'subvolid' number rather than the toplevel
+subvolume. You can use *btrfs subvolume list* to see subvolume ID numbers.
+This mount option overrides the default subvolume set for the given filesystem.
+NOTE: if both 'subvolid' and 'subvol' are specified, they must point at the
+same subvolume, otherwise mount will fail.
+(irrelevant since: 3.2, formally deprecated since: 3.10)
+A workaround option from times (pre 3.2) when it was not possible to mount a
+subvolume that did not reside directly under the toplevel subvolume.
+(default: min(NRCPUS + 2, 8) )
+The number of worker threads to allocate. NRCPUS is number of on-line CPUs
+detected at the time of mount. Small number leads to less parallelism in
+processing data and metadata, higher numbers could lead to a performance hit
+due to increased locking contention, cache-line bouncing or costly data
+transfers between local CPU memories.
+(default: on)
+Enable the tree logging used for 'fsync' and 'O_SYNC' writes. The tree log
+stores changes without the need of a full filesystem sync. The log operations
+are flushed at sync and transaction commit. If the system crashes between two
+such syncs, the pending tree log operations are replayed during mount.
+WARNING: currently, the tree log is replayed even with a read-only mount! To
+disable that behaviour, mount also with 'nologreplay'.
+The tree log could contain new files/directories, these would not exist on
+a mounted filesystem if the log is not replayed.
+Enable autorecovery attempts if a bad tree root is found at mount time.
+Currently this scans a backup list of several previous tree roots and tries to
+use the first readable. This can be used with read-only mounts as well.
+NOTE: This option has replaced 'recovery'.
+(default: off)
+Allow subvolumes to be deleted by their respective owner. Otherwise, only the
+root user can do that.
+The btrfs filesystem supports setting the following file attributes using the
+`chattr`(1) utility:
+'append only', new writes are always written at the end of the file
+'no atime updates'
+'compress data', all data written after this attribute is set will be compressed.
+Please note that compression is also affected by the mount options or the parent
+directory attributes.
+When set on a directory, all newly created files will inherit this attribute.
+'no copy-on-write', file modifications are done in-place
+When set on a directory, all newly created files will inherit this attribute.
+NOTE: due to implementation limitations, this flag can be set/unset only on
+empty files.
+'no dump', makes sense with 3rd party tools like `dump`(8), on BTRFS the
+attribute can be set/unset on no other special handling is done
+'synchronous directory updates', for more details search `open`(2) for 'O_SYNC'
+and 'O_DSYNC'
+'immutable', no file data and metadata changes allowed even to the root user as
+long as this attribute is set (obviously the exception is unsetting the attribute)
+'synchronous updates', for more details search `open`(2) for 'O_SYNC' and
+'no compression', permanently turn off compression on the given file, other
+compression mount options will not affect that
+When set on a directory, all newly created files will inherit this attribute.
+No other attributes are supported. For the complete list please refer to the
+`chattr`(1) manual page.