btrfs-progs - Debian dgit repo for package btrfs-progs

	Commit message (Collapse)	Author	Age
...
*	Btrfs-progs: check, ability to detect and fix outdated snapshot root items	Filipe Manana	2014-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds code to detect and fix the issue introduced in the kernel release 3.17, where creation of read-only snapshots lead to a corrupted filesystem if they were created at a moment when the source subvolume/snapshot had orphan items. The issue was that the on-disk root items became incorrect, referring to the pre orphan cleanup root node instead of the post orphan cleanup root node. A test filesystem can be generated with the test case recently submitted for xfstests/fstests, which is essencially the following (bash script): workout() { ops=$1 procs=$2 num_snapshots=$3 _scratch_mkfs >> $seqres.full 2>&1 _scratch_mount snapshot_cmd="$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT" snapshot_cmd="$snapshot_cmd $SCRATCH_MNT/snap_\`date +'%H_%M_%S_%N'\`" run_check $FSSTRESS_PROG -p $procs \ -x "$snapshot_cmd" -X $num_snapshots -d $SCRATCH_MNT -n $ops } ops=10000 procs=4 snapshots=500 workout $ops $procs $snapshots Example of btrfsck's (btrfs check) behaviour against such filesystem: $ btrfsck /dev/loop0 root item for root 311, current bytenr 44630016, current gen 60, current level 1, new bytenr 44957696, new gen 61, new level 1 root item for root 1480, current bytenr 1003569152, current gen 1271, current level 1, new bytenr 1004175360, new gen 1272, new level 1 root item for root 1509, current bytenr 1037434880, current gen 1300, current level 1, new bytenr 1038467072, new gen 1301, new level 1 root item for root 1562, current bytenr 33636352, current gen 1354, current level 1, new bytenr 34455552, new gen 1355, new level 1 root item for root 3094, current bytenr 1011712000, current gen 2935, current level 1, new bytenr 1008484352, new gen 2936, new level 1 root item for root 3716, current bytenr 80805888, current gen 3578, current level 1, new bytenr 73515008, new gen 3579, new level 1 root item for root 4085, current bytenr 714031104, current gen 3958, current level 1, new bytenr 716816384, new gen 3959, new level 1 Found 7 roots with an outdated root item. Please run a filesystem check with the option --repair to fix them. $ echo $? 1 $ btrfsck --repair /dev/loop0 enabling repair mode fixing root item for root 311, current bytenr 44630016, current gen 60, current level 1, new bytenr 44957696, new gen 61, new level 1 fixing root item for root 1480, current bytenr 1003569152, current gen 1271, current level 1, new bytenr 1004175360, new gen 1272, new level 1 fixing root item for root 1509, current bytenr 1037434880, current gen 1300, current level 1, new bytenr 1038467072, new gen 1301, new level 1 fixing root item for root 1562, current bytenr 33636352, current gen 1354, current level 1, new bytenr 34455552, new gen 1355, new level 1 fixing root item for root 3094, current bytenr 1011712000, current gen 2935, current level 1, new bytenr 1008484352, new gen 2936, new level 1 fixing root item for root 3716, current bytenr 80805888, current gen 3578, current level 1, new bytenr 73515008, new gen 3579, new level 1 fixing root item for root 4085, current bytenr 714031104, current gen 3958, current level 1, new bytenr 716816384, new gen 3959, new level 1 Fixed 7 roots. Checking filesystem on /dev/loop0 UUID: 2186e9b9-c977-4a35-9c7b-69c6609d4620 checking extents checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots checking csums checking root refs found 618537000 bytes used err is 0 total csum bytes: 130824 total tree bytes: 601620480 total fs tree bytes: 580288512 total extent tree bytes: 18464768 btree space waste bytes: 136939144 file data blocks allocated: 34150318080 referenced 27815415808 Btrfs v3.17-rc3-2-gbbe1dd8 $ echo $? 0 Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: check, fix return value check of is_child_root()	Filipe Manana	2014-10-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following commit: "btrfs-progs: fsck: remove unfriendly BUG_ON() for searching tree failure" f495a2ac66116f0a1b15e73380c8cbca6e0a4ca0 introduced a regression, detected through xfstests/btrfs/054, where previously a negative return value (-1) was used to mean a particular root didn't had any parent root, and now, after that change, a negative value is also used to mean that an error happened. That change also made the only caller of is_child_root() interpret any negative return value as an error and therefore incorrectly made the caller leave with an error, instead of continuing. This affects only the 3.17 release candidates (3.16 and older releases don't have this issue). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Wang Shilong <wangshilong1991@gmail.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: lookup all roots that point to a corrupt block	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \| \| \| \|	If we have a corrupt block that multiple snapshots point to we will only fix the guy who originally pointed to the block, and then simply loop forever because we keep finding the same bad block. So instead lookup all roots that point to this block, and then search down to the block for each root and fix the block in all snapshots. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: make fsck deal with bogus items	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \| \| \|	We can deal with corrupt items by deleting them in a few cases. Fsck can easily recover from a missing extent item or a dir index item. So if we notice a item is completely bogus and it is of a key that we know we can repair then just delete it and carry on. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: check blocks when checking fs roots	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \| \| \| \|	Usually if we find a bad block during the extent tree stuff we will error out, but if the bad block is in an fs tree and doens't have extents in it then fsck may still pass even though the block was complete garbage. So add the check block logic to the fs root checking so we actually error out of fsck if there is a bad block. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: add the ability to fix shifted item offsets	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \| \| \| \|	A user had a corrupted fs where the items had been shifted improperly. This patch adds the ability to fix this sort of problem within fsck. We will simply shift the item over to the proper offset and update the offsets to make sure they are correct. I tested this with a hand crafted fs that was broken in the same way as the user, and I've included the file as a new test. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: deal with mismatch index between dir index and inode ref	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \| \| \|	Sometimes we have a dir index and an inode ref that don't agree on the index. In this case just assume that the inode ref is the ultimate authority on the subject and delete the dir index. This means we have to not reset index if we find a mismatched inode ref to make sure we delete the right dir index. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: add a dummy backref if our location is wrong	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \| \| \|	If our location is bogus in our dir item we were just skipping the thing. However in this case we want to just delete the dir index, so create a dummy inode rec using BTRFS_MULTIPLE_OBJECTIDS and just add every backref we find to the list so we know to straight up delete all of these items. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: delete bogus dir indexes	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We may run across dir indexes that are corrupt in such a way that it makes them useless, such as having a bad location key or a bad name. In this case we can just delete dir indexes that don't show up properly and then re-create what we need. When we delete dir indexes however we need to restart scanning the fs tree as we could have greated bogus inode recs if the location key was bad, so set it up so that if we had to delete an dir index we go ahead and free up our inode recs and return -EAGAIN to check_fs_roots so it knows to restart the loop. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: re-search tree root if it changes	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \|	If we change something while scanning fs-roots we need to redo our search so that we get valid root items and have valid root cache. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: reset chunk state if we restart check	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \| \| \| \| \|	If we hid a corrupt block that we fix and we restart the fsck loop you will get lots of noise about duplicate block groups and such. This is because we don't clear the block group and chunk cache when we do this restart. This patch fixes that, which is a little tricky since the structs are linked together with various linked lists, but this passed with a user who was hitting this problem. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: break out rbtree util functions	Josef Bacik	2014-10-14
\| \| \| \| \| \| \| \| \|	These were added to deal with duplicated functionality within btrfs-progs, but we specifically copied rbtree.c from the kernel, so move these functions out into their own file. This will make it easier to keep rbtree.c in sync. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: repair missing dir index	Josef Bacik	2014-10-13
\| \| \| \| \| \| \| \| \| \| \|	If we have an inode backref entry then we know enough to add back a missing dir index. When messing with the inode backrefs we need to do all of that first before we process the inode recs themselves as we may clear errors on the inode recs as we fix the directory indexes. This adds the framework for fixing backref errors and fixes missing dir index issues. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: check: do not dereference tree_refs as data_refs	Alexandre Oliva	2014-10-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a filesystem corrupted by a faulty memory module, btrfsck would get very confused attempting to access backrefs that weren't data backrefs as if they were. Besides invoking undefined behavior for accessing potentially-uninitialized data past the end of objects, or with dynamic types unrelated with the static types held in the corresponding memory, it used offsets and lengths from such fields that did not correspond to anything in the filesystem proper. Moving the test for full backrefs and checking that they're data backrefs earlier avoided the crash I was running into, but that was not enough to make the filesystem complete a successful repair. Signed-off-by: Alexandre Oliva <oliva@gnu.org> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: repair: remove recowed entry from the to-recow list	Alexandre Oliva	2014-10-10
\| \| \| \| \| \| \| \| \| \|	If we attempt to repair a filesystem with metadata blocks that need recowing, we'll get into an infinite loop repeatedly recowing the first entry in the list, without ever removing it from the list. Oops. Fixed. Signed-off-by: Alexandre Oliva <oliva@gnu.org> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: deal with corrupted csum root	Wang Shilong	2014-10-10
\| \| \| \| \| \| \| \| \| \| \| \| \|	If checksum root is corrupted, fsck will get segmentation. This is because if we fail to load checksum root, root's node is NULL which cause NULL pointer deferences later. To fix this problem, we just did something like extent tree rebuilding. Allocate a new one and clear uptodate flag. We will do sanity check before fsck going on. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: only allow partial opening under repair mode	Wang Shilong	2014-10-10
\| \| \| \| \| \| \| \| \| \| \| \| \|	The reason that we allow partial opening is that sometimes, we may have some corrupted trees.(for example extent tree), for fsck repair case, the broken tree may be rebuilt later. So if users only want to do check but not repair anything, this patch will make fsck return failure as soon as possible and tell users that some critial roots have been corrupted. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: Check the consistence between the parent node and child node/leaf.	Qu Wenruo	2014-10-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When btrfs-progs walk down the tree, it does not check whether the child node/leaf is valid. In fact, there is some corrupted image whose csum is all valid but parent node points to a invalid leaf. In my case, the parent node in fs tree point to a invalid leaf(gen 11), whose generation(15) and first key(EXTENT_TREE ROOT_ITEM 0) is completely invalid, and will cause BUG_ON in process_inode_item(). Unfortunately, we are unable to fix when it happens. So we can only output meaningful error message and avoid the insane node/leaf, which is still much better than the original BUG_ON(). Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: rebuild the crc tree with --init-csum-tree	Josef Bacik	2014-10-10
\| \| \| \| \| \| \| \| \| \| \| \|	We have --init-csum-tree, which just empties the csum tree. I'm not sure why we would ever need this, but we definitely need to be able to rebuild the csum tree in some cases. This patch adds the ability to completely rebuild the crc tree by reading all of the data and adding csum entries for them. This patch doesn't pay attention to NODATASUM inodes, it'll happily add csums for everything. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: check, fix csum check in the presence of non-inlined refs	Filipe David Borba Manana	2014-10-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we have non-inlined extent references, we were failing to find the corresponding extent item for an existing csum item in the csum tree. Reproducer: mkfs.btrfs -f /dev/sdd mount /dev/sdd /mnt xfs_io -f -c "falloc 780366 135302" /mnt/foo xfs_io -c "falloc 327680 151552" /mnt/foo xfs_io -c "pwrite -S 0xff -b 131072 0 131072" /mnt/foo sync for i in `seq 1 40`; do btrfs subvolume snapshot /mnt /mnt/snap$i ; done umount /mnt btrfs check /dev/sdd The check command exited with status 1 and the following output: Checking filesystem on /dev/sdd UUID: 2416ab5f-9d71-457e-bb13-a27d4f6b399a checking extents checking free space cache checking fs roots checking csums There are no extents for csum range 12980224-12984320 Csum exists for 12980224-12984320 but there is no extent record found 1388544 bytes used err is 1 total csum bytes: 132 total tree bytes: 704512 total fs tree bytes: 573440 total extent tree bytes: 16384 btree space waste bytes: 564479 file data blocks allocated: 19341312 referenced 14606336 Btrfs v3.14.1-94-g80597e7 After this change it no longer erroneously reports a missing extent for the csum item and exits with a status of 0. Also added missing btrfs_prev_leaf() return value checks, as we were ignoring errors and non-existence of left siblings completely. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: fsck: add ability to check reloc roots	Wang Shilong	2014-10-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When encountering system crash or balance enospc errors, there maybe still some reloc roots left. The way we store reloc root is different from fs root: reloc root's root key(BTRFS_RELOC_TREE_OBJECTID, ROOT_ITEM, objectid) fs root's root key(objectid, ROOT_ITEM, -1) reloc data's root key(BTRFS_DATA_RELOC_TREE_OBJECTID, ROOT_ITEM, 0) So this patch use right key to search corresponding root node, and avoid using normal fs root cache for reloc roots. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: fsck: finish transaction commit if repair error out	Wang Shilong	2014-10-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If btrfsck fail to repair, we hit something like following: Check tree block failed, want=29442048, have=0 Check tree block failed, want=29442048, have=0 Check tree block failed, want=29442048, have=0 Check tree block failed, want=29442048, have=0 Check tree block failed, want=29442048, have=0 read block failed check_tree_block found 98304 bytes used err is 1 total csum bytes: 0 total tree bytes: 0 total fs tree bytes: 0 total extent tree bytes: 0 btree space waste bytes: 0 file data blocks allocated: 0 referenced 0 Btrfs v3.14.2-rc2-63-g3944f15 btrfs: transaction.h:38: btrfs_start_transaction: Assertion `!(root->commit_root)' failed. Aborted (core dumped) This is because under repair mode, we will start a transaction, and if we error out, we don't finish this transaction. So in close_ctree(), it will try to start and commit transaction which causes the above segmentation. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: fsck: remove unfriendly BUG_ON() for searching tree failure	Wang Shilong	2014-10-10
\| \| \| \| \| \| \| \| \| \|	Now btrfsck would hit assertation failure for some searching tree failure. It is true that filesystem may get some metadata block corrupted, and btrfsck could not deal with these corruptings. But, Users really don't want a BUG_ON() here, Instead, just return errors to caller. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: fsck: clear out log tree in repair mode	Wang Shilong	2014-10-10
\| \| \| \| \| \| \| \| \| \| \|	Repair mode will commit transaction which will make us fail to load log tree anymore. Give a warning to common users, if they really want to coninue, we will clear out log tree. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: fsck: avoid pinning same block several times	Wang Shilong	2014-10-10
\| \| \| \| \| \| \| \|	This can not only give some speedups but also avoid forever loop with a really broken filesystem. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: Check the csum tree node before go through the csum tree	Qu Wenruo	2014-10-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[BUG] Some fsfuzzed btrfs image will cause btrfsck segfault. [REPRODUCER] Run btrfsck on a csum tree block corrupted image. [REASON] check_csums() function call btrfs_search_slot() on csum_tree but doesn't check whether the csum_tree contains a valid extent_buffer, which causes the segfault. [FIX] Check the csum_root->node before any search. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: add root to dirty list when fixing bad keys	Josef Bacik	2014-10-01
\| \| \| \| \| \| \| \| \| \| \| \|	A user reported a WARN_ON() when trying to run btrfsck --repair on his fs with bad key ordering. This was because the root that was broken wasn't part of the transaction yet. We do this open coded thing in a few other places in fsck, so just make it a helper function and make sure all the places that need to call it do call it. With this patch he was able to run repair without it dying. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: kill BUG_ON in readahead_tree_block()	Zach Brown	2014-09-14
\| \| \| \| \| \| \| \| \| \| \|	David sent a quick patch that removed a BUG_ON(). I took a peek and found that the function was already leaking an eb ref and only returned 0. So this fixes the leak and makes the function void and fixes up the callers. Accidentally-motivated-by: David Sterba <dsterba@suse.cz> Signed-off-by: Zach Brown <zab@zabbo.net> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: check: do not require argument for --subvol-extents	David Sterba	2014-08-22
\| \| \| \| \| \| \|	$ btrfs check --subvol-extents /dev/sdx ERROR: /dev/sdx is not a valid numeric value. Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: use check_argc_* to check arg number for all tools	Gui Hecheng	2014-08-22
\| \| \| \| \| \| \| \| \| \| \| \|	Since this patch: btrfs-progs: move the check_argc_* functions into utils.c All tools including the independent tools(e.g. btrfs-image, btrfs-convert) can share the convenience of the check_argc_* functions, so this patch adopt the argc check functions globally. Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: show extent state for a subvolume	Mark Fasheh	2014-08-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The qgroup verification code can trivially be extended to provide extended information on the extents which a subvolume root references. Along with qgroup-verify, I have found this tool to be invaluable when tracking down extent references. The patch adds a switch to the check subcommand '--subvol-extents' which takes as args a single subvolume id. When run with the switch, we'll print out each extent that the subvolume references. The extent printout gives standard extent info you would expect along with information on which other roots reference it. Sample output follows - this is a few lines from a run on a subvolume I've been testing qgroup changes on: Print extent state for subvolume 281 on /dev/vdb2 UUID: 8203ca66-9858-4e3f-b447-5bbaacf79c02 Offset Len Root Refs Roots 12582912 20480 12 257 279 280 281 282 283 284 285 286 287 288 289 12603392 8192 12 257 279 280 281 282 283 284 285 286 287 288 289 12611584 12288 12 257 279 280 281 282 283 284 285 286 287 288 289 <snip a bunch of extents to show some variety> 124583936 16384 4 281 282 283 280 125075456 16384 4 280 281 282 283 126255104 16384 11 257 280 281 282 283 284 285 286 287 288 289 4763508736 4096 3 279 280 281 In case it wasn't clear, this applies on top of my qgroup verify patch: "btrfs-progs: add quota group verify code" A branch with all this can be found on github: https://github.com/markfasheh/btrfs-progs-patches/tree/qgroup-verify Please apply, Signed-off-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: reduce memory usage for extents check	Wang Shilong	2014-08-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Steps to reproduce: # mkfs.btrfs -f /dev/sda9 -b 2g # mount /dev/sda9 /mnt # dd if=/dev/zero of=/mnt/data bs=4k oflag=direct # btrfs file df /mnt Data, single: total=1.66GiB, used=1.66GiB System, single: total=4.00MiB, used=16.00KiB Metadata, single: total=200.00MiB, used=67.88MiB For a filesystem without snapshots, 70M metadata, extent checking eats max memory about 110M, this is a nightmare for some system with low memory. It is very likely that extent record can be freed quickly for a filesystem without snapshots, improve this by trying if it can free memory after adding data/tree backrefs. This patch reduces max memory cost from 110M to 40M for extents checking for the above case. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: check: Fix wrong level access	Hugo Mills	2014-08-22
\| \| \| \| \| \| \| \| \| \|	There's no reason to assume that the bad key order is in a leaf block, so accessing level 0 of the path is going to be an error if it's actually a node block that's bad. Reported-by: Chris Mason <clm@fb.com> Signed-off-by: Hugo Mills <hugo@carfax.org.uk> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: switch to is_fstree()	Wang Shilong	2014-08-22
\| \| \| \| \|	Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: add an option to check data csums	Wang Shilong	2014-08-22
\| \| \| \| \| \| \| \|	This patch adds an option '--check-data-csum' to verify data checksums. fsck won't check data csums unless users specify this option explictly. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	btrfs-progs: add quota group verify code	Mark Fasheh	2014-08-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds functionality (in qgroup-verify.c) to compute bytecounts in subvolume quota groups. The original groups are read in and stored in memory so that after we compute our own bytecounts, we can compare them with those on disk. A print function is provided to do this comparison and show the results on the console. A 'qgroup check' pass is added to btrfsck. If any subvolume quota groups differ from what we compute, the differences for them are printed. We also provide an option '--qgroup-report' which will run only the quota check code and print a report on all quota groups. Other than making it possible to verify that our qgroup changes work correctly, this mode can also be used in xfstests for automated checking after qgroup tests. This patch does not address the following: - compressed counts are identical to non compressed, because kernel doesn't make the distinction yet. Adding the code to verify compressed counts shouldn't be hard at all though once kernel can do this. - It is only concerned with subvolume quota groups (like most of btrfs-progs). Signed-off-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: make smatch checker happy (trivial fixes)	Rakesh Pandit	2014-05-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	It complains errno never gets assigned to zero in find-root and since errno anyway is zero at program started up, lets remove it. Check "copy is less then zero" isn't possible because strtoull used by arg_strtou64 wouldn't return -ve number. Trivial space fixes. Signed-off-by: Rakesh Pandit <rakesh@tuxera.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: while checking root refs print readable errors	Rakesh Pandit	2014-05-02
\| \| \| \| \| \| \| \| \|	Lets use "errors" instead of "error" because more then one ref errors are possible. Also print error messages for unresolved refs in check_root_refs. Signed-off-by: Rakesh Pandit <rakesh@tuxera.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: update btrfs_file_extent_inline_len to match kernel version	Filipe David Borba Manana	2014-04-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following kernel commit changed the definition of the inline function btrfs_file_extent_inline_len(): commit 514ac8ad8793a097c0c9d89202c642479d6dfa34 Author: Chris Mason <clm@fb.com> Date: Fri Jan 3 21:07:00 2014 -0800 Btrfs: don't use ram_bytes for uncompressed inline items If we truncate an uncompressed inline item, ram_bytes isn't updated to reflect the new size. The fixe uses the size directly from the item header when reading uncompressed inlines, and also fixes truncate to update the size as it goes. Not having this new definition implies that the restore tool might misbehave when restoring files with an inline extent that got truncated on a kernel older than release 3.14. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: fix wrong index in pick_next_pending()	Wang Shilong	2014-04-11
\| \| \| \| \| \| \| \|	Though all tree blocks have same size, we'd better use right index here. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: reduce memory usage of extent record struct	Wang Shilong	2014-04-11
\| \| \| \| \| \| \| \| \|	Two changes: 1.use bit filed for @found_rec 2.u32 is enough to calculate duplicate extent number. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: fix possible memory leaks in run_next_block()	Wang Shilong	2014-04-11
\| \| \| \| \| \| \|	We still need free allocated cache memory in case error happens. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: don't free @seen cache until we finish searching	Wang Shilong	2014-04-11
\| \| \| \| \| \| \| \|	@seen cache is used to avoid iterating same block more than once, and we can not free them until we have finished searching. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs-progs: fsck: fix memory leak and unnecessary call to free	Rakesh Pandit	2014-03-21
\| \| \| \| \| \| \| \| \|	Free already allocated memory to item1_data if malloc fails for item2_data in swap_values. Seems to be a typo from commit 70749a77. Signed-off-by: Rakesh Pandit <rakesh@tuxera.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
*	Btrfs-progs: fsck: handle case that we can not lookup extent info	Wang Shilong	2014-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, --init-extent-tree works just because btrfs_lookup_extent_info() blindly return 0, and this make it work if there are not any FULL BACKREF mode in broken filesystem. It is just a coincidence that --init-extent-tree option works, let's do it in the right way firstly. For now, we have not supported to rebuild extent tree if there are any FULL BACKREF mode which means if there are snapshots with broken filesystem, avoid using --init-extent-tree option now. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
*	Btrfs-progs: fsck: force to udate tree root for some cases	Wang Shilong	2014-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit roots won't update root item in tree root if it finds updated root's bytenr is same as before. However, this is not right for fsck, we need update tree root in the following case: 1.overwrite previous root node. 2.reinit reloc data tree, this is because we skip pin relo data tree before which means we can allocate same block as before. Fix this by updating tree root ourselves for the above cases. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
*	Btrfs-progs: fsck: insert root dir into reloc data tree when reiniting it	Wang Shilong	2014-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two bugs when resetting balance: 1.we will skip reinitting reloc data tree if no reloc root found, however this is not right because we don't pin reloc data tree before. 2.we should insert root dir into reloc data tree,otherwise we will fail to fsck. Fix problems by forcely reiniting reloc data root and inserting root dir. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
*	Btrfs-progs: fsck: reset balance after reiniting extent root	Wang Shilong	2014-03-21
\| \| \| \| \| \| \| \| \| \|	reset balance need cow block which will insert extent item into extent tree. If we do this before reinitting extent root, we may encounter EEIXST. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
*	Btrfs-progs: fsck: deal with really corrupted extent tree	Wang Shilong	2014-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	To reinit extent root, we need find a free extent, however, we may have a really corrupted extent tree, so we can't rely on existed extent tree to cache block group any more. During test, we fail to reinit extent tree which is because we can not find a free extent so let's make block group cache ourselves firstly. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
*	Btrfs-progs: record generation for tree blocks in fsck	Josef Bacik	2014-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When working with a user who had a broken file system I noticed that we were reading a bad copy of a block when the other copy was perfectly fine. This is because we don't keep track of the parent generation for tree blocks, so we just read whichever copy we damned well please with no regards for which is best. This fixes this problem by recording the parent generation of the tree block so we can be sure to read the most correct copy before we check it, which will give us a better chance of fixing really broken filesystems. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>