| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
| |
The normal back reference counting doesn't care about the extent referred
by the extent data in the shared leaf. The check_extent_data_backref
function need to skip the leaf that owner mismatch with the root_id.
Reported-by: Marc MERLIN <marc@merlins.org>
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
extent interrupt
Make lowmem mode output more detailed information about file extent
interrupt.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
super sizes
Along with the rescue introduced, also introduce check and repair for them.
Unlike normal check functions, some of the check is optional, and even if
the image failed to pass optional check, kernel can still runs fine.
(But may cause noisy kernel warning)
So some check, mainly for alignment, will not cause btrfs check to fail,
but only to output warning and instructs how to fix it.
For repair, it just calls the same repair function in rescue, and is
included in 'btrfs check --repair'.
But 'btrfs rescue' is still the preferred method, since it can be used
independent of all the 'check' passes, if we know what's the exact
problem to fix.
Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lowmem check does not skip invalid type in extent_inline_ref and then
calls btrfs_extent_inline_ref_size(type) which causes a crash.
Error:
$ btrfs check --mode=lowmem /tmp/data_small
Checking filesystem on /tmp/data_small
UUID: ee205d69-8724-4aa2-a4f5-bc8558a62169
checking extents
ERROR: extent[20971520 16384] backref type mismatch, missing bit: 2
ERROR: extent[20971520 16384] backref generation mismatch,
wanted: 7, have: 0
ERROR: extent[20971520 16384] is referred by other roots than 3
ctree.h:1754: btrfs_extent_inline_ref_size: BUG_ON `1` triggered,
value 1
btrfs(+0x543db)[0x55fabc2ab3db]
btrfs(+0x587f7)[0x55fabc2af7f7]
btrfs(+0x5fa44)[0x55fabc2b6a44]
btrfs(cmd_check+0x194a)[0x55fabc2bd717]
btrfs(main+0x88)[0x55fabc2682e0]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f021c3824ca]
btrfs(_start+0x2a)[0x55fabc267e7a]
[1] 5188 abort (core dumped) btrfs check --mode=lowmem /tmp/data_small
Fix it by introducing check_extent_inline_ref() to check the type.
If the checker returns a non-zero value, we should not try to check the
corrupted extent item anymore.
Suggested-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Return value of repair_root_items():
<0 on error
=0 does nothing
>0 if repair is enabled, N roots are repaired;
else N roots are corrupted.
In the repair mode, there should be no error if the return value is
bigger than 0. This fixes the test fsck/006 again.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The annotation of repair_root_items says:
"This must be run before any other repair code - not doing it so,
makes other repair code delete or modify backrefs in the extent tree
for example, which will result in an inconsistent fs after repairing
the root items."
However, the rule was broken by commit 1f728b1a514f ("Btrfs-progs,
fsck: move root items repair after root rebuilding").
The commit intends to fix failure of test-fsck/013 so it moves
repair_root_items() after check_extents_and_chunks().
The correct way is to skip calling repair_root_item() when
init_extent_tree is non-zero.
Now put repair_root_items() before do_check_chunks_and_extents() and
do not call repair_root_items() if init_extent_tree is set.
Then test-fsck/013 works well.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In original check mode (without option --repair), check_extent_refs()
always returns 0.
Add a variable @err to record status while checking extents. At the end
of check_extent_refs(), let it return -EIO if @err is non-zero.
The test fsck/006-bad-root-items will fail after this patch and fixed by
the following patches.
Example:
$ btrfs check bad-extent-inline-ref-type.raw
Checking filesystem on bad-extent-inline-ref-type.raw
UUID: 1942d6fe-617b-4499-9982-cc8ffae5447f
checking extents
corrupt extent record: key 29360128 169 16384
ref mismatch on [29360128 16384] extent item 0, found 1
Backref 29360128 parent 5 root 5 not found in extent tree
backpointer mismatch on [29360128 16384]
bad extent [29360128, 29376512), type mismatch with chunk
checking free space cache
checking fs roots
checking csums
checking root refs
found 114688 bytes used, no error found
total csum bytes: 0
total tree bytes: 114688
total fs tree bytes: 32768
total extent tree bytes: 16384
btree space waste bytes: 109471
file data blocks allocated: 0
referenced 0
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[ add note about the failing test, rename variable to err ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Add a macro named BG_ACCOUNT_ERROR meaning that block group used size
does not equal the total.
After extent-tree repair, BG_ACCOUNT_ERROR should be fixed up.
Clean bits at end of check_chunks_and_extents_v2().
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only thing repair_extent_data_item() does is that it adds backref of the
tree_block. Just like what original mode does:
It first searches the corresponding extent item.
1. If the extent item exists but backref is missing, add one backref to the
extent.
2. Found nothing, just add an extent item and add one backref.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only thing repair_tree_block_ref() does is that it adds backref of the
tree_block. Just like what original repair do:
It first searches the corresponding extent item then
1. If the extent item exists but backref is missing, add one backref to the
extent.
2. if found nothing, just add an extent item and add one backref.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Because this patchset concentrates on repair of extent tree,
repair_chunk_item() now only inserts missed chunk group item into
extent tree.
There are some things left TODO, for example dev_item fix.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce delete_extent_tree_item() and repair_extent_item() to do
only deletion.
While checking the extent tree, just delete the wrong item. For extent
item, free wrong backref. Otherwise, delete. So the remaining items in
extent tree should be correct.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is a preparation for extent-tree repair in lowmem mode.
In the lowmem mode, checking tree blocks of various trees is recursive.
But during repair, adding or deleting item(s) may modify upper nodes
which will cause the repair to be complicated and dangerous.
Before this patch:
One problem of lowmem check is that it only checks the lowest node's
backref in check_tree_block_ref.
This way ensures checked tree blocks are valid and avoids to traverse
all trees for performance reasons.
However, there is one shortcoming that it can not detect backref mistake
if one extent whose owner == offset but lacks the other backref(s).
In check, correctness is more important than speed.
If errors can not be detected, repair is impossible.
Changes in the patch:
check_chunks_and_extents now has to check *ALL* trees so lowmem check
will behave like original mode.
Changing the way of traversal to be same as fs tree which calls
walk_down_tree_v2() and walk_up_tree_v2() is easy for further
repair.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[ heavy coding style fixes ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Since repair functions will search path again, if the last item
was checked, the location where the path points is invalid.
Fix it by saving the last valid key if err contains LAST_ITEM,
and call btrfs_next_item() before return of check_inode_item().
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While checking file extents, there are two errors that may occur:
1) There is one hole between the last extent end and beginning of the
current extent but no-holes is disabled.
2) No-holes is disabled, one file's nbytes equals 0 but isize is not 0.
Those both mean the file may have lost some extents.
To avoid btrfsck's error message, fix it by introducing function
'punch_extent_hole' to punch holes.
For case 1, punch a hole extent whose length is
(current extent begin - last extent end)
while checking one extent.
For case 2, punch a hole extent whose length is
(file isize - actual file size)
after traversing one entire file.
Then repair_inode_nbytes will set the nbytes to isize.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
New function repair_inode_nlinks_lowmem() sets nlink of the inode to refs.
If refs equals 0, move the inode to lost+found and set refs to 1
initially.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
repair_ternary_lowmem() may delete dir_item(s), later traversal can cause
wrong isize of the dirctory inode.
Introduce count_dir_iszie() to count directory isize if any
dir_item(s) in the directory has been repaired.
check_dir_item() now returns DIR_COUNT_AGAIN means the inode should be
counted isize again.
It is unnessary to do recount after check_inode_ref(), since
inode_ref is irrelevant to isize.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce repair_ternary_lowmem() to repair dir_item, dir_index
and inode_ref.
If two of the three are missing or mismatched, call btrfs_unlink() to
delete the existing one.
If one of three is missing or mismatched, call btrfs_add_link() to
add the missing one.
repair_dir_item() inserts an inode item corresponding to location in the
dir item if error contains INODE_ITEM_MISSING.
Also, it calls repair_ternary_lowmem() to repair relationship of
dir_item, dir_index and inode_ref.
check_inode_ref() calls repair_ternary_item() to fix up errors.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
| |
Introduce 'repair_fs_first_inode' to repair first inode errors
(ref missing and inode item missing).
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Introduce __create_inode_item() to create a new inode item.
It is called by create_inode_item() and create_inode_item_lowmem().
Function repair_inode_item_missing() just adds a new inode item.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
For code reuse, btrfs_insert_dir_item() now calls
inserts_with_overflow() even if the dir_item existed.
Add a parameter @ignore_existed to btrfs_add_link().
If @ignore_existed is not zero, btrfs_add_link() continues to do link.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
check_dir_item() now checks relative dir_item/dir_index.
Introduce print_dir_item_err() to print error msg while
checking dir_item/dir_index.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce print_inode_ref() to print error msg while checking inode ref.
Add args @name_ret and @namelen_ret to check_inode_ref().
Name is essential if the inode item is to be put into lost+found
while doing nlinks repair.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The changes in the patch is for further repair:
1.Introduce find_dir_index() to get the index by traversing items.
2.We should distinguish dir_index error and dir_item error.
However, there are only DIR_ITEM_MISSING and DIR_ITEM_MISMATCH.
Introduce marcos DIR_INDEX_MISSING and DIR_INDEX_MISMATCH
to represent index missing/mismatch.
3.Because find_dir_item() prints message right now if it detects any
error.
Remove message output now and next patches will introduce functions
to print error message.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modify check_fs_first_inode to check the inode ref in first inode.
Which root dir inode differs from other inode is inode_ref points
"..".
So we just handle this special case and treat it as normal
inode in continued check.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
| |
For further lowmem repair, change @index type u64 to u64* of
function find_inode_ref().
So caller can get the index of ref.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce repair_inode_orphan_item_lowmem() to add an orphan
item if the inode refs and nlink are both zero.
repair_inode_orphan_item_lowmem() is just a wrapper function
that calls btrfs_add_orphan_item().
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
After traversal of whole directory, we should get the actual isize.
Like original mode, function repair_dir_isize_lowmem() sets isize of the
directory inode item to actual size.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
After checking one entire inode item, we should get the actual
nbytes of the inode item.
Like original mode, repair_inode_nbytes_lowmem() sets nbytes in
struct btrfs_inode_item to the actual nbytes.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
| |
Turn on the option --repair with --mode==lowmem in btrfs check.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[ use warning() and adjust wording ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The check opens the given device in exclusive by default. In the forced
mode we want to access a device in use, so we have to drop the
exclusivity bit.
This works for block devices but not for files, that could be mounted
via a loop device. In that respect test check/007 is broken and will be
fixed.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
We now have two data structures that can be used to iterate the same data
set, and there may be quite a few of them in memory. Eliminating the
list_head member will reduce memory consumption while iterating over
the extent backrefs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the pathlogical case, like xfstests generic/297 that creates a
large file consisting of one, repeating reflinked extent, fsck can
take hours. The root cause is that calling find_data_backref while
iterating the extent records is an O(n^2) algorithm. For my
example test run, n was 2*2^20 and fsck was at 8 hours and counting.
This patch supplements the list with an rbtree and drops the runtime
of that testcase to about 20 seconds.
A previous version of this patch introduced a regression that would
have corrupted file systems during repair. It was traced to the
compare algorithm honoring ->bytes regardless of whether the
reference had been found and a failure to reinsert nodes after
the target reference was found.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Sometimes it's needed to do a check on a mounted filesystem. This should
work fine on a quiescent filesystem or a read-only mount. Changes on the
block device done by kernel might confuse the userspace checker and it
might crash when it reads some stale data.
Repair without mount checks is not supported right now.
Signed-off-by: David Sterba <dsterba@suse.cz>
|
|
|
|
|
|
| |
Simplify main a bit.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
| |
The root pointer is not used anyway, will be cleaned up next.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
| |
Simplify main a bit.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
| |
The root pointer is set to fs_root as was originally.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
| |
The root pointer is set to fs_root as was originally.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
| |
The pointers to critical roots must be valid before we start using them,
eg. as the space clearing code.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
| |
Move the code out of main.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
| |
A code added in 2009 (95d3f20b51e9b) for a very short-lived change in
the format is no concern to us nowadays.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
| |
As btrfs_update_block_group fails when the block group is not found in
cache, we can exit btrfs_free_block_group, not much to rollback. The
caller will also exit in turn.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
| |
Tree blocks are always nodesize. As readahead is only an optimization,
exact size is not required and is only advisory.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
| |
Nodesize is same for all levels, besides it's been only set and not
used, in root_item_record.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
| |
Prep work so we can drop the blocksize argument from several functions.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
I found some btrfs commands options are not working because of
inappropriate getopt_long() setting.
This fixes "btrfs check -Q/-E"
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|