| Commit message (Collapse) | Author | Age |
|
|
|
|
| |
Signed-off-by: Nicholas D Steeves <nsteeves@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
| |
Signed-off-by: Domagoj Tršan <domagoj.trsan@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
| |
Cleanup the rb_tree.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
| |
Switch the messages that do not come from the actual image checking,
more like the parameter verification.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
| |
All callers handle errors.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
| |
Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 31d8235410985e0b64487354c9ba67d40c4bdfe3.
False report of backref mismatches, lots of messages similar to:
Incorrect local backref count on 12713984 root 5 owner 257 offset 12845056 found 1 wanted 0 back 0x7b3ed0
backpointer mismatch on [12713984 131072]
Repairing will make things worse. A fix has been proposed, but is not
finalized so we go with a revert.
Reported-by: Chris Murphy <bugzilla@colorremedies.com>
References: https://bugzilla.kernel.org/show_bug.cgi?id=155791
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
This reverts commit bbebe814c0e335745cfa773df966418e754b50e3.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add_tree_backref() can cause BUG_ON() and abort() in quite a lot of
cases, from the ENOMEM to existing tree backref records.
Change all these BUG_ON() and abort() to return proper values.
And modify all callers to handle such problems.
Reported-by: Lukas Lueg <lukas.lueg@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
| |
Check bytenr alignment for extent item to filter invalid items early.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Exposed by fuzzed image from Lukas, which contains invalid drop level
(16), causing segfault when accessing path->nodes[drop_level].
This patch will check drop level against fs tree level and
BTRFS_MAX_LEVEL to avoid such problem.
Reported-by: Lukas Lueg <lukas.lueg@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current we only do chunk validation check at mount time.
It's good for most case, but for fuzzed or manually crafted images, we
can insert a CHUNK_ITEM key into root tree.
Since mount time check will only check chunk tree, it will not check
CHUNK_ITEM in root tree.
Even with previous key type check against leaf owner, it is still
possible to modify the leaf owner to by-pass it.
So we still need to check chunk validation before processing it.
Reported-by: Lukas Lueg <lukas.lueg@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Btrfs tree implies a lot of restriction on which key types are allowed
in specific roots.
Like CHUNK_ITEM keys are only valid in chunk root.
This patch will add such check at run_next_block() for original mode.
Reported-by: Lukas Lueg <lukas.lueg@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
| |
As we're passing a set of flags, the enum type is not appropriate.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
| |
Change the single-purpose option --low-memory to a generic option that
takes the mode. Currently supported are the original mode and the
low-memory in the same way.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new fsck mode: low memory mode.
Old btrfsck is working efficiently but uses some memory for each extent
item. This method will ensure extents are only iterated once at
extent/chunk tree check process.
But since it uses some memory for each extent item, for a large fs with
several TB metadata, this can easily eat up memory and cause OOM.
To handle such limitation and improve scalability, the new low-memory
mode will not use any heap memory to record which extent is checked.
Instead it will use extent backref to avoid most of uneeded checks on
shared fs/subvolume tree blocks.
And with the use forward and backward reference cross check, we can also
ensure every tree block is at least checked once.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new function traverse_tree_block() to do pre-order
traversal, to co-operate with new fs/subvolume tree skip function.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce function should_check() to reduced duplicated tree block check
for fs/subvolume tree.
The idea is, we only check the fs/subvolue tree block if we have the
lowest referencer rootid, according to extent tree.
In that case, we can skip a lot of fs/subvolume tree block checks if
there are a lot of snapshots.
Although we will do a lot of extent tree searches for it.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce an entry function, check_leaf_items() to check all
known/valuable items and update related accounting like total_bytes and
csum_bytes.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce function check_chunk_item() to check a chunk item.
It will check all chunk stripes with dev extents and the corresponding
block group item.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce function check_block_group_item() to check a block group item.
It will check the referencer chunk and the used space accounting with
extent tree.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Introduce function check_dev_item() to check used space with dev extent
items.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
| |
Introduce function check_dev_extent_item() to find its referencer chunk.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce function check_extent_item() using previously introduced
functions.
With previous function to check referencer and backref, this function
can be quite easy.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Introduce the function check_shared_data_backref() to check the
referencer of a given shared data backref.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Introduce new function check_extent_data_backref() to search referencer
for a given data backref.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Introduce function check_shared_block_backref() to check shared block
ref.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new function check_tree_block_backref() to check if a
backref points to correct referencer.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce function query_tree_block_level() to resolve tree block level
by the following method:
1) tree block backref level
2) tree block header level
And only when header level == backref level, and transid matches, it will
return the tree block level.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new function check_data_extent_item() to check if the
corresponding data backref exists in extent tree.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tree
Introduce function check_tree_block_ref() to check whether a tree block
has correct backref in extent tree.
Unlike old extent tree check method, we only use search_slot() to search
reference, no extra structure will be allocated in heap to record what we
have checked.
This method may cause a little more IO, but should work for super large
fs without triggering OOM.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
walk_down_tree()
In walk_down_tree(), we may call btrfs_lookup_extent_info() for same tree
block many times, obviously unnecessary. Here we define a simple struct to
record whether we already have gotten tree block's refs:
struct node_refs {
u64 bytenr[BTRFS_MAX_LEVEL];
u64 refs[BTRFS_MAX_LEVEL];
};
I fill a disk partition with linux kernel source codes and use below
test script to have performance test.
#!/bin/bash
echo 3 > /proc/sys/vm/drop_caches
for ((i = 0; i < 20; i++)); do
time ./btrfsck /dev/sdc5
done 2>&1 | grep real | awk -F "[ms]" '{run_time += $2} END{print run_time / 20}'
Before this patch, it averagely took 0.8447s for every btrfsck execution,
and with this patch, it averagely took 0.7807s.
Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we can verify all qgroups, we can write the corrected qgroups out
to disk when '--repair' is specified. The qgroup status item is also updated
to clear any out-of-date state. The repair_ functions were modeled after the
inode repair code in cmds-check.c.
I also renamed the 'scan' member of qgroup_status_item to 'rescan' in order
to keep consistency with the kernel.
Testing this was easy, I just reproduced qgroup inconsistencies via the
usual routes and had btrfsck fix them.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
We now have two data structures that can be used to iterate the same data
set, and there may be quite a few of them in memory. Eliminating the
list_head member will reduce memory consumption while iterating over
the extent backrefs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the pathlogical case, like xfstests generic/297 that creates a
large file consisting of one, repeating reflinked extent, fsck can
take hours. The root cause is that calling find_data_backref while
iterating the extent records is an O(n^2) algorithm. For my
example test run, n was 2*2^20 and fsck was at 8 hours and counting.
This patch supplements the list with an rbtree and drops the runtime
of that testcase to about 20 seconds.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
| |
We either open code list_entry calls or outright cast between types. The
compiler will do the right thing if we use static inlines to do
typesafe conversions.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
| |
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
| |
Signed-off-by: Nicholas D Steeves <nsteeves@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the new add_extent_rec_nolookup() function, we add bytes_used to
update found bytes accounting.
However there is a typo that we used tmpl->nr, which should be rec->nr.
This will make us to add 1 for data backref, instead the correct size.
Reported-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
| |
The extent record's max_size might be 0 and the stripe crossing check
will report a false positive, should use the filesyste nodesize. There's
a global fs_info available.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
| |
Similar to add_extent_rec_nolookup, pass the arguments via a temporary
structure. In case the extent is found, some of the values are not
assigned directly so the semantics is preserved.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
| |
There are just 3 values of flag_block_full_backref, we can utilize a
bitfield and save 8 bytes (192 now).
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
| |
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
| |
Reduce number of parameters that just fill the extent_record from a
temporary template that's supposed to be zeroed and filled by the
callers.
Signed-off-by: David Sterba <dsterba@suse.com>
|