提交 · ff023aac31198e88507d626825379b28ea481d4d · openeuler / Kernel

13 12月, 2012 10 次提交

Btrfs: add code to scrub to copy read data to another disk · ff023aac

由 Stefan Behrens 提交于 11月 06, 2012

The device replace procedure makes use of the scrub code. The scrub
code is the most efficient code to read the allocated data of a disk,
i.e. it reads sequentially in order to avoid disk head movements, it
skips unallocated blocks, it uses read ahead mechanisms, and it
contains all the code to detect and repair defects.
This commit adds code to scrub to allow the scrub code to copy read
data to another disk.
One goal is to be able to perform as fast as possible. Therefore the
write requests are collected until huge bios are built, and the
write process is decoupled from the read process with some kind of
flow control, of course, in order to limit the allocated memory.
The best performance on spinning disks could by reached when the
head movements are avoided as much as possible. Therefore a single
worker is used to interface the read process with the write process.
The regular scrub operation works as fast as before, it is not
negatively influenced and actually it is more or less unchanged.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

ff023aac

Btrfs: disallow some operations on the device replace target device · 63a212ab

由 Stefan Behrens 提交于 11月 05, 2012

This patch adds some code to disallow operations on the device that
is used as the target for the device replace operation.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

63a212ab

Btrfs: pass fs_info instead of root · aa1b8cd4

由 Stefan Behrens 提交于 11月 05, 2012

A small number of functions that are used in a device replace
procedure when the operation is resumed at mount time are unable
to pass the same root pointer that would be used in the regular
(ioctl) context. And since the root pointer is not required, only
the fs_info is, the root pointer argument is replaced with the
fs_info pointer argument.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

aa1b8cd4

Btrfs: pass fs_info to btrfs_map_block() instead of mapping_tree · 3ec706c8

由 Stefan Behrens 提交于 11月 05, 2012

This is required for the device replace procedure in a later step.
Two calling functions also had to be changed to have the fs_info
pointer: repair_io_failure() and scrub_setup_recheck_block().
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

3ec706c8

Btrfs: cleanup scrub bio and worker wait code · b6bfebc1

由 Stefan Behrens 提交于 11月 02, 2012

Just move some code into functions to make everything more readable.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

b6bfebc1

Btrfs: in scrub repair code, simplify alloc error handling · 34f5c8e9

由 Stefan Behrens 提交于 11月 02, 2012

In the scrub repair code, the code is changed to handle memory
allocation errors a little bit smarter. The change is to handle
it just like a read error. This simplifies the code and removes
a couple of lines of code, since the code to handle read errors
is there anyway.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

34f5c8e9

Btrfs: in scrub repair code, optimize the reading of mirrors · cb2ced73

由 Stefan Behrens 提交于 11月 02, 2012

In case that disk blocks need to be repaired (rewritten), the
current code at first (for simplicity reasons) reads all alternate
mirrors in the first step, afterwards selects the best one in a
second step. This is now changed to read one alternate mirror
after the other and to leave the loop early when a perfect mirror
is found.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

cb2ced73

Btrfs: make the scrub page array dynamically allocated · 7a9e9987

由 Stefan Behrens 提交于 11月 02, 2012

With the modified design (in order to support the devive replace
procedure) it is necessary to alloc the page array dynamically.
The reason is that pages are reused. At first a page is used for
the bio to read the data from the filesystem, then the same page
is reused for the bio that writes the data to the target disk.
Since the read process and the write process are completely
decoupled, this requires a new concept of refcounts and get/put
functions for pages, and it requires to use newly created pages
for each read bio which are freed after the write operation
is finished.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

7a9e9987

Btrfs: remove the block device pointer from the scrub context struct · a36cf8b8

由 Stefan Behrens 提交于 11月 02, 2012

The block device is removed from the scrub context state structure.
The scrub code as it is used for the device replace procedure reads
the source data from whereever it is optimal. The source device might
even be gone (disconnected, for instance due to a hardware failure).
Or the drive can be so faulty so that the device replace procedure
tries to avoid access to the faulty source drive as much as possible,
and only if all other mirrors are damaged, as a last resort, the
source disk is accessed.
The modified scrub code operates as if it would handle the source
drive and thereby generates an exact copy of the source disk on the
target disk, even if the source disk is not present at all. Therefore
the block device pointer to the source disk is removed in the scrub
context struct and moved into the lower level scope of scrub_bio,
fixup and page structures where the block device context is known.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

a36cf8b8

Btrfs: rename the scrub context structure · d9d181c1

由 Stefan Behrens 提交于 11月 02, 2012

The device replace procedure makes use of the scrub code. The scrub
code is the most efficient code to read the allocated data of a disk,
i.e. it reads sequentially in order to avoid disk head movements, it
skips unallocated blocks, it uses read ahead mechanisms, and it
contains all the code to detect and repair defects.
This commit is a first preparation step to adapt the scrub code to
be shareable for the device replace procedure.
The block device will be removed from the scrub context state
structure in a later step. It used to be the source block device.
The scrub code as it is used for the device replace procedure reads
the source data from whereever it is optimal. The source device might
even be gone (disconnected, for instance due to a hardware failure).
Or the drive can be so faulty so that the device replace procedure
tries to avoid access to the faulty source drive as much as possible,
and only if all other mirrors are damaged, as a last resort, the
source disk is accessed.
The modified scrub code operates as if it would handle the source
drive and thereby generates an exact copy of the source disk on the
target disk, even if the source disk is not present at all. Therefore
the block device pointer to the source disk is removed in a later
patch, and therefore the context structure is renamed (this is the
goal of the current patch) to reflect that no source block device
scope is there anymore.

Summary:
This first preparation step consists of a textual substitution of the
term "dev" to the term "ctx" whereever the scrub context is used.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

d9d181c1

02 10月, 2012 3 次提交

btrfs: Kill some bi_idx references · be3940c0

由 Kent Overstreet 提交于 9月 11, 2012

For immutable bio vecs, I've been auditing and removing bi_idx
references. These were harmless, but removing them will make auditing
easier.

scrub_bio_end_io_worker() was open coding a bio_reset() - but this
doesn't appear to have been needed for anything as right after it does a
bio_put(), and perusing the code it doesn't appear anything else was
holding a reference to the bio.

The other use end_bio_extent_readpage() was just for a pr_debug() -
changed it to something that might be a bit more useful.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
CC: Chris Mason <chris.mason@oracle.com>
CC: Stefan Behrens <sbehrens@giantdisaster.de>

be3940c0

Btrfs: fix a bug in parsing return value in logical resolve · 69917e43

由 Liu Bo 提交于 9月 07, 2012

In logical resolve, we parse extent_from_logical()'s 'ret' as a kind of flag.

It is possible to lose our errors because
(-EXXXX & BTRFS_EXTENT_FLAG_TREE_BLOCK) is true.

I'm not sure if it is on purpose, it just looks too hacky if it is.
I'd rather use a real flag and a 'ret' to catch errors.
Acked-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NLiu Bo <liub.liubo@gmail.com>

69917e43

Btrfs: fix possible memory leak in scrub_setup_recheck_block() · cf93dcce

由 Wei Yongjun 提交于 9月 02, 2012

bbio has been malloced in btrfs_map_block() and should be
freed before leaving from the error handling cases.

spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>

cf93dcce

15 6月, 2012 1 次提交

Btrfs: use rcu to protect device->name · 606686ee

由 Josef Bacik 提交于 6月 04, 2012

Al pointed out that we can just toss out the old name on a device and add a
new one arbitrarily, so anybody who uses device->name in printk could
possibly use free'd memory. Instead of adding locking around all of this he
suggested doing it with RCU, so I've introduced a struct rcu_string that
does just that and have gone through and protected all accesses to
device->name that aren't under the uuid_mutex with rcu_read_lock(). This
protects us and I will use it for dealing with removing the device that we
used to mount the file system in a later patch. Thanks,
Reviewed-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <josef@redhat.com>

606686ee

30 5月, 2012 1 次提交

Btrfs: add device counters for detected IO and checksum errors · 442a4f63

由 Stefan Behrens 提交于 5月 25, 2012

The goal is to detect when drives start to get an increased error rate,
when drives should be replaced soon. Therefore statistic counters are
added that count IO errors (read, write and flush). Additionally, the
software detected errors like checksum errors and corrupted blocks are
counted.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>

442a4f63

05 5月, 2012 1 次提交

Btrfs: fix crash in scrub repair code when device is missing · ea9947b4

由 Stefan Behrens 提交于 5月 04, 2012

Fix that when scrub tries to repair an I/O or checksum error and one of
the devices containing the mirror is missing, it crashes in bio_add_page
because the bdev is a NULL pointer for missing devices.
Reported-by: NMarco L. Crociani <marco.crociani@gmail.com>
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ea9947b4

19 4月, 2012 1 次提交

Btrfs: don't count CRC or header errors twice while scrubbing · 5c84fc3c

由 Stefan Behrens 提交于 3月 30, 2012

Each CRC or header error was counted twice, this is now fixed.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>

5c84fc3c

13 4月, 2012 1 次提交

Btrfs: check return value of bio_alloc() properly · e627ee7b

由 Tsutomu Itoh 提交于 4月 12, 2012

bio_alloc() has the possibility of returning NULL.
So, it is necessary to check the return value.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e627ee7b

28 3月, 2012 2 次提交

Btrfs: change scrub to support big blocks · b5d67f64

由 Stefan Behrens 提交于 3月 27, 2012

Scrub used to be coded for nodesize == leafsize == sectorsize == PAGE_SIZE.
This is now changed to support sizes for nodesize and leafsize which are
N * PAGE_SIZE.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b5d67f64

Btrfs: minor cleanup in scrub · 1623edeb

由 Stefan Behrens 提交于 3月 27, 2012

Just a minor cleanup commit in preparation for the big block changes.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1623edeb

27 3月, 2012 1 次提交

Btrfs: fix regression in scrub path resolving · 7a3ae2f8

由 Jan Schmidt 提交于 3月 23, 2012

In commit 4692cf58 we introduced new backref walking code for btrfs. This
assumes we're searching live roots, which requires a transaction context.
While scrubbing, however, we must not join a transaction because this could
deadlock with the commit path. Additionally, what scrub really wants to do
is resolving a logical address in the commit root it's currently checking.

This patch adds support for logical to path resolving on commit roots and
makes scrub use that.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

7a3ae2f8

22 3月, 2012 3 次提交

btrfs: replace many BUG_ONs with proper error handling · 79787eaa

由 Jeff Mahoney 提交于 3月 12, 2012

 btrfs currently handles most errors with BUG_ON. This patch is a work-in-
 progress but aims to handle most errors other than internal logic
 errors and ENOMEM more gracefully.

 This iteration prevents most crashes but can run into lockups with
 the page lock on occasion when the timing "works out."
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

79787eaa

J
btrfs: enhance transaction abort infrastructure · 49b25e05
由 Jeff Mahoney 提交于 3月 01, 2012
```
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
```
49b25e05
J
btrfs: return void in functions without error conditions · 143bede5
由 Jeff Mahoney 提交于 3月 01, 2012
```
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
```
143bede5

20 3月, 2012 1 次提交
- C
  btrfs: remove the second argument of k[un]map_atomic() · 7ac687d9
  由 Cong Wang 提交于 11月 25, 2011
```
Signed-off-by: NCong Wang <amwang@redhat.com>
```
  7ac687d9
15 2月, 2012 1 次提交

btrfs: don't check DUP chunks twice · 859acaf1

由 Arne Jansen 提交于 2月 09, 2012

Because scrub enumerates the dev extent tree to find the chunks to scrub,
it currently finds each DUP chunk twice and also scrubs it twice. This
patch makes sure that scrub_chunk only checks that part of the chunk the
dev extent has been found for. This only changes the behaviour for DUP
chunks.
Reported-and-tested-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NArne Jansen <sensille@gmx.net>

859acaf1

05 1月, 2012 1 次提交

Btrfs: new backref walking code · 4692cf58

由 Jan Schmidt 提交于 12月 02, 2011

The old backref iteration code could only safely be used on commit roots.
Besides this limitation, it had bugs in finding the roots for these
references. This commit replaces large parts of it by btrfs_find_all_roots()
which a) really finds all roots and the correct roots, b) works correctly
under heavy file system load, c) considers delayed refs.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

4692cf58

22 12月, 2011 1 次提交

Btrfs: integrate integrity check module into btrfs · 21adbd5c

由 Stefan Behrens 提交于 11月 09, 2011

This is the last part of the patch series. It modifies the btrfs
code to use the integrity check module if configured to do so
with the define BTRFS_FS_CHECK_INTEGRITY. If this define is not set,
the only effective change is that code is added that handles the
mount option to activate the integrity check. If the mount option is
set and the define BTRFS_FS_CHECK_INTEGRITY is not set, that code
complains in the log and the mount fails with EINVAL.

Add the mount option to activate the usage of the integrity check
code.
Add invocation of btrfs integrity check code init and cleanup
function on mount and umount, respectively.
Add hook to call btrfs integrity check code version of
submit_bh/submit_bio.
Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>

21adbd5c

16 12月, 2011 1 次提交

Btrfs: fix num_workers_starting bug and other bugs in async thread · 0dc3b84a

由 Josef Bacik 提交于 11月 18, 2011

Al pointed out we have some random problems with the way we account for
num_workers_starting in the async thread stuff.  First of all we need to make
sure to decrement num_workers_starting if we fail to start the worker, so make
__btrfs_start_workers do this.  Also fix __btrfs_start_workers so that it
doesn't call btrfs_stop_workers(), there is no point in stopping everybody if we
failed to create a worker.  Also check_pending_worker_creates needs to call
__btrfs_start_work in it's work function since it already increments
num_workers_starting.

People only start one worker at a time, so get rid of the num_workers argument
everywhere, and make btrfs_queue_worker a void since it will always succeed.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

0dc3b84a

01 12月, 2011 1 次提交
- D
  btrfs scrub: handle -ENOMEM from init_ipath() · 26bdef54
  由 Dan Carpenter 提交于 11月 16, 2011
```
init_ipath() can return an ERR_PTR(-ENOMEM).
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
```
  26bdef54
20 11月, 2011 1 次提交

btrfs: Fix up 32/64-bit compatibility for new ioctls · 745c4d8e

由 Jeff Mahoney 提交于 11月 20, 2011

This patch casts to unsigned long before casting to a pointer and fixes
the following warnings:
fs/btrfs/extent_io.c:2289:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
fs/btrfs/ioctl.c:2933:37: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
fs/btrfs/ioctl.c:2937:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
fs/btrfs/ioctl.c:3020:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
fs/btrfs/scrub.c:275:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
fs/btrfs/backref.c:686:27: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

745c4d8e

11 11月, 2011 1 次提交

Btrfs: handle bio_add_page failure gracefully in scrub · 69f4cb52

由 Arne Jansen 提交于 11月 11, 2011

Currently scrub fails with ENOMEM when bio_add_page fails. Unfortunately
dm based targets accept only one page per bio, thus making scrub always
fails. This patch just submits the current bio when an error is encountered
and starts a new one.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

69f4cb52

06 11月, 2011 3 次提交

Btrfs: fix a potential btrfs_bio leak on scrub fixups · 56d2a48f

由 Ilya Dryomov 提交于 11月 04, 2011

In case we were able to map less than we wanted (length < PAGE_SIZE
clause is true) btrfs_bio is still allocated and we have to free it.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

56d2a48f

Btrfs: fix the new inspection ioctls for 32 bit compat · 740c3d22

由 Chris Mason 提交于 11月 02, 2011

The new ioctls to follow backrefs are not clean for 32/64 bit
compat.  This reworks them for u64s everywhere.  They are brand new, so
there are no problems with changing the interface now.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

740c3d22

btrfs: separate superblock items out of fs_info · 6c41761f

由 David Sterba 提交于 4月 13, 2011

fs_info has now ~9kb, more than fits into one page. This will cause
mount failure when memory is too fragmented. Top space consumers are
super block structures super_copy and super_for_commit, ~2.8kb each.
Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64)

Add a wrapper for freeing fs_info and all of it's dynamically allocated
members.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

6c41761f

02 10月, 2011 1 次提交

btrfs: use readahead API for scrub · 7a26285e

由 Arne Jansen 提交于 6月 10, 2011

Scrub uses a simple tree-enumeration to bring the relevant portions
of the extent- and csum-tree into the page cache before starting the
scrub-I/O. This is now replaced by using the new readahead-API.
During readahead the scrub is being accounted as paused, so it won't
hold off transaction commits.

This change raises the average disk bandwith utilisation on my test
volume from 70% to 90%. On another volume, the time for a test run
went down from 89s to 43s.

Changes v5:
 - reada1/2 are now of type struct reada_control *
Signed-off-by: NArne Jansen <sensille@gmx.net>

7a26285e

29 9月, 2011 4 次提交

btrfs: integrating raid-repair and scrub-fixup-nodatasum · 5da6fcbc

由 Jan Schmidt 提交于 8月 04, 2011

This ties nodatasum fixup in scrub together with raid repair patches. While
both series are working fine alone, scrub will report uncorrectable errors
if they occur in a nodatasum extent *and* the page is in the page cache.

Previously, we would have triggered readpage to find good data and do the
repair. However, readpage wouldn't read anything in the case where the page
is up to date in the cache. So, we simply take that good data we have and
call repair_io_failure directly (unless the page in the cache is dirty).
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

5da6fcbc

btrfs: btrfs_multi_bio replaced with btrfs_bio · a1d3c478

由 Jan Schmidt 提交于 8月 04, 2011

btrfs_bio is a bio abstraction able to split and not complete after the last
bio has returned (like the old btrfs_multi_bio). Additionally, btrfs_bio
tracks the mirror_num used to read data which can be used for error
correction purposes.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

a1d3c478

btrfs scrub: add fixup code for errors on nodatasum files · 0ef8e451

由 Jan Schmidt 提交于 6月 13, 2011

This removes a FIXME comment and introduces the first part of nodatasum
fixup: It gets the corresponding inode for a logical address and triggers a
regular readpage for the corrupted sector.

Once we have on-the-fly error correction our error will be automatically
corrected. The correction code is expected to clear the newly introduced
EXTENT_DAMAGED flag, making scrub report that error as "corrected" instead
of "uncorrectable" eventually.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

0ef8e451

btrfs scrub: use int for mirror_num, not u64 · e12fa9cd

由 Jan Schmidt 提交于 6月 17, 2011

the rest of the code uses int mirror_num, and so should scrub
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

e12fa9cd

openeuler / Kernel 11 个月 前同步成功

openeuler / Kernel
11 个月前同步成功