提交 · 00e4e6b33a0f78aab4b788d6d31c884fd8bf88da · openeuler / raspberrypi-kernel

25 9月, 2008 21 次提交

由 Yan Zheng 提交于 8月 04, 2008

This trivial patch contains two locking fixes and a off by one fix.

---
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b48652c1

Btrfs: Fix the defragmention code and the block relocation code for data=ordered · 3eaa2885

由 Chris Mason 提交于 7月 24, 2008

Before setting an extent to delalloc, the code needs to wait for
pending ordered extents.

Also, the relocation code needs to wait for ordered IO before scanning
the block group again.  This is because the extents are not removed
until the IO for the new extents is finished
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3eaa2885

Btrfs: Search data ordered extents first for checksums on read · 89642229

由 Chris Mason 提交于 7月 24, 2008

Checksum items are not inserted into the tree until all of the io from a
given extent is complete. This means one dirty page from an extent may
be written, freed, and then read again before the entire extent is on disk
and the checksum item is inserted.

The checksums themselves are stored in the ordered extent so they can
be inserted in bulk when IO is complete. On read, if a checksum item isn't
found, the ordered extents were being searched for a checksum record.

This all worked most of the time, but the checksum insertion code tries
to reduce the number of tree operations by pre-inserting checksum items
based on i_size and a few other factors. This means the read code might
find a checksum item that hasn't yet really been filled in.

This commit changes things to check the ordered extents first and only
dive into the btree if nothing was found. This removes the need for
extra locking and is more reliable.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

89642229

C
Btrfs: Take the csum mutex while reading checksums · ed98b56a
由 Chris Mason 提交于 7月 22, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
ed98b56a

Btrfs: Fix some data=ordered related data corruptions · f421950f

由 Chris Mason 提交于 7月 22, 2008

Stress testing was showing data checksum errors, most of which were caused
by a lookup bug in the extent_map tree.  The tree was caching the last
pointer returned, and searches would check the last pointer first.

But, search callers also expect the search to return the very first
matching extent in the range, which wasn't always true with the last
pointer usage.

For now, the code to cache the last return value is just removed.  It is
easy to fix, but I think lookups are rare enough that it isn't required anymore.

This commit also replaces do_sync_mapping_range with a local copy of the
related functions.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f421950f

Btrfs: Data ordered fixes · 4a096752

由 Chris Mason 提交于 7月 21, 2008

* In btrfs_delete_inode, wait for ordered extents after calling
truncate_inode_pages.  This is much faster, and more correct

* Properly clear our the PageChecked bit everywhere we redirty the page.

* Change the writepage fixup handler to lock the page range and check to
see if an ordered extent had been inserted since the improperly dirtied
page was discovered

* Wait for ordered extents outside the transaction.  This isn't required
for locking rules but does improve transaction latencies

* Reduce contention on the alloc_mutex by dropping it while incrementing
refs on a node/leaf and while dropping refs on a leaf.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4a096752

C
Fix btrfs_wait_ordered_extent_range to properly wait · e5a2217e
由 Chris Mason 提交于 7月 18, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
e5a2217e

Btrfs: Handle data checksumming on bios that span multiple ordered extents · 3edf7d33

由 Chris Mason 提交于 7月 18, 2008

Data checksumming is done right before the bio is sent down the IO stack,
which means a single bio might span more than one ordered extent. In
this case, the checksumming data is split between two ordered extents.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3edf7d33

C
Btrfs: Cleanup and comment ordered-data.c · eb84ae03
由 Chris Mason 提交于 7月 17, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
eb84ae03

Btrfs: Don't pin pages in ram until the entire ordered extent is on disk. · ba1da2f4

由 Chris Mason 提交于 7月 17, 2008

Checksum items are not inserted until the entire ordered extent is on disk,
but individual pages might be clean and available for reclaim long before
the whole extent is on disk.

In order to allow those pages to be freed, we need to be able to search
the list of ordered extents to find the checksum that is going to be inserted
in the tree.  This way if the page needs to be read back in before
the checksums are in the btree, we'll be able to verify the checksum on
the page.

This commit adds the ability to search the pending ordered extents for
a given offset in the file, and changes btrfs_releasepage to allow
ordered pages to be freed.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ba1da2f4

Btrfs: Update on disk i_size only after pending ordered extents are done · dbe674a9

由 Chris Mason 提交于 7月 17, 2008

This changes the ordered data code to update i_size after the extent
is on disk.  An on disk i_size is maintained in the in-memory btrfs inode
structures, and this is updated as extents finish.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

dbe674a9

Btrfs: New data=ordered implementation · e6dcd2dc

由 Chris Mason 提交于 7月 17, 2008

The old data=ordered code would force commit to wait until
all the data extents from the transaction were fully on disk.  This
introduced large latencies into the commit and stalled new writers
in the transaction for a long time.

The new code changes the way data allocations and extents work:

* When delayed allocation is filled, data extents are reserved, and
  the extent bit EXTENT_ORDERED is set on the entire range of the extent.
  A struct btrfs_ordered_extent is allocated an inserted into a per-inode
  rbtree to track the pending extents.

* As each page is written EXTENT_ORDERED is cleared on the bytes corresponding
  to that page.

* When all of the bytes corresponding to a single struct btrfs_ordered_extent
  are written, The previously reserved extent is inserted into the FS
  btree and into the extent allocation trees.  The checksums for the file
  data are also updated.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e6dcd2dc

C
Btrfs: Add a per-inode csum mutex to avoid races creating csum items · 1b1e2135
由 Chris Mason 提交于 6月 25, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
1b1e2135

Fix btrfs_del_ordered_inode to allow forcing the drop during unlinks · 594a24eb

由 Chris Mason 提交于 6月 25, 2008

This allows us to delete an unlinked inode with dirty pages from the list
instead of forcing commit to write these out before deleting the inode.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

594a24eb

btrfs delete ordered inode handling fix · e1b81e67

由 Mingming 提交于 5月 27, 2008

Use btrfs_release_file instead of a put_inode call
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e1b81e67

Btrfs: Fixes for 2.6.18 enterprise kernels · d6bfde87

由 Chris Mason 提交于 4月 30, 2008

2.6.18 seems to get caught in an infinite loop when
cancel_rearming_delayed_workqueue is called more than once, so this switches
to cancel_delayed_work, which is arguably more correct.

Also, balance_dirty_pages can run into problems with 2.6.18 based kernels
because it doesn't have the per-bdi dirty limits. This avoids calling
balance_dirty_pages on the btree inode unless there is actually something
to balance, which is a good optimization in general.

Finally there's a compile fix for ordered-data.h
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d6bfde87

C
Btrfs: Throttle file_write when data=ordered is flushing the inode · 81d7ed29
由 Chris Mason 提交于 4月 25, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
81d7ed29

Btrfs: Fix data=ordered vs wait_on_inode deadlock on older kernels · 4d5e74bc

由 Chris Mason 提交于 1月 16, 2008

Using ilookup5 during data=ordered writeback could deadlock on I_LOCK.  This
saves a pointer to the inode instead.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4d5e74bc

C
Btrfs: Run igrab on data=ordered inodes to prevent deadlocks during writeout · 2da98f00
由 Chris Mason 提交于 1月 16, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
2da98f00
C
Rework btrfs_drop_inode to avoid scheduling · cee36a03
由 Chris Mason 提交于 1月 15, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
cee36a03

Btrfs: Add data=ordered support · dc17ff8f

由 Chris Mason 提交于 1月 08, 2008

This forces file data extents down the disk along with the metadata that
references them. The current implementation is fairly simple, and just
writes out all of the dirty pages in an inode before the commit.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

dc17ff8f