提交 · 37d1aeee3990385e9bb436c50c2f7e120a668df6 · openanolis / cloud-kernel

25 9月, 2008 40 次提交

由 Chris Mason 提交于 7月 31, 2008

This avoids waiting for transactions with pages locked by breaking out
the code to wait for the current transaction to close into a function
called by btrfs_throttle.

It also lowers the limits for where we start throttling.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

37d1aeee

C
Btrfs: Add missing hunk from Yan Zheng's cache reclaim patch · 47ac14fa
由 Chris Mason 提交于 7月 31, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
47ac14fa

Btrfs: Add compatibility for kernels >= 2.6.27-rc1 · 0ee0fda0

由 Sven Wegener 提交于 7月 30, 2008

Add a couple of #if's to follow API changes.
Signed-off-by: NSven Wegener <sven.wegener@stealer.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0ee0fda0

Btrfs: implement memory reclaim for leaf reference cache · bcc63abb

由 Yan 提交于 7月 30, 2008

The memory reclaiming issue happens when snapshot exists. In that
case, some cache entries may not be used during old snapshot dropping,
so they will remain in the cache until umount.

The patch adds a field to struct btrfs_leaf_ref to record create time. Besides,
the patch makes all dead roots of a given snapshot linked together in order of
create time. After a old snapshot was completely dropped, we check the dead
root list and remove all cache entries created before the oldest dead root in
the list.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

bcc63abb

Btrfs: Fix verify_parent_transid · 33958dc6

由 Chris Mason 提交于 7月 30, 2008

It was incorrectly clearing the up to date flag on the buffer even
when the buffer properly verified.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

33958dc6

Btrfs: Update and fix mount -o nodatacow · f321e491

由 Yan Zheng 提交于 7月 30, 2008

To check whether a given file extent is referenced by multiple snapshots, the
checker walks down the fs tree through dead root and checks all tree blocks in
the path.

We can easily detect whether a given tree block is directly referenced by other
snapshot. We can also detect any indirect reference from other snapshot by
checking reference's generation. The checker can always detect multiple
references, but can't reliably detect cases of single reference. So btrfs may
do file data cow even there is only one reference.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f321e491

Btrfs: async-thread: fix possible memory leak · 3bf10418

由 Li Zefan 提交于 7月 30, 2008

When kthread_run() returns failure, this worker hasn't been
added to the list, so btrfs_stop_workers() won't free it.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3bf10418

Btrfs: Throttle operations if the reference cache gets too large · ab78c84d

由 Chris Mason 提交于 7月 29, 2008

A large reference cache is directly related to a lot of work pending
for the cleaner thread.  This throttles back new operations based on
the size of the reference cache so the cleaner thread will be able to keep
up.

Overall, this actually makes the FS faster because the cleaner thread will
be more likely to find things in cache.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ab78c84d

C
Btrfs: Fix version.sh when used outside of an hg repo · 1a3f5d04
由 Chris Mason 提交于 7月 29, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
1a3f5d04

Btrfs: Leaf reference cache update · 017e5369

由 Chris Mason 提交于 7月 28, 2008

This changes the reference cache to make a single cache per root
instead of one cache per transaction, and to key by the byte number
of the disk block instead of the keys inside.

This makes it much less likely to have cache misses if a snapshot
or something has an extra reference on a higher node or a leaf while
the first transaction that added the leaf into the cache is dropping.

Some throttling is added to functions that free blocks heavily so they
wait for old transactions to drop.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

017e5369

Btrfs: Add a leaf reference cache · 31153d81

由 Yan Zheng 提交于 7月 28, 2008

Much of the IO done while dropping snapshots is done looking up
leaves in the filesystem trees to see if they point to any extents and
to drop the references on any extents found.

This creates a cache so that IO isn't required.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

31153d81

C
Btrfs: Rev the disk format magic · 3a115f52
由 Chris Mason 提交于 7月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
3a115f52

Btrfs: Null terminate strings passed in from userspace · 5516e595

由 Mark Fasheh 提交于 7月 24, 2008

The 'char name[BTRFS_PATH_NAME_MAX]' member of struct btrfs_ioctl_vol_args
is passed directly to strlen() after being copied from user. I haven't
verified this, but in theory a userspace program could pass in an
unterminated string and cause a kernel crash as strlen walks off the end of
the array.

This patch terminates the ->name string in all btrfs ioctl functions which
currently use a 'struct btrfs_ioctl_vol_args'. Since the string is now
properly terminated, it's length will never be longer than
BTRFS_PATH_NAME_MAX so that error check has been removed.

By the way, it might be better overall to just have the ioctl pass an
unterminated string + length structure but I didn't bother with that since
it'd change the kernel/user interface.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5516e595

Fix path slots selection in btrfs_search_forward · 9652480b

由 Yan 提交于 7月 24, 2008

We should decrease the found slot by one as btrfs_search_slot does
when bin_search return 1 and node level > 0.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9652480b

Btrfs: Fix .. lookup corner case · 445dceb7

由 Yan 提交于 7月 24, 2008

Inode ref item can be in the next leaf when we find "path->slots[0] ==
btrfs_header_nritems(...)".
Signed-off-by: NChris Mason <chris.mason@oracle.com>

445dceb7

Btrfs: Properly release lock in pin_down_bytes · 974e35a8

由 Yan 提交于 7月 24, 2008

When buffer isn't uptodate, pin_down_bytes may leave the tree locked
after it returns.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

974e35a8

Btrfs: Remove unused variable in fixup_tree_root_location · 45467261

由 Balaji Rao 提交于 7月 24, 2008

Remove a unused variable 'path' in fixup_tree_root_location.
Signed-off-by: NBalaji Rao <balajirrao@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

45467261

J
Btrfs: Fix a few functions that exit without stopping their transaction · 8e8a1e31
由 Josef Bacik 提交于 7月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8e8a1e31
J
Btrfs: Create orphan inode records to prevent lost files after a crash · 7b128766
由 Josef Bacik 提交于 7月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
7b128766
J
Btrfs: Add ACL support · 33268eaf
由 Josef Bacik 提交于 7月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
33268eaf
J
Btrfs: Remove unused xattr code · 6099afe8
由 Josef Bacik 提交于 7月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
6099afe8
J
Btrfs: Implement new dir index format · aec7477b
由 Josef Bacik 提交于 7月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
aec7477b

Btrfs: Fix the defragmention code and the block relocation code for data=ordered · 3eaa2885

由 Chris Mason 提交于 7月 24, 2008

Before setting an extent to delalloc, the code needs to wait for
pending ordered extents.

Also, the relocation code needs to wait for ordered IO before scanning
the block group again.  This is because the extents are not removed
until the IO for the new extents is finished
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3eaa2885

D
Btrfs: Use assert_spin_locked instead of spin_trylock · 64f26f74
由 David Woodhouse 提交于 7月 24, 2008
```
On UP systems spin_trylock always succeeds
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
64f26f74
C
Btrfs: Add version strings on module load · b3c3da71
由 Chris Mason 提交于 7月 23, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
b3c3da71
C
Btrfs: Fix some build problems on 2.6.18 based enterprise kernels · 4881ee5a
由 Chris Mason 提交于 7月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
4881ee5a

Btrfs: Search data ordered extents first for checksums on read · 89642229

由 Chris Mason 提交于 7月 24, 2008

Checksum items are not inserted into the tree until all of the io from a
given extent is complete. This means one dirty page from an extent may
be written, freed, and then read again before the entire extent is on disk
and the checksum item is inserted.

The checksums themselves are stored in the ordered extent so they can
be inserted in bulk when IO is complete. On read, if a checksum item isn't
found, the ordered extents were being searched for a checksum record.

This all worked most of the time, but the checksum insertion code tries
to reduce the number of tree operations by pre-inserting checksum items
based on i_size and a few other factors. This means the read code might
find a checksum item that hasn't yet really been filled in.

This commit changes things to check the ordered extents first and only
dive into the btree if nothing was found. This removes the need for
extra locking and is more reliable.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

89642229

C
Btrfs: Fix 32 bit compiles by using an unsigned long byte count in the ordered extent · 9ba4611a
由 Chris Mason 提交于 7月 23, 2008
```
The ordered extents have to fit in memory, so an unsigned long is sufficient.
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
9ba4611a
C
Btrfs: Take the csum mutex while reading checksums · ed98b56a
由 Chris Mason 提交于 7月 22, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
ed98b56a

Btrfs: alloc_mutex latency reduction · c286ac48

由 Chris Mason 提交于 7月 22, 2008

This releases the alloc_mutex in a few places that hold it for over long
operations.  btrfs_lookup_block_group is changed so that it doesn't need
the mutex at all.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c286ac48

Btrfs: Add some conditional schedules near the alloc_mutex · e34a5b4f

由 Chris Mason 提交于 7月 22, 2008

This helps prevent stalls, especially while the snapshot cleaner is
running hard
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e34a5b4f

Btrfs: Use mutex_lock_nested for tree locking · 6dddcbeb

由 Chris Mason 提交于 7月 22, 2008

Lockdep has the notion of locking subclasses so that you can identify
locks you expect to be taken after other locks of the same class. This
changes the per-extent buffer btree locking routines to use a subclass based
on the level in the tree.

Unfortunately, lockdep can only handle 8 total subclasses, and the btrfs
max level is also 8. So when lockdep is on, use a lower max level.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6dddcbeb

Btrfs: Fix some data=ordered related data corruptions · f421950f

由 Chris Mason 提交于 7月 22, 2008

Stress testing was showing data checksum errors, most of which were caused
by a lookup bug in the extent_map tree.  The tree was caching the last
pointer returned, and searches would check the last pointer first.

But, search callers also expect the search to return the very first
matching extent in the range, which wasn't always true with the last
pointer usage.

For now, the code to cache the last return value is just removed.  It is
easy to fix, but I think lookups are rare enough that it isn't required anymore.

This commit also replaces do_sync_mapping_range with a local copy of the
related functions.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f421950f

Btrfs: Use a mutex in the extent buffer for tree block locking · a61e6f29

由 Chris Mason 提交于 7月 22, 2008

This replaces the use of the page cache lock bit for locking, which wasn't
suitable for block size < page size and couldn't be used recursively.

The mutexes alone don't fix either problem, but they are the first step.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a61e6f29

Btrfs: Index extent buffers in an rbtree · 6af118ce

由 Chris Mason 提交于 7月 22, 2008

Before, extent buffers were a temporary object, meant to map a number of pages
at once and collect operations on them.

But, a few extra fields have crept in, and they are also the best place to
store a per-tree block lock field as well.  This commit puts the extent
buffers into an rbtree, and ensures a single extent buffer for each
tree block.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6af118ce

Btrfs: Data ordered fixes · 4a096752

由 Chris Mason 提交于 7月 21, 2008

* In btrfs_delete_inode, wait for ordered extents after calling
truncate_inode_pages.  This is much faster, and more correct

* Properly clear our the PageChecked bit everywhere we redirty the page.

* Change the writepage fixup handler to lock the page range and check to
see if an ordered extent had been inserted since the improperly dirtied
page was discovered

* Wait for ordered extents outside the transaction.  This isn't required
for locking rules but does improve transaction latencies

* Reduce contention on the alloc_mutex by dropping it while incrementing
refs on a node/leaf and while dropping refs on a leaf.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4a096752

C
Fix btrfs_wait_ordered_extent_range to properly wait · e5a2217e
由 Chris Mason 提交于 7月 18, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
e5a2217e

Btrfs: Keep extent mappings in ram until pending ordered extents are done · 7f3c74fb

由 Chris Mason 提交于 7月 18, 2008

It was possible for stale mappings from disk to be used instead of the
new pending ordered extent. This adds a flag to the extent map struct
to keep it pinned until the pending ordered extent is actually on disk.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7f3c74fb

C
Btrfs: Don't allow releasepage to succeed if EXTENT_ORDERED is set · 211f90e6
由 Chris Mason 提交于 7月 18, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
211f90e6

Btrfs: Handle data checksumming on bios that span multiple ordered extents · 3edf7d33

由 Chris Mason 提交于 7月 18, 2008

Data checksumming is done right before the bio is sent down the IO stack,
which means a single bio might span more than one ordered extent. In
this case, the checksumming data is split between two ordered extents.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3edf7d33

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功