提交 · 247e743cbe6e655768c3679f84821e03c1577902 · openeuler / Kernel

25 9月, 2008 40 次提交

Btrfs: Use async helpers to deal with pages that have been improperly dirtied · 247e743c

由 Chris Mason 提交于 16年前

Higher layers sometimes call set_page_dirty without asking the filesystem
to help. This causes many problems for the data=ordered and cow code.
This commit detects pages that haven't been properly setup for IO and
kicks off an async helper to deal with them.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

247e743c

Btrfs: New data=ordered implementation · e6dcd2dc

由 Chris Mason 提交于 16年前

The old data=ordered code would force commit to wait until
all the data extents from the transaction were fully on disk.  This
introduced large latencies into the commit and stalled new writers
in the transaction for a long time.

The new code changes the way data allocations and extents work:

* When delayed allocation is filled, data extents are reserved, and
  the extent bit EXTENT_ORDERED is set on the entire range of the extent.
  A struct btrfs_ordered_extent is allocated an inserted into a per-inode
  rbtree to track the pending extents.

* As each page is written EXTENT_ORDERED is cleared on the bytes corresponding
  to that page.

* When all of the bytes corresponding to a single struct btrfs_ordered_extent
  are written, The previously reserved extent is inserted into the FS
  btree and into the extent allocation trees.  The checksums for the file
  data are also updated.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e6dcd2dc

C
Btrfs: Drop some verbose printks · 77a41afb
由 Chris Mason 提交于 16年前
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
77a41afb
C
Btrfs: Add locking around volume management (device add/remove/balance) · 7d9eb12c
由 Chris Mason 提交于 16年前
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
7d9eb12c

Btrfs: Fix deadlock while searching for dead roots on mount · a7a16fd7

由 Chris Mason 提交于 16年前

btrfs_find_dead_roots called btrfs_read_fs_root_no_radix, which
means we end up calling btrfs_search_slot with a path already held.

The fix is to remember the key inside btrfs_find_dead_roots and drop
the path.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a7a16fd7

Btrfs: Reduce contention on the root node · f9efa9c7

由 Chris Mason 提交于 16年前

This calls unlock_up sooner in btrfs_search_slot in order to decrease the
amount of work done with the higher level tree locks held.

Also, it changes btrfs_tree_lock to spin for a big against the page lock
before scheduling. This makes a big difference in context switch rate under
highly contended workloads.

Longer term, a better locking structure is needed than the page lock.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f9efa9c7

Btrfs: Online btree defragmentation fixes · 3f157a2f

由 Chris Mason 提交于 16年前

The btree defragger wasn't making forward progress because the new key wasn't
being saved by the btrfs_search_forward function.

This also disables the automatic btree defrag, it wasn't scaling well to
huge filesystems. The auto-defrag needs to be done differently.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3f157a2f

C
Btrfs: Add a per-inode csum mutex to avoid races creating csum items · 1b1e2135
由 Chris Mason 提交于 16年前
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
1b1e2135

Btrfs: Change find_extent_buffer to use TestSetPageLocked · 079899c2

由 Chris Mason 提交于 16年前

This makes it possible for callers to check for extent_buffers in cache
without deadlocking against any btree locks held.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

079899c2

Btrfs: Add btree locking to the tree defragmentation code · e7a84565

由 Chris Mason 提交于 16年前

The online btree defragger is simplified and rewritten to use
standard btree searches instead of a walk up / down mechanism.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e7a84565

Btrfs: Replace the transaction work queue with kthreads · a74a4b97

由 Chris Mason 提交于 16年前

This creates one kthread for commits and one kthread for
deleting old snapshots.  All the work queues are removed.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a74a4b97

Add btrfs_end_transaction_throttle to force writers to wait for pending commits · 89ce8a63

由 Chris Mason 提交于 16年前

The existing throttle mechanism was often not sufficient to prevent
new writers from coming in and making a given transaction run forever.
This adds an explicit wait at the end of most operations so they will
allow the current transaction to close.

There is no wait inside file_write, inode updates, or cow filling, all which
have different deadlock possibilities.

This is a temporary measure until better asynchronous commit support is
added.  This code leads to stalls as it waits for data=ordered
writeback, and it really needs to be fixed.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

89ce8a63

C
Btrfs: Fix snapshot deletion to release the alloc_mutex much more often. · 333db94c
由 Chris Mason 提交于 16年前
```
This lowers the impact of snapshot deletion on the rest of the FS.
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
333db94c

Btrfs: Add a skip_locking parameter to struct path, and make various funcs honor it · 5cd57b2c

由 Chris Mason 提交于 16年前

Allocations may need to read in block groups from the extent allocation tree,
which will require a tree search and take locks on the extent allocation
tree.  But, those locks might already be held in other places, leading
to deadlocks.

Since the alloc_mutex serializes everything right now, it is safe to
skip the btree locking while caching block groups.  A better fix will be
to either create a recursive lock or find a way to back off existing
locks while caching block groups.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5cd57b2c

C
Fix btrfs_next_leaf to check for new items after dropping locks · 168fd7d2
由 Chris Mason 提交于 16年前
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
168fd7d2

Fix btrfs_del_ordered_inode to allow forcing the drop during unlinks · 594a24eb

由 Chris Mason 提交于 16年前

This allows us to delete an unlinked inode with dirty pages from the list
instead of forcing commit to write these out before deleting the inode.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

594a24eb

Drop locks in btrfs_search_slot when reading a tree block. · 051e1b9f

由 Chris Mason 提交于 16年前

One lock per btree block can make for significant congestion if everyone
has to wait for IO at the high levels of the btree. This drops
locks held by a path when doing reads during a tree search.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

051e1b9f

Btrfs: Replace the big fs_mutex with a collection of other locks · a2135011

由 Chris Mason 提交于 16年前

Extent alloctions are still protected by a large alloc_mutex.
Objectid allocations are covered by a objectid mutex
Other btree operations are protected by a lock on individual btree nodes
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a2135011

Btrfs: Start btree concurrency work. · 925baedd

由 Chris Mason 提交于 16年前

The allocation trees and the chunk trees are serialized via their own
dedicated mutexes.  This means allocation location is still not very
fine grained.

The main FS btree is protected by locks on each block in the btree.  Locks
are taken top / down, and as processing finishes on a given level of the
tree, the lock is released after locking the lower level.

The end result of a search is now a path where only the lowest level
is locked.  Releasing or freeing the path drops any locks held.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

925baedd

Btrfs: Add a thread pool just for submit_bio · 1cc127b5

由 Chris Mason 提交于 16年前

If a bio submission is after a lock holder waiting for the bio
on the work queue, it is possible to deadlock.  Move the bios
into their own pool.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1cc127b5

BTRFS_IOC_TRANS_START should be privilegued · df5b5520

由 Christoph Hellwig 提交于 16年前

As mentioned in the comment next to it btrfs_ioctl_trans_start can
do bad damage to filesystems and thus should be limited to privilegued
users.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

df5b5520

Btrfs: split out ioctl.c · f46b5a66

由 Christoph Hellwig 提交于 16年前

Split the ioctl handling out of inode.c into a file of it's own.
Also fix up checkpatch.pl warnings for the moved code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f46b5a66

Btrfs: kerneldoc comments for extent_map.c · 9d2423c5

由 Christoph Hellwig 提交于 16年前

Add kerneldoc comments for all exported functions.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9d2423c5

Btrfs: Add a mount option to control worker thread pool size · 4543df7e

由 Chris Mason 提交于 16年前

mount -o thread_pool_size changes the default, which is
min(num_cpus + 2, 8).  Larger thread pools would make more sense on
very large disk arrays.

This mount option controls the max size of each thread pool.  There
are multiple thread pools, so the total worker count will be larger
than the mount option.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4543df7e

Btrfs: Worker thread optimizations · 35d8ba66

由 Chris Mason 提交于 16年前

This changes the worker thread pool to maintain a list of idle threads,
avoiding a complex search for a good thread to wake up.

Threads have two states:

idle - we try to reuse the last thread used in hopes of improving the batching
ratios

busy - each time a new work item is added to a busy task, the task is
rotated to the end of the line.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

35d8ba66

C
Btrfs: Add backport for the kthread work on kernels older than 2.6.20 · d05e5a4d
由 Chris Mason 提交于 16年前
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
d05e5a4d

Btrfs: Fix mount -o max_inline=0 · 15ada040

由 Chris Mason 提交于 16年前

max_inline=0 used to force the max_inline size to one sector instead.  Now
it properly disables inline data items, while still being able to read
any that happen to exist on disk.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

15ada040

Btrfs: Add async worker threads for pre and post IO checksumming · 8b712842

由 Chris Mason 提交于 16年前

Btrfs has been using workqueues to spread the checksumming load across
other CPUs in the system.  But, workqueues only schedule work on the
same CPU that queued the work, giving them a limited benefit for systems with
higher CPU counts.

This code adds a generic facility to schedule work with pools of kthreads,
and changes the bio submission code to queue bios up.  The queueing is
important to make sure large numbers of procs on the system don't
turn streaming workloads into random workloads by sending IO down
concurrently.

The end result of all of this is much higher performance (and CPU usage) when
doing checksumming on large machines.  Two worker pools are created,
one for writes and one for endio processing.  The two could deadlock if
we tried to service both from a single pool.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8b712842

btrfs: allow scanning multiple devices during mount · 43e570b0

由 Christoph Hellwig 提交于 16年前

Allows to specify one or multiple device=/dev/foo options during mount
so that ioctls on the control device can be avoided.  Especially useful
when trying to mount a multi-device setup as root.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

43e570b0

btrfs: sanity mount option parsing and early mount code · edf24abe

由 Christoph Hellwig 提交于 16年前

Also adds lots of comments to describe what's going on here.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

edf24abe

C
btrfs: fix strange indentation in lookup_extent_mapping · 306929f3
由 Christoph Hellwig 提交于 16年前
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
306929f3

btrfs: tiny makefile cleanup · 95c9eb17

由 Christoph Hellwig 提交于 16年前

use normal kbuild syntax to build acl.o conditinally and remove comment
out lines.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

95c9eb17

Btrfs: transaction ioctls · 6bf13c0c

由 Sage Weil 提交于 16年前

These ioctls let a user application hold a transaction open while it
performs a series of operations.  A final ioctl does a sync on the fs
(closing the current transaction).  This is the main requirement for
Ceph's OSD to be able to keep the data it's storing in a btrfs volume
consistent, and AFAICS it works just fine.  The application would do
something like

	fd = ::open("some/file", O_RDONLY);
	::ioctl(fd, BTRFS_IOC_TRANS_START);
	/* do a bunch of stuff */
	::ioctl(fd, BTRFS_IOC_TRANS_END);
or just
	::close(fd);

And to ensure it commits to disk,

	::ioctl(fd, BTRFS_IOC_SYNC);

When a transaction is held open, the trans_handle is attached to the
struct file (via private_data) so that it will get cleaned up if the
process dies unexpectedly.  A held transaction is also ended on fsync() to
avoid a deadlock.

A misbehaving application could also deliberately hold a transaction open,
effectively locking up the FS, so it may make sense to restrict something
like this to root or something.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6bf13c0c

Btrfs: Dislable acl xattr handlers · eba12c7b

由 Yan 提交于 16年前

The acl code is not yet complete, and the xattr handlers are causing
problems for cp -p on some distros.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

eba12c7b

J
Btrfs: bdi_init and bdi_destroy come with 2.6.23 · 51ebc0d3
由 Jan Engelhardt 提交于 16年前
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
51ebc0d3

btrfsctl -A error code fixup · f819d837

由 Linda Knippers 提交于 16年前

Send the error back to userland if the ioctl fails
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f819d837

Btrfs: Invalidate dcache entry after creating snapshot and · 3b96362c

由 Sven Wegener 提交于 16年前

We need to invalidate an existing dcache entry after creating a new
snapshot or subvolume, because a negative dache entry will stop us from
accessing the new snapshot or subvolume.

---
  ctree.h       |   23 +++++++++++++++++++++++
  inode.c       |    4 ++++
  transaction.c |    4 ++++
  3 files changed, 31 insertions(+)
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3b96362c

Btrfs: Fix race in running_transaction checks · 48ec2cf8

由 Chris Mason 提交于 16年前

When a new transaction was started, the code would incorrectly
set the pointer in fs_info before all the data structures were setup.
fsync heavy workloads hit races on the setup of the ordered inode spinlock
Signed-off-by: NChris Mason <chris.mason@oracle.com>

48ec2cf8

btrfs delete ordered inode handling fix · e1b81e67

由 Mingming 提交于 16年前

Use btrfs_release_file instead of a put_inode call
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e1b81e67

Btrfs: Always use the async submission queue for checksummed writes · da496f2a

由 Chris Mason 提交于 16年前

This avoids IO stalls and poorly ordered IO from inline writers mixing in
with the async submission queue
Signed-off-by: NChris Mason <chris.mason@oracle.com>

da496f2a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功