提交 · 8b7128429235d9bd72cfd5ed20c77c4f3118f744 · openanolis / cloud-kernel

25 9月, 2008 40 次提交

Btrfs: Add async worker threads for pre and post IO checksumming · 8b712842

由 Chris Mason 提交于 6月 11, 2008

Btrfs has been using workqueues to spread the checksumming load across
other CPUs in the system.  But, workqueues only schedule work on the
same CPU that queued the work, giving them a limited benefit for systems with
higher CPU counts.

This code adds a generic facility to schedule work with pools of kthreads,
and changes the bio submission code to queue bios up.  The queueing is
important to make sure large numbers of procs on the system don't
turn streaming workloads into random workloads by sending IO down
concurrently.

The end result of all of this is much higher performance (and CPU usage) when
doing checksumming on large machines.  Two worker pools are created,
one for writes and one for endio processing.  The two could deadlock if
we tried to service both from a single pool.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8b712842

btrfs: sanity mount option parsing and early mount code · edf24abe

由 Christoph Hellwig 提交于 6月 10, 2008

Also adds lots of comments to describe what's going on here.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

edf24abe

Btrfs: transaction ioctls · 6bf13c0c

由 Sage Weil 提交于 6月 10, 2008

These ioctls let a user application hold a transaction open while it
performs a series of operations.  A final ioctl does a sync on the fs
(closing the current transaction).  This is the main requirement for
Ceph's OSD to be able to keep the data it's storing in a btrfs volume
consistent, and AFAICS it works just fine.  The application would do
something like

	fd = ::open("some/file", O_RDONLY);
	::ioctl(fd, BTRFS_IOC_TRANS_START);
	/* do a bunch of stuff */
	::ioctl(fd, BTRFS_IOC_TRANS_END);
or just
	::close(fd);

And to ensure it commits to disk,

	::ioctl(fd, BTRFS_IOC_SYNC);

When a transaction is held open, the trans_handle is attached to the
struct file (via private_data) so that it will get cleaned up if the
process dies unexpectedly.  A held transaction is also ended on fsync() to
avoid a deadlock.

A misbehaving application could also deliberately hold a transaction open,
effectively locking up the FS, so it may make sense to restrict something
like this to root or something.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6bf13c0c

Btrfs: Invalidate dcache entry after creating snapshot and · 3b96362c

由 Sven Wegener 提交于 6月 09, 2008

We need to invalidate an existing dcache entry after creating a new
snapshot or subvolume, because a negative dache entry will stop us from
accessing the new snapshot or subvolume.

---
  ctree.h       |   23 +++++++++++++++++++++++
  inode.c       |    4 ++++
  transaction.c |    4 ++++
  3 files changed, 31 insertions(+)
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3b96362c

Btrfs: Allocator fix variety pack · 0ef3e66b

由 Chris Mason 提交于 5月 24, 2008

* Force chunk allocation when find_free_extent has to do a full scan
* Record the max key at the start of defrag so it doesn't run forever
* Block groups might not be contiguous, make a forward search for the
  next block group in extent-tree.c
* Get rid of extra checks for total fs size
* Fix relocate_one_reference to avoid relocating the same file data block
  twice when referenced by an older transaction
* Use the open device count when allocating chunks so that we don't
  try to allocate from devices that don't exist
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0ef3e66b

Btrfs: Change the congestion functions to meter the number of async submits as well · cb03c743

由 Chris Mason 提交于 5月 15, 2008

The async submit workqueue was absorbing too many requests, leading to long
stalls where the async submitters were stalling.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

cb03c743

C
Btrfs: Add mount -o degraded to allow mounts to continue with missing devices · dfe25020
由 Chris Mason 提交于 5月 13, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
dfe25020

Btrfs: Update nodatacow mode to support cloned single files and resizing · a68d5933

由 Chris Mason 提交于 5月 08, 2008

Before, nodatacow only checked to make sure multiple roots didn't have
references on a single extent.  This check makes sure that multiple
inodes don't have references.

nodatacow needed an extra check to see if the block group was currently
readonly.  This way cows forced by the chunk relocation code are honored.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a68d5933

C
Btrfs: Properly find the root for snapshotted blocks during chunk relocation · bf4ef679
由 Chris Mason 提交于 5月 08, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
bf4ef679

Btrfs: Add support for online device removal · a061fc8d

由 Chris Mason 提交于 5月 07, 2008

This required a few structural changes to the code that manages bdev pointers:

The VFS super block now gets an anon-bdev instead of a pointer to the
lowest bdev.  This allows us to avoid swapping the super block bdev pointer
around at run time.

The code to read in the super block no longer goes through the extent
buffer interface.  Things got ugly keeping the mapping constant.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a061fc8d

Btrfs: Clone file data ioctl · f2eb0a24

由 Sage Weil 提交于 5月 02, 2008

Add a new ioctl to clone file data
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f2eb0a24

C
Btrfs: Add balance ioctl to restripe the chunks · ec44a35c
由 Chris Mason 提交于 4月 28, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
ec44a35c
C
Btrfs: Add new ioctl to add devices · 788f20eb
由 Chris Mason 提交于 4月 28, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
788f20eb
C
Btrfs: Make the resizer work based on shrinking and growing devices · 8f18cf13
由 Chris Mason 提交于 4月 25, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8f18cf13
C
Btrfs: Add support for labels in the super block · 7ae9c09d
由 Chris Mason 提交于 4月 18, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
7ae9c09d
C
Btrfs: Check device uuids along with devids · a443755f
由 Chris Mason 提交于 4月 18, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
a443755f

Btrfs: Write bio checksumming outside the FS mutex · e015640f

由 Chris Mason 提交于 4月 16, 2008

This significantly improves streaming write performance by allowing
concurrency in the data checksumming.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e015640f

Btrfs: Create a work queue for bio writes · 44b8bd7e

由 Chris Mason 提交于 4月 16, 2008

This allows checksumming to happen in parallel among many cpus, and
keeps us from bogging down pdflush with the checksumming code.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

44b8bd7e

C
Btrfs: Add RAID10 support · 321aecc6
由 Chris Mason 提交于 4月 16, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
321aecc6

Btrfs: Add chunk uuids and update multi-device back references · e17cade2

由 Chris Mason 提交于 4月 15, 2008

Block headers now store the chunk tree uuid

Chunk items records the device uuid for each stripes

Device extent items record better back refs to the chunk tree

Block groups record better back refs to the chunk tree

The chunk tree format has also changed.  The objectid of BTRFS_CHUNK_ITEM_KEY
used to be the logical offset of the chunk.  Now it is a chunk tree id,
with the logical offset being stored in the offset field of the key.

This allows a single chunk tree to record multiple logical address spaces,
upping the number of bytes indexed by a chunk tree from 2^64 to
2^128.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e17cade2

Add a min size parameter to btrfs_alloc_extent · 98d20f67

由 Chris Mason 提交于 4月 14, 2008

On huge machines, delayed allocation may try to allocate massive extents.
This change allows btrfs_alloc_extent to return something smaller than
the caller asked for, and the data allocation routines will loop over
the allocations until it fills the whole delayed alloc.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

98d20f67

Btrfs: Do metadata checksums for reads via a workqueue · ce9adaa5

由 Chris Mason 提交于 4月 09, 2008

Before, metadata checksumming was done by the callers of read_tree_block,
which would set EXTENT_CSUM bits in the extent tree to show that a given
range of pages was already checksummed and didn't need to be verified
again.

But, those bits could go away via try_to_releasepage, and the end
result was bogus checksum failures on pages that never left the cache.

The new code validates checksums when the page is read.  It is a little
tricky because metadata blocks can span pages and a single read may
end up going via multiple bios.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ce9adaa5

C
Btrfs: Fix allocation profile init · d18a2c44
由 Chris Mason 提交于 4月 04, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
d18a2c44
C
Btrfs: Add support for duplicate blocks on a single spindle · 611f0e00
由 Chris Mason 提交于 4月 03, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
611f0e00
C
Btrfs: Add support for mirroring across drives · 8790d502
由 Chris Mason 提交于 4月 03, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8790d502

Reorder the flags field in struct btrfs_header and record a flag on writeout · 63b10fc4

由 Chris Mason 提交于 4月 01, 2008

This allows detection of blocks that have already been written in the
running transaction so they can be recowed instead of modified again.
It is step one in trusting the transid field of the block pointers.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

63b10fc4

Create a btrfs backing dev info · 04160088

由 Chris Mason 提交于 3月 26, 2008

This allows intelligent versions of unplug and congestion functions
Signed-off-by: NChris Mason <chris.mason@oracle.com>

04160088

C
Btrfs: Implement raid0 when multiple devices are present · 593060d7
由 Chris Mason 提交于 3月 25, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
593060d7
C
Btrfs: Add support for device scanning and detection ioctls · 8a4b83cc
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8a4b83cc
C
Btrfs: Bring back mount -o ssd optimizations · 239b14b3
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
239b14b3
C
Btrfs: Move device information into the super block so it can be scanned · 0d81ba5d
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
0d81ba5d
C
Btrfs: Make the FS tree the last objectid in the tree of tree roots · e085def2
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
e085def2
C
Btrfs: Dynamic chunk and block group allocation · 6324fbf3
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
6324fbf3
C
Btrfs: Add support for multiple devices per filesystem · 0b86a832
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
0b86a832

Btrfs: checksum file data at bio submission time instead of during writepage · 065631f6

由 Chris Mason 提交于 2月 20, 2008

When we checkum file data during writepage, the checksumming is done one
page at a time, making it difficult to do bulk metadata modifications
to insert checksums for large ranges of the file at once.

This patch changes btrfs to checksum on a per-bio basis instead. The
bios are checksummed before they are handed off to the block layer, so
each bio is contiguous and only has pages from the same inode.

Checksumming on a bio basis allows us to insert and modify the file
checksum items in large groups. It also allows the checksumming to
be done more easily by async worker threads.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

065631f6

Btrfs: unaligned access fixes · df68b8a7

由 David Miller 提交于 2月 15, 2008

Btrfs set/get macros lose type information needed to avoid
unaligned accesses on sparc64.
ere is a patch for the kernel bits which fixes most of the
unaligned accesses on sparc64.

btrfs_name_hash is modified to return the hash value instead
of getting a return location via a (potentially unaligned)
pointer.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

df68b8a7

Btrfs: Fix i_blocks accounting · 9069218d

由 Chris Mason 提交于 2月 08, 2008

Now that delayed allocation accounting works, i_blocks accounting is changed
to only modify i_blocks when extents inserted or removed.

The fillattr call is changed to include the delayed allocation byte count
in the i_blocks result.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9069218d

C
Btrfs: Update magic · 47b0c4f8
由 Chris Mason 提交于 2月 04, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
47b0c4f8
C
Btrfs: Add data block hints to SSD mode too · 4529ba49
由 Chris Mason 提交于 1月 31, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
4529ba49
C
Btrfs: mount -o max_inline=size to control the maximum inline extent size · 6f568d35
由 Chris Mason 提交于 1月 29, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
6f568d35

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功