提交 · 7d9eb12c8739e7dc80c78c6b3596f912ecd8f941 · openanolis / cloud-kernel

25 9月, 2008 40 次提交

C
Btrfs: Add locking around volume management (device add/remove/balance) · 7d9eb12c
由 Chris Mason 提交于 7月 08, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
7d9eb12c

Btrfs: Online btree defragmentation fixes · 3f157a2f

由 Chris Mason 提交于 6月 25, 2008

The btree defragger wasn't making forward progress because the new key wasn't
being saved by the btrfs_search_forward function.

This also disables the automatic btree defrag, it wasn't scaling well to
huge filesystems. The auto-defrag needs to be done differently.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3f157a2f

Btrfs: Change find_extent_buffer to use TestSetPageLocked · 079899c2

由 Chris Mason 提交于 6月 25, 2008

This makes it possible for callers to check for extent_buffers in cache
without deadlocking against any btree locks held.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

079899c2

Btrfs: Add btree locking to the tree defragmentation code · e7a84565

由 Chris Mason 提交于 6月 25, 2008

The online btree defragger is simplified and rewritten to use
standard btree searches instead of a walk up / down mechanism.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e7a84565

Btrfs: Replace the transaction work queue with kthreads · a74a4b97

由 Chris Mason 提交于 6月 25, 2008

This creates one kthread for commits and one kthread for
deleting old snapshots.  All the work queues are removed.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a74a4b97

C
Btrfs: Fix snapshot deletion to release the alloc_mutex much more often. · 333db94c
由 Chris Mason 提交于 6月 25, 2008
```
This lowers the impact of snapshot deletion on the rest of the FS.
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
333db94c

Btrfs: Add a skip_locking parameter to struct path, and make various funcs honor it · 5cd57b2c

由 Chris Mason 提交于 6月 25, 2008

Allocations may need to read in block groups from the extent allocation tree,
which will require a tree search and take locks on the extent allocation
tree.  But, those locks might already be held in other places, leading
to deadlocks.

Since the alloc_mutex serializes everything right now, it is safe to
skip the btree locking while caching block groups.  A better fix will be
to either create a recursive lock or find a way to back off existing
locks while caching block groups.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5cd57b2c

Drop locks in btrfs_search_slot when reading a tree block. · 051e1b9f

由 Chris Mason 提交于 6月 25, 2008

One lock per btree block can make for significant congestion if everyone
has to wait for IO at the high levels of the btree. This drops
locks held by a path when doing reads during a tree search.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

051e1b9f

Btrfs: Replace the big fs_mutex with a collection of other locks · a2135011

由 Chris Mason 提交于 6月 25, 2008

Extent alloctions are still protected by a large alloc_mutex.
Objectid allocations are covered by a objectid mutex
Other btree operations are protected by a lock on individual btree nodes
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a2135011

Btrfs: Start btree concurrency work. · 925baedd

由 Chris Mason 提交于 6月 25, 2008

The allocation trees and the chunk trees are serialized via their own
dedicated mutexes.  This means allocation location is still not very
fine grained.

The main FS btree is protected by locks on each block in the btree.  Locks
are taken top / down, and as processing finishes on a given level of the
tree, the lock is released after locking the lower level.

The end result of a search is now a path where only the lowest level
is locked.  Releasing or freeing the path drops any locks held.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

925baedd

Btrfs: Allocator fix variety pack · 0ef3e66b

由 Chris Mason 提交于 5月 24, 2008

* Force chunk allocation when find_free_extent has to do a full scan
* Record the max key at the start of defrag so it doesn't run forever
* Block groups might not be contiguous, make a forward search for the
  next block group in extent-tree.c
* Get rid of extra checks for total fs size
* Fix relocate_one_reference to avoid relocating the same file data block
  twice when referenced by an older transaction
* Use the open device count when allocating chunks so that we don't
  try to allocate from devices that don't exist
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0ef3e66b

Btrfs: Handle write errors on raid1 and raid10 · 1259ab75

由 Chris Mason 提交于 5月 12, 2008

When duplicate copies exist, writes are allowed to fail to one of those
copies.  This changeset includes a few changes that allow the FS to
continue even when some IOs fail.

It also adds verification of the parent generation number for btree blocks.
This generation is stored in the pointer to a block, and it ensures
that missed writes to are detected.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1259ab75

C
Btrfs: Pass down the expected generation number when reading tree blocks · ca7a79ad
由 Chris Mason 提交于 5月 12, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
ca7a79ad
C
Btrfs: Chunk relocation fine tuning, and add a few printks to show progress · 323da79c
由 Chris Mason 提交于 5月 09, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
323da79c

Btrfs: A number of nodatacow fixes · bbaf549e

由 Chris Mason 提交于 5月 08, 2008

Once part of a delalloc request fails the cow checks, just cow the
entire range

It is possible for the back references to all be from the same root,
but still have snapshots against an extent.  The checks are now more strict,
forcing cow any time there are multiple refs against the data extent.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

bbaf549e

Btrfs: Update nodatacow mode to support cloned single files and resizing · a68d5933

由 Chris Mason 提交于 5月 08, 2008

Before, nodatacow only checked to make sure multiple roots didn't have
references on a single extent.  This check makes sure that multiple
inodes don't have references.

nodatacow needed an extra check to see if the block group was currently
readonly.  This way cows forced by the chunk relocation code are honored.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a68d5933

C
Btrfs: Properly find the root for snapshotted blocks during chunk relocation · bf4ef679
由 Chris Mason 提交于 5月 08, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
bf4ef679

Btrfs: Add support for online device removal · a061fc8d

由 Chris Mason 提交于 5月 07, 2008

This required a few structural changes to the code that manages bdev pointers:

The VFS super block now gets an anon-bdev instead of a pointer to the
lowest bdev.  This allows us to avoid swapping the super block bdev pointer
around at run time.

The code to read in the super block no longer goes through the extent
buffer interface.  Things got ugly keeping the mapping constant.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a061fc8d

C
Btrfs: Deal with failed writes in mirrored configurations · a236aed1
由 Chris Mason 提交于 4月 29, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
a236aed1
C
Btrfs: Add balance ioctl to restripe the chunks · ec44a35c
由 Chris Mason 提交于 4月 28, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
ec44a35c
C
Btrfs: Do more optimal file RA during shrinking and defrag · 8e7bf94f
由 Chris Mason 提交于 4月 28, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8e7bf94f
C
Btrfs: Avoid recursive chunk allocations · 3bf3d9e9
由 Chris Mason 提交于 4月 26, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
3bf3d9e9
C
Btrfs: Make the resizer work based on shrinking and growing devices · 8f18cf13
由 Chris Mason 提交于 4月 25, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8f18cf13

Btrfs: Fix balance_level to free the middle block if there is room in the left one · bce4eae9

由 Chris Mason 提交于 4月 24, 2008

balance level starts by trying to empty the middle block, and then
pushes from the right to the middle.  This might empty the right block
and leave a small number of pointers in the middle.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

bce4eae9

C
Btrfs: Simplify device selection for mirrored reads · 3c12ac72
由 Chris Mason 提交于 4月 21, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
3c12ac72

Btrfs: Use the extent map cache to find the logical disk block during data retries · 3b951516

由 Chris Mason 提交于 4月 17, 2008

The data read retry code needs to find the logical disk block before it
can resubmit new bios. But, finding this block isn't allowed to take
the fs_mutex because that will deadlock with a number of different callers.

This changes the retry code to use the extent map cache instead, but
that requires the extent map cache to have the extent we're looking for.
This is a problem because btrfs_drop_extent_cache just drops the entire
extent instead of the little tiny part it is invalidating.

The bulk of the code in this patch changes btrfs_drop_extent_cache to
invalidate only a portion of the extent cache, and changes btrfs_get_extent
to deal with the results.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3b951516

Btrfs: Don't wait on tree block writeback before freeing them anymore · 699122f5

由 Chris Mason 提交于 4月 16, 2008

This isn't required anymore because we don't reallocate blocks that
have already been written in this transaction.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

699122f5

C
Btrfs: Add RAID10 support · 321aecc6
由 Chris Mason 提交于 4月 16, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
321aecc6

Btrfs: Add chunk uuids and update multi-device back references · e17cade2

由 Chris Mason 提交于 4月 15, 2008

Block headers now store the chunk tree uuid

Chunk items records the device uuid for each stripes

Device extent items record better back refs to the chunk tree

Block groups record better back refs to the chunk tree

The chunk tree format has also changed.  The objectid of BTRFS_CHUNK_ITEM_KEY
used to be the logical offset of the chunk.  Now it is a chunk tree id,
with the logical offset being stored in the offset field of the key.

This allows a single chunk tree to record multiple logical address spaces,
upping the number of bytes indexed by a chunk tree from 2^64 to
2^128.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e17cade2

Add a min size parameter to btrfs_alloc_extent · 98d20f67

由 Chris Mason 提交于 4月 14, 2008

On huge machines, delayed allocation may try to allocate massive extents.
This change allows btrfs_alloc_extent to return something smaller than
the caller asked for, and the data allocation routines will loop over
the allocations until it fills the whole delayed alloc.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

98d20f67

Btrfs: Endianess bug fix for v0.13 with kernels · a5eb62e3

由 Miguel 提交于 4月 11, 2008

Fix for a endianess BUG when using btrfs v0.13 with kernels older than 2.6.23

Problem:

Has of v0.13, btrfs-progs is using crc32c.c equivalent to the one found on
linux-2.6.23/lib/libcrc32c.c Since crc32c_le() changed in linux-2.6.23, when
running btrfs v0.13 with older kernels we have a missmatch between the versions
of crc32c_le() from btrfs-progs and libcrc32c in the kernel.  This missmatch
causes a bug when using btrfs on big endian machines.

Solution:
btrfs_crc32c() macro that when compiling for kernels older than 2.6.23, does
endianess conversion to parameters and return value of crc32c().
This endianess conversion nullifies the differences in implementation
of crc32c_le().
If kernel 2.6.23 or better, it calls crc32c().
Signed-off-by: NMiguel Sousa Filipe <miguel.filipe@gmail.com>
---
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a5eb62e3

Btrfs: Do metadata checksums for reads via a workqueue · ce9adaa5

由 Chris Mason 提交于 4月 09, 2008

Before, metadata checksumming was done by the callers of read_tree_block,
which would set EXTENT_CSUM bits in the extent tree to show that a given
range of pages was already checksummed and didn't need to be verified
again.

But, those bits could go away via try_to_releasepage, and the end
result was bogus checksum failures on pages that never left the cache.

The new code validates checksums when the page is read.  It is a little
tricky because metadata blocks can span pages and a single read may
end up going via multiple bios.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ce9adaa5

C
Btrfs: Fix allocation profile init · d18a2c44
由 Chris Mason 提交于 4月 04, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
d18a2c44

Btrfs: Don't allow written blocks from this transaction to be reallocated · 6bc34676

由 Chris Mason 提交于 4月 04, 2008

When a block is freed, it can be immediately reused if it is from
the current transaction. But, an extra check is required to make sure
the block had not been written yet. If it were reused after being written,
the transid in the block header might match the transid of the
next time the block was allocated.

The parent node records the transaction ID of the block it is pointing to,
and this is used as part of validating the block on reads. So, there
can only be one version of a block per transaction.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6bc34676

C
Btrfs: Add support for duplicate blocks on a single spindle · 611f0e00
由 Chris Mason 提交于 4月 03, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
611f0e00
C
Btrfs: Add support for mirroring across drives · 8790d502
由 Chris Mason 提交于 4月 03, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8790d502

Btrfs: Verify checksums on tree blocks found without read_tree_block · 0999df54

由 Chris Mason 提交于 4月 01, 2008

Checksums were only verified by btrfs_read_tree_block, which meant the
functions to probe the page cache for blocks were not validating checksums.
Normally this is fine because the buffers will only be in cache if they
have already been validated.

But, there is a window while the buffer is being read from disk where
it could be up to date in the cache but not yet verified.  This patch
makes sure all buffers go through checksum verification before they
are used.

This is safer, and it prevents modification of buffers before they go
through the csum code.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0999df54

Btrfs: Keep fs_mutex during reads done by snapshot deletion · ecbe2402

由 Chris Mason 提交于 4月 01, 2008

There was an optimization to drop the fs_mutex when doing snapshot deletion
reads, but this can lead to false positives on checksumming errors.  Keep
the lock for now.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ecbe2402

C
Btrfs: Implement raid0 when multiple devices are present · 593060d7
由 Chris Mason 提交于 3月 25, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
593060d7
C
Btrfs: Bring back mount -o ssd optimizations · 239b14b3
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
239b14b3

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功