提交 · cd02dca56442e1504fd6bc5b96f7f1870162b266 · openeuler / Kernel

14 12月, 2010 1 次提交

Btrfs: account for missing devices in RAID allocation profiles · cd02dca5

由 Chris Mason 提交于 12月 13, 2010

When we mount in RAID degraded mode without adding a new device to
replace the failed one, we can end up using the wrong RAID flags for
allocations.

This results in strange combinations of block groups (raid1 in a raid10
filesystem) and corruptions when we try to allocate blocks from single
spindle chunks on drives that are actually missing.

The first device has two small 4MB chunks in it that mkfs creates and
these are usually unused in a raid1 or raid10 setup.  But, in -o degraded,
the allocator will fall back to these because the mask of desired raid groups
isn't correct.

The fix here is to count the missing devices as we build up the list
of devices in the system.  This count is used when picking the
raid level to make sure we continue using the same levels that were
in place before we lost a drive.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

cd02dca5

22 9月, 2009 1 次提交

Btrfs: make balance code choose more wisely when relocating · ba1bf481

由 Josef Bacik 提交于 9月 11, 2009

Currently, we can panic the box if the first block group we go to move is of a
type where there is no space left to move those extents.  For example, if we
fill the disk up with data, and then we try to balance and we have no room to
move the data nor room to allocate new chunks, we will panic.  Change this by
checking to see if we have room to move this chunk around, and if not, return
-ENOSPC and move on to the next chunk.  This will make sure we remove block
groups that are moveable, like if we have alot of empty metadata block groups,
and then that way we make room to be able to balance our data chunks as well.
Tested this with an fs that would panic on btrfs-vol -b normally, but no longer
panics with this patch.

V1->V2:
-actually search for a free extent on the device to make sure we can allocate a
chunk if need be.

-fix btrfs_shrink_device to make sure we actually try to relocate all the
chunks, and then if we can't return -ENOSPC so if we are doing a btrfs-vol -r
we don't remove the device with data still on it.

-check to make sure the block group we are going to relocate isn't the last one
in that particular space

-fix a bug in btrfs_shrink_device where we would change the device's size and
not fix it if we fail to do our relocate
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ba1bf481

11 6月, 2009 1 次提交

Btrfs: avoid races between super writeout and device list updates · e5e9a520

由 Chris Mason 提交于 6月 10, 2009

On multi-device filesystems, btrfs writes supers to all of the devices
before considering a sync complete. There wasn't any additional
locking between super writeout and the device list management code
because device management was done inside a transaction and
super writeout only happened with no transation writers running.

With the btrfs fsync log and other async transaction updates, this
has been racey for some time. This adds a mutex to protect
the device list. The existing volume mutex could not be reused due to
transaction lock ordering requirements.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e5e9a520

10 6月, 2009 1 次提交

Btrfs: autodetect SSD devices · c289811c

由 Chris Mason 提交于 6月 10, 2009

During mount, btrfs will check the queue nonrot flag
for all the devices found in the FS.  If they are all
non-rotating, SSD mode is enabled by default.

If the FS was mounted with -o nossd, the non-rotating
flag is ignored.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c289811c

27 4月, 2009 1 次提交

Btrfs: When shrinking, only update disk size on success · d6397bae

由 Chris Ball 提交于 4月 27, 2009

Previously, we updated a device's size prior to attempting a shrink
operation. This patch moves the device resizing logic to only happen if
the shrink completes successfully. In the process, it introduces a new
field to btrfs_device -- disk_total_bytes -- to track the on-disk size.
Signed-off-by: NChris Ball <cjb@laptop.org>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d6397bae

21 4月, 2009 1 次提交

Btrfs: use WRITE_SYNC for synchronous writes · ffbd517d

由 Chris Mason 提交于 4月 20, 2009

Part of reducing fsync/O_SYNC/O_DIRECT latencies is using WRITE_SYNC for
writes we plan on waiting on in the near future.  This patch
mirrors recent changes in other filesystems and the generic code to
use WRITE_SYNC when WB_SYNC_ALL is passed and to use WRITE_SYNC for
other latency critical writes.

Btrfs uses async worker threads for checksumming before the write is done,
and then again to actually submit the bios.  The bio submission code just
runs a per-device list of bios that need to be sent down the pipe.

This list is split into low priority and high priority lists so the
WRITE_SYNC IO happens first.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ffbd517d

03 4月, 2009 1 次提交
- W
  Btrfs: fix typos in comments · d4a78947
  由 Wu Fengguang 提交于 4月 02, 2009
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
  d4a78947
12 12月, 2008 1 次提交

Btrfs: shared seed device · e4404d6e

由 Yan Zheng 提交于 12月 12, 2008

This patch makes seed device possible to be shared by
multiple mounted file systems. The sharing is achieved
by cloning seed device's btrfs_fs_devices structure.
Thanks you,
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>

e4404d6e

09 12月, 2008 1 次提交

Btrfs: superblock duplication · a512bbf8

由 Yan Zheng 提交于 12月 08, 2008

This patch implements superblock duplication. Superblocks
are stored at offset 16K, 64M and 256G on every devices.
Spaces used by superblocks are preserved by the allocator,
which uses a reverse mapping function to find the logical
addresses that correspond to superblocks. Thank you,
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>

a512bbf8

02 12月, 2008 1 次提交

Btrfs: corret fmode_t annotations · 97288f2c

由 Christoph Hellwig 提交于 12月 02, 2008

Make sure to propagate fmode_t properly and use the right constants for
it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97288f2c

20 11月, 2008 1 次提交

Btrfs: Fixes for 2.6.28-rc API changes · 15916de8

由 Chris Mason 提交于 11月 19, 2008

* open/close_bdev_excl -> open/close_bdev_exclusive
* blkdev_issue_discard takes a GFP mask now
* Fix blkdev_issue_discard usage now that it is enabled
Signed-off-by: NChris Mason <chris.mason@oracle.com>

15916de8

18 11月, 2008 1 次提交

Btrfs: Seed device support · 2b82032c

由 Yan Zheng 提交于 11月 17, 2008

Seed device is a special btrfs with SEEDING super flag
set and can only be mounted in read-only mode. Seed
devices allow people to create new btrfs on top of it.

The new FS contains the same contents as the seed device,
but it can be mounted in read-write mode.

This patch does the following:

1) split code in btrfs_alloc_chunk into two parts. The first part does makes
the newly allocated chunk usable, but does not do any operation that modifies
the chunk tree. The second part does the the chunk tree modifications. This
division is for the bootstrap step of adding storage to the seed device.

2) Update device management code to handle seed device.
The basic idea is: For an FS grown from seed devices, its
seed devices are put into a list. Seed devices are
opened on demand at mounting time. If any seed device is
missing or has been changed, btrfs kernel module will
refuse to mount the FS.

3) make btrfs_find_block_group not return NULL when all
block groups are read-only.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>

2b82032c

25 9月, 2008 21 次提交

Btrfs: Fix the multi-bio code to save the original bio for completion · 7d2b4daa

由 Chris Mason 提交于 8月 05, 2008

The multi-bio code is responsible for duplicating blocks in raid1 and
single spindle duplication.  It has counters to make sure all of
the locations for a given extent are properly written before io completion
is returned to the higher layers.

But, it didn't always complete the same bio it was given, sometimes a
clone was completed instead.  This lead to problems with the async
work queues because they saved a pointer to the bio in a struct off
bi_private.

The fix is to remember the original bio and only complete that one.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7d2b4daa

Btrfs: Add async worker threads for pre and post IO checksumming · 8b712842

由 Chris Mason 提交于 6月 11, 2008

Btrfs has been using workqueues to spread the checksumming load across
other CPUs in the system.  But, workqueues only schedule work on the
same CPU that queued the work, giving them a limited benefit for systems with
higher CPU counts.

This code adds a generic facility to schedule work with pools of kthreads,
and changes the bio submission code to queue bios up.  The queueing is
important to make sure large numbers of procs on the system don't
turn streaming workloads into random workloads by sending IO down
concurrently.

The end result of all of this is much higher performance (and CPU usage) when
doing checksumming on large machines.  Two worker pools are created,
one for writes and one for endio processing.  The two could deadlock if
we tried to service both from a single pool.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8b712842

Fix btrfs_open_devices to deal with changes since the scan ioctls · a0af469b

由 Chris Mason 提交于 5月 13, 2008

Devices can change after the scan ioctls are done, and btrfs_open_devices
needs to be able to verify them as they are opened and used by the FS.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a0af469b

C
Btrfs: Add mount -o degraded to allow mounts to continue with missing devices · dfe25020
由 Chris Mason 提交于 5月 13, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
dfe25020

Btrfs: Add support for online device removal · a061fc8d

由 Chris Mason 提交于 5月 07, 2008

This required a few structural changes to the code that manages bdev pointers:

The VFS super block now gets an anon-bdev instead of a pointer to the
lowest bdev.  This allows us to avoid swapping the super block bdev pointer
around at run time.

The code to read in the super block no longer goes through the extent
buffer interface.  Things got ugly keeping the mapping constant.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a061fc8d

C
Btrfs: Deal with failed writes in mirrored configurations · a236aed1
由 Chris Mason 提交于 4月 29, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
a236aed1
C
Btrfs: Add balance ioctl to restripe the chunks · ec44a35c
由 Chris Mason 提交于 4月 28, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
ec44a35c
C
Btrfs: Add new ioctl to add devices · 788f20eb
由 Chris Mason 提交于 4月 28, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
788f20eb
C
Btrfs: Make the resizer work based on shrinking and growing devices · 8f18cf13
由 Chris Mason 提交于 4月 25, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8f18cf13

Btrfs: Add a special device list for chunk allocations · b3075717

由 Chris Mason 提交于 4月 22, 2008

This allows other code that needs to walk every device in the FS to do so
without locking against allocations.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b3075717

C
Btrfs: Make an unplug function that doesn't unplug every spindle · f2d8d74d
由 Chris Mason 提交于 4月 21, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
f2d8d74d

Btrfs: Add chunk uuids and update multi-device back references · e17cade2

由 Chris Mason 提交于 4月 15, 2008

Block headers now store the chunk tree uuid

Chunk items records the device uuid for each stripes

Device extent items record better back refs to the chunk tree

Block groups record better back refs to the chunk tree

The chunk tree format has also changed.  The objectid of BTRFS_CHUNK_ITEM_KEY
used to be the logical offset of the chunk.  Now it is a chunk tree id,
with the logical offset being stored in the offset field of the key.

This allows a single chunk tree to record multiple logical address spaces,
upping the number of bytes indexed by a chunk tree from 2^64 to
2^128.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e17cade2

C
Btrfs: Write out all super blocks on commit, and bring back proper barrier support · f2984462
由 Chris Mason 提交于 4月 10, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
f2984462
C
Btrfs: Retry metadata reads in the face of checksum failures · f188591e
由 Chris Mason 提交于 4月 09, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
f188591e
C
Change btrfs_map_block to return a structure with mappings for all stripes · cea9e445
由 Chris Mason 提交于 4月 09, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
cea9e445
C
Btrfs: Add support for mirroring across drives · 8790d502
由 Chris Mason 提交于 4月 03, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8790d502
C
Btrfs: Add support for device scanning and detection ioctls · 8a4b83cc
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
8a4b83cc
C
Btrfs: Bring back mount -o ssd optimizations · 239b14b3
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
239b14b3
C
Btrfs: Move device information into the super block so it can be scanned · 0d81ba5d
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
0d81ba5d
C
Btrfs: Dynamic chunk and block group allocation · 6324fbf3
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
6324fbf3
C
Btrfs: Add support for multiple devices per filesystem · 0b86a832
由 Chris Mason 提交于 3月 24, 2008
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
0b86a832

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功