提交 · 80f6c29d8a758650d5c9eac05074b4b3e8c266df · openeuler / Kernel

08 8月, 2010 1 次提交

block: unify flags for struct bio and struct request · 7b6d91da

由 Christoph Hellwig 提交于 8月 07, 2010

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7b6d91da

12 6月, 2010 2 次提交

Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRs · 3140c9a3

由 Dan Carpenter 提交于 5月 29, 2010

btrfs_read_fs_root_no_name() returns ERR_PTRs on error so I added a
check for that.  It's not clear to me if it can also return NULL
pointers or not so I left the original NULL pointer check as is.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3140c9a3

Btrfs: handle kzalloc() failure in open_ctree() · 676e4c86

由 Dan Carpenter 提交于 5月 29, 2010

Unwind and return -ENOMEM if the allocation fails here.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

676e4c86

25 5月, 2010 7 次提交

Btrfs: use async helpers for DIO write checksumming · eaf25d93

由 Chris Mason 提交于 5月 25, 2010

The async helper threads offload crc work onto all the
CPUs, and make streaming writes much faster.  This
changes the O_DIRECT write code to use them.  The only
small complication was that we need to pass in the
logical offset in the file for each bio, because we can't
find it in the bio's pages.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

eaf25d93

Btrfs: Metadata ENOSPC handling for tree log · 4a500fd1

由 Yan, Zheng 提交于 5月 16, 2010

Previous patches make the allocater return -ENOSPC if there is no
unreserved free metadata space. This patch updates tree log code
and various other places to propagate/handle the ENOSPC error.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4a500fd1

Btrfs: Metadata reservation for orphan inodes · d68fc57b

由 Yan, Zheng 提交于 5月 16, 2010

reserve metadata space for handling orphan inodes
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d68fc57b

Btrfs: Introduce global metadata reservation · 8929ecfa

由 Yan, Zheng 提交于 5月 16, 2010

Reserve metadata space for extent tree, checksum tree and root tree
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8929ecfa

Btrfs: Integrate metadata reservation with start_transaction · a22285a6

由 Yan, Zheng 提交于 5月 16, 2010

Besides simplify the code, this change makes sure all metadata
reservation for normal metadata operations are released after
committing transaction.

Changes since V1:

Add code that check if unlink and rmdir will free space.

Add ENOSPC handling for clone ioctl.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a22285a6

Btrfs: Introduce contexts for metadata reservation · f0486c68

由 Yan, Zheng 提交于 5月 16, 2010

Introducing metadata reseravtion contexts has two major advantages.
First, it makes metadata reseravtion more traceable. Second, it can
reclaim freed space and re-add them to the itself after transaction
committed.

Besides add btrfs_block_rsv structure and related helper functions,
This patch contains following changes:

Move code that decides if freed tree block should be pinned into
btrfs_free_tree_block().

Make space accounting more accurate, mainly for handling read only
block groups.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f0486c68

Btrfs: Shrink delay allocated space in a synchronized · 5da9d01b

由 Yan, Zheng 提交于 5月 16, 2010

Shrink delayed allocation space in a synchronized manner is more
controllable than flushing all delay allocated space in an async
thread.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5da9d01b

26 4月, 2010 1 次提交

btrfs: convert to using bdi_setup_and_register() · e6d086d8

由 Jens Axboe 提交于 4月 26, 2010

It's now a provided helper, so get rid of the internal setup
and btrfs atomic_t bdi enumerator.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e6d086d8

31 3月, 2010 2 次提交

Btrfs: kill max_extent mount option · 287a0ab9

由 Josef Bacik 提交于 3月 19, 2010

As Yan pointed out, theres not much reason for all this complicated math to
account for file extents being split up into max_extent chunks, since they are
likely to all end up in the same leaf anyway.  Since there isn't much reason to
use max_extent, just remove the option altogether so we have one less thing we
need to test.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

287a0ab9

Btrfs: fail to mount if we have problems reading the block groups · 1b1d1f66

由 Josef Bacik 提交于 3月 19, 2010

We don't actually check the return value of btrfs_read_block_groups, so we can
possibly succeed to mount, but then fail to say read the superblock xattr for
selinux which will cause the vfs code to deactivate the super.

This is a problem because in find_free_extent we just assume that we
will find the right space_info for the allocation we want. But if we
failed to read the block groups, we won't have setup any space_info's,
and we'll hit a NULL pointer deref in find_free_extent.

This patch fixes that problem by checking the return value of
btrfs_read_block_groups, and failing out properly. I've also added a
check in find_free_extent so if for some reason we don't find an
appropriate space_info, we just return -ENOSPC.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1b1d1f66

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

15 3月, 2010 1 次提交

Btrfs: cache the extent state everywhere we possibly can V2 · 2ac55d41

由 Josef Bacik 提交于 2月 03, 2010

This patch just goes through and fixes everybody that does

lock_extent()
blah
unlock_extent()

to use

lock_extent_bits()
blah
unlock_extent_cached()

and pass around a extent_state so we only have to do the searches once per
function.  This gives me about a 3 mb/s boots on my random write test.  I have
not converted some things, like the relocation and ioctl's, since they aren't
heavily used and the relocation stuff is in the middle of being re-written.  I
also changed the clear_extent_bit() to only unset the cached state if we are
clearing EXTENT_LOCKED and related stuff, so we can do things like this

lock_extent_bits()
clear delalloc bits
unlock_extent_cached()

without losing our cached state.  I tested this thoroughly and turned on
LEAK_DEBUG to make sure we weren't leaking extent states, everything worked out
fine.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2ac55d41

09 3月, 2010 1 次提交

Btrfs: use RB_ROOT to intialize rb_trees instead of setting rb_node to NULL · 6bef4d31

由 Eric Paris 提交于 2月 23, 2010

btrfs inialize rb trees in quite a number of places by settin rb_node =
NULL;  The problem with this is that 17d9ddc7 in the
linux-next tree adds a new field to that struct which needs to be NULL for
the new rbtree library code to work properly.  This patch uses RB_ROOT as
the intializer so all of the relevant fields will be NULL'd.  Without the
patch I get a panic.
Signed-off-by: NEric Paris <eparis@redhat.com>
Acked-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6bef4d31

05 2月, 2010 1 次提交

Btrfs: remove BUG_ON() due to mounting bad filesystem · d7ce5843

由 Miao Xie 提交于 2月 02, 2010

Mounting a bad filesystem caused a BUG_ON(). The following is steps to
reproduce it.
 # mkfs.btrfs /dev/sda2
 # mount /dev/sda2 /mnt
 # mkfs.btrfs /dev/sda1 /dev/sda2
 (the program says that /dev/sda2 was mounted, and then exits. )
 # umount /mnt
 # mount /dev/sda1 /mnt

At the third step, mkfs.btrfs exited in the way of make filesystem. So the
initialization of the filesystem didn't finish. So the filesystem was bad, and
it caused BUG_ON() when mounting it. But BUG_ON() should be called by the wrong
code, not user's operation, so I think it is a bug of btrfs.

This patch fixes it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d7ce5843

29 1月, 2010 1 次提交

Btrfs: run orphan cleanup on default fs root · e3acc2a6

由 Josef Bacik 提交于 1月 26, 2010

This patch revert's commit

6c090a11

Since it introduces this problem where we can run orphan cleanup on a
volume that can have orphan entries re-added.  Instead of my original
fix, Yan Zheng pointed out that we can just revert my original fix and
then run the orphan cleanup in open_ctree after we look up the fs_root.
I have tested this with all the tests that gave me problems and this
patch fixes both problems.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e3acc2a6

18 12月, 2009 2 次提交

Btrfs: Add delayed iput · 24bbcf04

由 Yan, Zheng 提交于 11月 12, 2009

iput() can trigger new transactions if we are dropping the
final reference, so calling it in btrfs_commit_transaction
may end up deadlock. This patch adds delayed iput to avoid
the issue.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

24bbcf04

Btrfs: Avoid orphan inodes cleanup while replaying log · c71bf099

由 Yan, Zheng 提交于 11月 12, 2009

We do log replay in a single transaction, so it's not good to do unbound
operations. This patch cleans up orphan inodes cleanup after replaying
the log. It also avoids doing other unbound operations such as truncating
a file during replaying log. These unbound operations are postponed to
the orphan inode cleanup stage.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c71bf099

16 12月, 2009 1 次提交

Btrfs: Avoid superfluous tree-log writeout · 8cef4e16

由 Yan, Zheng 提交于 11月 12, 2009

We allow two log transactions at a time, but use same flag
to mark dirty tree-log btree blocks. So we may flush dirty
blocks belonging to newer log transaction when committing a
log transaction. This patch fixes the issue by using two
flags to mark dirty tree-log btree blocks.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8cef4e16

14 10月, 2009 1 次提交

Btrfs: avoid tree log commit when there are no changes · 257c62e1

由 Chris Mason 提交于 10月 13, 2009

rpm has a habit of running fdatasync when the file hasn't
changed.  We already detect if a file hasn't been changed
in the current transaction but it might have been sent to
the tree-log in this transaction and not changed since
the last call to fsync.

In this case, we want to avoid a tree log sync, which includes
a number of synchronous writes and barriers.  This commit
extends the existing tracking of the last transaction to change
a file to also track the last sub-transaction.

The end result is that rpm -ivh and -Uvh are roughly twice as fast,
and on par with ext3.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

257c62e1

09 10月, 2009 1 次提交

Btrfs: async delalloc flushing under space pressure · e3ccfa98

由 Josef Bacik 提交于 10月 07, 2009

This patch moves the delalloc flushing that occurs when we are under space
pressure off to a async thread pool. This helps since we only free up
metadata space when we actually insert the extent item, which means it takes
quite a while for space to be free'ed up if we wait on all ordered extents.
However, if space is freed up due to inline extents being inserted, we can
wake people who are waiting up early, and they can finish their work.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e3ccfa98

05 10月, 2009 1 次提交

Btrfs: fix deadlock on async thread startup · 61d92c32

由 Chris Mason 提交于 10月 02, 2009

The btrfs async worker threads are used for a wide variety of things,
including processing bio end_io functions.  This means that when
the endio threads aren't running, the rest of the FS isn't
able to do the final processing required to clear PageWriteback.

The endio threads also try to exit as they become idle and
start more as the work piles up.  The problem is that starting more
threads means kthreadd may need to allocate ram, and that allocation
may wait until the global number of writeback pages on the system is
below a certain limit.

The result of that throttling is that end IO threads wait on
kthreadd, who is waiting on IO to end, which will never happen.

This commit fixes the deadlock by handing off thread startup to a
dedicated thread.  It also fixes a bug where the on-demand thread
creation was creating far too many threads because it didn't take into
account threads being started by other procs.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

61d92c32

02 10月, 2009 1 次提交

Btrfs: remove duplicates of filemap_ helpers · 8aa38c31

由 Christoph Hellwig 提交于 10月 01, 2009

Use filemap_fdatawrite_range and filemap_fdatawait_range instead of
local copies of the functions.  For filemap_fdatawait_range that
also means replacing the awkward old wait_on_page_writeback_range
calling convention with the regular filemap byte offsets.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8aa38c31

01 10月, 2009 1 次提交

Btrfs: fix arguments to btrfs_wait_on_page_writeback_range · 35d62a94

由 Christoph Hellwig 提交于 9月 30, 2009

wait_on_page_writeback_range/btrfs_wait_on_page_writeback_range takes
a pagecache offset, not a byte offset into the file.  Shift the arguments
around to wait for the correct range
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

35d62a94

29 9月, 2009 1 次提交

Btrfs: proper -ENOSPC handling · 9ed74f2d

由 Josef Bacik 提交于 9月 11, 2009

At the start of a transaction we do a btrfs_reserve_metadata_space() and
specify how many items we plan on modifying. Then once we've done our
modifications and such, just call btrfs_unreserve_metadata_space() for
the same number of items we reserved.

For keeping track of metadata needed for data I've had to add an extent_io op
for when we merge extents. This lets us track space properly when we are doing
sequential writes, so we don't end up reserving way more metadata space than
what we need.

The only place where the metadata space accounting is not done is in the
relocation code. This is because Yan is going to be reworking that code in the
near future, so running btrfs-vol -b could still possibly result in a ENOSPC
related panic. This patch also turns off the metadata_ratio stuff in order to
allow users to more efficiently use their disk space.

This patch makes it so we track how much metadata we need for an inode's
delayed allocation extents by tracking how many extents are currently
waiting for allocation. It introduces two new callbacks for the
extent_io tree's, merge_extent_hook and split_extent_hook. These help
us keep track of when we merge delalloc extents together and split them
up. Reservations are handled prior to any actually dirty'ing occurs,
and then we unreserve after we dirty.

btrfs_unreserve_metadata_for_delalloc() will make the appropriate
unreservations as needed based on the number of reservations we
currently have and the number of extents we currently have. Doing the
reservation outside of doing any of the actual dirty'ing lets us do
things like filemap_flush() the inode to try and force delalloc to
happen, or as a last resort actually start allocation on all delalloc
inodes in the fs. This has survived dbench, fs_mark and an fsx torture
test.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9ed74f2d

24 9月, 2009 1 次提交

Btrfs: hash the btree inode during fill_super · c65ddb52

由 Yan Zheng 提交于 9月 24, 2009

The snapshot deletion  patches dropped this line, but the inode
needs to be hashed.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c65ddb52

22 9月, 2009 4 次提交

const: mark remaining address_space_operations const · 7f09410b

由 Alexey Dobriyan 提交于 9月 21, 2009

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7f09410b

Btrfs: add snapshot/subvolume destroy ioctl · 76dda93c

由 Yan, Zheng 提交于 9月 21, 2009

This patch adds snapshot/subvolume destroy ioctl. A subvolume that isn't being
used and doesn't contains links to other subvolumes can be destroyed.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

76dda93c

Btrfs: change how subvolumes are organized · 4df27c4d

由 Yan, Zheng 提交于 9月 21, 2009

btrfs allows subvolumes and snapshots anywhere in the directory tree.
If we snapshot a subvolume that contains a link to other subvolume
called subvolA, subvolA can be accessed through both the original
subvolume and the snapshot. This is similar to creating hard link to
directory, and has the very similar problems.

The aim of this patch is enforcing there is only one access point to
each subvolume. Only the first directory entry (the one added when
the subvolume/snapshot was created) is treated as valid access point.
The first directory entry is distinguished by checking root forward
reference. If the corresponding root forward reference is missing,
we know the entry is not the first one.

This patch also adds snapshot/subvolume rename support, the code
allows rename subvolume link across subvolumes.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4df27c4d

Btrfs: do not reuse objectid of deleted snapshot/subvol · 13a8a7c8

由 Yan, Zheng 提交于 9月 21, 2009

The new back reference format does not allow reusing objectid of
deleted snapshot/subvol. So we use ++highest_objectid to allocate
objectid for new snapshot/subvol.

Now we use ++highest_objectid to allocate objectid for both new inode
and new snapshot/subvolume, so this patch removes 'find hole' code in
btrfs_find_free_objectid.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

13a8a7c8

18 9月, 2009 1 次提交

Btrfs: improve async block group caching · 11833d66

由 Yan Zheng 提交于 9月 11, 2009

This patch gets rid of two limitations of async block group caching.
The old code delays handling pinned extents when block group is in
caching. To allocate logged file extents, the old code need wait
until block group is fully cached. To get rid of the limitations,
This patch introduces a data structure to track the progress of
caching. Base on the caching progress, we know which extents should
be added to the free space cache when handling the pinned extents.
The logged file extents are also handled in a similar way.

This patch also changes how pinned extents are tracked. The old
code uses one tree to track pinned extents, and copy the pinned
extents tree at transaction commit time. This patch makes it use
two trees to track pinned extents. One tree for extents that are
pinned in the running transaction, one tree for extents that can
be unpinned. At transaction commit time, we swap the two trees.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

11833d66

16 9月, 2009 1 次提交

fs: Assign bdi in super_block · 32a88aa1

由 Jens Axboe 提交于 9月 16, 2009

We do this automatically in get_sb_bdev() from the set_bdev_super()
callback. Filesystems that have their own private backing_dev_info
must assign that in ->fill_super().

Note that ->s_bdi assignment is required for proper writeback!
Acked-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

32a88aa1

12 9月, 2009 2 次提交

Btrfs: switch extent_map to a rw lock · 890871be

由 Chris Mason 提交于 9月 02, 2009

There are two main users of the extent_map tree.  The
first is regular file inodes, where it is evenly spread
between readers and writers.

The second is the chunk allocation tree, which maps blocks from
logical addresses to phyiscal ones, and it is 99.99% reads.

The mapping tree is a point of lock contention during heavy IO
workloads, so this commit switches things to a rw lock.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

890871be

Btrfs: Allow worker threads to exit when idle · 9042846b

由 Chris Mason 提交于 8月 04, 2009

The Btrfs worker threads don't currently die off after they have
been idle for a while, leading to a lot of threads sitting around
doing nothing for each mount.

Also, they are unable to start atomically (from end_io hanlders).

This commit reworks the worker threads so they can be started
from end_io handlers (just setting a flag that asks for a thread
to be added at a later date) and so they can exit if they
have been idle for a long time.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9042846b

11 9月, 2009 1 次提交

writeback: add name to backing_dev_info · d993831f

由 Jens Axboe 提交于 6月 12, 2009

This enables us to track who does what and print info. Its main use
is catching dirty inodes on the default_backing_dev_info, so we can
fix that up.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d993831f

30 7月, 2009 1 次提交

Btrfs: preserve commit_root for async caching · 276e680d

由 Yan Zheng 提交于 7月 30, 2009

The async block group caching code uses the commit_root pointer
to get a stable version of the extent allocation tree for scanning.
This copy of the tree root isn't going to change and it significantly
reduces the complexity of the scanning code.

During a commit, we have a loop where we update the extent allocation
tree root.  We need to loop because updating the root pointer in
the tree of tree roots may allocate blocks which may change the
extent allocation tree.

Right now the commit_root pointer is changed inside this loop.  It
is more correct to change the commit_root pointer only after all the
looping is done.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

276e680d

28 7月, 2009 2 次提交

Btrfs: Fix async caching interaction with unmount · f25784b3

由 Yan Zheng 提交于 7月 28, 2009

- don't stop the caching thread until btrfs_commit_super return.

- if caching is interrupted by umount, set last to (u64)-1.
  otherwise the un-scanned range of block group will be considered
  as free extent.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f25784b3

Btrfs: change how we unpin extents · 68b38550

由 Josef Bacik 提交于 7月 27, 2009

We are racy with async block caching and unpinning extents. This patch makes
things much less complicated by only unpinning the extent if the block group is
cached. We check the block_group->cached var under the block_group->lock spin
lock. If it is set to BTRFS_CACHE_FINISHED then we update the pinned counters,
and unpin the extent and add the free space back. If it is not set to this, we
start the caching of the block group so the next time we unpin extents we can
unpin the extent. This keeps us from racing with the async caching threads,
lets us kill the fs wide async thread counter, and keeps us from having to set
DELALLOC bits for every extent we hit if there are caching kthreads going.

One thing that needed to be changed was btrfs_free_super_mirror_extents. Now
instead of just looking for LOCKED extents, we also look for DIRTY extents,
since we could have left some extents pinned in the previous transaction that
will never get freed now that we are unmounting, which would cause us to leak
memory. So btrfs_free_super_mirror_extents has been changed to
btrfs_free_pinned_extents, and it will clear the extents locked for the super
mirror, and any remaining pinned extents that may be present. Thank you,
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

68b38550

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功