提交 · 4da6f1a332f6c16b6594c7892f13c31459b9b1c8 · openanolis / cloud-kernel

11 1月, 2012 2 次提交
- L
  Btrfs: reserve metadata space in btrfs_ioctl_setflags() · 4da6f1a3
  由 Li Zefan 提交于 12月 29, 2011
```
Check and reserve space for btrfs_update_inode().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
```
  4da6f1a3
- L
  Btrfs: remove BUG_ON()s in btrfs_ioctl_setflags() · f062abf0
  由 Li Zefan 提交于 12月 29, 2011
```
We can recover from errors and return -errno to user space.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
```
  f062abf0
16 12月, 2011 1 次提交

Btrfs: fix how we do delalloc reservations and how we free reservations on error · 660d3f6c

由 Josef Bacik 提交于 12月 09, 2011

Running xfstests 269 with some tracing my scripts kept spitting out errors about
releasing bytes that we didn't actually have reserved. This took me down a huge
rabbit hole and it turns out the way we deal with reserved_extents is wrong,
we need to only be setting it if the reservation succeeds, otherwise the free()
method will come in and unreserve space that isn't actually reserved yet, which
can lead to other warnings and such. The math was all working out right in the
end, but it caused all sorts of other issues in addition to making my scripts
yell and scream and generally make it impossible for me to track down the
original issue I was looking for. The other problem is with our error handling
in the reservation code. There are two cases that we need to deal with

1) We raced with free. In this case free won't free anything because csum_bytes
is modified before we dro the lock in our reservation path, so free rightly
doesn't release any space because the reservation code may be depending on that
reservation. However if we fail, we need the reservation side to do the free at
that point since that space is no longer in use. So as it stands the code was
doing this fine and it worked out, except in case #2

2) We don't race with free. Nobody comes in and changes anything, and our
reservation fails. In this case we didn't reserve anything anyway and we just
need to clean up csum_bytes but not free anything. So we keep track of
csum_bytes before we drop the lock and if it hasn't changed we know we can just
decrement csum_bytes and carry on.

Because of the case where we can race with free()'s since we have to drop our
spin_lock to do the reservation, I'm going to serialize all reservations with
the i_mutex. We already get this for free in the heavy use paths, truncate and
file write all hold the i_mutex, just needed to add it to page_mkwrite and
various ioctl/balance things. With this patch my space leak scripts no longer
scream bloody murder. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

660d3f6c

15 12月, 2011 1 次提交

Btrfs: fix ctime update of on-disk inode · 306424cc

由 Li Zefan 提交于 12月 14, 2011

To reproduce the bug:

    # touch /mnt/tmp
    # stat /mnt/tmp | grep Change
    Change: 2011-12-09 09:32:23.412105981 +0800
    # chattr +i /mnt/tmp
    # stat /mnt/tmp | grep Change
    Change: 2011-12-09 09:32:43.198105295 +0800
    # umount /mnt
    # mount /dev/loop1 /mnt
    # stat /mnt/tmp | grep Change
    Change: 2011-12-09 09:32:23.412105981 +0800

We should update ctime of in-memory inode before calling
btrfs_update_inode().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

306424cc

01 12月, 2011 1 次提交

Btrfs: Don't error on resizing FS to same size · ece7d20e

由 Mike Fleetwood 提交于 11月 18, 2011

It seems overly harsh to fail a resize of a btrfs file system to the
same size when a shrink or grow would succeed. User app GParted trips
over this error. Allow it by bypassing the shrink or grow operation.
Signed-off-by: NMike Fleetwood <mike.fleetwood@googlemail.com>

ece7d20e

20 11月, 2011 2 次提交

Btrfs: prefix resize related printks with btrfs: · 5bb14682

由 Arnd Hannemann 提交于 11月 20, 2011

For the user it is confusing to find something like:
[10197.627710] new size for /dev/mapper/vg0-usr_share is 3221225472
in kernel log, because it doesn't point directly to btrfs.

This patch prefixes those messages with "btrfs:" like other btrfs
related printks.
Signed-off-by: NArnd Hannemann <arnd@arndnet.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5bb14682

btrfs: Fix up 32/64-bit compatibility for new ioctls · 745c4d8e

由 Jeff Mahoney 提交于 11月 20, 2011

This patch casts to unsigned long before casting to a pointer and fixes
the following warnings:
fs/btrfs/extent_io.c:2289:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
fs/btrfs/ioctl.c:2933:37: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
fs/btrfs/ioctl.c:2937:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
fs/btrfs/ioctl.c:3020:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
fs/btrfs/scrub.c:275:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
fs/btrfs/backref.c:686:27: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

745c4d8e

06 11月, 2011 2 次提交

Btrfs: fix the new inspection ioctls for 32 bit compat · 740c3d22

由 Chris Mason 提交于 11月 02, 2011

The new ioctls to follow backrefs are not clean for 32/64 bit
compat.  This reworks them for u64s everywhere.  They are brand new, so
there are no problems with changing the interface now.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

740c3d22

btrfs: separate superblock items out of fs_info · 6c41761f

由 David Sterba 提交于 4月 13, 2011

fs_info has now ~9kb, more than fits into one page. This will cause
mount failure when memory is too fragmented. Top space consumers are
super block structures super_copy and super_for_commit, ~2.8kb each.
Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64)

Add a wrapper for freeing fs_info and all of it's dynamically allocated
members.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

6c41761f

21 10月, 2011 6 次提交

btrfs: return EINVAL if start > total_bytes in fitrim ioctl · f4c697e6

由 Lukas Czerner 提交于 9月 05, 2011

We should retirn EINVAL if the start is beyond the end of the file
system in the btrfs_ioctl_fitrim(). Fix that by adding the appropriate
check for it.

Also in the btrfs_trim_fs() it is possible that len+start might overflow
if big values are passed. Fix it by decrementing the len so that start+len
is equal to the file system size in the worst case.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>

f4c697e6

Btrfs: honor extent thresh during defragmentation · 008873ea

由 Li Zefan 提交于 9月 02, 2011

We won't defrag an extent, if it's bigger than the threshold we
specified and there's no small extent before it, but actually
the code doesn't work this way.

There are three bugs:

- When should_defrag_range() decides we should keep on defragmenting
  an extent, last_len is not incremented. (old bug)

- The length that passes to should_defrag_range() is not the length
  we're going to defrag. (new bug)

- We always defrag 256K bytes data, and a big extent can be part of
  this range. (new bug)

For a file with 4 extents:

        | 4K | 4K | 256K | 256K |

The result of defrag with (the default) 256K extent thresh should be:

        | 264K | 256K |

but with those bugs, we'll get:

        | 520K |
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

008873ea

Btrfs: fix wrong max_to_defrag in btrfs_defrag_file() · 5ca49660

由 Li Zefan 提交于 9月 02, 2011

It's off-by-one, and thus we may skip the last page while defragmenting.

An example case:

  # create /mnt/file with 2 4K file extents
  # btrfs fi defrag /mnt/file
  # sync
  # filefrag /mnt/file
  /mnt/file: 2 extents found

So it's not defragmented.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

5ca49660

Btrfs: use i_size_read() in btrfs_defrag_file() · 151a31b2

由 Li Zefan 提交于 9月 02, 2011

Don't use inode->i_size directly, since we're not holding i_mutex.

This also fixes another bug, that i_size can change after it's checked
against 0 and then (i_size - 1) can be negative.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

151a31b2

Btrfs: fix defragmentation regression · cbcc8326

由 Li Zefan 提交于 9月 02, 2011

There's an off-by-one bug:

  # create a file with lots of 4K file extents
  # btrfs fi defrag /mnt/file
  # sync
  # filefrag -v /mnt/file
  Filesystem type is: 9123683e
  File size of /mnt/file is 1228800 (300 blocks, blocksize 4096)
   ext logical physical expected length flags
     0       0     3372              64
     1      64     3136     3435      1
     2      65     3436     3136     64
     3     129     3201     3499      1
     4     130     3500     3201     64
     5     194     3266     3563      1
     6     195     3564     3266     64
     7     259     3331     3627      1
     8     260     3628     3331     40 eof

After this patch:

  ...
  # filefrag -v /mnt/file
  Filesystem type is: 9123683e
  File size of /mnt/file is 1228800 (300 blocks, blocksize 4096)
   ext logical physical expected length flags
     0       0     3372             300 eof
  /mnt/file: 1 extent found
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

cbcc8326

btrfs: fix memory leak in btrfs_defrag_file · 60ccf82f

由 Diego Calleja 提交于 9月 01, 2011

kmemleak found this:
unreferenced object 0xffff8801b64af968 (size 512):
  comm "btrfs-cleaner", pid 3317, jiffies 4306810886 (age 903.272s)
  hex dump (first 32 bytes):
    00 82 01 07 00 ea ff ff c0 83 01 07 00 ea ff ff  ................
    80 82 01 07 00 ea ff ff c0 87 01 07 00 ea ff ff  ................
  backtrace:
    [<ffffffff816875cc>] kmemleak_alloc+0x5c/0xc0
    [<ffffffff8114aec3>] kmem_cache_alloc_trace+0x163/0x240
    [<ffffffff8127a290>] btrfs_defrag_file+0xf0/0xb20
    [<ffffffff8125d9a5>] btrfs_run_defrag_inodes+0x165/0x210
    [<ffffffff812479d7>] cleaner_kthread+0x177/0x190
    [<ffffffff81075c7d>] kthread+0x8d/0xa0
    [<ffffffff816af5f4>] kernel_thread_helper+0x4/0x10
    [<ffffffffffffffff>] 0xffffffffffffffff

"pages" is not always freed. Fix it removing the unnecesary additional return.
Signed-off-by: NDiego Calleja <diegocg@gmail.com>

60ccf82f

20 10月, 2011 2 次提交

Btrfs: only inherit btrfs specific flags when creating files · e27425d6

由 Josef Bacik 提交于 9月 27, 2011

Xfstests 79 was failing because we were inheriting the S_APPEND flag when we
weren't supposed to. There isn't any specific documentation on this so I'm
taking the test as the standard of how things work, and having S_APPEND set on a
directory doesn't mean that S_APPEND gets inherited by its children according to
this test. So only inherit btrfs specific things. This will let us set
compress/nocompress on specific directories and everything in the directories
will inherit this flag, same with nodatacow. With this patch test 79 passes.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

e27425d6

Btrfs: use the inode's mapping mask for allocating pages · 3b16a4e3

由 Josef Bacik 提交于 9月 21, 2011

Johannes pointed out we were allocating only kernel pages for doing writes,
which is kind of a big deal if you are on 32bit and have more than a gig of ram.
So fix our allocations to use the mapping's gfp but still clear __GFP_FS so we
don't re-enter.  Thanks,
Reported-by: NJohannes Weiner <jweiner@redhat.com>
Signed-off-by: NJosef Bacik <josef@redhat.com>

3b16a4e3

11 10月, 2011 2 次提交

Btrfs: make sure not to defrag extents past i_size · f7f43cc8

由 Chris Mason 提交于 10月 11, 2011

The btrfs file defrag code will loop through the extents and
force COW on them.  But there is a concurrent truncate in the middle of
the defrag, it might end up defragging the same range over and over
again.

The problem is that writepage won't go through and do anything on pages
past i_size, so the cow won't happen, so the file will appear to still
be fragmented.  defrag will end up hitting the same extents again and
again.

In the worst case, the truncate can actually live lock with the defrag
because the defrag keeps creating new ordered extents which the truncate
code keeps waiting on.

The fix here is to make defrag check for i_size inside the main loop,
instead of just once before the looping starts.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f7f43cc8

Btrfs: fix recursive auto-defrag · 2a0f7f57

由 Li Zefan 提交于 10月 10, 2011

Follow those steps:

  # mount -o autodefrag /dev/sda7 /mnt
  # dd if=/dev/urandom of=/mnt/tmp bs=200K count=1
  # sync
  # dd if=/dev/urandom of=/mnt/tmp bs=8K count=1 conv=notrunc

and then it'll go into a loop: writeback -> defrag -> writeback ...

It's because writeback writes [8K, 200K] and then writes [0, 8K].

I tried to make writeback know if the pages are dirtied by defrag,
but the patch was a bit intrusive. Here I simply set writeback_index
when we defrag a file.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2a0f7f57

29 9月, 2011 1 次提交

btrfs: new ioctls to do logical->inode and inode->path resolving · d7728c96

由 Jan Schmidt 提交于 7月 07, 2011

these ioctls make use of the new functions initially added for scrub. they
return all inodes belonging to a logical address (BTRFS_IOC_LOGICAL_INO) and
all paths belonging to an inode (BTRFS_IOC_INO_PATHS).
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

d7728c96

21 9月, 2011 1 次提交

Btrfs: reserve sufficient space for ioctl clone · b6f3409b

由 Sage Weil 提交于 9月 20, 2011

Fix a crash/BUG_ON in the clone ioctl due to insufficient reservation. We
need to reserve space for:

 - adjusting the old extent (possibly splitting it)
 - adding the new extent
 - updating the inode
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b6f3409b

18 9月, 2011 3 次提交

Btrfs: don't change inode flag of the dest clone file · dde820fb

由 Li Zefan 提交于 9月 18, 2011

The dst file will have the same inode flags with dst file after
file clone, and I think it's unexpected.

For example, the dst file will suddenly become immutable after
getting some share of data with src file, if the src is immutable.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

dde820fb

Btrfs: don't make a file partly checksummed through file clone · 0e7b824c

由 Li Zefan 提交于 9月 18, 2011

To reproduce the bug:

  # mount /dev/sda7 /mnt
  # dd if=/dev/zero of=/mnt/src bs=4K count=1
  # umount /mnt

  # mount -o nodatasum /dev/sda7 /mnt
  # dd if=/dev/zero of=/mnt/dst bs=4K count=1
  # clone_range -s 4K -l 4K /mnt/src /mnt/dst

  # echo 3 > /proc/sys/vm/drop_caches
  # cat /mnt/dst
  # dmesg
  ...
  btrfs no csum found for inode 258 start 0
  btrfs csum failed ino 258 off 0 csum 2566472073 private 0

It's because part of the file is checksummed and the other part is not,
and then btrfs will complain checksum is not found when we read the file.

Disallow file clone if src and dst file have different checksum flag,
so we ensure a file is completely checksummed or unchecksummed.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0e7b824c

Btrfs: fix pages truncation in btrfs_ioctl_clone() · 71ef0786

由 Li Zefan 提交于 9月 18, 2011

It's a bug in commit f81c9cdc
(Btrfs: truncate pages from clone ioctl target range)

We should pass the dest range to the truncate function, but not the
src range.

Also move the function before locking extent state.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

71ef0786

11 9月, 2011 2 次提交

Btrfs: add dummy extent if dst offset excceeds file end in · d525e8ab

由 Li Zefan 提交于 9月 11, 2011

You can see there's no file extent with range [0, 4096]. Check this by
btrfsck:

 # btrfsck /dev/sda7
 root 5 inode 258 errors 100
 ...
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d525e8ab

Btrfs: calc file extent num_bytes correctly in file clone · d72c0842

由 Li Zefan 提交于 9月 11, 2011

num_bytes should be 4096 not 12288.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d72c0842

17 8月, 2011 1 次提交

Btrfs: truncate pages from clone ioctl target range · f81c9cdc

由 Sage Weil 提交于 8月 10, 2011

We need to truncate page cache pages for the clone ioctl target range or
else we'll confuse ourselves to no end.  If the old data was cached, we
used to still see it (until remount).  If the page was partially updated
we used to get a mix of old and new data.
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f81c9cdc

02 8月, 2011 1 次提交

Btrfs: copy string correctly in INO_LOOKUP ioctl · 77906a50

由 Li Zefan 提交于 7月 14, 2011

Memory areas [ptr, ptr+total_len] and [name, name+total_len]
may overlap, so it's wrong to use memcpy().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

77906a50

28 7月, 2011 2 次提交

Btrfs: fix enospc problems with delalloc · 9e0baf60

由 Josef Bacik 提交于 7月 15, 2011

So I had this brilliant idea to use atomic counters for outstanding and reserved
extents, but this turned out to be a bad idea.  Consider this where we have 1
outstanding extent and 1 reserved extent

Reserver				Releaser
					atomic_dec(outstanding) now 0
atomic_read(outstanding)+1 get 1
atomic_read(reserved) get 1
don't actually reserve anything because
they are the same
					atomic_cmpxchg(reserved, 1, 0)
atomic_inc(outstanding)
atomic_add(0, reserved)
					free reserved space for 1 extent

Then the reserver now has no actual space reserved for it, and when it goes to
finish the ordered IO it won't have enough space to do it's allocation and you
get those lovely warnings.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9e0baf60

Btrfs: use find_or_create_page instead of grab_cache_page · a94733d0

由 Josef Bacik 提交于 7月 11, 2011

grab_cache_page will use mapping_gfp_mask(), which for all inodes is set to
GFP_HIGHUSER_MOVABLE. So instead use find_or_create_page in all cases where we
need GFP_NOFS so we don't deadlock. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

a94733d0

21 7月, 2011 1 次提交

get rid of useless dget_parent() in fs/btrfs/ioctl.c · 2fbe8c8a

由 Al Viro 提交于 7月 16, 2011

both callers there have dentry->d_parent stabilized by the fact that
their caller had obtained dentry from lookup_one_len() and had not
dropped ->i_mutex on parent since then.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2fbe8c8a

16 6月, 2011 1 次提交

Btrfs: protect the pending_snapshots list with trans_lock · 8351583e

由 Josef Bacik 提交于 6月 14, 2011

Currently there is nothing protecting the pending_snapshots list on the
transaction. We only hold the directory mutex that we are snapshotting and a
read lock on the subvol_sem, so we could race with somebody else creating a
snapshot in a different directory and end up with list corruption. So protect
this list with the trans_lock. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

8351583e

11 6月, 2011 1 次提交

Btrfs: avoid stack bloat in btrfs_ioctl_fs_info() · 027ed2f0

由 Li Zefan 提交于 6月 08, 2011

The size of struct btrfs_ioctl_fs_info_args is as big as 1KB, so
don't declare the variable on stack.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

027ed2f0

04 6月, 2011 1 次提交

btrfs: use btrfs_ino to access inode number · a4689d2b

由 David Sterba 提交于 5月 31, 2011

commit 4cb5300b ("Btrfs: add mount -o auto_defrag") accesses inode
number directly while it should use the helper with the new inode
number allocator.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a4689d2b

27 5月, 2011 1 次提交

Btrfs: add mount -o auto_defrag · 4cb5300b

由 Chris Mason 提交于 5月 24, 2011

This will detect small random writes into files and
queue the up for an auto defrag process.  It isn't well suited to
database workloads yet, but works for smaller files such as rpm, sqlite
or bdb databases.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4cb5300b

24 5月, 2011 5 次提交

Btrfs: using rcu lock in the reader side of devices list · 1f78160c

由 Xiao Guangrong 提交于 4月 20, 2011

fs_devices->devices is only updated on remove and add device paths, so we can
use rcu to protect it in the reader side
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1f78160c

btrfs: Ensure the tree search ioctl returns the right number of records · e2156867

由 Hugo Mills 提交于 5月 14, 2011

Btrfs's tree search ioctl has a field to indicate that no more than a
given number of records should be returned. The ioctl doesn't honour
this, as the tested value is not incremented until the end of the
copy_to_sk function. This patch removes an unnecessary local variable,
and updates the num_found counter as each key is found in the tree.
Signed-off-by: NHugo Mills <hugo@carfax.org.uk>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e2156867

Btrfs: kill BTRFS_I(inode)->block_group · d82a6f1d

由 Josef Bacik 提交于 5月 11, 2011

Originally this was going to be used as a way to give hints to the allocator,
but frankly we can get much better hints elsewhere and it's not even used at all
for anything usefull. In addition to be completely useless, when we initialize
an inode we try and find a freeish block group to set as the inodes block group,
and with a completely full 40gb fs this takes _forever_, so I imagine with say
1tb fs this is just unbearable. So just axe the thing altoghether, we don't
need it and it saves us 8 bytes in the inode and saves us 500 microseconds per
inode lookup in my testcase. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

d82a6f1d

Btrfs: kill trans_mutex · a4abeea4

由 Josef Bacik 提交于 4月 11, 2011

We use trans_mutex for lots of things, here's a basic list

1) To serialize trans_handles joining the currently running transaction
2) To make sure that no new trans handles are started while we are committing
3) To protect the dead_roots list and the transaction lists

Really the serializing trans_handles joining is not too hard, and can really get
bogged down in acquiring a reference to the transaction. So replace the
trans_mutex with a trans_lock spinlock and use it to do the following

1) Protect fs_info->running_transaction. All trans handles have to do is check
this, and then take a reference of the transaction and keep on going.
2) Protect the fs_info->trans_list. This doesn't get used too much, basically
it just holds the current transactions, which will usually just be the currently
committing transaction and the currently running transaction at most.
3) Protect the dead roots list. This is only ever processed by splicing the
list so this is relatively simple.
4) Protect the fs_info->reloc_ctl stuff. This is very lightweight and was using
the trans_mutex before, so this is a pretty straightforward change.
5) Protect fs_info->no_trans_join. Because we don't hold the trans_lock over
the entirety of the commit we need to have a way to block new people from
creating a new transaction while we're doing our work. So we set no_trans_join
and in join_transaction we test to see if that is set, and if it is we do a
wait_on_commit.
6) Make the transaction use count atomic so we don't need to take locks to
modify it when we're dropping references.
7) Add a commit_lock to the transaction to make sure multiple people trying to
commit the same transaction don't race and commit at the same time.
8) Make open_ioctl_trans an atomic so we don't have to take any locks for ioctl
trans.

I have tested this with xfstests, but obviously it is a pretty hairy change so
lots of testing is greatly appreciated. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

a4abeea4

Btrfs: take away the num_items argument from btrfs_join_transaction · 7a7eaa40

由 Josef Bacik 提交于 4月 13, 2011

I keep forgetting that btrfs_join_transaction() just ignores the num_items
argument, which leads me to sending pointless patches and looking stupid :). So
just kill the num_items argument from btrfs_join_transaction and
btrfs_start_ioctl_transaction, since neither of them use it. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

7a7eaa40

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功