提交 · af60bed24eb0e3b6d93eaa6bb395a5721e6c09a8 · openeuler / raspberrypi-kernel

24 5月, 2011 6 次提交

Btrfs: set range_start to the right start in count_range_bits · af60bed2

由 Josef Bacik 提交于 5月 04, 2011

In count_range_bits we are adjusting total_bytes based on the range we are
searching for, but we don't adjust the range start according to the range we are
searching for, which makes for weird results.  For example, if the range

[0-8192]

is set DELALLOC, but I search for 4096-8192, I will get back 4096 for the number
of bytes found, but the range_start will be 0, which makes it look like the
range is [0-4096].  So instead set range_start = max(cur_start, state->start).
This makes everything come out right.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

af60bed2

Btrfs: fix how we do space reservation for truncate · fcb80c2a

由 Josef Bacik 提交于 5月 03, 2011

The ceph guys keep running into problems where we have space reserved in our
orphan block rsv when freeing it up. This is because they tend to do snapshots
alot, so their truncates tend to use a bunch of space, so when we go to do
things like update the inode we have to steal reservation space in order to make
the reservation happen. This happens because truncate can use as much space as
it freaking feels like, but we still have to hold space for removing the orphan
item and updating the inode, which will definitely always happen. So in order
to fix this we need to split all of the reservation stuf up. So with this patch
we have

1) The orphan block reserve which only holds the space for deleting our orphan
item when everything is over.

2) The truncate block reserve which gets allocated and used specifically for the
space that the truncate will use on a per truncate basis.

3) The transaction will always have 1 item's worth of data reserved so we can
update the inode normally.

Hopefully this will make the ceph problem go away. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

fcb80c2a

Btrfs: kill trans_mutex · a4abeea4

由 Josef Bacik 提交于 4月 11, 2011

We use trans_mutex for lots of things, here's a basic list

1) To serialize trans_handles joining the currently running transaction
2) To make sure that no new trans handles are started while we are committing
3) To protect the dead_roots list and the transaction lists

Really the serializing trans_handles joining is not too hard, and can really get
bogged down in acquiring a reference to the transaction. So replace the
trans_mutex with a trans_lock spinlock and use it to do the following

1) Protect fs_info->running_transaction. All trans handles have to do is check
this, and then take a reference of the transaction and keep on going.
2) Protect the fs_info->trans_list. This doesn't get used too much, basically
it just holds the current transactions, which will usually just be the currently
committing transaction and the currently running transaction at most.
3) Protect the dead roots list. This is only ever processed by splicing the
list so this is relatively simple.
4) Protect the fs_info->reloc_ctl stuff. This is very lightweight and was using
the trans_mutex before, so this is a pretty straightforward change.
5) Protect fs_info->no_trans_join. Because we don't hold the trans_lock over
the entirety of the commit we need to have a way to block new people from
creating a new transaction while we're doing our work. So we set no_trans_join
and in join_transaction we test to see if that is set, and if it is we do a
wait_on_commit.
6) Make the transaction use count atomic so we don't need to take locks to
modify it when we're dropping references.
7) Add a commit_lock to the transaction to make sure multiple people trying to
commit the same transaction don't race and commit at the same time.
8) Make open_ioctl_trans an atomic so we don't have to take any locks for ioctl
trans.

I have tested this with xfstests, but obviously it is a pretty hairy change so
lots of testing is greatly appreciated. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

a4abeea4

Btrfs: if we've already started a trans handle, use that one · 2a1eb461

由 Josef Bacik 提交于 4月 13, 2011

We currently track trans handles in current->journal_info, but we don't actually
use it. This patch fixes it. This will cover the case where we have multiple
people starting transactions down the call chain. This keeps us from having to
allocate a new handle and all of that, we just increase the use count of the
current handle, save the old block_rsv, and return. I tested this with xfstests
and it worked out fine. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

2a1eb461

Btrfs: take away the num_items argument from btrfs_join_transaction · 7a7eaa40

由 Josef Bacik 提交于 4月 13, 2011

I keep forgetting that btrfs_join_transaction() just ignores the num_items
argument, which leads me to sending pointless patches and looking stupid :). So
just kill the num_items argument from btrfs_join_transaction and
btrfs_start_ioctl_transaction, since neither of them use it. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

7a7eaa40

Btrfs: make sure to use the delalloc reserve when filling delalloc · 74b21075

由 Josef Bacik 提交于 4月 13, 2011

In the prealloc filling code and compressed code we don't set trans->block_rsv
to the delalloc block reserve properly, which is going to make us use metadata
from the wrong pool, this patch fixes that. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

74b21075

15 5月, 2011 5 次提交

Btrfs: fix FS_IOC_SETFLAGS ioctl · ebcb904d

由 Li Zefan 提交于 4月 15, 2011

Steps to reproduce the bug:

  - Call FS_IOC_SETLFAGS ioctl with flags=FS_COMPR_FL
  - Call FS_IOC_SETFLAGS ioctl with flags=0
  - Call FS_IOC_GETFLAGS ioctl, and you'll see FS_COMPR_FL is still set!
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ebcb904d

Btrfs: fix FS_IOC_GETFLAGS ioctl · d0092bdd

由 Li Zefan 提交于 4月 15, 2011

As we've added per file compression/cow support.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d0092bdd

fs: remove FS_COW_FL · e1e8fb6a

由 Li Zefan 提交于 4月 15, 2011

FS_COW_FL and FS_NOCOW_FL were newly introduced to control per file
COW in btrfs, but FS_NOCOW_FL is sufficient.

The fact is we don't have corresponding BTRFS_INODE_COW flag.

COW is default, and FS_NOCOW_FL can be used to switch off COW for
a single file.

If we mount btrfs with nodatacow, a newly created file will be set with
the FS_NOCOW_FL flag. So to turn on COW for it, we can just clear the
FS_NOCOW_FL flag.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e1e8fb6a

Btrfs: fix easily get into ENOSPC in mixed case · 1aba86d6

由 liubo 提交于 4月 08, 2011

When a btrfs disk is created by mixed data & metadata option, it will have no
pure data or pure metadata space info.

In btrfs's for-linus branch, commit 78b1ea13838039cd88afdd62519b40b344d6c920
(Btrfs: fix OOPS of empty filesystem after balance) initializes space infos at
the very beginning.  The problem is this initialization does not take the mixed
case into account, which will cause btrfs will easily get into ENOSPC in mixed
case.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1aba86d6

Prevent oopsing in posix_acl_valid() · f5de9391

由 Daniel J Blueman 提交于 5月 03, 2011

If posix_acl_from_xattr() returns an error code, a negative address is
dereferenced causing an oops; fix by checking for error code first.
Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f5de9391

27 4月, 2011 1 次提交

Revert wrong fixes for common misspellings · e9c54999

由 Lucas De Marchi 提交于 4月 26, 2011

These changes were incorrectly fixed by codespell. They were now
manually corrected.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

e9c54999

26 4月, 2011 8 次提交

Btrfs: cleanup error handling in inode.c · 7cf96da3

由 Tsutomu Itoh 提交于 4月 25, 2011

The error processing of several places is changed like setting the
error number only at the error.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7cf96da3

Btrfs: put the right bio if we have an error · 64728bbb

由 Josef Bacik 提交于 4月 25, 2011

In btrfs_submit_direct_hook if the first btrfs_map_block fails we need to put
the orig_bio, not bio.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

64728bbb

Btrfs: free bitmaps properly when evicting the cache · a4f0162f

由 Josef Bacik 提交于 4月 25, 2011

If our space cache is wrong, we do the right thing and free up everything that
we loaded, however we don't reset the total_bitmaps counter or the thresholds or
anything. So in btrfs_remove_free_space_cache make sure to call free_bitmap()
if it's a bitmap, this will keep us from panicing when we check to make sure we
don't have too many bitmaps. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a4f0162f

Btrfs: Free free_space item properly in btrfs_trim_block_group() · f789b684

由 Li Zefan 提交于 4月 25, 2011

Since commit dc89e982, we've changed
to use a specific slab for alocation of free_space items.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f789b684

D
btrfs: add missing spin_unlock to a rare exit path · cfece4db
由 David Sterba 提交于 4月 25, 2011
```
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
cfece4db

Btrfs: check return value of kmalloc() · 8d413713

由 Tsutomu Itoh 提交于 4月 25, 2011

The check on the return value of kmalloc() is added to some places.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8d413713

btrfs: fix wrong allocating flag when reading page · 43e817a1

由 Itaru Kitayama 提交于 4月 25, 2011

the space cache use extent_readpages() to read free space information,
so we can not use GFP_KERNEL flag to allocate memory, or it may lead
to deadlock.
Signed-off-by: NItaru Kitayama <kitayama@cl.bb4u.ne.jp>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

43e817a1

Btrfs: fix missing mutex_unlock in btrfs_del_dir_entries_in_log() · a62f44a5

由 Tsutomu Itoh 提交于 4月 25, 2011

It is necessary to unlock mutex_lock before it return an error when
btrfs_alloc_path() fails.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a62f44a5

20 4月, 2011 1 次提交

Btrfs: do some plugging in the submit_bio threads · 211588ad

由 Chris Mason 提交于 4月 19, 2011

The Btrfs submit bio threads have a small number of
threads responsible for pushing down bios we've collected
for a large number of devices.

Since we do all the bios for a single device at once,
we want to make sure we unplug and send down the bios
for each device as we're done processing them.

The new plugging API removed the btrfs code to
unplug while processing bios, this adds it back with
the new API.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

211588ad

18 4月, 2011 1 次提交

Btrfs: fix free space cache leak · f65647c2

由 Chris Mason 提交于 4月 18, 2011

The free space caching code was recently reworked to
cache all the pages it needed instead of using find_get_page everywhere.

One loop was missed though, so it ended up leaking pages.  This fixes
it to use our page array instead of find_get_page.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f65647c2

16 4月, 2011 3 次提交

Btrfs: avoid taking the chunk_mutex in do_chunk_alloc · 6d74119f

由 Josef Bacik 提交于 4月 11, 2011

Everytime we try to allocate disk space we try and see if we can pre-emptively
allocate a chunk, but in the common case we don't allocate anything, so there is
no sense in taking the chunk_mutex at all. So instead if we are allocating a
chunk, mark it in the space_info so we don't get two people trying to allocate
at the same time. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Reviewed-by: NLiu Bo <liubo2009@cn.fujitsu.com>

6d74119f

Btrfs end_bio_extent_readpage should look for locked bits · 0d399205

由 Chris Mason 提交于 4月 16, 2011

A recent commit caches the extent state in end_bio_extent_readpage,
but the search it does should look for locked extents.  This
fixes things to make it more effective.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0d399205

Btrfs: don't force chunk allocation in find_free_extent · 0e4f8f88

由 Chris Mason 提交于 4月 15, 2011

find_free_extent likes to allocate in contiguous clusters,
which makes writeback faster, especially on SSD storage.  As
the FS fragments, these clusters become harder to find and we have
to decide between allocating a new chunk to make more clusters
or giving up on the cluster to allocate from the free space
we have.

Right now it creates too many chunks, and you can end up with
a whole FS that is mostly empty metadata chunks.  This commit
changes the allocation code to be more strict and only
allocate new chunks when we've made good use of the chunks we
already have.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0e4f8f88

13 4月, 2011 5 次提交

Btrfs: Check validity before setting an acl · 329c5056

由 Miao Xie 提交于 4月 13, 2011

Call posix_acl_valid() to check if an acl is valid or not.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

329c5056

Btrfs: Fix incorrect inode nlink in btrfs_link() · 3153495d

由 Miao Xie 提交于 4月 13, 2011

Link count of the inode is not decreased if btrfs_set_inode_index()
fails.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Singed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

3153495d

Btrfs: Check if btrfs_next_leaf() returns error in btrfs_real_readdir() · b9e03af0

由 Li Zefan 提交于 3月 23, 2011

btrfs_next_leaf() can return -errno, and we should propagate
it to userspace.

This also simplifies how we walk the btree path.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

b9e03af0

Btrfs: Check if btrfs_next_leaf() returns error in btrfs_listxattr() · 2e6a0035

由 Li Zefan 提交于 3月 17, 2011

btrfs_next_leaf() can return -errno, and we should propagate
it to userspace.

This also simplifies how we walk the btree path.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

2e6a0035

Btrfs: make uncache_state unconditional · 109b36a2

由 Chris Mason 提交于 4月 12, 2011

The extent_io code can take cached pointers into the extent state trees,
and these can make lookups much faster in common operations.  The
caching only happens when specific bits are set that prevent merging
and splitting of the extent state.

A help function was added to uncache the state, and it was testing
the same set of conditionals.  This can leak in very strange corner
cases where the lock bit goes away unexpectedly.

The uncaching should be unconditional.  Once we have a ref on the
extent we should always give it up.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

109b36a2

12 4月, 2011 7 次提交

btrfs: using cached extent_state in set/unlock combinations · 507903b8

由 Arne Jansen 提交于 4月 06, 2011

In several places the sequence (set_extent_uptodate, unlock_extent) is used.
This leads to a duplicate lookup of the extent state. This patch lets
set_extent_uptodate return a cached extent_state which can be passed to
unlock_extent_cached.
The occurences of the above sequences are updated to use the cache. Only
end_bio_extent_readpage is updated that it first gets a cached state to
pass it to the readpage_end_io_hook as the prototype requested and is later
on being used for set/unlock.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

507903b8

Btrfs: avoid taking the trans_mutex in btrfs_end_transaction · 13c5a93e

由 Josef Bacik 提交于 4月 11, 2011

I've been working on making our O_DIRECT latency not suck and I noticed we were
taking the trans_mutex in btrfs_end_transaction. So to do this we convert
num_writers and use_count to atomic_t's and just decrement them in
btrfs_end_transaction. Instead of deleting the transaction from the trans list
in put_transaction we do that in btrfs_commit_transaction() since that's the
only time it actually needs to be removed from the list. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

13c5a93e

Btrfs: fix subvolume mount by name problem when default mount subvolume is set · e15d0542

由 Xin Zhong 提交于 4月 06, 2011

We create two subvolumes (meego_root and meego_home) in
btrfs root directory. And set meego_root as default mount
subvolume. After we remount btrfs, meego_root is mounted
to top directory by default. Then when we try to mount
meego_home (subvol=meego_home) to a subdirectory, it failed.
The problem is when default mount subvolume is set to
meego_root, we search meego_home in meego_root but can not find
it. So the solution is to add a new mount option (subvolrootid)
to specify subvol id of root and search subvol name in it. For
our case, now we can use "-o subvolrootid=0,subvol=meego_home)
to mount meego_home.

Detail information can be found in meego bugzilla:
https://bugs.meego.com/show_bug.cgi?id=15055Signed-off-by: NZhong, Xin <xin.zhong@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e15d0542

fix user annotation in ioctl.c · 13f2696f

由 Daniel J Blueman 提交于 4月 11, 2011

Fix address space annotation correct in ioctl.c.
Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com>

 		       BTRFS_BLOCK_GROUP_SYSTEM,
@@ -2387,7 +2387,7 @@ long btrfs_ioctl_space_info(struct btrfs_root
*root, void __user *arg)
 		up_read(&info->groups_sem);
 	}

-	user_dest = (struct btrfs_ioctl_space_info *)
+	user_dest = (struct btrfs_ioctl_space_info __user *)
 		(arg + sizeof(struct btrfs_ioctl_space_args));

 	if (copy_to_user(user_dest, dest_orig, alloc_size))
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

13f2696f

Btrfs: check for duplicate iov_base's when doing dio reads · a1b75f7d

由 Josef Bacik 提交于 4月 08, 2011

Apparently it is ok to submit a read to an IDE device with the same target page
for different offsets. This is what Windows does under qemu. The problem is
under DIO we expect them to be different buffers for checksumming reasons, and
so this sort of thing will result in checksum errors, when in reality the file
is fine. So when reading, check to make sure that all iov bases are different,
and if they aren't fall back to buffered mode, since that will work out right.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a1b75f7d

btrfs: properly handle overlapping areas in memmove_extent_buffer · 3387206f

由 Sergei Trofimovich 提交于 4月 11, 2011

Fix data corruption caused by memcpy() usage on overlapping data.
I've observed it first when found out usermode linux crash on btrfs.

?all chain is the following:
------------[ cut here ]------------
WARNING: at /home/slyfox/linux-2.6/fs/btrfs/extent_io.c:3900 memcpy_extent_buffer+0x1a5/0x219()
Call Trace:
6fa39a58:  [<601b495e>] _raw_spin_unlock_irqrestore+0x18/0x1c
6fa39a68:  [<60029ad9>] warn_slowpath_common+0x59/0x70
6fa39aa8:  [<60029b05>] warn_slowpath_null+0x15/0x17
6fa39ab8:  [<600efc97>] memcpy_extent_buffer+0x1a5/0x219
6fa39b48:  [<600efd9f>] memmove_extent_buffer+0x94/0x208
6fa39bc8:  [<600becbf>] btrfs_del_items+0x214/0x473
6fa39c78:  [<600ce1b0>] btrfs_delete_one_dir_name+0x7c/0xda
6fa39cc8:  [<600dad6b>] __btrfs_unlink_inode+0xad/0x25d
6fa39d08:  [<600d7864>] btrfs_start_transaction+0xe/0x10
6fa39d48:  [<600dc9ff>] btrfs_unlink_inode+0x1b/0x3b
6fa39d78:  [<600e04bc>] btrfs_unlink+0x70/0xef
6fa39dc8:  [<6007f0d0>] vfs_unlink+0x58/0xa3
6fa39df8:  [<60080278>] do_unlinkat+0xd4/0x162
6fa39e48:  [<600517db>] call_rcu_sched+0xe/0x10
6fa39e58:  [<600452a8>] __put_cred+0x58/0x5a
6fa39e78:  [<6007446c>] sys_faccessat+0x154/0x166
6fa39ed8:  [<60080317>] sys_unlink+0x11/0x13
6fa39ee8:  [<60016b80>] handle_syscall+0x58/0x70
6fa39f08:  [<60021377>] userspace+0x2d4/0x381
6fa39fc8:  [<60014507>] fork_handler+0x62/0x69
---[ end trace 70b0ca2ef0266b93 ]---

http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg09302.htmlSigned-off-by: NSergei Trofimovich <slyfox@gentoo.org>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3387206f

Btrfs: fix memory leaks in btrfs_new_inode() · 8fb27640

由 Yoshinori Sano 提交于 4月 09, 2011

This patch fixes memory leaks in btrfs_new_inode().
Signed-off-by: NYoshinori Sano <yoshinori.sano@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8fb27640

09 4月, 2011 3 次提交

Btrfs: check for duplicate iov_base's when doing dio reads · 93a54bc4

由 Josef Bacik 提交于 4月 06, 2011

93a54bc4

Btrfs: reuse the extent_map we found when calling btrfs_get_extent · 16d299ac

由 Josef Bacik 提交于 4月 06, 2011

In btrfs_get_block_direct we call btrfs_get_extent to lookup the extent for the
range that we are looking for. If we don't find an extent, btrfs_get_extent
will insert a extent_map for that area and mark it as a hole. So it does the
job of allocating a new extent map and inserting it into the io tree. But if
we're creating a new extent we free it up and redo all of that work. So instead
pass the em to btrfs_new_extent_direct(), and if it will work just allocate the
disk space and set it up properly and bypass the freeing/allocating of a new
extent map and the expensive operation of inserting the thing into the io_tree.
Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

16d299ac

Btrfs: do not use async submit for small DIO io's · 1ae39938

由 Josef Bacik 提交于 4月 06, 2011

When looking at our DIO performance Chris said that for small IO's doing the
async submit stuff tends to be more overhead than it's worth. With this on top
of my other fixes I get about a 17-20% speedup doing a sequential dd with 4k
IO's. Basically if we don't have to split the bio for the map length it's small
enough to be directly submitted, otherwise go back to the async submit. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

1ae39938