提交 · ead432756ab2c76b1f1de742a1c8a06992cb98eb · openanolis / cloud-kernel

23 6月, 2014 2 次提交

f2fs: recover fallocated data and its i_size together · ead43275

由 Jaegeuk Kim 提交于 6月 13, 2014

This patch arranges the f2fs_locks to cover the fallocated data and its i_size.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ead43275

f2fs: fix to report newly allocate region as extent · ccfb3000

由 Jaegeuk Kim 提交于 6月 13, 2014

Previous get_block in f2fs didn't report the newly allocated region which has
NEW_ADDR.
For reader, it should not report, but fiemap needs this.
So, this patch introduces two get_block sharing core function.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ccfb3000

20 6月, 2014 9 次提交

Btrfs: fix wrong error handle when the device is missing or is not writeable · 8408c716

由 Miao Xie 提交于 6月 19, 2014

The original bio might be submitted, so we shoud increase bi_remaining to
account for it when we deal with the error that the device is missing or
is not writeable, or we would skip the endio handle.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

8408c716

Btrfs: fix deadlock when mounting a degraded fs · c55f1396

由 Miao Xie 提交于 6月 19, 2014

The deadlock happened when we mount degraded filesystem, the reproduced
steps are following:
 # mkfs.btrfs -f -m raid1 -d raid1 <dev0> <dev1>
 # echo 1 > /sys/block/`basename <dev0>`/device/delete
 # mount -o degraded <dev1> <mnt>

The reason was that the counter -- bi_remaining was wrong. If the missing
or unwriteable device was the last device in the mapping array, we would
not submit the original bio, so we shouldn't increase bi_remaining of it
in btrfs_end_bio(), or we would skip the final endio handle.

Fix this problem by adding a flag into btrfs bio structure. If we submit
the original bio, we will set the flag, and we increase bi_remaining counter,
or we don't.

Though there is another way to fix it -- decrease bi_remaining counter of the
original bio when we make sure the original bio is not submitted, this method
need add more check and is easy to make mistake.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <clm@fb.com>

c55f1396

M
Btrfs: use bio_endio_nodec instead of open code · e990f167
由 Miao Xie 提交于 6月 19, 2014
```
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>
```
e990f167

Btrfs: fix NULL pointer crash when running balance and scrub concurrently · 298a8f9c

由 Wang Shilong 提交于 6月 19, 2014

While running balance, scrub, fsstress concurrently we hit the
following kernel crash:

[56561.448845] BTRFS info (device sde): relocating block group 11005853696 flags 132
[56561.524077] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[56561.524237] IP: [<ffffffffa038956d>] scrub_chunk.isra.12+0xdd/0x130 [btrfs]
[56561.524297] PGD 9be28067 PUD 7f3dd067 PMD 0
[56561.524325] Oops: 0000 [#1] SMP
[....]
[56561.527237] Call Trace:
[56561.527309]  [<ffffffffa038980e>] scrub_enumerate_chunks+0x24e/0x490 [btrfs]
[56561.527392]  [<ffffffff810abe00>] ? abort_exclusive_wait+0x50/0xb0
[56561.527476]  [<ffffffffa038add4>] btrfs_scrub_dev+0x1a4/0x530 [btrfs]
[56561.527561]  [<ffffffffa0368107>] btrfs_ioctl+0x13f7/0x2a90 [btrfs]
[56561.527639]  [<ffffffff811c82f0>] do_vfs_ioctl+0x2e0/0x4c0
[56561.527712]  [<ffffffff8109c384>] ? vtime_account_user+0x54/0x60
[56561.527788]  [<ffffffff810f768c>] ? __audit_syscall_entry+0x9c/0xf0
[56561.527870]  [<ffffffff811c8551>] SyS_ioctl+0x81/0xa0
[56561.527941]  [<ffffffff815707f7>] tracesys+0xdd/0xe2
[...]
[56561.528304] RIP  [<ffffffffa038956d>] scrub_chunk.isra.12+0xdd/0x130 [btrfs]
[56561.528395]  RSP <ffff88004c0f5be8>
[56561.528454] CR2: 0000000000000078

This is because in btrfs_relocate_chunk(), we will free @bdev directly while
scrub may still hold extent mapping, and may access freed memory.

Fix this problem by wrapping freeing @bdev work into free_extent_map() which
is based on reference count.
Reported-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

298a8f9c

btrfs: Skip scrubbing removed chunks to avoid -ENOENT. · ced96edc

由 Qu Wenruo 提交于 6月 19, 2014

When run scrub with balance, sometimes -ENOENT will be returned, since
in scrub_enumerate_chunks() will search dev_extent in *COMMIT_ROOT*, but
btrfs_lookup_block_group() will search block group in *MEMORY*, so if a
chunk is removed but not committed, -ENOENT will be returned.

However, there is no need to stop scrubbing since other chunks may be
scrubbed without problem.

So this patch changes the behavior to skip removed chunks and continue
to scrub the rest.
Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

ced96edc

Btrfs: fix broken free space cache after the system crashed · e570fd27

由 Miao Xie 提交于 6月 19, 2014

When we mounted the filesystem after the crash, we got the following
message:
  BTRFS error (device xxx): block group xxxx has wrong amount of free space
  BTRFS error (device xxx): failed to load free space cache for block group xxx

It is because we didn't update the metadata of the allocated space (in extent
tree) until the file data was written into the disk. During this time, there was
no information about the allocated spaces in either the extent tree nor the
free space cache. when we wrote out the free space cache at this time (commit
transaction), those spaces were lost. In fact, only the free space that is
used to store the file data had this problem, the others didn't because
the metadata of them is updated in the same transaction context.

There are many methods which can fix the above problem
- track the allocated space, and write it out when we write out the free
  space cache
- account the size of the allocated space that is used to store the file
  data, if the size is not zero, don't write out the free space cache.

The first one is complex and may make the performance drop down.
This patch chose the second method, we use a per-block-group variant to
account the size of that allocated space. Besides that, we also introduce
a per-block-group read-write semaphore to avoid the race between
the allocation and the free space cache write out.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

e570fd27

Btrfs: make free space cache write out functions more readable · 5349d6c3

由 Miao Xie 提交于 6月 19, 2014

This patch makes the free space cache write out functions more readable,
and beisdes that, it also reduces the stack space that the function --
__btrfs_write_out_cache uses from 194bytes to 144bytes.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

5349d6c3

Btrfs: remove unused wait queue in struct extent_buffer · 46fefe41

由 Filipe Manana 提交于 6月 16, 2014

The lock_wq wait queue is not used anywhere, therefore just remove it.
On a x86_64 system, this reduced sizeof(struct extent_buffer) from 320
bytes down to 296 bytes, which means a 4Kb page can now be used for
13 extent buffers instead of 12.
Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: NChris Mason <clm@fb.com>

46fefe41

Btrfs: fix deadlocks with trylock on tree nodes · ea4ebde0

由 Chris Mason 提交于 6月 19, 2014

The Btrfs tree trylock function is poorly named.  It always takes
the spinlock and backs off if the blocking lock is held.  This
can lead to surprising lockups because people expect it to really be a
trylock.

This commit makes it a pure trylock, both for the spinlock and the
blocking lock.  It also reworks the nested lock handling slightly to
avoid taking the read lock while a spinning write lock might be held.
Signed-off-by: NChris Mason <clm@fb.com>

ea4ebde0

18 6月, 2014 2 次提交

NFSD: fix bug for readdir of pseudofs · f41c5ad2

由 Kinglong Mee 提交于 6月 13, 2014

Commit 561f0ed4 (nfsd4: allow large readdirs) introduces a bug
about readdir the root of pseudofs.

Call xdr_truncate_encode() revert encoded name when skipping.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f41c5ad2

NFSD: Don't hand out delegations for 30 seconds after recalling them. · 6282cd56

由 NeilBrown 提交于 6月 04, 2014

If nfsd needs to recall a delegation for some reason it implies that there is
contention on the file, so further delegations should not be handed out.

The current code fails to do so, and the result is effectively a
live-lock under some workloads: a client attempting a conflicting
operation on a read-delegated file receives NFS4ERR_DELAY and retries
the operation, but by the time it retries the server may already have
given out another delegation.

We could simply avoid delegations for (say) 30 seconds after any recall, but
this is probably too heavy handed.

We could keep a list of inodes (or inode numbers or filehandles) for recalled
delegations, but that requires memory allocation and searching.

The approach taken here is to use a bloom filter to record the filehandles
which are currently blocked from delegation, and to accept the cost of a few
false positives.

We have 2 bloom filters, each of which is valid for 30 seconds.   When a
delegation is recalled the filehandle is added to one filter and will remain
disabled for between 30 and 60 seconds.

We keep a count of the number of filehandles that have been added, so when
that count is zero we can bypass all other tests.

The bloom filters have 256 bits and 3 hash functions.  This should allow a
couple of dozen blocked  filehandles with minimal false positives.  If many
more filehandles are all blocked at once, behaviour will degrade towards
rejecting all delegations for between 30 and 60 seconds, then resetting and
allowing new delegations.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6282cd56

17 6月, 2014 1 次提交

epoll: fix use-after-free in eventpoll_release_file · ebe06187

由 Konstantin Khlebnikov 提交于 6月 17, 2014

This fixes use-after-free of epi->fllink.next inside list loop macro.
This loop actually releases elements in the body.  The list is
rcu-protected but here we cannot hold rcu_read_lock because we need to
lock mutex inside.

The obvious solution is to use list_for_each_entry_safe().  RCU-ness
isn't essential because nobody can change this list under us, it's final
fput for this file.

The bug was introduced by ae10b2b4 ("epoll: optimize EPOLL_CTL_DEL
using rcu")
Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
Reported-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Stable <stable@vger.kernel.org> # 3.13+
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Jason Baron <jbaron@akamai.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ebe06187

14 6月, 2014 7 次提交

btrfs: fix error handling in create_pending_snapshot · 47a306a7

由 Eric Sandeen 提交于 6月 12, 2014

fcebe456 cut and pasted some code to a later point
in create_pending_snapshot(), but didn't switch
to the appropriate error handling for this stage
of the function.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NChris Mason <clm@fb.com>

47a306a7

btrfs: fix use of uninit "ret" in end_extent_writepage() · 3e2426bd

由 Eric Sandeen 提交于 6月 12, 2014

If this condition in end_extent_writepage() is false:

	if (tree->ops && tree->ops->writepage_end_io_hook)

we will then test an uninitialized "ret" at:

	ret = ret < 0 ? ret : -EIO;

The test for ret is for the case where ->writepage_end_io_hook
failed, and we'd choose that ret as the error; but if
there is no ->writepage_end_io_hook, nothing sets ret.

Initializing ret to 0 should be sufficient; if
writepage_end_io_hook wasn't set, (!uptodate) means
non-zero err was passed in, so we choose -EIO in that case.
Signed-of-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NChris Mason <clm@fb.com>

3e2426bd

btrfs: free ulist in qgroup_shared_accounting() error path · d7372780

由 Eric Sandeen 提交于 6月 12, 2014

If tmp = ulist_alloc(GFP_NOFS) fails, we return without
freeing the previously allocated qgroups = ulist_alloc(GFP_NOFS)
and cause a memory leak.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NChris Mason <clm@fb.com>

d7372780

Btrfs: fix qgroups sanity test crash or hang · b050f9f6

由 Filipe Manana 提交于 6月 12, 2014

Often when running the qgroups sanity test, a crash or a hang happened.
This is because the extent buffer the test uses for the root node doesn't
have an header level explicitly set, making it have a random level value.
This is a problem when it's not zero for the btrfs_search_slot() calls
the test ends up doing, resulting in crashes or hangs such as the following:

[ 6454.127192] Btrfs loaded, debug=on, assert=on, integrity-checker=on
(...)
[ 6454.127760] BTRFS: selftest: Running qgroup tests
[ 6454.127964] BTRFS: selftest: Running test_test_no_shared_qgroup
[ 6454.127966] BTRFS: selftest: Qgroup basic add
[ 6480.152005] BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:5383]
[ 6480.152005] Modules linked in: btrfs(+) xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc i2c_piix4 i2c_core pcspkr evbug psmouse serio_raw e1000 [last unloaded: btrfs]
[ 6480.152005] irq event stamp: 188448
[ 6480.152005] hardirqs last  enabled at (188447): [<ffffffff8168ef5c>] restore_args+0x0/0x30
[ 6480.152005] hardirqs last disabled at (188448): [<ffffffff81698e6a>] apic_timer_interrupt+0x6a/0x80
[ 6480.152005] softirqs last  enabled at (188446): [<ffffffff810516cf>] __do_softirq+0x1cf/0x450
[ 6480.152005] softirqs last disabled at (188441): [<ffffffff81051c25>] irq_exit+0xb5/0xc0
[ 6480.152005] CPU: 0 PID: 5383 Comm: modprobe Not tainted 3.15.0-rc8-fdm-btrfs-next-33+ #4
[ 6480.152005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 6480.152005] task: ffff8802146125a0 ti: ffff8800d0d00000 task.ti: ffff8800d0d00000
[ 6480.152005] RIP: 0010:[<ffffffff81349a63>]  [<ffffffff81349a63>] __write_lock_failed+0x13/0x20
[ 6480.152005] RSP: 0018:ffff8800d0d038e8  EFLAGS: 00000287
[ 6480.152005] RAX: 0000000000000000 RBX: ffffffff8168ef5c RCX: 000005deb8525852
[ 6480.152005] RDX: 0000000000000000 RSI: 0000000000001d45 RDI: ffff8802105000b8
[ 6480.152005] RBP: ffff8800d0d038e8 R08: fffffe12710f63db R09: ffffffffa03196fb
[ 6480.152005] R10: ffff8802146125a0 R11: ffff880214612e28 R12: ffff8800d0d03858
[ 6480.152005] R13: 0000000000000000 R14: ffff8800d0d00000 R15: ffff8802146125a0
[ 6480.152005] FS:  00007f14ff804700(0000) GS:ffff880215e00000(0000) knlGS:0000000000000000
[ 6480.152005] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6480.152005] CR2: 00007fff4df0dac8 CR3: 00000000d1796000 CR4: 00000000000006f0
[ 6480.152005] Stack:
[ 6480.152005]  ffff8800d0d03908 ffffffff810ae967 0000000000000001 ffff8802105000b8
[ 6480.152005]  ffff8800d0d03938 ffffffff8168e57e ffffffffa0319c16 0000000000000007
[ 6480.152005]  ffff880210500000 ffff880210500100 ffff8800d0d039b8 ffffffffa0319c16
[ 6480.152005] Call Trace:
[ 6480.152005]  [<ffffffff810ae967>] do_raw_write_lock+0x47/0xa0
[ 6480.152005]  [<ffffffff8168e57e>] _raw_write_lock+0x5e/0x80
[ 6480.152005]  [<ffffffffa0319c16>] ? btrfs_tree_lock+0x116/0x270 [btrfs]
[ 6480.152005]  [<ffffffffa0319c16>] btrfs_tree_lock+0x116/0x270 [btrfs]
[ 6480.152005]  [<ffffffffa02b2acb>] btrfs_lock_root_node+0x3b/0x50 [btrfs]
[ 6480.152005]  [<ffffffffa02b81a6>] btrfs_search_slot+0x916/0xa20 [btrfs]
[ 6480.152005]  [<ffffffff811a727f>] ? create_object+0x23f/0x300
[ 6480.152005]  [<ffffffffa02b9958>] btrfs_insert_empty_items+0x78/0xd0 [btrfs]
[ 6480.152005]  [<ffffffffa036041a>] insert_normal_tree_ref.constprop.4+0xa2/0x19a [btrfs]
[ 6480.152005]  [<ffffffffa03605c3>] test_no_shared_qgroup+0xb1/0x1ca [btrfs]
[ 6480.152005]  [<ffffffff8108cad6>] ? local_clock+0x16/0x30
[ 6480.152005]  [<ffffffffa035ef8e>] btrfs_test_qgroups+0x1ae/0x1d7 [btrfs]
[ 6480.152005]  [<ffffffffa03a69d2>] ? ftrace_define_fields_btrfs_space_reservation+0xfd/0xfd [btrfs]
[ 6480.152005]  [<ffffffffa03a6a86>] init_btrfs_fs+0xb4/0x153 [btrfs]
[ 6480.152005]  [<ffffffff81000352>] do_one_initcall+0x102/0x150
[ 6480.152005]  [<ffffffff8103d223>] ? set_memory_nx+0x43/0x50
[ 6480.152005]  [<ffffffff81682668>] ? set_section_ro_nx+0x6d/0x74
[ 6480.152005]  [<ffffffff810d91cc>] load_module+0x1cdc/0x2630
(...)

Therefore initialize the extent buffer as an empty leaf (level 0).

Issue easy to reproduce when btrfs is built as a module via:

    $ for ((i = 1; i <= 1000000; i++)); do rmmod btrfs; modprobe btrfs; done
Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: NChris Mason <clm@fb.com>

b050f9f6

btrfs: prevent RCU warning when dereferencing radix tree slot · f1e3c289

由 Sasha Levin 提交于 6月 11, 2014

Mark the dereference as protected by lock. Not doing so triggers
an RCU warning since the radix tree assumed that RCU is in use.
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NChris Mason <clm@fb.com>

f1e3c289

Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting · 5fbc7c59

由 Wang Shilong 提交于 6月 11, 2014

Steps to reproduce:

 # mkfs.btrfs -f /dev/sd[b-f] -m raid5 -d raid5
 # mkfs.ext4 /dev/sdc --->corrupt one of btrfs device
 # mount /dev/sdb /mnt -o degraded
 # btrfs scrub start -BRd /mnt

This is because readahead would skip missing device, this is not true
for RAID5/6, because REQ_GET_READ_MIRRORS return 1 for RAID5/6 block
mapping. If expected data locates in missing device, readahead thread
would not call __readahead_hook() which makes event @rc->elems=0
wait forever.

Fix this problem by checking return value of btrfs_map_block(),we
can only skip missing device safely if there are several mirrors.
Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: NChris Mason <clm@fb.com>

5fbc7c59

btrfs: new ioctl TREE_SEARCH_V2 · cc68a8a5

由 Gerhard Heift 提交于 1月 30, 2014

This new ioctl call allows the user to supply a buffer of varying size in which
a tree search can store its results. This is much more flexible if you want to
receive items which are larger than the current fixed buffer of 3992 bytes or
if you want to fetch more items at once. Items larger than this buffer are for
example some of the type EXTENT_CSUM.
Signed-off-by: NGerhard Heift <Gerhard@Heift.Name>
Signed-off-by: NChris Mason <clm@fb.com>
Acked-by: NDavid Sterba <dsterba@suse.cz>

cc68a8a5

13 6月, 2014 6 次提交

btrfs: tree_search, search_ioctl: direct copy to userspace · ba346b35

由 Gerhard Heift 提交于 1月 30, 2014

By copying each found item seperatly to userspace, we do not need extra
buffer in the kernel.
Signed-off-by: NGerhard Heift <Gerhard@Heift.Name>
Signed-off-by: NChris Mason <clm@fb.com>
Acked-by: NDavid Sterba <dsterba@suse.cz>

ba346b35

btrfs: new function read_extent_buffer_to_user · 550ac1d8

由 Gerhard Heift 提交于 1月 30, 2014

This new function reads the content of an extent directly to user memory.
Signed-off-by: NGerhard Heift <Gerhard@Heift.Name>
Signed-off-by: NChris Mason <clm@fb.com>
Acked-by: NDavid Sterba <dsterba@suse.cz>

550ac1d8

btrfs: tree_search, copy_to_sk: return needed size on EOVERFLOW · 9b6e817d

由 Gerhard Heift 提交于 1月 30, 2014

If an item in tree_search is too large to be stored in the given buffer, return
the needed size (including the header).
Signed-off-by: NGerhard Heift <Gerhard@Heift.Name>
Signed-off-by: NChris Mason <clm@fb.com>
Acked-by: NDavid Sterba <dsterba@suse.cz>

9b6e817d

btrfs: tree_search, copy_to_sk: return EOVERFLOW for too small buffer · 8f5f6178

由 Gerhard Heift 提交于 1月 30, 2014

In copy_to_sk, if an item is too large for the given buffer, it now returns
-EOVERFLOW instead of copying a search_header with len = 0. For backward
compatibility for the first item it still copies such a header to the buffer,
but not any other following items, which could have fitted.

tree_search changes -EOVERFLOW back to 0 to behave similiar to the way it
behaved before this patch.
Signed-off-by: NGerhard Heift <Gerhard@Heift.Name>
Signed-off-by: NChris Mason <clm@fb.com>
Acked-by: NDavid Sterba <dsterba@suse.cz>

8f5f6178

btrfs: tree_search, search_ioctl: accept varying buffer · 12544442

由 Gerhard Heift 提交于 1月 30, 2014

rewrite search_ioctl to accept a buffer with varying size
Signed-off-by: NGerhard Heift <Gerhard@Heift.Name>
Signed-off-by: NChris Mason <clm@fb.com>
Acked-by: NDavid Sterba <dsterba@suse.cz>

12544442

btrfs: tree_search: eliminate redundant nr_items check · 25c9bc2e

由 Gerhard Heift 提交于 1月 30, 2014

If the amount of items reached the given limit of nr_items, we can leave
copy_to_sk without updating the key. Also by returning 1 we leave the loop in
search_ioctl without rechecking if we reached the given limit.
Signed-off-by: NGerhard Heift <Gerhard@Heift.Name>
Signed-off-by: NChris Mason <clm@fb.com>
Acked-by: NDavid Sterba <dsterba@suse.cz>

25c9bc2e

12 6月, 2014 8 次提交

dlm: keep listening connection alive with sctp mode · 883854c5

由 Lidong Zhong 提交于 6月 12, 2014

The connection struct with nodeid 0 is the listening socket,
not a connection to another node.  The sctp resend function
was not checking that the nodeid was valid (non-zero), so it
would mistakenly get and resend on the listening connection
when nodeid was zero.
Signed-off-by: NLidong Zhong <lzhong@suse.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

883854c5

lock_parent: don't step on stale ->d_parent of all-but-freed one · c2338f2d

由 Al Viro 提交于 6月 12, 2014

Dentry that had been through (or into) __dentry_kill() might be seen
by shrink_dentry_list(); that's normal, it'll be taken off the shrink
list and freed if __dentry_kill() has already finished.  The problem
is, its ->d_parent might be pointing to already freed dentry, so
lock_parent() needs to be careful.

We need to check that dentry hasn't already gone into __dentry_kill()
*and* grab rcu_read_lock() before dropping ->d_lock - the latter makes
sure that whatever we see in ->d_parent after dropping ->d_lock it
won't be freed until we drop rcu_read_lock().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c2338f2d

A
kill generic_file_splice_write() · 5f073850
由 Al Viro 提交于 4月 05, 2014
```
no callers left
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
5f073850
A
ceph: switch to iter_file_splice_write() · 3551dd79
由 Al Viro 提交于 4月 05, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3551dd79
A
nfs: switch to iter_splice_write_file() · 4da54c21
由 Al Viro 提交于 4月 05, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
4da54c21

fs/splice.c: remove unneeded exports · 96f9bc8f

由 Al Viro 提交于 4月 05, 2014

ocfs2 was using a bunch of splice.c guts...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

96f9bc8f

A
ocfs2: switch to iter_file_splice_write() · 6dc8bc0f
由 Al Viro 提交于 4月 05, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
6dc8bc0f

->splice_write() via ->write_iter() · 8d020765

由 Al Viro 提交于 4月 05, 2014

iter_file_splice_write() - a ->splice_write() instance that gathers the
pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
it to ->write_iter().  A bunch of simple cases coverted to that...

[AV: fixed the braino spotted by Cyrill]
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8d020765

11 6月, 2014 5 次提交

reiserfs: Fix compilation breakage with CONFIG_REISERFS_CHECK · 19ef1229

由 Jan Kara 提交于 6月 11, 2014

There was a bug in debug printout when CONFIG_REISERFS_CHECK was
enabled so one of the assertions in do_balan.c didn't compile. Fix it.

Fixes: 0080e9f9Signed-off-by: NJan Kara <jack@suse.cz>

19ef1229

ocfs2/o2net: incorrect to terminate accepting connections loop upon rejecting an invalid one · 79deb3c1

由 Tariq Saeed 提交于 6月 10, 2014

When o2net-accept-one() rejects an illegal connection, it terminates the
loop picking up the remaining queued connections.  This fix will
continue accepting connections till the queue is emtpy.

Addresses Orabug 17489469.
Signed-off-by: NTariq Saseed <tariq.x.saeed@oracle.com>
Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Reviewed-by: NMark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

79deb3c1

fs,userns: Change inode_capable to capable_wrt_inode_uidgid · 23adbe12

由 Andy Lutomirski 提交于 6月 10, 2014

The kernel has no concept of capabilities with respect to inodes; inodes
exist independently of namespaces.  For example, inode_capable(inode,
CAP_LINUX_IMMUTABLE) would be nonsense.

This patch changes inode_capable to check for uid and gid mappings and
renames it to capable_wrt_inode_uidgid, which should make it more
obvious what it does.

Fixes CVE-2014-4014.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: stable@vger.kernel.org
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

23adbe12

Btrfs: convert smp_mb__{before,after}_clear_bit · c7548af6

由 Chris Mason 提交于 6月 10, 2014

The new call is smp_mb__{before,after}_atomic.  The __ gives us extra
protection from the atomic rays.
Signed-off-by: NChris Mason <clm@fb.com>

c7548af6

locks: set fl_owner for leases back to current->files · 0c273629

由 Jeff Layton 提交于 6月 10, 2014

This fixes a regression due to commit 130d1f95 (locks: ensure that
fl_owner is always initialized properly in flock and lease codepaths). I
had mistakenly thought that the fl_owner wasn't used in the lease code,
but I missed the place in __break_lease that does use it.

The i_have_this_lease check in generic_add_lease uses it. While I'm not
sure that check is terribly helpful [1], reset it back to using
current->files in order to ensure that there's no behavior change here.

[1]: leases are owned by the file description. It's possible that this
     is a threaded program, and the lease breaker and the task that
     would handle the signal are different, even if they have the same
     file table. So, there is the potential for false positives with
     this check.

Fixes: 130d1f95 (locks: ensure that fl_owner is always initialized properly in flock and lease codepaths)
Signed-off-by: NJeff Layton <jlayton@primarydata.com>

0c273629

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功