提交 · 80a0e63751b4d872e8ebe0f14f89fdd3f5c8989b · gsplhtlxg / clone-Linux

01 8月, 2014 2 次提交

cifs: Split lanman auth from CIFS_SessSetup() · 80a0e637

由 Sachin Prabhu 提交于 6月 16, 2014

In preparation for splitting CIFS_SessSetup() into smaller more
manageable chunks, we first add helper functions.

We then proceed to split out lanman auth out of CIFS_SessSetup()
Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
Reviewed-by: NShirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

80a0e637

cifs: replace code with free_rsp_buf() · 6d81ed1e

由 Sachin Prabhu 提交于 6月 16, 2014

The functionality provided by free_rsp_buf() is duplicated in a number
of places. Replace these instances with a call to free_rsp_buf().
Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
Reviewed-by: NShirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

6d81ed1e

30 7月, 2014 1 次提交

AFS: Correctly assemble the client UUID · 0ef13515

由 David Howells 提交于 7月 29, 2014

Correctly assemble the client UUID by OR'ing in the flags rather than
assigning them over the other components.
Reported-by: NHimangi Saraogi <himangi774@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0ef13515

24 7月, 2014 4 次提交

fs: umount on symlink leaks mnt count · 295dc39d

由 Vasily Averin 提交于 7月 21, 2014

Currently umount on symlink blocks following umount:

/vz is separate mount

# ls /vz/ -al | grep test
drwxr-xr-x.  2 root root       4096 Jul 19 01:14 testdir
lrwxrwxrwx.  1 root root         11 Jul 19 01:16 testlink -> /vz/testdir
# umount -l /vz/testlink
umount: /vz/testlink: not mounted (expected)

# lsof /vz
# umount /vz
umount: /vz: device is busy. (unexpected)

In this case mountpoint_last() gets an extra refcount on path->mnt
Signed-off-by: NVasily Averin <vvs@openvz.org>
Acked-by: NIan Kent <raven@themaw.net>
Acked-by: NJeff Layton <jlayton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: NChristoph Hellwig <hch@lst.de>

295dc39d

direct-io: fix uninitialized warning in do_direct_IO() · 6fcc5420

由 Boaz Harrosh 提交于 7月 20, 2014

The following warnings:

  fs/direct-io.c: In function ‘__blockdev_direct_IO’:
  fs/direct-io.c:1011:12: warning: ‘to’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  fs/direct-io.c:913:16: note: ‘to’ was declared here
  fs/direct-io.c:1011:12: warning: ‘from’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  fs/direct-io.c:913:10: note: ‘from’ was declared here

are false positive because dio_get_page() either fails, or sets both
'from' and 'to'.

Paul Bolle said ...
Maybe it's better to move initializing "to" and "from" out of
dio_get_page(). That _might_ make it easier for both the the reader and
the compiler to understand what's going on. Something like this:

Christoph Hellwig said ...
The fix of moving the code definitively looks nicer, while I think
uninitialized_var is horrible wart that won't get anywhere near my code.

Boaz Harrosh: I agree with Christoph and Paul
Signed-off-by: NBoaz Harrosh <boaz@plexistor.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6fcc5420

simple_xattr: permit 0-size extended attributes · 4e66d445

由 Hugh Dickins 提交于 7月 23, 2014

If a filesystem uses simple_xattr to support user extended attributes,
LTP setxattr01 and xfstests generic/062 fail with "Cannot allocate
memory": simple_xattr_alloc()'s wrap-around test mistakenly excludes
values of zero size.  Fix that off-by-one (but apparently no filesystem
needs them yet).
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4e66d445

coredump: fix the setting of PF_DUMPCORE · aed8adb7

由 Silesh C V 提交于 7月 23, 2014

Commit 079148b9 ("coredump: factor out the setting of PF_DUMPCORE")
cleaned up the setting of PF_DUMPCORE by removing it from all the
linux_binfmt->core_dump() and moving it to zap_threads().But this ended
up clearing all the previously set flags.  This causes issues during
core generation when tsk->flags is checked again (eg.  for PF_USED_MATH
to dump floating point registers).  Fix this.
Signed-off-by: NSilesh C V <svellattu@mvista.com>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: <stable@vger.kernel.org>	[3.10+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aed8adb7

23 7月, 2014 1 次提交

NFSD: Fix crash encoding lock reply on 32-bit · f98bac5a

由 Kinglong Mee 提交于 7月 07, 2014

Commit 8c7424cf "nfsd4: don't try to encode conflicting owner if low
on space" forgot to free conf->data in nfsd4_encode_lockt and before
sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak.

Worse, kfree() can be called on an uninitialized pointer in the case of
a succesful lock (or one that fails for a reason other than a conflict).

(Note that lock->lk_denied.ld_owner.data appears it should be zero here,
until you notice that it's one arm of a union the other arm of which is
written to in the succesful case by the

	memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid,
	                                sizeof(stateid_t));

in nfsd4_lock().  In the 32-bit case this overwrites ld_owner.data.)
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Fixes: 8c7424cf ""nfsd4: don't try to encode conflicting owner if low on space"
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f98bac5a

22 7月, 2014 2 次提交

fuse: add FUSE_NO_OPEN_SUPPORT flag to INIT · d7afaec0

由 Andrew Gallagher 提交于 7月 22, 2014

Here some additional changes to set a capability flag so that clients can
detect when it's appropriate to return -ENOSYS from open.

This amends the following commit introduced in 3.14:

  7678ac50  fuse: support clients that don't implement 'open'

However we can only add the flag to 3.15 and later since there was no
protocol version update in 3.14.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: <stable@vger.kernel.org> # v3.15+

d7afaec0

fuse: s_time_gran fix · a800bad3

由 Miklos Szeredi 提交于 7月 22, 2014

Default s_time_gran is 1, don't overwrite that if userspace didn't
explicitly specify one.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: <stable@vger.kernel.org> # v3.15+

a800bad3

20 7月, 2014 2 次提交

btrfs: test for valid bdev before kobj removal in btrfs_rm_device · 0bfaa9c5

由 Eric Sandeen 提交于 7月 07, 2014

commit 99994cde btrfs: dev delete should remove sysfs entry
added a btrfs_kobj_rm_device, which dereferences device->bdev...
right after we check whether device->bdev might be NULL.

I don't honestly know if it's possible to have a NULL device->bdev
here, but assuming that it is (given the test), we need to move
the kobject removal to be under that test.

(Coverity spotted this)
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NChris Mason <clm@fb.com>

0bfaa9c5

Btrfs: fix abnormal long waiting in fsync · 98ce2ded

由 Liu Bo 提交于 7月 17, 2014

xfstests generic/127 detected this problem.

With commit 7fc34a62, now fsync will only flush
data within the passed range.  This is the cause of the above problem,
-- btrfs's fsync has a stage called 'sync log' which will wait for all the
ordered extents it've recorded to finish.

In xfstests/generic/127, with mixed operations such as truncate, fallocate,
punch hole, and mapwrite, we get some pre-allocated extents, and mapwrite will
mmap, and then msync.  And I find that msync will wait for quite a long time
(about 20s in my case), thanks to ftrace, it turns out that the previous
fallocate calls 'btrfs_wait_ordered_range()' to flush dirty pages, but as the
range of dirty pages may be larger than 'btrfs_wait_ordered_range()' wants,
there can be some ordered extents created but not getting corresponding pages
flushed, then they're left in memory until we fsync which runs into the
stage 'sync log', and fsync will just wait for the system writeback thread
to flush those pages and get ordered extents finished, so the latency is
inevitable.

This adds a flush similar to btrfs_start_ordered_extent() in
btrfs_wait_logged_extents() to fix that.
Reviewed-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <clm@fb.com>

98ce2ded

18 7月, 2014 8 次提交

GFS2: fs/gfs2/rgrp.c: kernel-doc warning fixes · 27ff6a0f

由 Fabian Frederick 提交于 7月 02, 2014

Cc: cluster-devel@redhat.com
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

27ff6a0f

GFS2: memcontrol: Spelling s/invlidate/invalidate/ · 6b49d1d9

由 Geert Uytterhoeven 提交于 6月 29, 2014

Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Cc: cluster-devel@redhat.com
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6b49d1d9

GFS2: Allow caching of glocks for flock · 97a4f1d7

由 Bob Peterson 提交于 6月 26, 2014

This patch removes the GLF_NOCACHE flag from the glocks associated with
flocks. There should be no good reason not to cache glocks for flocks:
they only force the glock to be demoted before they can be reacquired,
which can slow down performance and even cause glock hangs, especially
in cases where the flocks are held in Shared (SH) mode.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

97a4f1d7

GFS2: Allow flocks to use normal glock dq rather than dq_wait · 5bef3e7c

由 Bob Peterson 提交于 6月 26, 2014

This patch allows flock glocks to use a non-blocking dequeue rather
than dq_wait. It also reverts the previous patch I had posted regarding
dq_wait. The reverted patch isn't necessarily a bad idea, but I decided
this might avoid unforeseen side effects, and was therefore safer.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5bef3e7c

GFS2: replace count*size kzalloc by kcalloc · 6ec43b18

由 Fabian Frederick 提交于 6月 25, 2014

kcalloc manages count*sizeof overflow.

Cc: cluster-devel@redhat.com
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6ec43b18

GFS2: Use GFP_NOFS when allocating glocks · fe0bbd29

由 Steven Whitehouse 提交于 6月 23, 2014

Normally GFP_KERNEL is ok here, but there is now a rarely used code path
relating to deallocation of unlinked inodes (in certain corner cases)
which if hit at times of memory shortage can cause recursion while
trying to free memory.

One solution would be to try and move the gfs2_glock_get() call so
that it is no longer called while another glock is held, but that
doesn't look at all easy, so GFP_NOFS is the best solution for the
time being.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fe0bbd29

GFS2: Fix race in glock lru glock disposal · 94a09a39

由 Steven Whitehouse 提交于 6月 23, 2014

We must not leave items on the LRU list with GLF_LOCK set, since
they can be removed if the glock is brought back into use, which
may then potentially result in a hang, waiting for GLF_LOCK to
clear.

It doesn't happen very often, since it requires a glock that has
not been used for a long time to be brought back into use at the
same moment that the shrinker is part way through disposing of
glocks.

The fix is to set GLF_LOCK at a later time, when we already know
that the other locks can be obtained. Also, we now only release
the lru_lock in case a resched is needed, rather than on every
iteration.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

94a09a39

GFS2: Only wait for demote when last holder is dequeued · 79272b35

由 Bob Peterson 提交于 6月 20, 2014

Function gfs2_glock_dq_wait is supposed to dequeue a glock and then
wait for the lock to be demoted. The problem is, if this is a shared
lock, its demote will depend on the other holders, which means you
might end up waiting forever because the other process is blocked.
This problem is especially apparent when dealing with nested flocks.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

79272b35

16 7月, 2014 1 次提交

quota: missing lock in dqcache_shrink_scan() · d68aab6b

由 Niu Yawei 提交于 6月 04, 2014

Commit 1ab6c499 (fs: convert fs shrinkers to new scan/count API)
accidentally removed locking from quota shrinker. Fix it -
dqcache_shrink_scan() should use dq_list_lock to protect the
scan on free_dquots list.

CC: stable@vger.kernel.org
Fixes: 1ab6c499Signed-off-by: NNiu Yawei <yawei.niu@intel.com>
Signed-off-by: NJan Kara <jack@suse.cz>

d68aab6b

15 7月, 2014 4 次提交

xfs: null unused quota inodes when quota is on · 03e01349

由 Dave Chinner 提交于 7月 15, 2014

When quota is on, it is expected that unused quota inodes have a
value of NULLFSINO. The changes to support a separate project quota
in 3.12 broken this rule for non-project quota inode enabled
filesystem, as the code now refuses to write the group quota inode
if neither group or project quotas are enabled. This regression was
introduced by commit d892d586 ("xfs: Start using pquotaino from the
superblock").

In this case, we should be writing NULLFSINO rather than nothing to
ensure that we leave the group quota inode in a valid state while
quotas are enabled.

Failure to do so doesn't cause a current kernel to break - the
separate project quota inodes introduced translation code to always
treat a zero inode as NULLFSINO. This was introduced by commit
01026297 ("xfs: Initialize all quota inodes to be NULLFSINO") with is
also in 3.12 but older kernels do not do this and hence taking a
filesystem back to an older kernel can result in quotas failing
initialisation at mount time. When that happens, we see this in
dmesg:

[ 1649.215390] XFS (sdb): Mounting Filesystem
[ 1649.316894] XFS (sdb): Failed to initialize disk quotas.
[ 1649.316902] XFS (sdb): Ending clean mount

By ensuring that we write NULLFSINO to quota inodes that aren't
active, we avoid this problem. We have to be really careful when
determining if the quota inodes are active or not, because we don't
want to write a NULLFSINO if the quota inodes are active and we
simply aren't updating them.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

03e01349

xfs: refine the allocation stack switch · cf11da9c

由 Dave Chinner 提交于 7月 15, 2014

The allocation stack switch at xfs_bmapi_allocate() has served it's
purpose, but is no longer a sufficient solution to the stack usage
problem we have in the XFS allocation path.

Whilst the kernel stack size is now 16k, that is not a valid reason
for undoing all our "keep stack usage down" modifications. What it
does allow us to do is have the freedom to refine and perfect the
modifications knowing that if we get it wrong it won't blow up in
our faces - we have a safety net now.

This is important because we still have the issue of older kernels
having smaller stacks and that they are still supported and are
demonstrating a wide range of different stack overflows.  Red Hat
has several open bugs for allocation based stack overflows from
directory modifications and direct IO block allocation and these
problems still need to be solved. If we can solve them upstream,
then distro's won't need to bake their own unique solutions.

To that end, I've observed that every allocation based stack
overflow report has had a specific characteristic - it has happened
during or directly after a bmap btree block split. That event
requires a new block to be allocated to the tree, and so we
effectively stack one allocation stack on top of another, and that's
when we get into trouble.

A further observation is that bmap btree block splits are much rarer
than writeback allocation - over a range of different workloads I've
observed the ratio of bmap btree inserts to splits ranges from 100:1
(xfstests run) to 10000:1 (local VM image server with sparse files
that range in the hundreds of thousands to millions of extents).
Either way, bmap btree split events are much, much rarer than
allocation events.

Finally, we have to move the kswapd state to the allocation workqueue
work when allocation is done on behalf of kswapd. This is proving to
cause significant perturbation in performance under memory pressure
and appears to be generating allocation deadlock warnings under some
workloads, so avoiding the use of a workqueue for the majority of
kswapd writeback allocation will minimise the impact of such
behaviour.

Hence it makes sense to move the stack switch to xfs_btree_split()
and only do it for bmap btree splits. Stack switches during
allocation will be much rarer, so there won't be significant
performacne overhead caused by switching stacks. The worse case
stack from all allocation paths will be split, not just writeback.
And the majority of memory allocations will be done in the correct
context (e.g. kswapd) without causing additional latency, and so we
simplify the memory reclaim interactions between processes,
workqueues and kswapd.

The worst stack I've been able to generate with this patch in place
is 5600 bytes deep. It's very revealing because we exit XFS at:

37)     1768      64   kmem_cache_alloc+0x13b/0x170

about 1800 bytes of stack consumed, and the remaining 3800 bytes
(and 36 functions) is memory reclaim, swap and the IO stack. And
this occurs in the inode allocation from an open(O_CREAT) syscall,
not writeback.

The amount of stack being used is much less than I've previously be
able to generate - fs_mark testing has been able to generate stack
usage of around 7k without too much trouble; with this patch it's
only just getting to 5.5k. This is primarily because the metadata
allocation paths (e.g. directory blocks) are no longer causing
double splits on the same stack, and hence now stack tracing is
showing swapping being the worst stack consumer rather than XFS.

Performance of fs_mark inode create workloads is unchanged.
Performance of fs_mark async fsync workloads is consistently good
with context switches reduced by around 150,000/s (30%).
Performance of dbench, streaming IO and postmark is unchanged.
Allocation deadlock warnings have not been seen on the workloads
that generated them since adding this patch.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

cf11da9c

Revert "xfs: block allocation work needs to be kswapd aware" · aa182e64

由 Dave Chinner 提交于 7月 15, 2014

This reverts commit 1f6d6482.

This commit resulted in regressions in performance in low
memory situations where kswapd was doing writeback of delayed
allocation blocks. It resulted in significant parallelism of the
kswapd work and with the special kswapd flags meant that hundreds of
active allocation could dip into kswapd specific memory reserves and
avoid being throttled. This cause a large amount of performance
variation, as well as random OOM-killer invocations that didn't
previously exist.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

aa182e64

aio: protect reqs_available updates from changes in interrupt handlers · 263782c1

由 Benjamin LaHaise 提交于 7月 14, 2014

As of commit f8567a38 it is now possible to
have put_reqs_available() called from irq context.  While put_reqs_available()
is per cpu, it did not protect itself from interrupts on the same CPU.  This
lead to aio_complete() corrupting the available io requests count when run
under a heavy O_DIRECT workloads as reported by Robert Elliott.  Fix this by
disabling irq updates around the per cpu batch updates of reqs_available.

Many thanks to Robert and folks for testing and tracking this down.
Reported-by: NRobert Elliot <Elliott@hp.com>
Tested-by: NRobert Elliot <Elliott@hp.com>
Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
Cc: Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@infradead.org>
Cc: stable@vger.kenel.org

263782c1

14 7月, 2014 3 次提交

fuse: replace count*size kzalloc by kcalloc · f2b3455e

由 Fabian Frederick 提交于 6月 23, 2014

kcalloc manages count*sizeof overflow.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

f2b3455e

fuse: release temporary page if fuse_writepage_locked() failed · 27f1b363

由 Maxim Patlasov 提交于 7月 10, 2014

tmp_page to be freed if fuse_write_file_get() returns NULL.
Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

27f1b363

NFS: Don't reset pg_moreio in __nfs_pageio_add_request · f563b89b

由 Trond Myklebust 提交于 7月 13, 2014

Once we've started sending unstable NFS writes, we do not want to
clear pg_moreio, or we may end up sending the very last request as
a stable write if the commit lists are still empty.

Do, however, reset pg_moreio in the case where we end up having to
recoalesce the write if an attempt to use pNFS failed.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f563b89b

13 7月, 2014 8 次提交

NFS: Remove 2 unused variables · aafe3750

由 Trond Myklebust 提交于 7月 12, 2014

Cc: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

aafe3750

nfs: handle multiple reqs in nfs_wb_page_cancel · 3e217045

由 Weston Andros Adamson 提交于 7月 11, 2014

Use nfs_lock_and_join_requests to merge all subrequests into the head request -
this cancels and dereferences all subrequests.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

3e217045

nfs: handle multiple reqs in nfs_page_async_flush · d4581383

由 Weston Andros Adamson 提交于 7月 11, 2014

Change nfs_find_and_lock_request so nfs_page_async_flush can handle multiple
requests in a page. There is only one request for a page the first time
nfs_page_async_flush is called, but if a write or commit fails, async_flush
is called again and there may be multiple requests associated with the page.
The solution is to merge all the requests in a page group into a single
request before calling nfs_pageio_add_request.

Rename nfs_find_and_lock_request to nfs_lock_and_join_requests and
change it to first lock all requests for the page, then cancel and merge
all subrequests into the head request.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d4581383

nfs: change find_request to find_head_request · 84d3a9a9

由 Weston Andros Adamson 提交于 7月 11, 2014

nfs_page_find_request_locked* should find the head request for that page.
Rename the functions and add comments to make this clear, and fix a bug
that could return a subrequest when page_private isn't set on the page.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

84d3a9a9

nfs: nfs_page should take a ref on the head req · 85710a83

由 Weston Andros Adamson 提交于 7月 11, 2014

nfs_pages that aren't the the head of a group must take a reference on the
head as long as ->wb_head is set to it. This stops the head from hitting
a refcount of 0 while there is still an active nfs_page for the page group.

This avoids kref warnings in the writeback code when the page group head
is found and referenced.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

85710a83

nfs: mark nfs_page reqs with flag for extra ref · 17089a29

由 Weston Andros Adamson 提交于 7月 11, 2014

Change the use of PG_INODE_REF - set it when taking extra reference on
subrequests and take care to only release once for each request.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

17089a29

ext4: fix potential null pointer dereference in ext4_free_inode · bf40c926

由 Namjae Jeon 提交于 7月 12, 2014

Fix potential null pointer dereferencing problem caused by e43bb4e6
("ext4: decrement free clusters/inodes counters when block group declared bad")
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>

bf40c926

ext4: fix a potential deadlock in __ext4_es_shrink() · 3f1f9b85

由 Theodore Ts'o 提交于 7月 12, 2014

This fixes the following lockdep complaint:

[ INFO: possible circular locking dependency detected ]
3.16.0-rc2-mm1+ #7 Tainted: G           O  
-------------------------------------------------------
kworker/u24:0/4356 is trying to acquire lock:
 (&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0

but task is already holding lock:
 (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

which lock already depends on the new lock.

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ei->i_es_lock);
                               lock(&(&sbi->s_es_lru_lock)->rlock);
                               lock(&ei->i_es_lock);
  lock(&(&sbi->s_es_lru_lock)->rlock);

 *** DEADLOCK ***

6 locks held by kworker/u24:0/4356:
 #0:  ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 #1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 #2:  (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90
 #3:  (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0
 #4:  (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550
 #5:  (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

stack backtrace:
CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G           O   3.16.0-rc2-mm1+ #7
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: writeback bdi_writeback_workfn (flush-253:0)
 ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007
 ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568
 ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930
Call Trace:
 [<ffffffff815df0bb>] dump_stack+0x4e/0x68
 [<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c
 [<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00
 [<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe
 [<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff810a8707>] lock_acquire+0x87/0x120
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70
 [<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0
 [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180
 [<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550
 [<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00
	...
Reported-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reported-by: NMinchan Kim <minchan@kernel.org>
Cc: stable@vger.kernel.org
Cc: Zheng Liu <gnehzuil.liu@gmail.com>

3f1f9b85

12 7月, 2014 1 次提交

ext4: revert commit which was causing fs corruption after journal replays · f9ae9cf5

由 Theodore Ts'o 提交于 7月 11, 2014

Commit 00764937 ("ext4: initialize multi-block allocator before
checking block descriptors") causes the block group descriptor's count
of the number of free blocks to become inconsistent with the number of
free blocks in the allocation bitmap.  This is a harmless form of fs
corruption, but it causes the kernel to potentially remount the file
system read-only, or to panic, depending on the file systems's error
behavior.

Thanks to Eric Whitney for his tireless work to reproduce and to find
the guilty commit.

Fixes: 00764937 ("ext4: initialize multi-block allocator before checking block descriptors"

Cc: stable@vger.kernel.org  # 3.15
Reported-by: NDavid Jander <david@protonic.nl>
Reported-by: NMatteo Croce <technoboy85@gmail.com>
Tested-by: NEric Whitney <enwlinux@gmail.com>
Suggested-by: NEric Whitney <enwlinux@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

f9ae9cf5

10 7月, 2014 1 次提交

fuse: restructure ->rename2() · 4237ba43

由 Miklos Szeredi 提交于 7月 10, 2014

Make ->rename2() universal, i.e. able to handle zero flags.  This is to
make future change of the API easier.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

4237ba43

09 7月, 2014 2 次提交

f2fs: avoid to access NULL pointer in issue_flush_thread · 50e1f8d2

由 Chao Yu 提交于 7月 07, 2014

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=75861

Denis 2014-05-10 11:28:59 UTC reported:
"F2FS-fs (mmcblk0p28): mounting..
 Unable to handle kernel NULL pointer dereference at virtual address 00000018
 ...
 [<c0a2f678>] (_raw_spin_lock+0x3c/0x70) from [<c03a0330>] (issue_flush_thread+0x50/0x17c)
 [<c03a0330>] (issue_flush_thread+0x50/0x17c) from [<c01b4064>] (kthread+0x98/0xa4)
 [<c01b4064>] (kthread+0x98/0xa4) from [<c0108060>] (kernel_thread_exit+0x0/0x8)"

This patch assign cmd_control_info in sm_info before issue_flush_thread is being
created, so this make sure that issue flush thread will have no chance to access
invalid info in fcc.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

50e1f8d2

f2fs: check bdi->dirty_exceeded when trying to skip data writes · 2743f865

由 Jaegeuk Kim 提交于 6月 28, 2014

If we don't check the current backing device status, balance_dirty_pages can
fall into infinite pausing routine.

This can be occurred when a lot of directories make a small number of dirty
dentry pages including files.
Reported-by: NBrian Chadwick <brianchad@westnet.com.au>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2743f865