提交 · 8ef9b0b9e1c02879c9a41246437a23f513e4378b · openanolis / cloud-kernel

21 4月, 2017 36 次提交

NFS: move nfs_pgarray_set() to open code · 8ef9b0b9

由 Benjamin Coddington 提交于 4月 19, 2017

Since commit 00bfa30a ("NFS: Create a common pgio_alloc and
pgio_release function"), nfs_pgarray_set() has only a single caller.  Let's
open code it.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8ef9b0b9

NFS: Use GFP_NOIO for two allocations in writeback · ae97aa52

由 Benjamin Coddington 提交于 4月 19, 2017

Prevent a deadlock that can occur if we wait on allocations
that try to write back our pages.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Fixes: 00bfa30a ("NFS: Create a common pgio_alloc and pgio_release...")
Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ae97aa52

NFS: Fix use after free in write error path · 1f84ccdf

由 Fred Isaman 提交于 4月 14, 2017

Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Fixes: 0bcbf039 ("nfs: handle request add failure properly")
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1f84ccdf

NFS: Fix missing pg_cleanup after nfs_pageio_cond_complete() · 43b7d964

由 Benjamin Coddington 提交于 4月 14, 2017

Commit a7d42ddb ("nfs: add mirroring
support to pgio layer") moved pg_cleanup out of the path when there was
non-sequental I/O that needed to be flushed.  The result is that for
layouts that have more than one layout segment per file, the pg_lseg is not
cleared, so we can end up hitting the WARN_ON_ONCE(req_start >= seg_end) in
pnfs_generic_pg_test since the pg_lseg will be pointing to that
previously-flushed layout segment.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Fixes: a7d42ddb ("nfs: add mirroring support to pgio layer")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

43b7d964

NFS: fix usage of mempools. · 518662e0

由 NeilBrown 提交于 4月 10, 2017

When passed GFP flags that allow sleeping (such as
GFP_NOIO), mempool_alloc() will never return NULL, it will
wait until memory is available.

This means that we don't need to handle failure, but that we
do need to ensure one thread doesn't call mempool_alloc()
twice on the one pool without queuing or freeing the first
allocation.  If multiple threads did this during times of
high memory pressure, the pool could be exhausted and a
deadlock could result.

pnfs_generic_alloc_ds_commits() attempts to allocate from
the nfs_commit_mempool while already holding an allocation
from that pool.  This is not safe.  So change
nfs_commitdata_alloc() to take a flag that indicates whether
failure is acceptable.

In pnfs_generic_alloc_ds_commits(), accept failure and
handle it as we currently do.  Else where, do not accept
failure, and do not handle it.

Even when failure is acceptable, we want to succeed if
possible.  That means both
 - using an entry from the pool if there is one
 - waiting for direct reclaim is there isn't.

We call mempool_alloc(GFP_NOWAIT) to achieve the first, then
kmem_cache_alloc(GFP_NOIO|__GFP_NORETRY) to achieve the
second.  Each of these can fail, but together they do the
best they can without blocking indefinitely.

The objects returned by kmem_cache_alloc() will still be freed
by mempool_free().  This is safe as mempool_alloc() uses
exactly the same function to allocate objects (since the mempool
was created with mempool_create_slab_pool()).  The object returned
by mempool_alloc() and kmem_cache_alloc() are indistinguishable
so mempool_free() will handle both identically, either adding to the
pool or calling kmem_cache_free().

Also, don't test for failure when allocating from
nfs_wdata_mempool.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

518662e0

NFS: Clean up nfs4_proc_get_lease_time() · f6148713

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f6148713

NFS: Clean up _nfs4_proc_exchange_id() · e917f0d1

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

e917f0d1

NFS: Clean up nfs4_proc_bind_one_conn_to_session() · c7ae7639

由 Anna Schumaker 提交于 4月 07, 2017

Returning errors directly even lets us remove the goto
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c7ae7639

NFS: Remove extra dprintk()s from nfs4namespace.c · 3183783b

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

3183783b

NFS: Clean up nfs4_get_rootfh() · 539fd1d1

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

539fd1d1

NFS: Remove extra dprintk()s from nfs4client.c · 4fe6b366

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4fe6b366

NFS: Clean up nfs4_init_server() · 1073d9b4

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1073d9b4

NFS: Clean up nfs4_set_client() · 2dc42c0d

由 Anna Schumaker 提交于 4月 07, 2017

If we cut out the dprintk()s, then we can return error codes directly
and cut out the goto.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2dc42c0d

NFS: Clean up nfs4_check_server_scope() · 8da0f934

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8da0f934

NFS: Clean up nfs4_check_serverowner_major_id() · ddfa0d48

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ddfa0d48

NFS: Create a common nfs4_match_client() function · 14d1bbb0

由 Anna Schumaker 提交于 4月 07, 2017

This puts all the common code in a single place for the
walk_client_list() functions.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

14d1bbb0

NFS: Clean up nfs4_check_serverowner_minor_id() · 5b6d3ff6

由 Anna Schumaker 提交于 4月 07, 2017

Once again, we can remove the function and compare integer values
directly.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5b6d3ff6

NFS: Clean up nfs4_match_clientids() · f251fd9e

由 Anna Schumaker 提交于 4月 07, 2017

If we cut out the dprintk()s, then we don't even need this to be a
separate function.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f251fd9e

NFS: Clean up nfs42_layoutstat_done() · 5be1810a

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5be1810a

NFS: Remove extra dprintk()s from namespace.c · e36d48e9

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

e36d48e9

NFS: Clean up nfs_direct_commit_complete() · fe4f844d

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

fe4f844d

NFS: Remove nfs_direct_readpage_release() · beeb5338

由 Anna Schumaker 提交于 4月 07, 2017

Just remove the function and have the caller use nfs_release_request()
instead.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

beeb5338

NFS: Clean up extra dprintk()s in client.c · 4cbb9768

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4cbb9768

NFS: Clean up nfs_init_client() · 2844b6ae

由 Anna Schumaker 提交于 4月 07, 2017

We always call nfs_mark_client_ready() even if nfs_create_rpc_client()
returns an error, so we can rearrange nfs_init_client() to mark the
client ready from a single place.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2844b6ae

NFS: Remove extra dprintk()s from callback_xdr.c · 36718a66

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

36718a66

NFS: Clean up encode_cb_sequence_res() · 3d0bfaa6

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

3d0bfaa6

NFS: Clean up decode_notify_lock_args() · 535ece2b

由 Anna Schumaker 提交于 4月 07, 2017

Let's cut out the goto and return any errors immedately
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

535ece2b

NFS: Clean up decode_cb_sequence_args() · 1796549a

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1796549a

NFS: Clean up decode_layoutrecall_args() · c79d56d2

由 Anna Schumaker 提交于 4月 07, 2017

Additionally, this change lets us cut out the goto by returning errors
immediately.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c79d56d2

NFS: Clean up decode_recall_args() · 135a4ea0

由 Anna Schumaker 提交于 4月 07, 2017

Removing the dprintk() lets us simplify the function by returning status
codes directly, rather than using a goto.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

135a4ea0

NFS: Clean up decode_getattr_args() · 56938bb7

由 Anna Schumaker 提交于 4月 07, 2017

Removing the dprintk() lets us return the status value directly, rather
than jumping to a label if an error occurs.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

56938bb7

NFS: Remove extra dprintk()s from callback_proc.c · be55f1bc

由 Anna Schumaker 提交于 4月 07, 2017

Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

be55f1bc

NFS: Clean up nfs4_callback_layoutrecall() · 5694a4f8

由 Anna Schumaker 提交于 4月 07, 2017

In addition to removing the dprintk(), this patch also initializes "res"
to the default return value instead of doing this through an else
condition.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5694a4f8

NFS: Clean up do_callback_layoutrecall() · 1a916ce0

由 Anna Schumaker 提交于 4月 07, 2017

Removing the dprintk()s lets us simplify the function by removing the
else condition entirely and returning the status of
initiate_{file,bulk}_draining() directly.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1a916ce0

nfs: flexfilelayout: remove v3-only data server limitation · a7878ca1

由 Tigran Mkrtchyan 提交于 4月 04, 2017

Flexfilelayout supports data servers which talk NFS v3 and v4.{0,1,2}.
However, this code path is disabled and v3 only servers are accepted.
This change removes this limitation.
Signed-off-by: NTigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

a7878ca1

NFS: switch back to to ->iterate() · b044f645

由 Benjamin Coddington 提交于 3月 10, 2017

NFS has some optimizations for readdir to choose between using READDIR or
READDIRPLUS based on workload, and which NFS operation to use is determined
by subsequent interactions with lookup, d_revalidate, and getattr.

Concurrent use of nfs_readdir() via ->iterate_shared() can cause those
optimizations to repeatedly invalidate the pagecache used to store
directory entries during readdir(), which causes some very bad performance
for directories with many entries (more than about 10000).

There's a couple ways to fix this in NFS, but no fix would be as simple as
going back to ->iterate() to serialize nfs_readdir(), and neither fix I
tested performed as well as going back to ->iterate().

The first required taking the directory's i_lock for each entry, with the
result of terrible contention.

The second way adds another flag to the nfs_inode, and so keeps the
optimizations working for large directories. The difference from using
->iterate() here is that much more memory is consumed for a given workload
without any performance gain.

The workings of nfs_readdir() are such that concurrent users are serialized
within read_cache_page() waiting to retrieve pages of entries from the
server. By serializing this work in iterate_dir() instead, contention for
cache pages is reduced. Waiting processes can have an uncontended pass at
the entirety of the directory's pagecache once previous processes have
completed filling it.

v2 - Keep the bits needed for parallel lookup
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b044f645

16 4月, 2017 2 次提交

orangefs: free superblock when mount fails · 1ec1688c

由 Martin Brandenburg 提交于 4月 14, 2017

Otherwise lockdep says:

[ 1337.483798] ================================================
[ 1337.483999] [ BUG: lock held when returning to user space! ]
[ 1337.484252] 4.11.0-rc6 #19 Not tainted
[ 1337.484423] ------------------------------------------------
[ 1337.484626] mount/14766 is leaving the kernel with locks still held!
[ 1337.484841] 1 lock held by mount/14766:
[ 1337.485017]  #0:  (&type->s_umount_key#33/1){+.+.+.}, at: [<ffffffff8124171f>] sget_userns+0x2af/0x520

Caught by xfstests generic/413 which tried to mount with the unsupported
mount option dax.  Then xfstests generic/422 ran sync which deadlocks.
Signed-off-by: NMartin Brandenburg <martin@omnibond.com>
Acked-by: NMike Marshall <hubcap@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1ec1688c

vfs: don't do RCU lookup of empty pathnames · c0eb027e

由 Linus Torvalds 提交于 4月 02, 2017

Normal pathname lookup doesn't allow empty pathnames, but using
AT_EMPTY_PATH (with name_to_handle_at() or fstatat(), for example) you
can trigger an empty pathname lookup.

And not only is the RCU lookup in that case entirely unnecessary
(because we'll obviously immediately finalize the end result), it is
actively wrong.

Why? An empth path is a special case that will return the original
'dirfd' dentry - and that dentry may not actually be RCU-free'd,
resulting in a potential use-after-free if we were to initialize the
path lazily under the RCU read lock and depend on complete_walk()
finalizing the dentry.

Found by syzkaller and KASAN.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Reported-by: NVegard Nossum <vegard.nossum@gmail.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c0eb027e

14 4月, 2017 2 次提交

hugetlbfs: fix offset overflow in hugetlbfs mmap · 045c7a3f

由 Mike Kravetz 提交于 4月 13, 2017

If mmap() maps a file, it can be passed an offset into the file at which
the mapping is to start.  Offset could be a negative value when
represented as a loff_t.  The offset plus length will be used to update
the file size (i_size) which is also a loff_t.

Validate the value of offset and offset + length to make sure they do
not overflow and appear as negative.

Found by syzcaller with commit ff8c0c53 ("mm/hugetlb.c: don't call
region_abort if region_chg fails") applied.  Prior to this commit, the
overflow would still occur but we would luckily return ENOMEM.

To reproduce:

   mmap(0, 0x2000, 0, 0x40021, 0xffffffffffffffffULL, 0x8000000000000000ULL);

Resulted in,

  kernel BUG at mm/hugetlb.c:742!
  Call Trace:
   hugetlbfs_evict_inode+0x80/0xa0
   evict+0x24a/0x620
   iput+0x48f/0x8c0
   dentry_unlink_inode+0x31f/0x4d0
   __dentry_kill+0x292/0x5e0
   dput+0x730/0x830
   __fput+0x438/0x720
   ____fput+0x1a/0x20
   task_work_run+0xfe/0x180
   exit_to_usermode_loop+0x133/0x150
   syscall_return_slowpath+0x184/0x1c0
   entry_SYSCALL_64_fastpath+0xab/0xad

Fixes: ff8c0c53 ("mm/hugetlb.c: don't call region_abort if region_chg fails")
Link: http://lkml.kernel.org/r/1491951118-30678-1-git-send-email-mike.kravetz@oracle.comReported-by: NVegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
Acked-by: NHillf Danton <hillf.zj@alibaba-inc.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

045c7a3f

thp: fix MADV_DONTNEED vs clear soft dirty race · 5b7abeae

由 Kirill A. Shutemov 提交于 4月 13, 2017

Yet another instance of the same race.

Fix is identical to change_huge_pmd().

See "thp: fix MADV_DONTNEED vs.  numa balancing race" for more details.

Link: http://lkml.kernel.org/r/20170302151034.27829-5-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5b7abeae

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功