提交 · 2864486bd0fdd14431058650c91fcb9fba605d43 · openanolis / cloud-kernel

09 2月, 2017 2 次提交

nfs: no PG_private waiters remain, remove waker · 600424e3

由 Nicholas Piggin 提交于 1月 04, 2017

Since commit 4f52b6bb ("NFS: Don't call COMMIT in ->releasepage()"),
no tasks wait on PagePrivate, so the wake introduced in commit 95905446
("NFS: avoid deadlocks with loop-back mounted NFS filesystems.") can
be removed.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

600424e3

NFS: nfs_rename() handle -ERESTARTSYS dentry left behind · 920b4530

由 Benjamin Coddington 提交于 2月 01, 2017

An interrupted rename will leave the old dentry behind if the rename
succeeds. Fix this by moving the final local work of the rename to
rpc_call_done so that the results of the RENAME can always be handled,
even if the original process has already returned with -ERESTARTSYS.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

920b4530

31 1月, 2017 25 次提交

NFSv4: Fix warning for using 0 as NULL · 68e33bd6

由 Wei Yongjun 提交于 1月 12, 2017

Fixes the following sparse warning:

fs/nfs/nfs4state.c:862:60: warning: Using plain integer as NULL pointer
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

68e33bd6

pNFS/flexfiles: Make local symbol layoutreturn_ops static · 2e54b9b1

由 Wei Yongjun 提交于 1月 12, 2017

Fixes the following sparse warning:

fs/nfs/flexfilelayout/flexfilelayout.c:2114:34: warning:
 symbol 'layoutreturn_ops' was not declared. Should it be static?
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

2e54b9b1

A
NFS: Return the comparison result directly in nfs41_match_stateid() · 045c5519
由 Anna Schumaker 提交于 1月 11, 2017
```
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
```
045c5519

NFS: Clean up nfs41_same_server_scope() · 49ad0145

由 Anna Schumaker 提交于 1月 11, 2017

The function is cleaner this way, since we can use the result of
memcmp() directly
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

49ad0145

A
NFS: No need to set and return status in nfs41_lock_expired() · 81b68de4
由 Anna Schumaker 提交于 1月 11, 2017
```
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
```
81b68de4

NFS: Remove unnecessary goto in nfs4_lookup_root_sec() · 9df1336c

由 Anna Schumaker 提交于 1月 11, 2017

Once again, it's easier and cleaner just to return the error directly.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9df1336c

NFS: Remove nfs4_recover_expired_lease() · 334f87dd

由 Anna Schumaker 提交于 1月 11, 2017

This function doesn't add much, since all it does is access the server's
nfs_client variable.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

334f87dd

NFS: Remove an extra if in _nfs4_recover_proc_open() · d7e98258

由 Anna Schumaker 提交于 1月 11, 2017

It's simpler just to return the status unconditionally
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d7e98258

NFS: Return errors directly in _nfs4_opendata_reclaim_to_nfs4_state() · 37a8484a

由 Anna Schumaker 提交于 1月 11, 2017

There is no need for a goto just to return an error code without any
cleanup. Returning the error directly helps to clean up the code.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

37a8484a

A
NFS: Remove nfs4_wait_for_completion_rpc_task() · 820bf85c
由 Anna Schumaker 提交于 1月 11, 2017
```
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
```
820bf85c

NFS: Clean up _nfs4_is_integrity_protected() · eeea5361

由 Anna Schumaker 提交于 1月 11, 2017

We can cut out the if statement and return the results of the comparison
directly.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

eeea5361

A
NFS: Fix inconsistent indentation in nfs4proc.c · d9b67e1e
由 Anna Schumaker 提交于 1月 11, 2017
```
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
```
d9b67e1e

NFS: Make trace_nfs4_setup_sequence() available to NFS v4.0 · ad05cc0f

由 Anna Schumaker 提交于 1月 11, 2017

This tracepoint displays information about the slot that was chosen for
the RPC, in addition to session information.  This could be useful
information for debugging, and we can set the session id hash to 0 to
indicate that there is no session.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ad05cc0f

NFS: Merge the remaining setup_sequence functions · 3d35808b

由 Anna Schumaker 提交于 1月 11, 2017

This creates a single place for all the work to happen, using the
presence of a session to determine if extra values need to be set.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

3d35808b

A
NFS: Check if the slot table is draining from nfs4_setup_sequence() · 76ee0354
由 Anna Schumaker 提交于 1月 10, 2017
```
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
```
76ee0354
A
NFS: Handle setup sequence task rescheduling in a single place · 0dcee8bb
由 Anna Schumaker 提交于 1月 10, 2017
```
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
```
0dcee8bb
A
NFS: Lock the slot table from a single place during setup sequence · 6994cdd7
由 Anna Schumaker 提交于 1月 10, 2017
```
Rather than implementing this twice for NFS v4.0 and v4.1
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
```
6994cdd7

NFS: Move slot-already-allocated check into nfs_setup_sequence() · 9dd9107f

由 Anna Schumaker 提交于 1月 10, 2017

This puts the check in a single place, rather than needing to implement
it twice for v4.0 and v4.1.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9dd9107f

NFS: Create a single nfs4_setup_sequence() function · 7981c8a6

由 Anna Schumaker 提交于 1月 10, 2017

The inline ifdef lets us put everything in a single place, rather than
having two (very similar) versions of this function.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

7981c8a6

NFS: Use nfs4_setup_sequence() everywhere · 6de7e12f

由 Anna Schumaker 提交于 1月 09, 2017

This does the right thing depending on if we have a session, rather than
needing to handle this manually in multiple places.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6de7e12f

NFS: Change nfs4_setup_sequence() to take an nfs_client structure · 42e1cca7

由 Anna Schumaker 提交于 1月 09, 2017

I want to have all callers use this function, rather than calling the
NFS v4.0 and v4.1 versions directly. This includes pNFS, which only has
access to the nfs_client structure in some places.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

42e1cca7

NFS: Change nfs4_get_session() to take an nfs_client structure · 172d9de1

由 Anna Schumaker 提交于 5月 14, 2015

pNFS only has access to the nfs_client structure, and not the
nfs_server, so we need to make this change so the function can be used
by pNFS as well.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

172d9de1

NFS: Move nfs4_get_session() into nfs4_session.h · efc6f4aa

由 Anna Schumaker 提交于 1月 09, 2017

This puts session related functions together in the same space.  I only
keep one version of this function, since this variable will always be
NULL when using NFS v4.0.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

efc6f4aa

NFS: tidy up nfs_show_mountd_netid · 6f6e3c09

由 NeilBrown 提交于 1月 13, 2017

This function is a bit clumsy, incorrectly producing
",mountproto=" if mountd_protocol is 0 and !showdefaults,
and duplicating the code for reporting "auto".

Tidy it up so that it only makes a single seq_printf() call,
and more obviously does the right thing.

Fixes: ee671b01 ("NFS: convert proto= option to use netids rather than a protoname")
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6f6e3c09

sunrpc & nfs: Add and use dprintk_cont macros · ddeaa637

由 Joe Perches 提交于 10月 15, 2016

Allow line continuations to work properly with KERN_CONT.
Signed-off-by: NJoe Perches <joe@perches.com>
[Anna: Add fallback dprintk_cont() for when CONFIG_SUNRPC_DEBUG=n]
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ddeaa637

28 1月, 2017 1 次提交

xfs: prevent quotacheck from overloading inode lru · e0d76fa4

由 Brian Foster 提交于 1月 26, 2017

Quotacheck runs at mount time in situations where quota accounting must
be recalculated. In doing so, it uses bulkstat to visit every inode in
the filesystem. Historically, every inode processed during quotacheck
was released and immediately tagged for reclaim because quotacheck runs
before the superblock is marked active by the VFS. In other words,
the final iput() lead to an immediate ->destroy_inode() call, which
allowed the XFS background reclaim worker to start reclaiming inodes.

Commit 17c12bcd ("xfs: when replaying bmap operations, don't let
unlinked inodes get reaped") marks the XFS superblock active sooner as
part of the mount process to support caching inodes processed during log
recovery. This occurs before quotacheck and thus means all inodes
processed by quotacheck are inserted to the LRU on release.  The
s_umount lock is held until the mount has completed and thus prevents
the shrinkers from operating on the sb. This means that quotacheck can
excessively populate the inode LRU and lead to OOM conditions on systems
without sufficient RAM.

Update the quotacheck bulkstat handler to set XFS_IGET_DONTCACHE on
inodes processed by quotacheck. This causes ->drop_inode() to return 1
and in turn causes iput_final() to evict the inode. This preserves the
original quotacheck behavior and prevents it from overloading the LRU
and running out of memory.

CC: stable@vger.kernel.org # v4.9
Reported-by: NMartin Svec <martin.svec@zoner.cz>
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

e0d76fa4

27 1月, 2017 6 次提交

Btrfs: remove ->{get, set}_acl() from btrfs_dir_ro_inode_operations · 57b59ed2

由 Omar Sandoval 提交于 1月 25, 2017

Subvolume directory inodes can't have ACLs.

Cc: <stable@vger.kernel.org> # 4.9.x
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

57b59ed2

Btrfs: disable xattr operations on subvolume directories · 1fdf4194

由 Omar Sandoval 提交于 1月 25, 2017

When you snapshot a subvolume containing a subvolume, you get a
placeholder directory where the subvolume would be. These directory
inodes have ->i_ops set to btrfs_dir_ro_inode_operations. Previously,
these i_ops didn't include the xattr operation callbacks. The conversion
to xattr_handlers missed this case, leading to bogus attempts to set
xattrs on these inodes. This manifested itself as failures when running
delayed inodes.

To fix this, clear IOP_XATTR in ->i_opflags on these inodes.

Fixes: 6c6ef9f2 ("xattr: Stop calling {get,set,remove}xattr inode operations")
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Reported-by: NChris Murphy <lists@colorremedies.com>
Tested-by: NChris Murphy <lists@colorremedies.com>
Cc: <stable@vger.kernel.org> # 4.9.x
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

1fdf4194

Btrfs: remove old tree_root case in btrfs_read_locked_inode() · 67ade058

由 Omar Sandoval 提交于 1月 25, 2017

As Jeff explained in c2951f32 ("btrfs: remove old tree_root dirent
processing in btrfs_real_readdir()"), supporting this old format is no
longer necessary since the Btrfs magic number has been updated since we
changed to the current format. There are other places where we still
handle this old format, but since this is part of a fix that is going to
stable, I'm only removing this one for now.

Cc: <stable@vger.kernel.org> # 4.9.x
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

67ade058

pNFS: Fix a reference leak in _pnfs_return_layout · ee6625a9

由 Trond Myklebust 提交于 1月 26, 2017

IF NFS_LAYOUT_RETURN_REQUESTED is not set, then we currently exit
without freeing the list of invalidated layout segments, leading
to a reference leak.
Reported-by: NOlga Kornievskaia <aglo@umich.edu>
Fixes: 24408f52 ("pNFS: Fix bugs in _pnfs_return_layout")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ee6625a9

nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED" · 406dab84

由 Chuck Lever 提交于 1月 26, 2017

Lock sequence IDs are bumped in decode_lock by calling
nfs_increment_seqid(). nfs_increment_sequid() does not use the
seqid_mutating_err() function fixed in commit 059aa734 ("Don't
increment lock sequence ID after NFS4ERR_MOVED").

Fixes: 059aa734 ("Don't increment lock sequence ID after ...")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NXuan Qi <xuan.qi@oracle.com>
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

406dab84

xfs: fix bmv_count confusion w/ shared extents · c364b6d0

由 Darrick J. Wong 提交于 1月 26, 2017

In a bmapx call, bmv_count is the total size of the array, including the
zeroth element that userspace uses to supply the search key. The output
array starts at offset 1 so that we can set up the user for the next
invocation. Since we now can split an extent into multiple bmap records
due to shared/unshared status, we have to be careful that we don't
overflow the output array.

In the original patch f86f4037 ("xfs: teach get_bmapx about shared
extents and the CoW fork") I used cur_ext (the output index) to check
for overflows, albeit with an off-by-one error. Since nexleft no longer
describes the number of unfilled slots in the output, we can rip all
that out and use cur_ext for the overflow check directly.

Failure to do this causes heap corruption in bmapx callers such as
xfs_io and xfs_scrub. xfs/328 can reproduce this problem.
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

c364b6d0

26 1月, 2017 2 次提交

xfs: clear _XBF_PAGES from buffers when readahead page · 2aa6ba7b

由 Darrick J. Wong 提交于 1月 25, 2017

If we try to allocate memory pages to back an xfs_buf that we're trying
to read, it's possible that we'll be so short on memory that the page
allocation fails.  For a blocking read we'll just wait, but for
readahead we simply dump all the pages we've collected so far.

Unfortunately, after dumping the pages we neglect to clear the
_XBF_PAGES state, which means that the subsequent call to xfs_buf_free
thinks that b_pages still points to pages we own.  It then double-frees
the b_pages pages.

This results in screaming about negative page refcounts from the memory
manager, which xfs oughtn't be triggering.  To reproduce this case,
mount a filesystem where the size of the inodes far outweighs the
availalble memory (a ~500M inode filesystem on a VM with 300MB memory
did the trick here) and run bulkstat in parallel with other memory
eating processes to put a huge load on the system.  The "check summary"
phase of xfs_scrub also works for this purpose.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>

2aa6ba7b

xfs: extsize hints are not unlikely in xfs_bmap_btalloc · 493611eb

由 Christoph Hellwig 提交于 1月 25, 2017

With COW files they are the hotpath, just like for files with the
extent size hint attribute.  We really shouldn't micro-manage anything
but failure cases with unlikely.

Additionally Arnd Bergmann recently reported that one of these two
unlikely annotations causes link failures together with an upcoming
kernel instrumentation patch, so let's get rid of it ASAP.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

493611eb

25 1月, 2017 4 次提交

xfs: remove racy hasattr check from attr ops · 5a93790d

由 Brian Foster 提交于 1月 25, 2017

xfs_attr_[get|remove]() have unlocked attribute fork checks to optimize
away a lock cycle in cases where the fork does not exist or is otherwise
empty. This check is not safe, however, because an attribute fork short
form to extent format conversion includes a transient state that causes
the xfs_inode_hasattr() check to fail. Specifically,
xfs_attr_shortform_to_leaf() creates an empty extent format attribute
fork and then adds the existing shortform attributes to it.

This means that lookup of an existing xattr can spuriously return
-ENOATTR when racing against a setxattr that causes the associated
format conversion. This was originally reproduced by an untar on a
particularly configured glusterfs volume, but can also be reproduced on
demand with properly crafted xattr requests.

The format conversion occurs under the exclusive ilock. xfs_attr_get()
and xfs_attr_remove() already have the proper locking and checks further
down in the functions to handle this situation correctly. Drop the
unlocked checks to avoid the spurious failure and rely on the existing
logic.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

5a93790d

xfs: use per-AG reservations for the finobt · 76d771b4

由 Christoph Hellwig 提交于 1月 25, 2017

Currently we try to rely on the global reserved block pool for block
allocations for the free inode btree, but I have customer reports
(fairly complex workload, need to find an easier reproducer) where that
is not enough as the AG where we free an inode that requires a new
finobt block is entirely full.  This causes us to cancel a dirty
transaction and thus a file system shutdown.

I think the right way to guard against this is to treat the finot the same
way as the refcount btree and have a per-AG reservations for the possible
worst case size of it, and the patch below implements that.

Note that this could increase mount times with large finobt trees.  In
an ideal world we would have added a field for the number of finobt
fields to the AGI, similar to what we did for the refcount blocks.
We should do add it next time we rev the AGI or AGF format by adding
new fields.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

76d771b4

xfs: only update mount/resv fields on success in __xfs_ag_resv_init · 4dfa2b84

由 Christoph Hellwig 提交于 1月 25, 2017

Try to reserve the blocks first and only then update the fields in
or hanging off the mount structure.  This way we can call __xfs_ag_resv_init
again after a previous failure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

4dfa2b84

romfs: use different way to generate fsid for BLOCK or MTD · f598f82e

由 Coly Li 提交于 1月 24, 2017

Commit 8a59f5d2 ("fs/romfs: return f_fsid for statfs(2)") generates
a 64bit id from sb->s_bdev->bd_dev.  This is only correct when romfs is
defined with CONFIG_ROMFS_ON_BLOCK.  If romfs is only defined with
CONFIG_ROMFS_ON_MTD, sb->s_bdev is NULL, referencing sb->s_bdev->bd_dev
will triger an oops.

Richard Weinberger points out that when CONFIG_ROMFS_BACKED_BY_BOTH=y,
both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD are defined.
Therefore when calling huge_encode_dev() to generate a 64bit id, I use
the follow order to choose parameter,

- CONFIG_ROMFS_ON_BLOCK defined
  use sb->s_bdev->bd_dev
- CONFIG_ROMFS_ON_BLOCK undefined and CONFIG_ROMFS_ON_MTD defined
  use sb->s_dev when,
- both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD undefined
  leave id as 0

When CONFIG_ROMFS_ON_MTD is defined and sb->s_mtd is not NULL, sb->s_dev
is set to a device ID generated by MTD_BLOCK_MAJOR and mtd index,
otherwise sb->s_dev is 0.

This is a try-best effort to generate a uniq file system ID, if all the
above conditions are not meet, f_fsid of this romfs instance will be 0.
Generally only one romfs can be built on single MTD block device, this
method is enough to identify multiple romfs instances in a computer.

Link: http://lkml.kernel.org/r/1482928596-115155-1-git-send-email-colyli@suse.deSigned-off-by: NColy Li <colyli@suse.de>
Reported-by: NNong Li <nongli1031@gmail.com>
Tested-by: NNong Li <nongli1031@gmail.com>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f598f82e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功