提交 · eed7c4143dd993ef380340af64b8f2e7a79102c8 · openanolis / cloud-kernel

23 9月, 2016 3 次提交

nfs: use safe, interruptible sleeps when waiting to retry LOCK · 66f570ab

由 Jeff Layton 提交于 9月 17, 2016

We actually want to use TASK_INTERRUPTIBLE sleeps when we're in the
process of polling for a NFSv4 lock. If there is a signal pending when
the task wakes up, then we'll be returning an error anyway. So, we might
as well wake up immediately for non-fatal signals as well. That allows
us to return to userland more quickly in that case, but won't change the
error that userland sees.

Also, there is no need to use the *_unsafe sleep variants here, as no
vfs-layer locks should be held at this point.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

66f570ab

J
nfs: eliminate pointless and confusing do_vfs_lock wrappers · 75575ddf
由 Jeff Layton 提交于 9月 17, 2016
```
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
```
75575ddf

nfs: the length argument to read_buf should be unsigned · b60475c9

由 Jeff Layton 提交于 9月 17, 2016

Since it gets passed through to xdr_inline_decode, we might as well
have read_buf expect what it expects -- a size_t.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

b60475c9

20 9月, 2016 15 次提交

nfs: cover ->migratepage with CONFIG_MIGRATION · f844cd0d

由 Chao Yu 提交于 9月 20, 2016

It will be more clean to use CONFIG_MIGRATION to cover nfs' private
.migratepage in nfs_file_aops like we do in other part of nfs
operations.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

f844cd0d

pnfs: add a new mechanism to select a layout driver according to an ordered list · ca440c38

由 Jeff Layton 提交于 9月 15, 2016

Currently, the layout driver selection code always chooses the first one
from the list. That's not really ideal however, as the server can send
the list of layout types in any order that it likes. It's up to the
client to select the best one for its needs.

This patch adds an ordered list of preferred driver types and has the
selection code sort the list of available layout drivers according to it.
Any unrecognized layout type is sorted to the end of the list.

For now, the order of preference is hardcoded, but it should be possible
to make this configurable in the future.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ca440c38

NFS pnfs data server multipath session trunking · 04fa2c6b

由 Andy Adamson 提交于 9月 09, 2016

Try all multipath addresses for a data server. The first address that
successfully connects and creates a session is the DS mount address.
All subsequent addresses are tested for session trunking and
added as aliases.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

04fa2c6b

NFS test session trunking with exchange id · ad0849a7

由 Andy Adamson 提交于 9月 09, 2016

Use an async exchange id call to test for session trunking

To conform with RFC 5661 section 18.35.4, the Non-Update on
Existing Clientid case, save the exchange id verifier in
cl_confirm and use it for the session trunking exhange id test.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ad0849a7

NFS add xprt switch addrs test to match client · 04ea1b3e

由 Andy Adamson 提交于 9月 09, 2016

Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

04ea1b3e

NFS detect session trunking · ba84db96

由 Andy Adamson 提交于 9月 09, 2016

Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ba84db96

NFS refactor nfs4_check_serverowner_major_id · e7b7cbf6

由 Andy Adamson 提交于 9月 09, 2016

For session trunking, to compare nfs41_exchange_id_res with
existing nfs_client
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e7b7cbf6

NFS refactor nfs4_match_clientids · 8e548edb

由 Andy Adamson 提交于 9月 09, 2016

For session trunking, to compare nfs41_exchange_id_res with
exiting nfs_client.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8e548edb

NFS setup async exchange_id · 8d89bd70

由 Andy Adamson 提交于 9月 09, 2016

Testing an rpc_xprt for session trunking should not delay application
progress over already established transports.
Setup exchange_id to be able to be an async call to test an rpc_xprt
for session trunking use.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8d89bd70

NFSv4.x: Add kernel parameter to control the callback server · 5405fc44

由 Trond Myklebust 提交于 8月 29, 2016

Add support for the kernel parameter nfs.callback_nr_threads to set
the number of threads that will be assigned to the callback channel.

Add support for the kernel parameter nfs.nfs.max_session_cb_slots
to set the maximum size of the callback channel slot table.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5405fc44

NFSv4.x: Switch to using svc_set_num_threads() to manage the callback threads · bb6aeba7

由 Trond Myklebust 提交于 8月 29, 2016

This will allow us to bump the number of callback threads at will.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

bb6aeba7

NFSv4.x: Fix up the global tracking of the callback server · 3b01c11e

由 Trond Myklebust 提交于 8月 29, 2016

Ensure that the nfs_callback_info[] array correctly tracks the
struct svc_serv.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

3b01c11e

SUNRPC: Initialise struct svc_serv backchannel fields during __svc_create() · d0025268

由 Trond Myklebust 提交于 8月 29, 2016

Clean up.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d0025268

NFSv4.x: Set up struct svc_serv_ops for the callback channel · f4b52bb0

由 Trond Myklebust 提交于 8月 29, 2016

In order to manage the threads using svc_set_num_threads, we need to
fill in a few extra fields.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

f4b52bb0

pnfs: track multiple layout types in fsinfo structure · 3132e49e

由 Jeff Layton 提交于 8月 10, 2016

Current NFSv4.1/pNFS client assumes that MDS supports only one layout
type. While it's true for most existing servers, nevertheless, this can
be change in the near future.

For now, this patch just plumbs in the ability to track a list of
layouts in the fsinfo structure. The existing behavior of the client
is preserved, by having it just select the first entry in the list.
Signed-off-by: NTigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: NJeff Layton <jlayton@poochiereds.net>
Reviewed-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

3132e49e

16 9月, 2016 3 次提交

aio: mark AIO pseudo-fs noexec · 22f6b4d3

由 Jann Horn 提交于 9月 16, 2016

This ensures that do_mmap() won't implicitly make AIO memory mappings
executable if the READ_IMPLIES_EXEC personality flag is set.  Such
behavior is problematic because the security_mmap_file LSM hook doesn't
catch this case, potentially permitting an attacker to bypass a W^X
policy enforced by SELinux.

I have tested the patch on my machine.

To test the behavior, compile and run this:

    #define _GNU_SOURCE
    #include <unistd.h>
    #include <sys/personality.h>
    #include <linux/aio_abi.h>
    #include <err.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/syscall.h>

    int main(void) {
        personality(READ_IMPLIES_EXEC);
        aio_context_t ctx = 0;
        if (syscall(__NR_io_setup, 1, &ctx))
            err(1, "io_setup");

        char cmd[1000];
        sprintf(cmd, "cat /proc/%d/maps | grep -F '/[aio]'",
            (int)getpid());
        system(cmd);
        return 0;
    }

In the output, "rw-s" is good, "rwxs" is bad.
Signed-off-by: NJann Horn <jann@thejh.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

22f6b4d3

vfs: cap dedupe request structure size at PAGE_SIZE · b71dbf10

由 Darrick J. Wong 提交于 9月 14, 2016

Kirill A Shutemov reports that the kernel doesn't try to cap dest_count
in any way, and uses the number to allocate kernel memory.  This causes
high order allocation warnings in the kernel log if someone passes in a
big enough value.  We should clamp the allocation at PAGE_SIZE to avoid
stressing the VM.

The two existing users of the dedupe ioctl never send more than 120
requests, so we can safely clamp dest_range at PAGE_SIZE, because with
4k pages we can handle up to 127 dedupe candidates.  Given the max
extent length of 16MB, we can end up doing 2GB of IO which is plenty.

[ Note: the "offsetof()" can't overflow, because 'count' is just a
  16-bit integer.  That's not obvious in the limited context of the
  patch, so I'm noting it here because it made me go look.  - Linus ]
Reported-by: N"Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b71dbf10

vfs: fix return type of ioctl_file_dedupe_range · 5297e0f0

由 Darrick J. Wong 提交于 9月 14, 2016

All the VFS functions in the dedupe ioctl path return int status, so
the ioctl handler ought to as well.

Found by Coverity, CID 1350952.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5297e0f0

12 9月, 2016 1 次提交

NFSv4.1: Fix the CREATE_SESSION slot number accounting · b519d408

由 Trond Myklebust 提交于 9月 11, 2016

Ensure that we conform to the algorithm described in RFC5661, section
18.36.4 for when to bump the sequence id. In essence we do it for all
cases except when the RPC call timed out, or in case of the server returning
NFS4ERR_DELAY or NFS4ERR_STALE_CLIENTID.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org

b519d408

10 9月, 2016 7 次提交

fscrypto: require write access to mount to set encryption policy · ba63f23d

由 Eric Biggers 提交于 9月 08, 2016

Since setting an encryption policy requires writing metadata to the
filesystem, it should be guarded by mnt_want_write/mnt_drop_write.
Otherwise, a user could cause a write to a frozen or readonly
filesystem.  This was handled correctly by f2fs but not by ext4.  Make
fscrypt_process_policy() handle it rather than relying on the filesystem
to get it right.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs}
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>

ba63f23d

Move check for prefix path to within cifs_get_root() · 348c1bfa

由 Sachin Prabhu 提交于 7月 29, 2016

Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
Tested-by: NAurelien Aptel <aaptel@suse.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

348c1bfa

Compare prepaths when comparing superblocks · c1d8b24d

由 Sachin Prabhu 提交于 7月 29, 2016

The patch
fs/cifs: make share unaccessible at root level mountable
makes use of prepaths when any component of the underlying path is
inaccessible.

When mounting 2 separate shares having different prepaths but are other
wise similar in other respects, we end up sharing superblocks when we
shouldn't be doing so.
Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
Tested-by: NAurelien Aptel <aaptel@suse.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

c1d8b24d

Fix memory leaks in cifs_do_mount() · 4214ebf4

由 Sachin Prabhu 提交于 7月 29, 2016

Fix memory leaks introduced by the patch
fs/cifs: make share unaccessible at root level mountable

Also move allocation of cifs_sb->prepath to cifs_setup_cifs_sb().
Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
Tested-by: NAurelien Aptel <aaptel@suse.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

4214ebf4

fscrypto: only allow setting encryption policy on directories · 002ced4b

由 Eric Biggers 提交于 9月 08, 2016

The FS_IOC_SET_ENCRYPTION_POLICY ioctl allowed setting an encryption
policy on nondirectory files.  This was unintentional, and in the case
of nonempty regular files did not behave as expected because existing
data was not actually encrypted by the ioctl.

In the case of ext4, the user could also trigger filesystem errors in
->empty_dir(), e.g. due to mismatched "directory" checksums when the
kernel incorrectly tried to interpret a regular file as a directory.

This bug affected ext4 with kernels v4.8-rc1 or later and f2fs with
kernels v4.6 and later.  It appears that older kernels only permitted
directories and that the check was accidentally lost during the
refactoring to share the file encryption code between ext4 and f2fs.

This patch restores the !S_ISDIR() check that was present in older
kernels.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

002ced4b

fscrypto: add authorization check for setting encryption policy · 163ae1c6

由 Eric Biggers 提交于 9月 08, 2016

On an ext4 or f2fs filesystem with file encryption supported, a user
could set an encryption policy on any empty directory(*) to which they
had readonly access.  This is obviously problematic, since such a
directory might be owned by another user and the new encryption policy
would prevent that other user from creating files in their own directory
(for example).

Fix this by requiring inode_owner_or_capable() permission to set an
encryption policy.  This means that either the caller must own the file,
or the caller must have the capability CAP_FOWNER.

(*) Or also on any regular file, for f2fs v4.6 and later and ext4
    v4.8-rc1 and later; a separate bug fix is coming for that.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs}
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

163ae1c6

mm: fix show_smap() for zone_device-pmd ranges · ca120cf6

由 Dan Williams 提交于 9月 03, 2016

Attempting to dump /proc/<pid>/smaps for a process with pmd dax mappings
currently results in the following VM_BUG_ONs:

 kernel BUG at mm/huge_memory.c:1105!
 task: ffff88045f16b140 task.stack: ffff88045be14000
 RIP: 0010:[<ffffffff81268f9b>]  [<ffffffff81268f9b>] follow_trans_huge_pmd+0x2cb/0x340
 [..]
 Call Trace:
  [<ffffffff81306030>] smaps_pte_range+0xa0/0x4b0
  [<ffffffff814c2755>] ? vsnprintf+0x255/0x4c0
  [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0
  [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80
  [<ffffffff81307656>] show_smap+0xa6/0x2b0

 kernel BUG at fs/proc/task_mmu.c:585!
 RIP: 0010:[<ffffffff81306469>]  [<ffffffff81306469>] smaps_pte_range+0x499/0x4b0
 Call Trace:
  [<ffffffff814c2795>] ? vsnprintf+0x255/0x4c0
  [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0
  [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80
  [<ffffffff81307696>] show_smap+0xa6/0x2b0

These locations are sanity checking page flags that must be set for an
anonymous transparent huge page, but are not set for the zone_device
pages associated with dax mappings.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

ca120cf6

06 9月, 2016 2 次提交

btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress · ce129655

由 Wang Xiaoguang 提交于 9月 02, 2016

In btrfs_async_reclaim_metadata_space(), we use ticket's address to
determine whether asynchronous metadata reclaim work is making progress.

	ticket = list_first_entry(&space_info->tickets,
				  struct reserve_ticket, list);
	if (last_ticket == ticket) {
		flush_state++;
	} else {
		last_ticket = ticket;
		flush_state = FLUSH_DELAYED_ITEMS_NR;
		if (commit_cycles)
			commit_cycles--;
	}

But indeed it's wrong, we should not rely on local variable's address to
do this check, because addresses may be same. In my test environment, I
dd one 168MB file in a 256MB fs, found that for this file, every time
wait_reserve_ticket() called, local variable ticket's address is same,

For above codes, assume a previous ticket's address is addrA, last_ticket
is addrA. Btrfs_async_reclaim_metadata_space() finished this ticket and
wake up it, then another ticket is added, but with the same address addrA,
now last_ticket will be same to current ticket, then current ticket's flush
work will start from current flush_state, not initial FLUSH_DELAYED_ITEMS_NR,
which may result in some enospc issues(I have seen this in my test machine).
Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ce129655

Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns · cbd60aa7

由 Chris Mason 提交于 9月 06, 2016

We use a btrfs_log_ctx structure to pass information into the
tree log commit, and get error values out.  It gets added to a per
log-transaction list which we walk when things go bad.

Commit d1433deb added an optimization to skip waiting for the log
commit, but didn't take root_log_ctx out of the list.  This
patch makes sure we remove things before exiting.
Signed-off-by: NChris Mason <clm@fb.com>
Fixes: d1433deb
cc: stable@vger.kernel.org # 3.15+

cbd60aa7

05 9月, 2016 4 次提交

btrfs: do not decrease bytes_may_use when replaying extents · ed7a6948

由 Wang Xiaoguang 提交于 8月 26, 2016

When replaying extents, there is no need to update bytes_may_use
in btrfs_alloc_logged_file_extent(), otherwise it'll trigger a
WARN_ON about bytes_may_use.

Fixes: ("btrfs: update btrfs_space_info's bytes_may_use timely")
Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ed7a6948

ceph: do not modify fi->frag in need_reset_readdir() · 0f5aa88a

由 Nicolas Iooss 提交于 8月 28, 2016

Commit f3c4ebe6 ("ceph: using hash value to compose dentry offset")
modified "if (fpos_frag(new_pos) != fi->frag)" to "if (fi->frag |=
fpos_frag(new_pos))" in need_reset_readdir(), thus replacing a
comparison operator with an assignment one.

This looks like a typo which is reported by clang when building the
kernel with some warning flags:

    fs/ceph/dir.c:600:22: error: using the result of an assignment as a
    condition without parentheses [-Werror,-Wparentheses]
            } else if (fi->frag |= fpos_frag(new_pos)) {
                       ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
    fs/ceph/dir.c:600:22: note: place parentheses around the assignment
    to silence this warning
            } else if (fi->frag |= fpos_frag(new_pos)) {
                                ^
                       (                             )
    fs/ceph/dir.c:600:22: note: use '!=' to turn this compound
    assignment into an inequality comparison
            } else if (fi->frag |= fpos_frag(new_pos)) {
                                ^~
                                !=

Fixes: f3c4ebe6 ("ceph: using hash value to compose dentry offset")
Signed-off-by: NNicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

0f5aa88a

ovl: fix workdir creation · e1ff3dd1

由 Miklos Szeredi 提交于 9月 05, 2016

Workdir creation fails in latest kernel.

Fix by allowing EOPNOTSUPP as a valid return value from
vfs_removexattr(XATTR_NAME_POSIX_ACL_*).  Upper filesystem may not support
ACL and still be perfectly able to support overlayfs.
Reported-by: NMartin Ziegler <ziegler@uni-freiburg.de>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: c11b9fdd ("ovl: remove posix_acl_default from workdir")
Cc: <stable@vger.kernel.org>

e1ff3dd1

pNFS: Don't forget the layout stateid if there are outstanding LAYOUTGETs · 334a8f37

由 Trond Myklebust 提交于 9月 04, 2016

If there are outstanding LAYOUTGET rpc calls, then we want to ensure that
we keep the layout stateid around so we that don't inadvertently pick up
an old/misordered sequence id.
The race is as follows:

Client				Server
======				======
LAYOUTGET(seqid)
LAYOUTGET(seqid)
				return LAYOUTGET(seqid+1)
				return LAYOUTGET(seqid+2)
process LAYOUTGET(seqid+2)
	forget layout
process LAYOUTGET(seqid+1)

If it forgets the layout stateid before processing seqid+1, then
the client will not check the layout->plh_barrier, and so will set
the stateid with seqid+1.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

334a8f37

04 9月, 2016 5 次提交

devpts: return NULL pts 'priv' entry for non-devpts nodes · 3e423945

由 Linus Torvalds 提交于 9月 03, 2016

In commit 8ead9dd5 ("devpts: more pty driver interface cleanups") I
made devpts_get_priv() just return the dentry->fs_data directly.  And
because I thought it wouldn't happen, I added a warning if you ever saw
a pts node that wasn't on devpts.

And no, that warning never triggered under any actual real use, but you
can trigger it by creating nonsensical pts nodes by hand.

So just revert the warning, and make devpts_get_priv() return NULL for
that case like it used to.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org # 4.6+
Cc: Eric W Biederman" <ebiederm@xmission.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3e423945

pNFS: Clear out all layout segments if the server unsets lrp->res.lrs_present · 52ec7be2

由 Trond Myklebust 提交于 9月 03, 2016

If the server fails to set lrp->res.lrs_present in the LAYOUTRETURN reply,
then that means it believes the client holds no more layout state for that
file, and that the layout stateid is now invalid.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

52ec7be2

pNFS: Fix pnfs_set_layout_stateid() to clear NFS_LAYOUT_INVALID_STID · 2a59a041

由 Trond Myklebust 提交于 9月 03, 2016

If the layout was marked as invalid, we want to ensure to initialise
the layout header fields correctly.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2a59a041

pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised · bf0291dd

由 Trond Myklebust 提交于 9月 03, 2016

According to RFC5661, the client is responsible for serialising
LAYOUTGET and LAYOUTRETURN to avoid ambiguity. Consider the case
where we send both in parallel.

Client					Server
======					======
LAYOUTGET(seqid=X)
LAYOUTRETURN(seqid=X)
					LAYOUTGET return seqid=X+1
					LAYOUTRETURN return seqid=X+2
Process LAYOUTRETURN
          Forget layout stateid
Process LAYOUTGET
          Set seqid=X+1

The client processes the layoutget/layoutreturn in the wrong order,
and since the result of the layoutreturn was to clear the only
existing layout segment, the client forgets the layout stateid.

When the LAYOUTGET comes in, it is treated as having a completely
new stateid, and so the client sets the wrong sequence id...

Fix is to check if there are outstanding LAYOUTGET requests
before we send the LAYOUTRETURN (note that LAYOUGET will already
wait if it sees an outstanding LAYOUTRETURN).
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

bf0291dd

NFS: Fix error reporting in nfs_file_write() · c49edecd

由 Trond Myklebust 提交于 9月 03, 2016

When doing O_DSYNC writes, the actual write errors are reported through
generic_write_sync(), so we must test the result.
Reported-by: NJ. R. Okajima <hooanon05g@gmail.com>
Fixes: 18290650 ("NFS: Move buffered I/O locking into nfs_file_write()")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c49edecd

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功