1. 20 7月, 2016 6 次提交
    • S
      sunrpc: move NO_CRKEY_TIMEOUT to the auth->au_flags · ce52914e
      Scott Mayhew 提交于
      A generic_cred can be used to look up a unx_cred or a gss_cred, so it's
      not really safe to use the the generic_cred->acred->ac_flags to store
      the NO_CRKEY_TIMEOUT flag.  A lookup for a unx_cred triggered while the
      KEY_EXPIRE_SOON flag is already set will cause both NO_CRKEY_TIMEOUT and
      KEY_EXPIRE_SOON to be set in the ac_flags, leaving the user associated
      with the auth_cred to be in a state where they're perpetually doing 4K
      NFS_FILE_SYNC writes.
      
      This can be reproduced as follows:
      
      1. Mount two NFS filesystems, one with sec=krb5 and one with sec=sys.
      They do not need to be the same export, nor do they even need to be from
      the same NFS server.  Also, v3 is fine.
      $ sudo mount -o v3,sec=krb5 server1:/export /mnt/krb5
      $ sudo mount -o v3,sec=sys server2:/export /mnt/sys
      
      2. As the normal user, before accessing the kerberized mount, kinit with
      a short lifetime (but not so short that renewing the ticket would leave
      you within the 4-minute window again by the time the original ticket
      expires), e.g.
      $ kinit -l 10m -r 60m
      
      3. Do some I/O to the kerberized mount and verify that the writes are
      wsize, UNSTABLE:
      $ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1
      
      4. Wait until you're within 4 minutes of key expiry, then do some more
      I/O to the kerberized mount to ensure that RPC_CRED_KEY_EXPIRE_SOON gets
      set.  Verify that the writes are 4K, FILE_SYNC:
      $ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1
      
      5. Now do some I/O to the sec=sys mount.  This will cause
      RPC_CRED_NO_CRKEY_TIMEOUT to be set:
      $ dd if=/dev/zero of=/mnt/sys/file bs=1M count=1
      
      6. Writes for that user will now be permanently 4K, FILE_SYNC for that
      user, regardless of which mount is being written to, until you reboot
      the client.  Renewing the kerberos ticket (assuming it hasn't already
      expired) will have no effect.  Grabbing a new kerberos ticket at this
      point will have no effect either.
      
      Move the flag to the auth->au_flags field (which is currently unused)
      and rename it slightly to reflect that it's no longer associated with
      the auth_cred->ac_flags.  Add the rpc_auth to the arg list of
      rpcauth_cred_key_to_expire and check the au_flags there too.  Finally,
      add the inode to the arg list of nfs_ctx_key_to_expire so we can
      determine the rpc_auth to pass to rpcauth_cred_key_to_expire.
      Signed-off-by: NScott Mayhew <smayhew@redhat.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      ce52914e
    • S
      mount: use sec= that was specified on the command line · e68fd7c8
      Steve Dickson 提交于
      When older servers return RPC_AUTH_NULL, it means the
      rpc creds will be ignored. In that case use the sec=
      that was specified instead of setting sec=null
      
      Fixes Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1112983Signed-off-by: NSteve Dickson <steved@redhat.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      e68fd7c8
    • T
      pNFS: Fix LAYOUTGET handling of NFS4ERR_BAD_STATEID and NFS4ERR_EXPIRED · f7db0b28
      Trond Myklebust 提交于
      We want to recover the open stateid if there is no layout stateid
      and/or the stateid argument matches an open stateid.
      Otherwise throw out the existing layout and recover from scratch, as
      the layout stateid is bad.
      
      Fixes: 183d9e7b ("pnfs: rework LAYOUTGET retry handling")
      Cc: stable@vger.kernel.org # 4.7
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      f7db0b28
    • T
      pNFS: Handle NFS4ERR_RECALLCONFLICT correctly in LAYOUTGET · 66b53f32
      Trond Myklebust 提交于
      Instead of giving up altogether and falling back to doing I/O
      through the MDS, which may make the situation worse, wait for
      2 lease periods for the callback to resolve itself, and then
      try destroying the existing layout.
      
      Only if this was an attempt at getting a first layout, do we
      give up altogether, as the server is clearly crazy.
      
      Fixes: 183d9e7b ("pnfs: rework LAYOUTGET retry handling")
      Cc: stable@vger.kernel.org # 4.7
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      66b53f32
    • T
      pNFS: Separate handling of NFS4ERR_LAYOUTTRYLATER and RECALLCONFLICT · e85d7ee4
      Trond Myklebust 提交于
      They are not the same error, and need to be handled differently.
      
      Fixes: 183d9e7b ("pnfs: rework LAYOUTGET retry handling")
      Cc: stable@vger.kernel.org # 4.7
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      e85d7ee4
    • T
      pNFS: Fix post-layoutget error handling in pnfs_update_layout() · 56b38a1f
      Trond Myklebust 提交于
      The non-retry error path is currently broken and ends up releasing the
      reference to the layout twice. It also can end up clearing the
      NFS_LAYOUT_FIRST_LAYOUTGET flag twice, causing a race.
      
      In addition, the retry path will fail to decrement the plh_outstanding
      counter.
      
      Fixes: 183d9e7b ("pnfs: rework LAYOUTGET retry handling")
      Cc: stable@vger.kernel.org # 4.7
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      56b38a1f
  2. 01 7月, 2016 3 次提交
    • T
      NFSv4: Allow retry of operations that used a returned delegation stateid · 8487c479
      Trond Myklebust 提交于
      Fix up nfs4_do_handle_exception() so that it can check if the operation
      that received the NFS4ERR_BAD_STATEID was using a defunct delegation.
      Apply that to the case of SETATTR, which will currently return EIO
      in some cases where this happens.
      Reported-by: NOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      8487c479
    • T
      NFS/pnfs: Do not clobber existing pgio_done_cb in nfs4_proc_read_setup · ca857cc1
      Trond Myklebust 提交于
      If a pNFS client sets hdr->pgio_done_cb, then we should not overwrite that
      in nfs4_proc_read_setup()
      
      Fixes: 75bf47eb ("pNFS/flexfile: Fix erroneous fall back to...")
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      ca857cc1
    • T
      NFS: Fix an Oops in the pNFS files and flexfiles connection setup to the DS · 5c6e5b60
      Trond Myklebust 提交于
      Chris Worley reports:
       RIP: 0010:[<ffffffffa0245f80>]  [<ffffffffa0245f80>] rpc_new_client+0x2a0/0x2e0 [sunrpc]
       RSP: 0018:ffff880158f6f548  EFLAGS: 00010246
       RAX: 0000000000000000 RBX: ffff880234f8bc00 RCX: 000000000000ea60
       RDX: 0000000000074cc0 RSI: 000000000000ea60 RDI: ffff880234f8bcf0
       RBP: ffff880158f6f588 R08: 000000000001ac80 R09: ffff880237003300
       R10: ffff880201171000 R11: ffffea0000d75200 R12: ffffffffa03afc60
       R13: ffff880230c18800 R14: 0000000000000000 R15: ffff880158f6f680
       FS:  00007f0e32673740(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: 0000000000000008 CR3: 0000000234886000 CR4: 00000000001406e0
       Stack:
        ffffffffa047a680 0000000000000000 ffff880158f6f598 ffff880158f6f680
        ffff880158f6f680 ffff880234d11d00 ffff88023357f800 ffff880158f6f7d0
        ffff880158f6f5b8 ffffffffa024660a ffff880158f6f5b8 ffffffffa02492ec
       Call Trace:
        [<ffffffffa024660a>] rpc_create_xprt+0x1a/0xb0 [sunrpc]
        [<ffffffffa02492ec>] ? xprt_create_transport+0x13c/0x240 [sunrpc]
        [<ffffffffa0246766>] rpc_create+0xc6/0x1a0 [sunrpc]
        [<ffffffffa038e695>] nfs_create_rpc_client+0xf5/0x140 [nfs]
        [<ffffffffa038f31a>] nfs_init_client+0x3a/0xd0 [nfs]
        [<ffffffffa038f22f>] nfs_get_client+0x25f/0x310 [nfs]
        [<ffffffffa025cef8>] ? rpc_ntop+0xe8/0x100 [sunrpc]
        [<ffffffffa047512c>] nfs3_set_ds_client+0xcc/0x100 [nfsv3]
        [<ffffffffa041fa10>] nfs4_pnfs_ds_connect+0x120/0x400 [nfsv4]
        [<ffffffffa03d41c7>] nfs4_ff_layout_prepare_ds+0xe7/0x330 [nfs_layout_flexfiles]
        [<ffffffffa03d1b1b>] ff_layout_pg_init_write+0xcb/0x280 [nfs_layout_flexfiles]
        [<ffffffffa03a14dc>] __nfs_pageio_add_request+0x12c/0x490 [nfs]
        [<ffffffffa03a1fa2>] nfs_pageio_add_request+0xc2/0x2a0 [nfs]
        [<ffffffffa03a0365>] ? nfs_pageio_init+0x75/0x120 [nfs]
        [<ffffffffa03a5b50>] nfs_do_writepage+0x120/0x270 [nfs]
        [<ffffffffa03a5d31>] nfs_writepage_locked+0x61/0xc0 [nfs]
        [<ffffffff813d4115>] ? __percpu_counter_add+0x55/0x70
        [<ffffffffa03a6a9f>] nfs_wb_single_page+0xef/0x1c0 [nfs]
        [<ffffffff811ca4a3>] ? __dec_zone_page_state+0x33/0x40
        [<ffffffffa0395b21>] nfs_launder_page+0x41/0x90 [nfs]
        [<ffffffff811baba0>] invalidate_inode_pages2_range+0x340/0x3a0
        [<ffffffff811bac17>] invalidate_inode_pages2+0x17/0x20
        [<ffffffffa039960e>] nfs_release+0x9e/0xb0 [nfs]
        [<ffffffffa0399570>] ? nfs_open+0x60/0x60 [nfs]
        [<ffffffffa0394dad>] nfs_file_release+0x3d/0x60 [nfs]
        [<ffffffff81226e6c>] __fput+0xdc/0x1e0
        [<ffffffff81226fbe>] ____fput+0xe/0x10
        [<ffffffff810bf2e4>] task_work_run+0xc4/0xe0
        [<ffffffff810a4188>] do_exit+0x2e8/0xb30
        [<ffffffff8102471c>] ? do_audit_syscall_entry+0x6c/0x70
        [<ffffffff811464e6>] ? __audit_syscall_exit+0x1e6/0x280
        [<ffffffff810a4a5f>] do_group_exit+0x3f/0xa0
        [<ffffffff810a4ad4>] SyS_exit_group+0x14/0x20
        [<ffffffff8179b76e>] system_call_fastpath+0x12/0x71
      
      Which seems to be due to a call to utsname() when in a task exit context
      in order to determine the hostname to set in rpc_new_client().
      
      In reality, what we want here is not the hostname of the current task, but
      the hostname that was used to set up the metadata server.
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      5c6e5b60
  3. 29 6月, 2016 1 次提交
    • T
      NFS: Fix another OPEN_DOWNGRADE bug · e547f262
      Trond Myklebust 提交于
      Olga Kornievskaia reports that the following test fails to trigger
      an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE.
      
      	fd0 = open(foo, RDRW)   -- should be open on the wire for "both"
      	fd1 = open(foo, RDONLY)  -- should be open on the wire for "read"
      	close(fd0) -- should trigger an open_downgrade
      	read(fd1)
      	close(fd1)
      
      The issue is that we're missing a check for whether or not the current
      state transitioned from an O_RDWR state as opposed to having transitioned
      from a combination of O_RDONLY and O_WRONLY.
      Reported-by: NOlga Kornievskaia <aglo@umich.edu>
      Fixes: cd9288ff ("NFSv4: Fix another bug in the close/open_downgrade code")
      Cc: stable@vger.kernel.org # 2.6.33+
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      e547f262
  4. 27 6月, 2016 1 次提交
    • A
      make nfs_atomic_open() call d_drop() on all ->open_context() errors. · d20cb71d
      Al Viro 提交于
      In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code"
      unconditional d_drop() after the ->open_context() had been removed.  It had
      been correct for success cases (there ->open_context() itself had been doing
      dcache manipulations), but not for error ones.  Only one of those (ENOENT)
      got a compensatory d_drop() added in that commit, but in fact it should've
      been done for all errors.  As it is, the case of O_CREAT non-exclusive open
      on a hashed negative dentry racing with e.g. symlink creation from another
      client ended up with ->open_context() getting an error and proceeding to
      call nfs_lookup().  On a hashed dentry, which would've instantly triggered
      BUG_ON() in d_materialise_unique() (or, these days, its equivalent in
      d_splice_alias()).
      
      Cc: stable@vger.kernel.org # v3.10+
      Tested-by: NOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      d20cb71d
  5. 25 6月, 2016 21 次提交
  6. 24 6月, 2016 8 次提交