1. 02 11月, 2016 4 次提交
    • J
      nfsd: clean up supported attribute handling · 916d2d84
      J. Bruce Fields 提交于
      Minor cleanup, no change in behavior.
      
      Provide helpers for some common attribute bitmap operations.  Drop some
      comments that just echo the code.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      916d2d84
    • J
      nfsd: fix error handling for clients that fail to return the layout · 851238a2
      Jeff Layton 提交于
      Currently, when the client continually returns NFS4ERR_DELAY on a
      CB_LAYOUTRECALL, we'll give up trying to retransmit after two lease
      periods, but leave the layout in place.
      
      What we really need to do here is fence the client in this case. Have it
      fall through to that code in that case instead of into the
      NFS4ERR_NOMATCHING_LAYOUT case.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      851238a2
    • J
      nfsd: more robust allocation failure handling in nfsd_reply_cache_init · 8f97514b
      Jeff Layton 提交于
      Currently, we try to allocate the cache as a single, large chunk, which
      can fail if no big chunks of memory are available. We _do_ try to size
      it according to the amount of memory in the box, but if the server is
      started well after boot time, then the allocation can fail due to memory
      fragmentation.
      
      Fall back to doing a vzalloc if the kcalloc fails, and switch the
      shutdown code to do a kvfree to handle freeing correctly.
      Reported-by: NOlaf Hering <olaf@aepfle.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      8f97514b
    • C
      nfsd: Fix general protection fault in release_lock_stateid() · f46c445b
      Chuck Lever 提交于
      When I push NFSv4.1 / RDMA hard, (xfstests generic/089, for example),
      I get this crash on the server:
      
      Oct 28 22:04:30 klimt kernel: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      Oct 28 22:04:30 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 iTCO_wdt iTCO_vendor_support sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor pcspkr raid6_pq i2c_i801 i2c_smbus lpc_ich mfd_core sg mei_me mei ioatdma shpchp wmi ipmi_si ipmi_msghandler rpcrdma ib_ipoib rdma_ucm acpi_power_meter acpi_pad ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb ahci libahci ptp mlx4_core pps_core dca libata i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
      Oct 28 22:04:30 klimt kernel: CPU: 7 PID: 1558 Comm: nfsd Not tainted 4.9.0-rc2-00005-g82cd754 #8
      Oct 28 22:04:30 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
      Oct 28 22:04:30 klimt kernel: task: ffff880835c3a100 task.stack: ffff8808420d8000
      Oct 28 22:04:30 klimt kernel: RIP: 0010:[<ffffffffa05a759f>]  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
      Oct 28 22:04:30 klimt kernel: RSP: 0018:ffff8808420dbce0  EFLAGS: 00010246
      Oct 28 22:04:30 klimt kernel: RAX: ffff88084e6660f0 RBX: ffff88084e667020 RCX: 0000000000000000
      Oct 28 22:04:30 klimt kernel: RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffff88084e667020
      Oct 28 22:04:30 klimt kernel: RBP: ffff8808420dbcf8 R08: 0000000000000001 R09: 0000000000000000
      Oct 28 22:04:30 klimt kernel: R10: ffff880835c3a100 R11: ffff880835c3aca8 R12: 6b6b6b6b6b6b6b6b
      Oct 28 22:04:30 klimt kernel: R13: ffff88084e6670d8 R14: ffff880835f546f0 R15: ffff880835f1c548
      Oct 28 22:04:30 klimt kernel: FS:  0000000000000000(0000) GS:ffff88087bdc0000(0000) knlGS:0000000000000000
      Oct 28 22:04:30 klimt kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      Oct 28 22:04:30 klimt kernel: CR2: 00007ff020389000 CR3: 0000000001c06000 CR4: 00000000001406e0
      Oct 28 22:04:30 klimt kernel: Stack:
      Oct 28 22:04:30 klimt kernel: ffff88084e667020 0000000000000000 ffff88084e6670d8 ffff8808420dbd20
      Oct 28 22:04:30 klimt kernel: ffffffffa05ac80d ffff880835f54548 ffff88084e640008 ffff880835f545b0
      Oct 28 22:04:30 klimt kernel: ffff8808420dbd70 ffffffffa059803d ffff880835f1c768 0000000000000870
      Oct 28 22:04:30 klimt kernel: Call Trace:
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05ac80d>] nfsd4_free_stateid+0xfd/0x1b0 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa059803d>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa0583114>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa047bbf9>] svc_process_common+0x3d9/0x700 [sunrpc]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa047ca64>] svc_process+0xf4/0x330 [sunrpc]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05827ca>] nfsd+0xfa/0x160 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05826d0>] ? nfsd_destroy+0x170/0x170 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffff810b367b>] kthread+0x10b/0x120
      Oct 28 22:04:30 klimt kernel: [<ffffffff810b3570>] ? kthread_stop+0x280/0x280
      Oct 28 22:04:30 klimt kernel: [<ffffffff8174e8ba>] ret_from_fork+0x2a/0x40
      Oct 28 22:04:30 klimt kernel: Code: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 8b 87 b0 00 00 00 48 89 fb 4c 8b a0 98 00 00 00 <49> 8b 44 24 20 48 8d b8 80 03 00 00 e8 10 66 1a e1 48 89 df e8
      Oct 28 22:04:30 klimt kernel: RIP  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
      Oct 28 22:04:30 klimt kernel: RSP <ffff8808420dbce0>
      Oct 28 22:04:30 klimt kernel: ---[ end trace cf5d0b371973e167 ]---
      
      Jeff Layton says:
      > Hm...now that I look though, this is a little suspicious:
      >
      >    struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
      >
      > I wonder if it's possible for the openstateid to have already been
      > destroyed at this point.
      >
      > We might be better off doing something like this to get the client pointer:
      >
      >    stp->st_stid.sc_client;
      >
      > ...which should be more direct and less dependent on other stateids
      > staying valid.
      
      With the suggested change, I am no longer able to reproduce the above oops.
      
      v2: Fix unhash_lock_stateid() as well
      Fix-suggested-by: NJeff Layton <jlayton@redhat.com>
      Fixes: 42691398 ('nfsd: Fix race between FREE_STATEID and LOCK')
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f46c445b
  2. 25 10月, 2016 1 次提交
  3. 08 10月, 2016 4 次提交
  4. 28 9月, 2016 1 次提交
  5. 27 9月, 2016 7 次提交
    • J
      nfsd4: setclientid_confirm with unmatched verifier should fail · 7d22fc11
      J. Bruce Fields 提交于
      A setclientid_confirm with (clientid, verifier) both matching an
      existing confirmed record is assumed to be a replay, but if the verifier
      doesn't match, it shouldn't be.
      
      This would be a very rare case, except that clients following
      https://tools.ietf.org/html/rfc7931#section-5.8 may depend on the
      failure.
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7d22fc11
    • J
      nfsd: randomize SETCLIENTID reply to help distinguish servers · ebd7c72c
      J. Bruce Fields 提交于
      NFSv4.1 has built-in trunking support that allows a client to determine
      whether two connections to two different IP addresses are actually to
      the same server.  NFSv4.0 does not, but RFC 7931 attempts to provide
      clients a means to do this, basically by performing a SETCLIENTID to one
      address and confirming it with a SETCLIENTID_CONFIRM to the other.
      
      Linux clients since 05f4c350 "NFS: Discover NFSv4 server trunking
      when mounting" implement a variation on this suggestion.  It is possible
      that other clients do too.
      
      This depends on the clientid and verifier not being accepted by an
      unrelated server.  Since both are 64-bit values, that would be very
      unlikely if they were random numbers.  But they aren't:
      
      knfsd generates the 64-bit clientid by concatenating the 32-bit boot
      time (in seconds) and a counter.  This makes collisions between
      clientids generated by the same server extremely unlikely.  But
      collisions are very likely between clientids generated by servers that
      boot at the same time, and it's quite common for multiple servers to
      boot at the same time.  The verifier is a concatenation of the
      SETCLIENTID time (in seconds) and a counter, so again collisions between
      different servers are likely if multiple SETCLIENTIDs are done at the
      same time, which is a common case.
      
      Therefore recent NFSv4.0 clients may decide two different servers are
      really the same, and mount a filesystem from the wrong server.
      
      Fortunately the Linux client, since 55b9df93 "nfsv4/v4.1: Verify the
      client owner id during trunking detection", only does this when given
      the non-default "migration" mount option.
      
      The fault is really with RFC 7931, and needs a client fix, but in the
      meantime we can mitigate the chance of these collisions by randomizing
      the starting value of the counters used to generate clientids and
      verifiers.
      Reported-by: NFrank Sorenson <fsorenso@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ebd7c72c
    • J
      nfsd: set the MAY_NOTIFY_LOCK flag in OPEN replies · 19e4c347
      Jeff Layton 提交于
      If we are using v4.1+, then we can send notification when contended
      locks become free. Inform the client of that fact.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      19e4c347
    • J
      nfsd: add a LRU list for blocked locks · 7919d0a2
      Jeff Layton 提交于
      It's possible for a client to call in on a lock that is blocked for a
      long time, but discontinue polling for it. A malicious client could
      even set a lock on a file, and then spam the server with failing lock
      requests from different lockowners that pile up in a DoS attack.
      
      Add the blocked lock structures to a per-net namespace LRU when hashing
      them, and timestamp them. If the lock request is not revisited after a
      lease period, we'll drop it under the assumption that the client is no
      longer interested.
      
      This also gives us a mechanism to clean up these objects at server
      shutdown time as well.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7919d0a2
    • J
      nfsd: have nfsd4_lock use blocking locks for v4.1+ locks · 76d348fa
      Jeff Layton 提交于
      Create a new per-lockowner+per-inode structure that contains a
      file_lock. Have nfsd4_lock add this structure to the lockowner's list
      prior to setting the lock. Then call the vfs and request a blocking lock
      (by setting FL_SLEEP). If we get anything besides FILE_LOCK_DEFERRED
      back, then we dequeue the block structure and free it. When the next
      lock request comes in, we'll look for an existing block for the same
      filehandle and dequeue and reuse it if there is one.
      
      When the lock comes free (a'la an lm_notify call), we dequeue it
      from the lockowner's list and kick off a CB_NOTIFY_LOCK callback to
      inform the client that it should retry the lock request.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      76d348fa
    • J
      nfsd: plumb in a CB_NOTIFY_LOCK operation · a188620e
      Jeff Layton 提交于
      Add the encoding/decoding for CB_NOTIFY_LOCK operations.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      a188620e
    • V
      NFSD: fix corruption in notifier registration · 1eca45f8
      Vasily Averin 提交于
      By design notifier can be registered once only, however nfsd registers
      the same inetaddr notifiers per net-namespace.  When this happen it
      corrupts list of notifiers, as result some notifiers can be not called
      on proper event, traverse on list can be cycled forever, and second
      unregister can access already freed memory.
      
      Cc: stable@vger.kernel.org
      fixes: 36684996 ("nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain")
      Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      1eca45f8
  6. 23 9月, 2016 1 次提交
  7. 22 9月, 2016 1 次提交
  8. 17 9月, 2016 2 次提交
    • J
      nfsd: eliminate cb_minorversion field · 89dfdc96
      Jeff Layton 提交于
      We already have that info in the client pointer. No need to pass around
      a copy.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      89dfdc96
    • J
      nfsd: don't set a FL_LAYOUT lease for flexfiles layouts · 1983a66f
      Jeff Layton 提交于
      We currently can hit a deadlock (of sorts) when trying to use flexfiles
      layouts with XFS. XFS will call break_layout when something wants to
      write to the file. In the case of the (super-simple) flexfiles layout
      driver in knfsd, the MDS and DS are the same machine.
      
      The client can get a layout and then issue a v3 write to do its I/O. XFS
      will then call xfs_break_layouts, which will cause a CB_LAYOUTRECALL to
      be issued to the client. The client however can't return the layout
      until the v3 WRITE completes, but XFS won't allow the write to proceed
      until the layout is returned.
      
      Christoph says:
      
          XFS only cares about block-like layouts where the client has direct
          access to the file blocks.  I'd need to look how to propagate the
          flag into break_layout, but in principle we don't need to do any
          recalls on truncate ever for file and flexfile layouts.
      
      If we're never going to recall the layout, then we don't even need to
      set the lease at all. Just skip doing so on flexfiles layouts by
      adding a new flag to struct nfsd4_layout_ops and skipping the lease
      setting and removal when that flag is true.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      1983a66f
  9. 13 8月, 2016 1 次提交
  10. 12 8月, 2016 1 次提交
    • C
      nfsd: Fix race between FREE_STATEID and LOCK · 42691398
      Chuck Lever 提交于
      When running LTP's nfslock01 test, the Linux client can send a LOCK
      and a FREE_STATEID request at the same time. The outcome is:
      
      Frame 324    R OPEN stateid [2,O]
      
      Frame 115004 C LOCK lockowner_is_new stateid [2,O] offset 672000 len 64
      Frame 115008 R LOCK stateid [1,L]
      Frame 115012 C WRITE stateid [0,L] offset 672000 len 64
      Frame 115016 R WRITE NFS4_OK
      Frame 115019 C LOCKU stateid [1,L] offset 672000 len 64
      Frame 115022 R LOCKU NFS4_OK
      Frame 115025 C FREE_STATEID stateid [2,L]
      Frame 115026 C LOCK lockowner_is_new stateid [2,O] offset 672128 len 64
      Frame 115029 R FREE_STATEID NFS4_OK
      Frame 115030 R LOCK stateid [3,L]
      Frame 115034 C WRITE stateid [0,L] offset 672128 len 64
      Frame 115038 R WRITE NFS4ERR_BAD_STATEID
      
      In other words, the server returns stateid L in a successful LOCK
      reply, but it has already released it. Subsequent uses of stateid L
      fail.
      
      To address this, protect the generation check in nfsd4_free_stateid
      with the st_mutex. This should guarantee that only one of two
      outcomes occurs: either LOCK returns a fresh valid stateid, or
      FREE_STATEID returns NFS4ERR_LOCKS_HELD.
      Reported-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Fix-suggested-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Tested-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      42691398
  11. 11 8月, 2016 1 次提交
  12. 05 8月, 2016 8 次提交
  13. 16 7月, 2016 4 次提交
  14. 14 7月, 2016 4 次提交