1. 08 10月, 2016 2 次提交
  2. 27 9月, 2016 7 次提交
    • J
      nfsd4: setclientid_confirm with unmatched verifier should fail · 7d22fc11
      J. Bruce Fields 提交于
      A setclientid_confirm with (clientid, verifier) both matching an
      existing confirmed record is assumed to be a replay, but if the verifier
      doesn't match, it shouldn't be.
      
      This would be a very rare case, except that clients following
      https://tools.ietf.org/html/rfc7931#section-5.8 may depend on the
      failure.
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7d22fc11
    • J
      nfsd: randomize SETCLIENTID reply to help distinguish servers · ebd7c72c
      J. Bruce Fields 提交于
      NFSv4.1 has built-in trunking support that allows a client to determine
      whether two connections to two different IP addresses are actually to
      the same server.  NFSv4.0 does not, but RFC 7931 attempts to provide
      clients a means to do this, basically by performing a SETCLIENTID to one
      address and confirming it with a SETCLIENTID_CONFIRM to the other.
      
      Linux clients since 05f4c350 "NFS: Discover NFSv4 server trunking
      when mounting" implement a variation on this suggestion.  It is possible
      that other clients do too.
      
      This depends on the clientid and verifier not being accepted by an
      unrelated server.  Since both are 64-bit values, that would be very
      unlikely if they were random numbers.  But they aren't:
      
      knfsd generates the 64-bit clientid by concatenating the 32-bit boot
      time (in seconds) and a counter.  This makes collisions between
      clientids generated by the same server extremely unlikely.  But
      collisions are very likely between clientids generated by servers that
      boot at the same time, and it's quite common for multiple servers to
      boot at the same time.  The verifier is a concatenation of the
      SETCLIENTID time (in seconds) and a counter, so again collisions between
      different servers are likely if multiple SETCLIENTIDs are done at the
      same time, which is a common case.
      
      Therefore recent NFSv4.0 clients may decide two different servers are
      really the same, and mount a filesystem from the wrong server.
      
      Fortunately the Linux client, since 55b9df93 "nfsv4/v4.1: Verify the
      client owner id during trunking detection", only does this when given
      the non-default "migration" mount option.
      
      The fault is really with RFC 7931, and needs a client fix, but in the
      meantime we can mitigate the chance of these collisions by randomizing
      the starting value of the counters used to generate clientids and
      verifiers.
      Reported-by: NFrank Sorenson <fsorenso@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ebd7c72c
    • J
      nfsd: set the MAY_NOTIFY_LOCK flag in OPEN replies · 19e4c347
      Jeff Layton 提交于
      If we are using v4.1+, then we can send notification when contended
      locks become free. Inform the client of that fact.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      19e4c347
    • J
      nfsd: add a LRU list for blocked locks · 7919d0a2
      Jeff Layton 提交于
      It's possible for a client to call in on a lock that is blocked for a
      long time, but discontinue polling for it. A malicious client could
      even set a lock on a file, and then spam the server with failing lock
      requests from different lockowners that pile up in a DoS attack.
      
      Add the blocked lock structures to a per-net namespace LRU when hashing
      them, and timestamp them. If the lock request is not revisited after a
      lease period, we'll drop it under the assumption that the client is no
      longer interested.
      
      This also gives us a mechanism to clean up these objects at server
      shutdown time as well.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7919d0a2
    • J
      nfsd: have nfsd4_lock use blocking locks for v4.1+ locks · 76d348fa
      Jeff Layton 提交于
      Create a new per-lockowner+per-inode structure that contains a
      file_lock. Have nfsd4_lock add this structure to the lockowner's list
      prior to setting the lock. Then call the vfs and request a blocking lock
      (by setting FL_SLEEP). If we get anything besides FILE_LOCK_DEFERRED
      back, then we dequeue the block structure and free it. When the next
      lock request comes in, we'll look for an existing block for the same
      filehandle and dequeue and reuse it if there is one.
      
      When the lock comes free (a'la an lm_notify call), we dequeue it
      from the lockowner's list and kick off a CB_NOTIFY_LOCK callback to
      inform the client that it should retry the lock request.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      76d348fa
    • J
      nfsd: plumb in a CB_NOTIFY_LOCK operation · a188620e
      Jeff Layton 提交于
      Add the encoding/decoding for CB_NOTIFY_LOCK operations.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      a188620e
    • V
      NFSD: fix corruption in notifier registration · 1eca45f8
      Vasily Averin 提交于
      By design notifier can be registered once only, however nfsd registers
      the same inetaddr notifiers per net-namespace.  When this happen it
      corrupts list of notifiers, as result some notifiers can be not called
      on proper event, traverse on list can be cycled forever, and second
      unregister can access already freed memory.
      
      Cc: stable@vger.kernel.org
      fixes: 36684996 ("nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain")
      Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      1eca45f8
  3. 23 9月, 2016 1 次提交
  4. 17 9月, 2016 2 次提交
    • J
      nfsd: eliminate cb_minorversion field · 89dfdc96
      Jeff Layton 提交于
      We already have that info in the client pointer. No need to pass around
      a copy.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      89dfdc96
    • J
      nfsd: don't set a FL_LAYOUT lease for flexfiles layouts · 1983a66f
      Jeff Layton 提交于
      We currently can hit a deadlock (of sorts) when trying to use flexfiles
      layouts with XFS. XFS will call break_layout when something wants to
      write to the file. In the case of the (super-simple) flexfiles layout
      driver in knfsd, the MDS and DS are the same machine.
      
      The client can get a layout and then issue a v3 write to do its I/O. XFS
      will then call xfs_break_layouts, which will cause a CB_LAYOUTRECALL to
      be issued to the client. The client however can't return the layout
      until the v3 WRITE completes, but XFS won't allow the write to proceed
      until the layout is returned.
      
      Christoph says:
      
          XFS only cares about block-like layouts where the client has direct
          access to the file blocks.  I'd need to look how to propagate the
          flag into break_layout, but in principle we don't need to do any
          recalls on truncate ever for file and flexfile layouts.
      
      If we're never going to recall the layout, then we don't even need to
      set the lease at all. Just skip doing so on flexfiles layouts by
      adding a new flag to struct nfsd4_layout_ops and skipping the lease
      setting and removal when that flag is true.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      1983a66f
  5. 13 8月, 2016 1 次提交
  6. 12 8月, 2016 1 次提交
    • C
      nfsd: Fix race between FREE_STATEID and LOCK · 42691398
      Chuck Lever 提交于
      When running LTP's nfslock01 test, the Linux client can send a LOCK
      and a FREE_STATEID request at the same time. The outcome is:
      
      Frame 324    R OPEN stateid [2,O]
      
      Frame 115004 C LOCK lockowner_is_new stateid [2,O] offset 672000 len 64
      Frame 115008 R LOCK stateid [1,L]
      Frame 115012 C WRITE stateid [0,L] offset 672000 len 64
      Frame 115016 R WRITE NFS4_OK
      Frame 115019 C LOCKU stateid [1,L] offset 672000 len 64
      Frame 115022 R LOCKU NFS4_OK
      Frame 115025 C FREE_STATEID stateid [2,L]
      Frame 115026 C LOCK lockowner_is_new stateid [2,O] offset 672128 len 64
      Frame 115029 R FREE_STATEID NFS4_OK
      Frame 115030 R LOCK stateid [3,L]
      Frame 115034 C WRITE stateid [0,L] offset 672128 len 64
      Frame 115038 R WRITE NFS4ERR_BAD_STATEID
      
      In other words, the server returns stateid L in a successful LOCK
      reply, but it has already released it. Subsequent uses of stateid L
      fail.
      
      To address this, protect the generation check in nfsd4_free_stateid
      with the st_mutex. This should guarantee that only one of two
      outcomes occurs: either LOCK returns a fresh valid stateid, or
      FREE_STATEID returns NFS4ERR_LOCKS_HELD.
      Reported-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Fix-suggested-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Tested-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      42691398
  7. 11 8月, 2016 1 次提交
  8. 05 8月, 2016 8 次提交
  9. 16 7月, 2016 4 次提交
  10. 14 7月, 2016 6 次提交
  11. 25 6月, 2016 1 次提交
    • B
      nfsd: check permissions when setting ACLs · 99965378
      Ben Hutchings 提交于
      Use set_posix_acl, which includes proper permission checks, instead of
      calling ->set_acl directly.  Without this anyone may be able to grant
      themselves permissions to a file by setting the ACL.
      
      Lock the inode to make the new checks atomic with respect to set_acl.
      (Also, nfsd was the only caller of set_acl not locking the inode, so I
      suspect this may fix other races.)
      
      This also simplifies the code, and ensures our ACLs are checked by
      posix_acl_valid.
      
      The permission checks and the inode locking were lost with commit
      4ac7249e, which changed nfsd to use the set_acl inode operation directly
      instead of going through xattr handlers.
      Reported-by: NDavid Sinquin <david@sinquin.eu>
      [agreunba@redhat.com: use set_posix_acl]
      Fixes: 4ac7249e
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      99965378
  12. 24 6月, 2016 1 次提交
    • E
      vfs: Pass data, ns, and ns->userns to mount_ns · d91ee87d
      Eric W. Biederman 提交于
      Today what is normally called data (the mount options) is not passed
      to fill_super through mount_ns.
      
      Pass the mount options and the namespace separately to mount_ns so
      that filesystems such as proc that have mount options, can use
      mount_ns.
      
      Pass the user namespace to mount_ns so that the standard permission
      check that verifies the mounter has permissions over the namespace can
      be performed in mount_ns instead of in each filesystems .mount method.
      Thus removing the duplication between mqueuefs and proc in terms of
      permission checks.  The extra permission check does not currently
      affect the rpc_pipefs filesystem and the nfsd filesystem as those
      filesystems do not currently allow unprivileged mounts.  Without
      unpvileged mounts it is guaranteed that the caller has already passed
      capable(CAP_SYS_ADMIN) which guarantees extra permission check will
      pass.
      
      Update rpc_pipefs and the nfsd filesystem to ensure that the network
      namespace reference is always taken in fill_super and always put in kill_sb
      so that the logic is simpler and so that errors originating inside of
      fill_super do not cause a network namespace leak.
      Acked-by: NSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      d91ee87d
  13. 21 6月, 2016 1 次提交
  14. 16 6月, 2016 3 次提交
  15. 15 6月, 2016 1 次提交