1. 18 9月, 2014 2 次提交
    • J
      lockd: add a /proc/fs/lockd/nlm_end_grace file · d68e3c4a
      Jeff Layton 提交于
      Add a new procfile that will allow a (privileged) userland process to
      end the NLM grace period early. The basic idea here will be to have
      sm-notify write to this file, if it sent out no NOTIFY requests when
      it runs. In that situation, we can generally expect that there will be
      no reclaim requests so the grace period can be lifted early.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      d68e3c4a
    • J
      lockd: move lockd's grace period handling into its own module · f7790029
      Jeff Layton 提交于
      Currently, all of the grace period handling is part of lockd. Eventually
      though we'd like to be able to build v4-only servers, at which point
      we'll need to put all of this elsewhere.
      
      Move the code itself into fs/nfs_common and have it build a grace.ko
      module. Then, rejigger the Kconfig options so that both nfsd and lockd
      enable it automatically.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      f7790029
  2. 03 9月, 2014 1 次提交
  3. 18 8月, 2014 1 次提交
  4. 24 7月, 2014 1 次提交
  5. 07 6月, 2014 1 次提交
  6. 07 5月, 2014 3 次提交
  7. 28 3月, 2014 1 次提交
    • J
      lockd: ensure we tear down any live sockets when socket creation fails during lockd_up · 679b033d
      Jeff Layton 提交于
      We had a Fedora ABRT report with a stack trace like this:
      
      kernel BUG at net/sunrpc/svc.c:550!
      invalid opcode: 0000 [#1] SMP
      [...]
      CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1
      Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013
      task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000
      RIP: 0010:[<ffffffffa0305fa8>]  [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
      RSP: 0018:ffff88003f9b9de0  EFLAGS: 00010206
      RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee
      RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
      RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360
      R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600
      R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000
      FS:  00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0
      Stack:
       ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000
       ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60
       ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008
      Call Trace:
       [<ffffffffa02de02a>] lockd_up+0xaa/0x330 [lockd]
       [<ffffffffa033bb35>] nfsd_svc+0x1b5/0x2f0 [nfsd]
       [<ffffffff8131c86c>] ? simple_strtoull+0x2c/0x50
       [<ffffffffa033c630>] ? write_pool_threads+0x280/0x280 [nfsd]
       [<ffffffffa033c6bb>] write_threads+0x8b/0xf0 [nfsd]
       [<ffffffff8114efa4>] ? __get_free_pages+0x14/0x50
       [<ffffffff8114eff6>] ? get_zeroed_page+0x16/0x20
       [<ffffffff811dec51>] ? simple_transaction_get+0xb1/0xd0
       [<ffffffffa033c098>] nfsctl_transaction_write+0x48/0x80 [nfsd]
       [<ffffffff811b8b34>] vfs_write+0xb4/0x1f0
       [<ffffffff811c3f99>] ? putname+0x29/0x40
       [<ffffffff811b9569>] SyS_write+0x49/0xa0
       [<ffffffff810fc2a6>] ? __audit_syscall_exit+0x1f6/0x2a0
       [<ffffffff816962e9>] system_call_fastpath+0x16/0x1b
      Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
      RIP  [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
       RSP <ffff88003f9b9de0>
      
      Evidently, we created some lockd sockets and then failed to create
      others. make_socks then returned an error and we tried to tear down the
      svc, but svc->sv_permsocks was not empty so we ended up tripping over
      the BUG() in svc_destroy().
      
      Fix this by ensuring that we tear down any live sockets we created when
      socket creation is going to return an error.
      
      Fixes: 786185b5 (SUNRPC: move per-net operations from...)
      Reported-by: NRaphos <raphoszap@laposte.net>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      679b033d
  8. 14 2月, 2014 1 次提交
    • N
      lockd: send correct lock when granting a delayed lock. · 2ec197db
      NeilBrown 提交于
      If an NFS client attempts to get a lock (using NLM) and the lock is
      not available, the server will remember the request and when the lock
      becomes available it will send a GRANT request to the client to
      provide the lock.
      
      If the client already held an adjacent lock, the GRANT callback will
      report the union of the existing and new locks, which can confuse the
      client.
      
      This happens because __posix_lock_file (called by vfs_lock_file)
      updates the passed-in file_lock structure when adjacent or
      over-lapping locks are found.
      
      To avoid this problem we take a copy of the two fields that can
      be changed (fl_start and fl_end) before the call and restore them
      afterwards.
      An alternate would be to allocate a 'struct file_lock', initialise it,
      use locks_copy_lock() to take a copy, then locks_release_private()
      after the vfs_lock_file() call.  But that is a lot more work.
      Reported-by: NOlaf Kirch <okir@suse.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      
      --
      v1 had a couple of issues (large on-stack struct and didn't really work properly).
      This version is much better tested.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2ec197db
  9. 06 8月, 2013 1 次提交
    • T
      LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs · 9a1b6bf8
      Trond Myklebust 提交于
      Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in
      which case we're in entirely the wrong namespace.
      
      Secondly, commit 8aac6270 (move
      exit_task_namespaces() outside of exit_notify()) now means that
      exit_task_work() is called after exit_task_namespaces(), which
      triggers an Oops when we're freeing up the locks.
      
      Fix this by ensuring that we initialise the nlm_host's rpc_client at mount
      time, so that the cl_nodename field is initialised to the value of
      utsname()->nodename that the net namespace uses. Then replace the
      lockd callers of utsname()->nodename.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Toralf Förster <toralf.foerster@gmx.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Nix <nix@esperi.org.uk>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org # 3.10.x
      9a1b6bf8
  10. 12 7月, 2013 1 次提交
    • D
      lockd: protect nlm_blocked access in nlmsvc_retry_blocked · 1c327d96
      David Jeffery 提交于
      In nlmsvc_retry_blocked, the check that the list is non-empty and acquiring
      the pointer of the first entry is unprotected by any lock.  This allows a rare
      race condition when there is only one entry on the list.  A function such as
      nlmsvc_grant_callback() can be called, which will temporarily remove the entry
      from the list.  Between the list_empty() and list_entry(),the list may become
      empty, causing an invalid pointer to be used as an nlm_block, leading to a
      possible crash.
      
      This patch adds the nlm_block_lock around these calls to prevent concurrent
      use of the nlm_blocked list.
      
      This was a regression introduced by
      f904be9c  "lockd: Mostly remove BKL from
      the server".
      
      Cc: Bryan Schumaker <bjschuma@netapp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      1c327d96
  11. 04 7月, 2013 1 次提交
  12. 29 6月, 2013 3 次提交
    • J
      locks: add a new "lm_owner_key" lock operation · 3999e493
      Jeff Layton 提交于
      Currently, the hashing that the locking code uses to add these values
      to the blocked_hash is simply calculated using fl_owner field. That's
      valid in most cases except for server-side lockd, which validates the
      owner of a lock based on fl_owner and fl_pid.
      
      In the case where you have a small number of NFS clients doing a lot
      of locking between different processes, you could end up with all
      the blocked requests sitting in a very small number of hash buckets.
      
      Add a new lm_owner_key operation to the lock_manager_operations that
      will generate an unsigned long to use as the key in the hashtable.
      That function is only implemented for server-side lockd, and simply
      XORs the fl_owner and fl_pid.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3999e493
    • J
      locks: protect most of the file_lock handling with i_lock · 1c8c601a
      Jeff Layton 提交于
      Having a global lock that protects all of this code is a clear
      scalability problem. Instead of doing that, move most of the code to be
      protected by the i_lock instead. The exceptions are the global lists
      that the ->fl_link sits on, and the ->fl_block list.
      
      ->fl_link is what connects these structures to the
      global lists, so we must ensure that we hold those locks when iterating
      over or updating these lists.
      
      Furthermore, sound deadlock detection requires that we hold the
      blocked_list state steady while checking for loops. We also must ensure
      that the search and update to the list are atomic.
      
      For the checking and insertion side of the blocked_list, push the
      acquisition of the global lock into __posix_lock_file and ensure that
      checking and update of the  blocked_list is done without dropping the
      lock in between.
      
      On the removal side, when waking up blocked lock waiters, take the
      global lock before walking the blocked list and dequeue the waiters from
      the global list prior to removal from the fl_block list.
      
      With this, deadlock detection should be race free while we minimize
      excessive file_lock_lock thrashing.
      
      Finally, in order to avoid a lock inversion problem when handling
      /proc/locks output we must ensure that manipulations of the fl_block
      list are also protected by the file_lock_lock.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1c8c601a
    • J
      f891a29f
  13. 22 4月, 2013 1 次提交
  14. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  15. 23 2月, 2013 1 次提交
  16. 20 2月, 2013 1 次提交
  17. 16 2月, 2013 1 次提交
  18. 05 2月, 2013 1 次提交
  19. 05 11月, 2012 4 次提交
  20. 24 10月, 2012 2 次提交
  21. 17 10月, 2012 1 次提交
  22. 02 10月, 2012 3 次提交
  23. 23 9月, 2012 1 次提交
  24. 22 8月, 2012 1 次提交
  25. 30 7月, 2012 2 次提交
  26. 28 7月, 2012 3 次提交