1. 18 11月, 2016 1 次提交
    • A
      netns: make struct pernet_operations::id unsigned int · c7d03a00
      Alexey Dobriyan 提交于
      Make struct pernet_operations::id unsigned.
      
      There are 2 reasons to do so:
      
      1)
      This field is really an index into an zero based array and
      thus is unsigned entity. Using negative value is out-of-bound
      access by definition.
      
      2)
      On x86_64 unsigned 32-bit data which are mixed with pointers
      via array indexing or offsets added or subtracted to pointers
      are preffered to signed 32-bit data.
      
      "int" being used as an array index needs to be sign-extended
      to 64-bit before being used.
      
      	void f(long *p, int i)
      	{
      		g(p[i]);
      	}
      
        roughly translates to
      
      	movsx	rsi, esi
      	mov	rdi, [rsi+...]
      	call 	g
      
      MOVSX is 3 byte instruction which isn't necessary if the variable is
      unsigned because x86_64 is zero extending by default.
      
      Now, there is net_generic() function which, you guessed it right, uses
      "int" as an array index:
      
      	static inline void *net_generic(const struct net *net, int id)
      	{
      		...
      		ptr = ng->ptr[id - 1];
      		...
      	}
      
      And this function is used a lot, so those sign extensions add up.
      
      Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
      messing with code generation):
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      
      Unfortunately some functions actually grow bigger.
      This is a semmingly random artefact of code generation with register
      allocator being used differently. gcc decides that some variable
      needs to live in new r8+ registers and every access now requires REX
      prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
      used which is longer than [r8]
      
      However, overall balance is in negative direction:
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      	function                                     old     new   delta
      	nfsd4_lock                                  3886    3959     +73
      	tipc_link_build_proto_msg                   1096    1140     +44
      	mac80211_hwsim_new_radio                    2776    2808     +32
      	tipc_mon_rcv                                1032    1058     +26
      	svcauth_gss_legacy_init                     1413    1429     +16
      	tipc_bcbase_select_primary                   379     392     +13
      	nfsd4_exchange_id                           1247    1260     +13
      	nfsd4_setclientid_confirm                    782     793     +11
      		...
      	put_client_renew_locked                      494     480     -14
      	ip_set_sockfn_get                            730     716     -14
      	geneve_sock_add                              829     813     -16
      	nfsd4_sequence_done                          721     703     -18
      	nlmclnt_lookup_host                          708     686     -22
      	nfsd4_lockt                                 1085    1063     -22
      	nfs_get_client                              1077    1050     -27
      	tcf_bpf_init                                1106    1076     -30
      	nfsd4_encode_fattr                          5997    5930     -67
      	Total: Before=154856051, After=154854321, chg -0.00%
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7d03a00
  2. 12 10月, 2016 1 次提交
  3. 01 7月, 2016 1 次提交
  4. 30 5月, 2016 1 次提交
  5. 07 1月, 2016 2 次提交
  6. 23 12月, 2015 1 次提交
  7. 24 10月, 2015 1 次提交
    • A
      lockd: get rid of reference-counted NSM RPC clients · 0d0f4aab
      Andrey Ryabinin 提交于
      Currently we have reference-counted per-net NSM RPC client
      which created on the first monitor request and destroyed
      after the last unmonitor request. It's needed because
      RPC client need to know 'utsname()->nodename', but utsname()
      might be NULL when nsm_unmonitor() called.
      
      So instead of holding the rpc client we could just save nodename
      in struct nlm_host and pass it to the rpc_create().
      Thus ther is no need in keeping rpc client until last
      unmonitor request. We could create separate RPC clients
      for each monitor/unmonitor requests.
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      0d0f4aab
  8. 23 10月, 2015 1 次提交
  9. 13 10月, 2015 1 次提交
    • A
      lockd: create NSM handles per net namespace · 0ad95472
      Andrey Ryabinin 提交于
      Commit cb7323ff ("lockd: create and use per-net NSM
       RPC clients on MON/UNMON requests") introduced per-net
      NSM RPC clients. Unfortunately this doesn't make any sense
      without per-net nsm_handle.
      
      E.g. the following scenario could happen
      Two hosts (X and Y) in different namespaces (A and B) share
      the same nsm struct.
      
      1. nsm_monitor(host_X) called => NSM rpc client created,
      	nsm->sm_monitored bit set.
      2. nsm_mointor(host-Y) called => nsm->sm_monitored already set,
      	we just exit. Thus in namespace B ln->nsm_clnt == NULL.
      3. host X destroyed => nsm->sm_count decremented to 1
      4. host Y destroyed => nsm_unmonitor() => nsm_mon_unmon() => NULL-ptr
      	dereference of *ln->nsm_clnt
      
      So this could be fixed by making per-net nsm_handles list,
      instead of global. Thus different net namespaces will not be able
      share the same nsm_handle.
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      0ad95472
  10. 13 8月, 2015 1 次提交
  11. 11 8月, 2015 2 次提交
  12. 22 4月, 2015 1 次提交
  13. 04 2月, 2015 1 次提交
  14. 23 1月, 2015 1 次提交
  15. 17 1月, 2015 2 次提交
  16. 16 1月, 2015 1 次提交
  17. 06 1月, 2015 1 次提交
    • T
      LOCKD: Fix a race when initialising nlmsvc_timeout · 06bed7d1
      Trond Myklebust 提交于
      This commit fixes a race whereby nlmclnt_init() first starts the lockd
      daemon, and then calls nlm_bind_host() with the expectation that
      nlmsvc_timeout has already been initialised. Unfortunately, there is no
      no synchronisation between lockd() and lockd_up() to guarantee that this
      is the case.
      
      Fix is to move the initialisation of nlmsvc_timeout into lockd_create_svc
      
      Fixes: 9a1b6bf8 ("LOCKD: Don't call utsname()->nodename...")
      Cc: Bruce Fields <bfields@fieldses.org>
      Cc: stable@vger.kernel.org # 3.10.x
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      06bed7d1
  18. 10 12月, 2014 1 次提交
  19. 25 11月, 2014 1 次提交
  20. 20 11月, 2014 1 次提交
  21. 07 11月, 2014 1 次提交
  22. 25 9月, 2014 1 次提交
  23. 18 9月, 2014 2 次提交
    • J
      lockd: add a /proc/fs/lockd/nlm_end_grace file · d68e3c4a
      Jeff Layton 提交于
      Add a new procfile that will allow a (privileged) userland process to
      end the NLM grace period early. The basic idea here will be to have
      sm-notify write to this file, if it sent out no NOTIFY requests when
      it runs. In that situation, we can generally expect that there will be
      no reclaim requests so the grace period can be lifted early.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      d68e3c4a
    • J
      lockd: move lockd's grace period handling into its own module · f7790029
      Jeff Layton 提交于
      Currently, all of the grace period handling is part of lockd. Eventually
      though we'd like to be able to build v4-only servers, at which point
      we'll need to put all of this elsewhere.
      
      Move the code itself into fs/nfs_common and have it build a grace.ko
      module. Then, rejigger the Kconfig options so that both nfsd and lockd
      enable it automatically.
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      f7790029
  24. 10 9月, 2014 3 次提交
    • J
      lockd: rip out deferred lock handling from testlock codepath · 09802fd2
      Jeff Layton 提交于
      As Kinglong points out, the nlm_block->b_fl field is no longer used at
      all. Also, vfs_test_lock in the generic locking code will only return
      FILE_LOCK_DEFERRED if FL_SLEEP is set, and it isn't here.
      
      The only other place that returns that value is the DLM lock code, but
      it only does that in dlm_posix_lock, never in dlm_posix_get.
      
      Remove all of the deferred locking code from the testlock codepath
      since it doesn't appear to ever be used anyway.
      
      I do have a small concern that this might cause a behavior change in the
      case where you have a block already sitting on the list when the
      testlock request comes in, but that looks like it doesn't really work
      properly anyway. I think it's best to just pass that down to
      vfs_test_lock and let the filesystem report that instead of trying to
      infer what's going on with the lock by looking at an existing block.
      
      Cc: cluster-devel@redhat.com
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NKinglong Mee <kinglongmee@gmail.com>
      09802fd2
    • K
      locks: Copy fl_lmops information for conflock in locks_copy_conflock() · f328296e
      Kinglong Mee 提交于
      Commit d5b9026a ([PATCH] knfsd: locks: flag NFSv4-owned locks) using
      fl_lmops field in file_lock for checking nfsd4 lockowner.
      
      But, commit 1a747ee0 (locks: don't call ->copy_lock methods on return
      of conflicting locks) causes the fl_lmops of conflock always be NULL.
      
      Also, commit 0996905f (lockd: posix_test_lock() should not call
      locks_copy_lock()) caused the fl_lmops of conflock always be NULL too.
      
      Make sure copy the private information by fl_copy_lock() in struct
      file_lock_operations, merge __locks_copy_lock() to fl_copy_lock().
      
      Jeff advice, "Set fl_lmops on conflocks, but don't set fl_ops.
      fl_ops are superfluous, since they are callbacks into the filesystem.
      There should be no need to bother the filesystem at all with info
      in a conflock. But, lock _ownership_ matters for conflocks and that's
      indicated by the fl_lmops. So you really do want to copy the fl_lmops
      for conflocks I think."
      
      v5: add missing calling of locks_release_private() in nlmsvc_testlock()
      v4: only copy fl_lmops for conflock, don't copy fl_ops
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      f328296e
    • J
      locks: Remove unused conf argument from lm_grant · d0449b90
      Joe Perches 提交于
      This argument is always NULL so don't pass it around.
      
      [jlayton: remove dependencies on previous patches in series]
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NJeff Layton <jlayton@primarydata.com>
      d0449b90
  25. 09 9月, 2014 1 次提交
    • J
      lockd: fix rpcbind crash on lockd startup failure · 7c17705e
      J. Bruce Fields 提交于
      Nikita Yuschenko reported that booting a kernel with init=/bin/sh and
      then nfs mounting without portmap or rpcbind running using a busybox
      mount resulted in:
      
        # mount -t nfs 10.30.130.21:/opt /mnt
        svc: failed to register lockdv1 RPC service (errno 111).
        lockd_up: makesock failed, error=-111
        Unable to handle kernel paging request for data at address 0x00000030
        Faulting instruction address: 0xc055e65c
        Oops: Kernel access of bad area, sig: 11 [#1]
        MPC85xx CDS
        Modules linked in:
        CPU: 0 PID: 1338 Comm: mount Not tainted 3.10.44.cge #117
        task: cf29cea0 ti: cf35c000 task.ti: cf35c000
        NIP: c055e65c LR: c0566490 CTR: c055e648
        REGS: cf35dad0 TRAP: 0300   Not tainted  (3.10.44.cge)
        MSR: 00029000 <CE,EE,ME>  CR: 22442488  XER: 20000000
        DEAR: 00000030, ESR: 00000000
      
        GPR00: c05606f4 cf35db80 cf29cea0 cf0ded80 cf0dedb8 00000001 1dec3086
        00000000
        GPR08: 00000000 c07b1640 00000007 1dec3086 22442482 100b9758 00000000
        10090ae8
        GPR16: 00000000 000186a5 00000000 00000000 100c3018 bfa46edc 100b0000
        bfa46ef0
        GPR24: cf386ae0 c07834f0 00000000 c0565f88 00000001 cf0dedb8 00000000
        cf0ded80
        NIP [c055e65c] call_start+0x14/0x34
        LR [c0566490] __rpc_execute+0x70/0x250
        Call Trace:
        [cf35db80] [00000080] 0x80 (unreliable)
        [cf35dbb0] [c05606f4] rpc_run_task+0x9c/0xc4
        [cf35dbc0] [c0560840] rpc_call_sync+0x50/0xb8
        [cf35dbf0] [c056ee90] rpcb_register_call+0x54/0x84
        [cf35dc10] [c056f24c] rpcb_register+0xf8/0x10c
        [cf35dc70] [c0569e18] svc_unregister.isra.23+0x100/0x108
        [cf35dc90] [c0569e38] svc_rpcb_cleanup+0x18/0x30
        [cf35dca0] [c0198c5c] lockd_up+0x1dc/0x2e0
        [cf35dcd0] [c0195348] nlmclnt_init+0x2c/0xc8
        [cf35dcf0] [c015bb5c] nfs_start_lockd+0x98/0xec
        [cf35dd20] [c015ce6c] nfs_create_server+0x1e8/0x3f4
        [cf35dd90] [c0171590] nfs3_create_server+0x10/0x44
        [cf35dda0] [c016528c] nfs_try_mount+0x158/0x1e4
        [cf35de20] [c01670d0] nfs_fs_mount+0x434/0x8c8
        [cf35de70] [c00cd3bc] mount_fs+0x20/0xbc
        [cf35de90] [c00e4f88] vfs_kern_mount+0x50/0x104
        [cf35dec0] [c00e6e0c] do_mount+0x1d0/0x8e0
        [cf35df10] [c00e75ac] SyS_mount+0x90/0xd0
        [cf35df40] [c000ccf4] ret_from_syscall+0x0/0x3c
      
      The addition of svc_shutdown_net() resulted in two calls to
      svc_rpcb_cleanup(); the second is no longer necessary and crashes when
      it calls rpcb_register_call with clnt=NULL.
      Reported-by: NNikita Yushchenko <nyushchenko@dev.rtsoft.ru>
      Fixes: 679b033d "lockd: ensure we tear down any live sockets when socket creation fails during lockd_up"
      Cc: stable@vger.kernel.org
      Acked-by: NJeff Layton <jlayton@primarydata.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7c17705e
  26. 03 9月, 2014 1 次提交
  27. 18 8月, 2014 1 次提交
  28. 24 7月, 2014 1 次提交
  29. 07 6月, 2014 1 次提交
  30. 07 5月, 2014 3 次提交
  31. 28 3月, 2014 1 次提交
    • J
      lockd: ensure we tear down any live sockets when socket creation fails during lockd_up · 679b033d
      Jeff Layton 提交于
      We had a Fedora ABRT report with a stack trace like this:
      
      kernel BUG at net/sunrpc/svc.c:550!
      invalid opcode: 0000 [#1] SMP
      [...]
      CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1
      Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013
      task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000
      RIP: 0010:[<ffffffffa0305fa8>]  [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
      RSP: 0018:ffff88003f9b9de0  EFLAGS: 00010206
      RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee
      RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
      RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360
      R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600
      R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000
      FS:  00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0
      Stack:
       ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000
       ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60
       ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008
      Call Trace:
       [<ffffffffa02de02a>] lockd_up+0xaa/0x330 [lockd]
       [<ffffffffa033bb35>] nfsd_svc+0x1b5/0x2f0 [nfsd]
       [<ffffffff8131c86c>] ? simple_strtoull+0x2c/0x50
       [<ffffffffa033c630>] ? write_pool_threads+0x280/0x280 [nfsd]
       [<ffffffffa033c6bb>] write_threads+0x8b/0xf0 [nfsd]
       [<ffffffff8114efa4>] ? __get_free_pages+0x14/0x50
       [<ffffffff8114eff6>] ? get_zeroed_page+0x16/0x20
       [<ffffffff811dec51>] ? simple_transaction_get+0xb1/0xd0
       [<ffffffffa033c098>] nfsctl_transaction_write+0x48/0x80 [nfsd]
       [<ffffffff811b8b34>] vfs_write+0xb4/0x1f0
       [<ffffffff811c3f99>] ? putname+0x29/0x40
       [<ffffffff811b9569>] SyS_write+0x49/0xa0
       [<ffffffff810fc2a6>] ? __audit_syscall_exit+0x1f6/0x2a0
       [<ffffffff816962e9>] system_call_fastpath+0x16/0x1b
      Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
      RIP  [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
       RSP <ffff88003f9b9de0>
      
      Evidently, we created some lockd sockets and then failed to create
      others. make_socks then returned an error and we tried to tear down the
      svc, but svc->sv_permsocks was not empty so we ended up tripping over
      the BUG() in svc_destroy().
      
      Fix this by ensuring that we tear down any live sockets we created when
      socket creation is going to return an error.
      
      Fixes: 786185b5 (SUNRPC: move per-net operations from...)
      Reported-by: NRaphos <raphoszap@laposte.net>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      679b033d
  32. 14 2月, 2014 1 次提交
    • N
      lockd: send correct lock when granting a delayed lock. · 2ec197db
      NeilBrown 提交于
      If an NFS client attempts to get a lock (using NLM) and the lock is
      not available, the server will remember the request and when the lock
      becomes available it will send a GRANT request to the client to
      provide the lock.
      
      If the client already held an adjacent lock, the GRANT callback will
      report the union of the existing and new locks, which can confuse the
      client.
      
      This happens because __posix_lock_file (called by vfs_lock_file)
      updates the passed-in file_lock structure when adjacent or
      over-lapping locks are found.
      
      To avoid this problem we take a copy of the two fields that can
      be changed (fl_start and fl_end) before the call and restore them
      afterwards.
      An alternate would be to allocate a 'struct file_lock', initialise it,
      use locks_copy_lock() to take a copy, then locks_release_private()
      after the vfs_lock_file() call.  But that is a lot more work.
      Reported-by: NOlaf Kirch <okir@suse.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      
      --
      v1 had a couple of issues (large on-stack struct and didn't really work properly).
      This version is much better tested.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2ec197db