1. 10 7月, 2013 2 次提交
  2. 29 6月, 2013 2 次提交
    • S
      SUNRPC: fix races on PipeFS UMOUNT notifications · adb6fa7f
      Stanislav Kinsbursky 提交于
      CPU#0                                   CPU#1
      -----------------------------           -----------------------------
      rpc_kill_sb
      sn->pipefs_sb = NULL                    rpc_release_client
      (UMOUNT_EVENT)                          rpc_free_auth
      rpc_pipefs_event
      rpc_get_client_for_event
      !atomic_inc_not_zero(cl_count)
      <skip the client>
                                              atomic_inc(cl_count)
                                              rpc_free_client
                                              rpc_clnt_remove_pipedir
                                              <skip client dir removing>
      
      To fix this, this patch does the following:
      
      1) Calls RPC_PIPEFS_UMOUNT notification with sn->pipefs_sb_lock being held.
      2) Removes SUNRPC client from the list AFTER pipes destroying.
      3) Doesn't hold RPC client on notification: if client in the list, then it
      can't be destroyed while sn->pipefs_sb_lock in hold by notification caller.
      Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      adb6fa7f
    • S
      SUNRPC: fix races on PipeFS MOUNT notifications · 38481605
      Stanislav Kinsbursky 提交于
      Below are races, when RPC client can be created without PiepFS dentries
      
      CPU#0					CPU#1
      -----------------------------		-----------------------------
      rpc_new_client				rpc_fill_super
      rpc_setup_pipedir
      mutex_lock(&sn->pipefs_sb_lock)
      rpc_get_sb_net == NULL
      (no per-net PipeFS superblock)
      					sn->pipefs_sb = sb;
      					notifier_call_chain(MOUNT)
      					(client is not in the list)
      rpc_register_client
      (client without pipes dentries)
      
      To fix this patch:
      1) makes PipeFS mount notification call with pipefs_sb_lock being held.
      2) releases pipefs_sb_lock on new SUNRPC client creation only after
      registration.
      Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      38481605
  3. 19 6月, 2013 1 次提交
    • J
      rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set · e401452d
      Jeff Layton 提交于
      We had a report of a reproducible WARNING:
      
      [ 1360.039358] ------------[ cut here ]------------
      [ 1360.043978] WARNING: at fs/dcache.c:1355 d_set_d_op+0x8d/0xc0()
      [ 1360.049880] Hardware name: HP Z200 Workstation
      [ 1360.054308] Modules linked in: nfsv4 nfs dns_resolver fscache nfsd
      auth_rpcgss nfs_acl lockd sunrpc sg acpi_cpufreq mperf coretemp kvm_intel kvm
      snd_hda_codec_realtek snd_hda_intel snd_hda_codec hp_wmi crc32c_intel
      snd_hwdep e1000e snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd
      sparse_keymap rfkill soundcore serio_raw ptp iTCO_wdt pps_core pcspkr
      iTCO_vendor_support mei microcode lpc_ich mfd_core wmi xfs libcrc32c sr_mod
      sd_mod cdrom crc_t10dif radeon i2c_algo_bit drm_kms_helper ttm ahci libahci
      drm i2c_core libata dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
      auth_rpcgss]
      [ 1360.107406] Pid: 8814, comm: mount.nfs4 Tainted: G         I --------------   3.9.0-0.55.el7.x86_64 #1
      [ 1360.116771] Call Trace:
      [ 1360.119219]  [<ffffffff810610c0>] warn_slowpath_common+0x70/0xa0
      [ 1360.125208]  [<ffffffff810611aa>] warn_slowpath_null+0x1a/0x20
      [ 1360.131025]  [<ffffffff811af46d>] d_set_d_op+0x8d/0xc0
      [ 1360.136159]  [<ffffffffa05a7d6f>] __rpc_lookup_create_exclusive+0x4f/0x80 [sunrpc]
      [ 1360.143710]  [<ffffffffa05a8cc6>] rpc_mkpipe_dentry+0x86/0x170 [sunrpc]
      [ 1360.150311]  [<ffffffffa062a7b6>] nfs_idmap_new+0x96/0x130 [nfsv4]
      [ 1360.156475]  [<ffffffffa062e7cd>] nfs4_init_client+0xad/0x2d0 [nfsv4]
      [ 1360.162902]  [<ffffffff812f02df>] ? idr_get_empty_slot+0x16f/0x3c0
      [ 1360.169062]  [<ffffffff812f0582>] ? idr_mark_full+0x52/0x60
      [ 1360.174615]  [<ffffffff812f0699>] ? idr_alloc+0x79/0xe0
      [ 1360.179826]  [<ffffffffa0598081>] ? __rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc]
      [ 1360.187635]  [<ffffffffa05980f3>] ? rpc_init_wait_queue+0x13/0x20 [sunrpc]
      [ 1360.194493]  [<ffffffffa05d05da>] nfs_get_client+0x27a/0x350 [nfs]
      [ 1360.200666]  [<ffffffffa062e438>] nfs4_set_client.isra.8+0x78/0x100 [nfsv4]
      [ 1360.207624]  [<ffffffffa062f2f3>] nfs4_create_server+0xf3/0x3a0 [nfsv4]
      [ 1360.214222]  [<ffffffffa06284be>] nfs4_remote_mount+0x2e/0x60 [nfsv4]
      [ 1360.220644]  [<ffffffff8119ea79>] mount_fs+0x39/0x1b0
      [ 1360.225691]  [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20
      [ 1360.231348]  [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0
      [ 1360.236822]  [<ffffffffa0628396>] nfs_do_root_mount+0x86/0xc0 [nfsv4]
      [ 1360.243246]  [<ffffffffa06287b4>] nfs4_try_mount+0x44/0xc0 [nfsv4]
      [ 1360.249410]  [<ffffffffa05d1457>] ? get_nfs_version+0x27/0x80 [nfs]
      [ 1360.255659]  [<ffffffffa05db985>] nfs_fs_mount+0x5c5/0xd10 [nfs]
      [ 1360.261650]  [<ffffffffa05dc550>] ? nfs_clone_super+0x140/0x140 [nfs]
      [ 1360.268074]  [<ffffffffa05da8e0>] ? param_set_portnr+0x60/0x60 [nfs]
      [ 1360.274406]  [<ffffffff8119ea79>] mount_fs+0x39/0x1b0
      [ 1360.279443]  [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20
      [ 1360.285088]  [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0
      [ 1360.290556]  [<ffffffff811b9f5d>] do_mount+0x1fd/0xa00
      [ 1360.295677]  [<ffffffff81137dee>] ? __get_free_pages+0xe/0x50
      [ 1360.301405]  [<ffffffff811b9be6>] ? copy_mount_options+0x36/0x170
      [ 1360.307479]  [<ffffffff811ba7e3>] sys_mount+0x83/0xc0
      [ 1360.312515]  [<ffffffff8160ad59>] system_call_fastpath+0x16/0x1b
      [ 1360.318503] ---[ end trace 8fa1f4cbc36094a7 ]---
      
      The problem is that we're ending up in __rpc_lookup_create_exclusive
      with a negative dentry that already has d_op set. A little debugging
      has shown that when we hit this, the d_ops are already set to
      simple_dentry_operations.
      
      I believe that what's happening is that during a mount, idmapd is racing
      in and doing a lookup of /var/lib/nfs/rpc_pipefs/nfs/clnt???/idmap.
      Before that dentry reference is released, the kernel races in to create
      that file and finds the new negative dentry, which already has the
      d_op set.
      
      This patch just avoids setting the d_op if it's already set.
      simple_dentry_operations and rpc_dentry_operations are functionally
      equivalent so it shouldn't matter which one it's set to.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      e401452d
  4. 16 5月, 2013 2 次提交
  5. 13 3月, 2013 1 次提交
    • E
      fs: Readd the fs module aliases. · fa7614dd
      Eric W. Biederman 提交于
      I had assumed that the only use of module aliases for filesystems
      prior to "fs: Limit sys_mount to only request filesystem modules."
      was in request_module.  It turns out I was wrong.  At least mkinitcpio
      in Arch linux uses these aliases.
      
      So readd the preexising aliases, to keep from breaking userspace.
      
      Userspace eventually will have to follow and use the same aliases the
      kernel does.  So at some point we may be delete these aliases without
      problems.  However that day is not today.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      fa7614dd
  6. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  7. 23 2月, 2013 1 次提交
  8. 09 11月, 2012 1 次提交
  9. 05 11月, 2012 1 次提交
  10. 02 10月, 2012 1 次提交
  11. 12 6月, 2012 1 次提交
  12. 17 5月, 2012 1 次提交
  13. 15 5月, 2012 1 次提交
    • R
      sunrpc: fix kernel-doc warnings · bda14606
      Randy Dunlap 提交于
      Fix kernel-doc warnings in sunrpc/rpc_pipe.c and
      sunrpc/rpcb_clnt.c:
      
      Warning(net/sunrpc/rpcb_clnt.c:428): No description found for parameter 'net'
      Warning(net/sunrpc/rpcb_clnt.c:567): No description found for parameter 'net'
      
      Warning(net/sunrpc/rpc_pipe.c:133): No description found for parameter 'pipe'
      Warning(net/sunrpc/rpc_pipe.c:133): Excess function parameter 'inode' description in 'rpc_queue_upcall'
      Warning(net/sunrpc/rpc_pipe.c:839): No description found for parameter 'pipe'
      Warning(net/sunrpc/rpc_pipe.c:839): Excess function parameter 'ops' description in 'rpc_mkpipe_dentry'
      Warning(net/sunrpc/rpc_pipe.c:839): Excess function parameter 'flags' description in 'rpc_mkpipe_dentry'
      Warning(net/sunrpc/rpc_pipe.c:949): No description found for parameter 'dentry'
      Warning(net/sunrpc/rpc_pipe.c:949): Excess function parameter 'clnt' description in 'rpc_remove_client_dir'
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      bda14606
  14. 11 5月, 2012 1 次提交
    • L
      vfs: make it possible to access the dentry hash/len as one 64-bit entry · 26fe5750
      Linus Torvalds 提交于
      This allows comparing hash and len in one operation on 64-bit
      architectures.  Right now only __d_lookup_rcu() takes advantage of this,
      since that is the case we care most about.
      
      The use of anonymous struct/unions hides the alternate 64-bit approach
      from most users, the exception being a few cases where we initialize a
      'struct qstr' with a static initializer.  This makes the problematic
      cases use a new QSTR_INIT() helper function for that (but initializing
      just the name pointer with a "{ .name = xyzzy }" initializer remains
      valid, as does just copying another qstr structure).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26fe5750
  15. 28 4月, 2012 1 次提交
  16. 26 3月, 2012 1 次提交
  17. 21 3月, 2012 1 次提交
  18. 12 3月, 2012 1 次提交
    • T
      SUNRPC: Fix a few sparse warnings · 09acfea5
      Trond Myklebust 提交于
      net/sunrpc/svcsock.c:412:22: warning: incorrect type in assignment
      (different address spaces)
       - svc_partial_recvfrom now takes a struct kvec, so the variable
         save_iovbase needs to be an ordinary (void *)
      
      Make a bunch of variables in net/sunrpc/xprtsock.c static
      
      Fix a couple of "warning: symbol 'foo' was not declared. Should it be
      static?" reports.
      
      Fix a couple of conflicting function declarations.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      09acfea5
  19. 03 3月, 2012 2 次提交
  20. 28 2月, 2012 2 次提交
    • S
      SUNRPC: move waitq from RPC pipe to RPC inode · 591ad7fe
      Stanislav Kinsbursky 提交于
      Currently, wait queue, used for polling of RPC pipe changes from user-space,
      is a part of RPC pipe. But the pipe data itself can be released on NFS umount
      prior to dentry-inode pair, connected to it (is case of this pair is open by
      some process).
      This is not a problem for almost all pipe users, because all PipeFS file
      operations checks pipe reference prior to using it.
      Except evenfd. This thing registers itself with "poll" file operation and thus
      has a reference to pipe wait queue. This leads to oopses on destroying eventfd
      after NFS umount (like rpc_idmapd do) since not pipe data left to the point
      already.
      The solution is to wait queue from pipe data to internal RPC inode data. This
      looks more logical, because this wiat queue used only for user-space processes,
      which already holds inode reference.
      
      Note: upcalls have to get pipe->dentry prior to dereferecing wait queue to make
      sure, that mount point won't disappear from underneath us.
      Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      591ad7fe
    • S
      SUNRPC: check RPC inode's pipe reference before dereferencing · 2c9030ee
      Stanislav Kinsbursky 提交于
      There are 2 tightly bound objects: pipe data (created for kernel needs, has
      reference to dentry, which depends on PipeFS mount/umount) and PipeFS
      dentry/inode pair (created on mount for user-space needs). They both
      independently may have or have not a valid reference to each other.
      This means, that we have to make sure, that pipe->dentry reference is valid on
      upcalls, and dentry->pipe reference is valid on downcalls. The latter check is
      absent - my fault.
      IOW, PipeFS dentry can be opened by some process (rpc.idmapd for example), but
      it's pipe data can belong to NFS mount, which was unmounted already and thus
      pipe data was destroyed.
      To fix this, pipe reference have to be set to NULL on rpc_unlink() and checked
      on PipeFS file operations instead of pipe->dentry check.
      
      Note: PipeFS "poll" file operation will be updated in next patch, because it's
      logic is more complicated.
      Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      2c9030ee
  21. 01 2月, 2012 15 次提交