1. 12 6月, 2016 1 次提交
    • A
      autofs races · ea01a184
      Al Viro 提交于
      * make autofs4_expire_indirect() skip the dentries being in process of
      expiry
      * do *not* mess with list_move(); making sure that dentry with
      AUTOFS_INF_EXPIRING are not picked for expiry is enough.
      * do not remove NO_RCU when we set EXPIRING, don't bother with smp_mb()
      there.  Clear it at the same time we clear EXPIRING.  Makes a bunch of
      tests simpler.
      * rename NO_RCU to WANT_EXPIRE, which is what it really is.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      ea01a184
  2. 16 3月, 2016 4 次提交
  3. 24 6月, 2015 1 次提交
  4. 16 4月, 2015 1 次提交
  5. 12 4月, 2015 1 次提交
  6. 14 10月, 2014 2 次提交
    • N
      autofs4: avoid taking fs_lock during rcu-walk · 4d885f90
      NeilBrown 提交于
      ->fs_lock protects AUTOFS_INF_EXPIRING.  We need to be sure that once
      the flag is set, no new references beneath the dentry are taken.  So
      rcu-walk currently needs to take fs_lock before checking the flag.  This
      hurts performance.
      
      Change the expiry to a two-stage process.  First set AUTOFS_INF_NO_RCU
      which forces any path walk into ref-walk mode, then drop the lock and
      call synchronize_rcu().  Once that returns we can be sure no rcu-walk is
      active beneath the dentry and we can check reference counts again.
      
      Now during an RCU-walk we can test AUTOFS_INF_EXPIRING without taking
      the lock as along as we test AUTOFS_INF_NO_RCU too.  If either are set,
      we must abort the RCU-walk If neither are set, we know that refcounts
      will be tested again after we finish the RCU-walk so we are safe to
      continue.
      
      ->fs_lock is still taken in d_manage() to check for a non-trap
      directory.  That will be resolved in the next patch.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Reviewed-by: NIan Kent <raven@themaw.net>
      Tested-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4d885f90
    • N
      autofs4: allow RCU-walk to walk through autofs4 · 23bfc2a2
      NeilBrown 提交于
      This series teaches autofs about RCU-walk so that we don't drop straight
      into REF-walk when we hit an autofs directory, and so that we avoid
      spinlocks as much as possible when performing an RCU-walk.
      
      This is needed so that the benefits of the recent NFS support for
      RCU-walk are fully available when NFS filesystems are automounted.
      
      Patches have been carefully reviewed and tested both with test suites
      and in production - thanks a lot to Ian Kent for his support there.
      
      This patch (of 6):
      
      Any attempt to look up a pathname that passes though an autofs4 mount is
      currently forced out of RCU-walk into REF-walk.
      
      This can significantly hurt performance of many-thread work loads on
      many-core systems, especially if the automounted filesystem supports
      RCU-walk but doesn't get to benefit from it.
      
      So if autofs4_d_manage is called with rcu_walk set, only fail with -ECHILD
      if it is necessary to wait longer than a spinlock.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Reviewed-by: NIan Kent <raven@themaw.net>
      Tested-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      23bfc2a2
  7. 09 8月, 2014 2 次提交
  8. 24 1月, 2014 1 次提交
    • S
      autofs4: allow autofs to work outside the initial PID namespace · 6eaba35b
      Sukadev Bhattiprolu 提交于
      Enable autofs4 to work in a "container".  oz_pgrp is converted from
      pid_t to struct pid and this is stored at mount time based on the
      "pgrp=" option or if the option is missing then the current pgrp.
      
      The "pgrp=" option is interpreted in the PID namespace of the current
      process.  This option is flawed in that it doesn't carry the namespace
      information, so it should be deprecated.  AFAICS the autofs daemon
      always sends the current pgrp, which is the default anyway.
      
      The oz_pgrp is also set from the AUTOFS_DEV_IOCTL_SETPIPEFD_CMD ioctl.
      This ioctl sets oz_pgrp to the current pgrp.  It is not allowed to
      change the pid namespace.
      
      oz_pgrp is used mainly to determine whether the process traversing the
      autofs mount tree is the autofs daemon itself or not.  This function now
      compares the pid pointers instead of the pid_t values.
      
      One other use of oz_pgrp is in autofs4_show_options.  There is shows the
      virtual pid number (i.e.  the one that is valid inside the PID namespace
      of the calling process)
      
      For debugging printk convert oz_pgrp to the value in the initial pid
      namespace.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Acked-by: NIan Kent <raven@themaw.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6eaba35b
  9. 25 10月, 2013 2 次提交
  10. 23 2月, 2013 1 次提交
  11. 15 11月, 2012 1 次提交
    • E
      userns: Support autofs4 interacing with multiple user namespaces · 45634cd8
      Eric W. Biederman 提交于
      Use kuid_t and kgid_t in struct autofs_info and struct autofs_wait_queue.
      
      When creating directories and symlinks default the uid and gid of
      the mount requester to the global root uid and gid.  autofs4_wait
      will update these fields when a mount is requested.
      
      When generating autofsv5 packets report the uid and gid of the mount
      requestor in user namespace of the process that opened the pipe,
      reporting unmapped uids and gids as overflowuid and overflowgid.
      
      In autofs_dev_ioctl_requester return the uid and gid of the last mount
      requester converted into the calling processes user namespace.  When the
      uid or gid don't map return overflowuid and overflowgid as appropriate,
      allowing failure to find a mount requester to be distinguished from
      failure to map a mount requester.
      
      The uid and gid mount options specifying the user and group of the
      root autofs inode are converted into kuid and kgid as they are parsed
      defaulting to the current uid and current gid of the process that
      mounts autofs.
      
      Mounting of autofs for the present remains confined to processes in
      the initial user namespace.
      
      Cc: Ian Kent <raven@themaw.net>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      45634cd8
  12. 30 4月, 2012 1 次提交
    • L
      autofs: make the autofsv5 packet file descriptor use a packetized pipe · 64f371bc
      Linus Torvalds 提交于
      The autofs packet size has had a very unfortunate size problem on x86:
      because the alignment of 'u64' differs in 32-bit and 64-bit modes, and
      because the packet data was not 8-byte aligned, the size of the autofsv5
      packet structure differed between 32-bit and 64-bit modes despite
      looking otherwise identical (300 vs 304 bytes respectively).
      
      We first fixed that up by making the 64-bit compat mode know about this
      problem in commit a32744d4 ("autofs: work around unhappy compat
      problem on x86-64"), and that made a 32-bit 'systemd' work happily on a
      64-bit kernel because everything then worked the same way as on a 32-bit
      kernel.
      
      But it turned out that 'automount' had actually known and worked around
      this problem in user space, so fixing the kernel to do the proper 32-bit
      compatibility handling actually *broke* 32-bit automount on a 64-bit
      kernel, because it knew that the packet sizes were wrong and expected
      those incorrect sizes.
      
      As a result, we ended up reverting that compatibility mode fix, and
      thus breaking systemd again, in commit fcbf94b9.
      
      With both automount and systemd doing a single read() system call, and
      verifying that they get *exactly* the size they expect but using
      different sizes, it seemed that fixing one of them inevitably seemed to
      break the other.  At one point, a patch I seriously considered applying
      from Michael Tokarev did a "strcmp()" to see if it was automount that
      was doing the operation.  Ugly, ugly.
      
      However, a prettier solution exists now thanks to the packetized pipe
      mode.  By marking the communication pipe as being packetized (by simply
      setting the O_DIRECT flag), we can always just write the bigger packet
      size, and if user-space does a smaller read, it will just get that
      partial end result and the extra alignment padding will simply be thrown
      away.
      
      This makes both automount and systemd happy, since they now get the size
      they asked for, and the kernel side of autofs simply no longer needs to
      care - it could pad out the packet arbitrarily.
      
      Of course, if there is some *other* user of autofs (please, please,
      please tell me it ain't so - and we haven't heard of any) that tries to
      read the packets with multiple writes, that other user will now be
      broken - the whole point of the packetized mode is that one system call
      gets exactly one packet, and you cannot read a packet in pieces.
      Tested-by: NMichael Tokarev <mjt@tls.msk.ru>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Thomas Meyer <thomas@m3y3r.de>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      64f371bc
  13. 28 4月, 2012 1 次提交
    • L
      Revert "autofs: work around unhappy compat problem on x86-64" · fcbf94b9
      Linus Torvalds 提交于
      This reverts commit a32744d4.
      
      While that commit was technically the right thing to do, and made the
      x86-64 compat mode work identically to native 32-bit mode (and thus
      fixing the problem with a 32-bit systemd install on a 64-bit kernel), it
      turns out that the automount binaries had workarounds for this compat
      problem.
      
      Now, the workarounds are disgusting: doing an "uname()" to find out the
      architecture of the kernel, and then comparing it for the 64-bit cases
      and fixing up the size of the read() in automount for those.  And they
      were confused: it's not actually a generic 64-bit issue at all, it's
      very much tied to just x86-64, which has different alignment for an
      'u64' in 64-bit mode than in 32-bit mode.
      
      But the end result is that fixing the compat layer actually breaks the
      case of a 32-bit automount on a x86-64 kernel.
      
      There are various approaches to fix this (including just doing a
      "strcmp()" on current->comm and comparing it to "automount"), but I
      think that I will do the one that teaches pipes about a special "packet
      mode", which will allow user space to not have to care too deeply about
      the padding at the end of the autofs packet.
      
      That change will make the compat workaround unnecessary, so let's revert
      it first, and get automount working again in compat mode.  The
      packetized pipes will then fix autofs for systemd.
      Reported-and-requested-by: NMichael Tokarev <mjt@tls.msk.ru>
      Cc: Ian Kent <raven@themaw.net>
      Cc: stable@kernel.org # for 3.3
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fcbf94b9
  14. 26 2月, 2012 1 次提交
    • I
      autofs: work around unhappy compat problem on x86-64 · a32744d4
      Ian Kent 提交于
      When the autofs protocol version 5 packet type was added in commit
      5c0a32fc ("autofs4: add new packet type for v5 communications"), it
      obvously tried quite hard to be word-size agnostic, and uses explicitly
      sized fields that are all correctly aligned.
      
      However, with the final "char name[NAME_MAX+1]" array at the end, the
      actual size of the structure ends up being not very well defined:
      because the struct isn't marked 'packed', doing a "sizeof()" on it will
      align the size of the struct up to the biggest alignment of the members
      it has.
      
      And despite all the members being the same, the alignment of them is
      different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
      alignment on x86-64.  And while 'NAME_MAX+1' ends up being a nice round
      number (256), the name[] array starts out a 4-byte aligned.
      
      End result: the "packed" size of the structure is 300 bytes: 4-byte, but
      not 8-byte aligned.
      
      As a result, despite all the fields being in the same place on all
      architectures, sizeof() will round up that size to 304 bytes on
      architectures that have 8-byte alignment for u64.
      
      Note that this is *not* a problem for 32-bit compat mode on POWER, since
      there __u64 is 8-byte aligned even in 32-bit mode.  But on x86, 32-bit
      and 64-bit alignment is different for 64-bit entities, and as a result
      the structure that has exactly the same layout has different sizes.
      
      So on x86-64, but no other architecture, we will just subtract 4 from
      the size of the structure when running in a compat task.  That way we
      will write the properly sized packet that user mode expects.
      
      Not pretty.  Sadly, this very subtle, and unnecessary, size difference
      has been encoded in user space that wants to read packets of *exactly*
      the right size, and will refuse to touch anything else.
      Reported-and-tested-by: NThomas Meyer <thomas@m3y3r.de>
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a32744d4
  15. 11 1月, 2012 1 次提交
  16. 04 1月, 2012 1 次提交
  17. 09 8月, 2011 1 次提交
  18. 25 3月, 2011 1 次提交
  19. 18 1月, 2011 6 次提交
  20. 16 1月, 2011 7 次提交
    • D
      autofs4: Merge the remaining dentry ops tables · b650c858
      David Howells 提交于
      Merge the remaining autofs4 dentry ops tables.  It doesn't matter if
      d_automount and d_manage are present on something that's not mountable or
      holdable as these ops are only used if the appropriate flags are set in
      dentry->d_flags.
      
      [AV] switch to ->s_d_op, since now _everything_ on autofs4 is using the
      same dentry_operations.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b650c858
    • I
      autofs4: Clean up dentry operations · 71e469db
      Ian Kent 提交于
      There are now two distinct dentry operations uses. One for dentrys
      that trigger mounts and one for dentrys that do not.
      
      Rationalize the use of these dentry operations and rename them to
      reflect their function.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      71e469db
    • I
      autofs4: Clean up inode operations · e61da20a
      Ian Kent 提交于
      Since the use of ->follow_link() has been eliminated there is no
      need to separate the indirect and direct inode operations.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e61da20a
    • I
      autofs4: Remove unused code · 8c13a676
      Ian Kent 提交于
      Remove code that is not used due to the use of ->d_automount()
      and ->d_manage().
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8c13a676
    • I
      autofs4: Add d_manage() dentry operation · b5b80177
      Ian Kent 提交于
      This patch required a previous patch to add the ->d_automount()
      dentry operation.
      
      Add a function to use the newly defined ->d_manage() dentry operation
      for blocking during mount and expire.
      
      Whether the VFS calls the dentry operations d_automount() and d_manage()
      is controled by the DMANAGED_AUTOMOUNT and DMANAGED_TRANSIT flags. autofs
      uses the d_automount() operation to callback to user space to request
      mount operations and the d_manage() operation to block walks into mounts
      that are under construction or destruction.
      
      In order to prevent these functions from being called unnecessarily the
      DMANAGED_* flags are cleared for cases which would cause this. In the
      common case the DMANAGED_AUTOMOUNT and DMANAGED_TRANSIT flags are both
      set for dentrys waiting to be mounted. The DMANAGED_TRANSIT flag is
      cleared upon successful mount request completion and set during expire
      runs, both during the dentry expire check, and if selected for expire,
      is left set until a subsequent successful mount request completes.
      
      The exception to this is the so-called rootless multi-mount which has
      no actual mount at its base. In this case the DMANAGED_AUTOMOUNT flag
      is cleared upon successful mount request completion as well and set
      again after a successful expire.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b5b80177
    • I
      autofs4: Add d_automount() dentry operation · 10584211
      Ian Kent 提交于
      Add a function to use the newly defined ->d_automount() dentry operation
      for triggering mounts instead of doing the user space callback in ->lookup()
      and ->d_revalidate().
      
      Note, to be useful the subsequent patch to add the ->d_manage() dentry
      operation is also needed so the discussion of functionality is deferred to
      that patch.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      10584211
    • D
      Add a dentry op to allow processes to be held during pathwalk transit · cc53ce53
      David Howells 提交于
      Add a dentry op (d_manage) to permit a filesystem to hold a process and make it
      sleep when it tries to transit away from one of that filesystem's directories
      during a pathwalk.  The operation is keyed off a new dentry flag
      (DCACHE_MANAGE_TRANSIT).
      
      The filesystem is allowed to be selective about which processes it holds and
      which it permits to continue on or prohibits from transiting from each flagged
      directory.  This will allow autofs to hold up client processes whilst letting
      its userspace daemon through to maintain the directory or the stuff behind it
      or mounted upon it.
      
      The ->d_manage() dentry operation:
      
      	int (*d_manage)(struct path *path, bool mounting_here);
      
      takes a pointer to the directory about to be transited away from and a flag
      indicating whether the transit is undertaken by do_add_mount() or
      do_move_mount() skipping through a pile of filesystems mounted on a mountpoint.
      
      It should return 0 if successful and to let the process continue on its way;
      -EISDIR to prohibit the caller from skipping to overmounted filesystems or
      automounting, and to use this directory; or some other error code to return to
      the user.
      
      ->d_manage() is called with namespace_sem writelocked if mounting_here is true
      and no other locks held, so it may sleep.  However, if mounting_here is true,
      it may not initiate or wait for a mount or unmount upon the parameter
      directory, even if the act is actually performed by userspace.
      
      Within fs/namei.c, follow_managed() is extended to check with d_manage() first
      on each managed directory, before transiting away from it or attempting to
      automount upon it.
      
      follow_down() is renamed follow_down_one() and should only be used where the
      filesystem deliberately intends to avoid management steps (e.g. autofs).
      
      A new follow_down() is added that incorporates the loop done by all other
      callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS
      and CIFS do use it, their use is removed by converting them to use
      d_automount()).  The new follow_down() calls d_manage() as appropriate.  It
      also takes an extra parameter to indicate if it is being called from mount code
      (with namespace_sem writelocked) which it passes to d_manage().  follow_down()
      ignores automount points so that it can be used to mount on them.
      
      __follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with
      DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to
      sleep.  It would be possible to enter d_manage() in rcu-walk mode too, and have
      that determine whether to abort or not itself.  That would allow the autofs
      daemon to continue on in rcu-walk mode.
      
      Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't
      required as every tranist from that directory will cause d_manage() to be
      invoked.  It can always be set again when necessary.
      
      ==========================
      WHAT THIS MEANS FOR AUTOFS
      ==========================
      
      Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to
      trigger the automounting of indirect mounts, and both of these can be called
      with i_mutex held.
      
      autofs knows that the i_mutex will be held by the caller in lookup(), and so
      can drop it before invoking the daemon - but this isn't so for d_revalidate(),
      since the lock is only held on _some_ of the code paths that call it.  This
      means that autofs can't risk dropping i_mutex from its d_revalidate() function
      before it calls the daemon.
      
      The bug could manifest itself as, for example, a process that's trying to
      validate an automount dentry that gets made to wait because that dentry is
      expired and needs cleaning up:
      
      	mkdir         S ffffffff8014e05a     0 32580  24956
      	Call Trace:
      	 [<ffffffff885371fd>] :autofs4:autofs4_wait+0x674/0x897
      	 [<ffffffff80127f7d>] avc_has_perm+0x46/0x58
      	 [<ffffffff8009fdcf>] autoremove_wake_function+0x0/0x2e
      	 [<ffffffff88537be6>] :autofs4:autofs4_expire_wait+0x41/0x6b
      	 [<ffffffff88535cfc>] :autofs4:autofs4_revalidate+0x91/0x149
      	 [<ffffffff80036d96>] __lookup_hash+0xa0/0x12f
      	 [<ffffffff80057a2f>] lookup_create+0x46/0x80
      	 [<ffffffff800e6e31>] sys_mkdirat+0x56/0xe4
      
      versus the automount daemon which wants to remove that dentry, but can't
      because the normal process is holding the i_mutex lock:
      
      	automount     D ffffffff8014e05a     0 32581      1              32561
      	Call Trace:
      	 [<ffffffff80063c3f>] __mutex_lock_slowpath+0x60/0x9b
      	 [<ffffffff8000ccf1>] do_path_lookup+0x2ca/0x2f1
      	 [<ffffffff80063c89>] .text.lock.mutex+0xf/0x14
      	 [<ffffffff800e6d55>] do_rmdir+0x77/0xde
      	 [<ffffffff8005d229>] tracesys+0x71/0xe0
      	 [<ffffffff8005d28d>] tracesys+0xd5/0xe0
      
      which means that the system is deadlocked.
      
      This patch allows autofs to hold up normal processes whilst the daemon goes
      ahead and does things to the dentry tree behind the automouter point without
      risking a deadlock as almost no locks are held in d_manage() and none in
      d_automount().
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Was-Acked-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      cc53ce53
  21. 07 1月, 2011 3 次提交
    • N
      fs: dcache remove dcache_lock · b5c84bf6
      Nick Piggin 提交于
      dcache_lock no longer protects anything. remove it.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      b5c84bf6
    • N
      fs: dcache scale subdirs · 2fd6b7f5
      Nick Piggin 提交于
      Protect d_subdirs and d_child with d_lock, except in filesystems that aren't
      using dcache_lock for these anyway (eg. using i_mutex).
      
      Note: if we change the locking rule in future so that ->d_child protection is
      provided only with ->d_parent->d_lock, it may allow us to reduce some locking.
      But it would be an exception to an otherwise regular locking scheme, so we'd
      have to see some good results. Probably not worthwhile.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      2fd6b7f5
    • N
      fs: dcache scale d_unhashed · da502956
      Nick Piggin 提交于
      Protect d_unhashed(dentry) condition with d_lock. This means keeping
      DCACHE_UNHASHED bit in synch with hash manipulations.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      da502956