1. 07 7月, 2022 3 次提交
    • A
      step_into(): lose inode argument · a4f5b521
      Al Viro 提交于
      make handle_mounts() always fetch it.  This is just the first step -
      the callers of step_into() will stop trying to calculate the sucker,
      etc.
      
      The passed value should be equal to dentry->d_inode in all cases;
      in RCU mode - fetched after we'd sampled ->d_seq.  Might as well
      fetch it here.  We do need to validate ->d_seq, which duplicates
      the check currently done in lookup_fast(); that duplication will
      go away shortly.
      
      After that change handle_mounts() always ignores the initial value of
      *inode and always sets it on success.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a4f5b521
    • A
      namei: stash the sampled ->d_seq into nameidata · 03fa86e9
      Al Viro 提交于
      New field: nd->next_seq.  Set to 0 outside of RCU mode, holds the sampled
      value for the next dentry to be considered.  Used instead of an arseload
      of local variables, arguments, etc.
      
      step_into() has lost seq argument; nd->next_seq is used, so dentry passed
      to it must be the one ->next_seq is about.
      
      There are two requirements for RCU pathwalk:
      	1) it should not give a hard failure (other than -ECHILD) unless
      non-RCU pathwalk might fail that way given suitable timings.
      	2) it should not succeed unless non-RCU pathwalk might succeed
      with the same end location given suitable timings.
      
      The use of seq numbers is the way we achieve that.  Invariant we want
      to maintain is:
      	if RCU pathwalk can reach the state with given nd->path, nd->inode
      and nd->seq after having traversed some part of pathname, it must be possible
      for non-RCU pathwalk to reach the same nd->path and nd->inode after having
      traversed the same part of pathname, and observe the nd->path.dentry->d_seq
      equal to what RCU pathwalk has in nd->seq
      
      	For transition from parent to child, we sample child's ->d_seq
      and verify that parent's ->d_seq remains unchanged.  Anything that
      disrupts parent-child relationship would've bumped ->d_seq on both.
      	For transitions from child to parent we sample parent's ->d_seq
      and verify that child's ->d_seq has not changed.  Same reasoning as
      for the previous case applies.
      	For transition from mountpoint to root of mounted we sample
      the ->d_seq of root and verify that nobody has touched mount_lock since
      the beginning of pathwalk.  That guarantees that mount we'd found had
      been there all along, with these mountpoint and root of the mounted.
      It would be possible for a non-RCU pathwalk to reach the previous state,
      find the same mount and observe its root at the moment we'd sampled
      ->d_seq of that
      	For transitions from root of mounted to mountpoint we sample
      ->d_seq of mountpoint and verify that mount_lock had not been touched
      since the beginning of pathwalk.  The same reasoning as in the
      previous case applies.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      03fa86e9
    • A
      namei: move clearing LOOKUP_RCU towards rcu_read_unlock() · 6e180327
      Al Viro 提交于
      try_to_unlazy()/try_to_unlazy_next() drop LOOKUP_RCU in the
      very beginning and do rcu_read_unlock() only at the very end.
      However, nothing done in between even looks at the flag in
      question; might as well clear it at the same time we unlock.
      
      Note that try_to_unlazy_next() used to call legitimize_mnt(),
      which might drop/regain rcu_read_lock() in some cases.  This
      is no longer true, so we really have rcu_read_lock() held
      all along until the end.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6e180327
  2. 06 7月, 2022 4 次提交
    • A
      switch try_to_unlazy_next() to __legitimize_mnt() · 7e4745a0
      Al Viro 提交于
      The tricky case (__legitimize_mnt() failing after having grabbed
      a reference) can be trivially dealt with by leaving nd->path.mnt
      non-NULL, for terminate_walk() to drop it.
      
      legitimize_mnt() becomes static after that.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7e4745a0
    • A
      follow_dotdot{,_rcu}(): change calling conventions · 51c6546c
      Al Viro 提交于
      Instead of returning NULL when we are in root, just make it return
      the current position (and set *seqp and *inodep accordingly).
      That collapses the calls of step_into() in handle_dots()
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      51c6546c
    • A
      namei: get rid of pointless unlikely(read_seqcount_retry(...)) · 82ef0698
      Al Viro 提交于
      read_seqcount_retry() et.al. are inlined and there's enough annotations
      for compiler to figure out that those are unlikely to return non-zero.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      82ef0698
    • A
      __follow_mount_rcu(): verify that mount_lock remains unchanged · 20aac6c6
      Al Viro 提交于
      Validate mount_lock seqcount as soon as we cross into mount in RCU
      mode.  Sure, ->mnt_root is pinned and will remain so until we
      do rcu_read_unlock() anyway, and we will eventually fail to unlazy if
      the mount_lock had been touched, but we might run into a hard error
      (e.g. -ENOENT) before trying to unlazy.  And it's possible to end
      up with RCU pathwalk racing with rename() and umount() in a way
      that would fail with -ENOENT while non-RCU pathwalk would've
      succeeded with any timings.
      
      Once upon a time we hadn't needed that, but analysis had been subtle,
      brittle and went out of window as soon as RENAME_EXCHANGE had been
      added.
      
      It's narrow, hard to hit and won't get you anything other than
      stray -ENOENT that could be arranged in much easier way with the
      same priveleges, but it's a bug all the same.
      
      Cc: stable@kernel.org
      X-sky-is-falling: unlikely
      Fixes: da1ce067 "vfs: add cross-rename"
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      20aac6c6
  3. 20 5月, 2022 3 次提交
    • T
      namei: cleanup double word in comment · 30476f7e
      Tom Rix 提交于
      Remove the second 'to'.
      Signed-off-by: NTom Rix <trix@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      30476f7e
    • A
      get rid of dead code in legitimize_root() · 52dba645
      Al Viro 提交于
      Combination of LOOKUP_IS_SCOPED and NULL nd->root.mnt is impossible
      after successful path_init().  All places where ->root.mnt might
      become NULL do that only if LOOKUP_IS_SCOPED is not there and
      path_init() itself can return success without setting nd->root
      only if ND_ROOT_PRESET had been set (in which case nd->root
      had been set by caller and never changed) or if the name had
      been a relative one *and* none of the bits in LOOKUP_IS_SCOPED
      had been present.
      
      Since all calls of legitimize_root() must be downstream of successful
      path_init(), the check for !nd->root.mnt && (nd->flags & LOOKUP_IS_SCOPED)
      is pure paranoia.
      
      FWIW, it had been discussed (and agreed upon) with Aleksa back when
      scoped lookups had been merged; looks like that had fallen through the
      cracks back then.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      52dba645
    • A
      fs/namei.c:reserve_stack(): tidy up the call of try_to_unlazy() · e5ca024e
      Al Viro 提交于
      !foo() != 0 is a strange way to spell !foo(); fallout from
      "fs: make unlazy_walk() error handling consistent"...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e5ca024e
  4. 14 5月, 2022 1 次提交
  5. 09 5月, 2022 3 次提交
  6. 28 4月, 2022 1 次提交
  7. 15 4月, 2022 1 次提交
    • N
      VFS: filename_create(): fix incorrect intent. · b3d4650d
      NeilBrown 提交于
      When asked to create a path ending '/', but which is not to be a
      directory (LOOKUP_DIRECTORY not set), filename_create() will never try
      to create the file.  If it doesn't exist, -ENOENT is reported.
      
      However, it still passes LOOKUP_CREATE|LOOKUP_EXCL to the filesystems
      ->lookup() function, even though there is no intent to create.  This is
      misleading and can cause incorrect behaviour.
      
      If you try
      
         ln -s foo /path/dir/
      
      where 'dir' is a directory on an NFS filesystem which is not currently
      known in the dcache, this will fail with ENOENT.
      
      But as the name is not in the dcache, nfs_lookup gets called with
      LOOKUP_CREATE|LOOKUP_EXCL and so it returns NULL without performing any
      lookup, with the expectation that a subsequent call to create the target
      will be made, and the lookup can be combined with the creation.  In the
      case with a trailing '/' and no LOOKUP_DIRECTORY, that call is never
      made.  Instead filename_create() sees that the dentry is not (yet)
      positive and returns -ENOENT - even though the directory actually
      exists.
      
      So only set LOOKUP_CREATE|LOOKUP_EXCL if there really is an intent to
      create, and use the absence of these flags to decide if -ENOENT should
      be returned.
      
      Note that filename_parentat() is only interested in LOOKUP_REVAL, so we
      split that out and store it in 'reval_flag'.  __lookup_hash() then gets
      reval_flag combined with whatever create flags were determined to be
      needed.
      Reviewed-by: NDavid Disseldorp <ddiss@suse.de>
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3d4650d
  8. 24 1月, 2022 1 次提交
    • A
      fsnotify: invalidate dcache before IN_DELETE event · a37d9a17
      Amir Goldstein 提交于
      Apparently, there are some applications that use IN_DELETE event as an
      invalidation mechanism and expect that if they try to open a file with
      the name reported with the delete event, that it should not contain the
      content of the deleted file.
      
      Commit 49246466 ("fsnotify: move fsnotify_nameremove() hook out of
      d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify
      will have access to a positive dentry.
      
      This allowed a race where opening the deleted file via cached dentry
      is now possible after receiving the IN_DELETE event.
      
      To fix the regression, create a new hook fsnotify_delete() that takes
      the unlinked inode as an argument and use a helper d_delete_notify() to
      pin the inode, so we can pass it to fsnotify_delete() after d_delete().
      
      Backporting hint: this regression is from v5.3. Although patch will
      apply with only trivial conflicts to v5.4 and v5.10, it won't build,
      because fsnotify_delete() implementation is different in each of those
      versions (see fsnotify_link()).
      
      A follow up patch will fix the fsnotify_unlink/rmdir() calls in pseudo
      filesystem that do not need to call d_delete().
      
      Link: https://lore.kernel.org/r/20220120215305.282577-1-amir73il@gmail.comReported-by: NIvan Delalande <colona@arista.com>
      Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/
      Fixes: 49246466 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()")
      Cc: stable@vger.kernel.org # v5.3+
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      a37d9a17
  9. 22 1月, 2022 1 次提交
    • L
      fs: move namei sysctls to its own file · 9c011be1
      Luis Chamberlain 提交于
      kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
      dishes, this makes it very difficult to maintain.
      
      To help with this maintenance let's start by moving sysctls to places
      where they actually belong.  The proc sysctl maintainers do not want to
      know what sysctl knobs you wish to add for your own piece of code, we
      just care about the core logic.
      
      So move namei's own sysctl knobs to its own file.
      
      Other than the move we also avoid initializing two static variables to 0
      as this is not needed:
      
        * sysctl_protected_symlinks
        * sysctl_protected_hardlinks
      
      Link: https://lkml.kernel.org/r/20211129205548.605569-8-mcgrof@kernel.orgSigned-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Antti Palosaari <crope@iki.fi>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Lukas Middendorf <kernel@tuxforce.de>
      Cc: Stephen Kitt <steve@sk2.org>
      Cc: Xiaoming Ni <nixiaoming@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c011be1
  10. 07 1月, 2022 1 次提交
  11. 27 10月, 2021 1 次提交
  12. 08 9月, 2021 5 次提交
  13. 04 9月, 2021 1 次提交
  14. 24 8月, 2021 10 次提交
  15. 23 8月, 2021 2 次提交
    • C
      namei: add mapping aware lookup helper · c2fd68b6
      Christian Brauner 提交于
      Various filesystems rely on the lookup_one_len() helper to lookup a
      single path component relative to a well-known starting point. Allow
      such filesystems to support idmapped mounts by adding a version of this
      helper to take the idmap into account when calling inode_permission().
      This change is a required to let btrfs (and other filesystems) support
      idmapped mounts.
      
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: linux-fsdevel@vger.kernel.org
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c2fd68b6
    • J
      fs: remove mandatory file locking support · f7e33bdb
      Jeff Layton 提交于
      We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it
      off in Fedora and RHEL8. Several other distros have followed suit.
      
      I've heard of one problem in all that time: Someone migrated from an
      older distro that supported "-o mand" to one that didn't, and the host
      had a fstab entry with "mand" in it which broke on reboot. They didn't
      actually _use_ mandatory locking so they just removed the mount option
      and moved on.
      
      This patch rips out mandatory locking support wholesale from the kernel,
      along with the Kconfig option and the Documentation file. It also
      changes the mount code to ignore the "mand" mount option instead of
      erroring out, and to throw a big, ugly warning.
      Signed-off-by: NJeff Layton <jlayton@kernel.org>
      f7e33bdb
  16. 08 4月, 2021 2 次提交
    • A
      namei: make sure nd->depth is always valid · 7962c7d1
      Al Viro 提交于
      Zero it in set_nameidata() rather than in path_init().  That way
      it always matches the number of valid nd->stack[] entries.
      Since terminate_walk() does zero it (after having emptied the
      stack), we don't need to reinitialize it in subsequent path_init().
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7962c7d1
    • A
      teach set_nameidata() to handle setting the root as well · 06422964
      Al Viro 提交于
      That way we don't need the callers to mess with manually setting any fields
      of nameidata instances.  Old set_nameidata() gets renamed (__set_nameidata()),
      new becomes an inlined helper that takes a struct path pointer and deals
      with setting nd->root and putting ND_ROOT_PRESET in nd->state when new
      argument is non-NULL.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      06422964