1. 23 1月, 2020 1 次提交
  2. 10 12月, 2019 1 次提交
    • A
      ovl: relax WARN_ON() on rename to self · 6889ee5a
      Amir Goldstein 提交于
      In ovl_rename(), if new upper is hardlinked to old upper underneath
      overlayfs before upper dirs are locked, user will get an ESTALE error
      and a WARN_ON will be printed.
      
      Changes to underlying layers while overlayfs is mounted may result in
      unexpected behavior, but it shouldn't crash the kernel and it shouldn't
      trigger WARN_ON() either, so relax this WARN_ON().
      
      Reported-by: syzbot+bb1836a212e69f8e201a@syzkaller.appspotmail.com
      Fixes: 804032fa ("ovl: don't check rename to self")
      Cc: <stable@vger.kernel.org> # v4.9+
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      6889ee5a
  3. 19 6月, 2019 1 次提交
  4. 18 6月, 2019 1 次提交
    • N
      ovl: fix typo in MODULE_PARM_DESC · 253e7483
      Nicolas Schier 提交于
      Change first argument to MODULE_PARM_DESC() calls, that each of them
      matched the actual module parameter name.  The matching results in
      changing (the 'parm' section from) the output of `modinfo overlay` from:
      
          parm: ovl_check_copy_up:Obsolete; does nothing
          parm: redirect_max:ushort
          parm: ovl_redirect_max:Maximum length of absolute redirect xattr value
          parm: redirect_dir:bool
          parm: ovl_redirect_dir_def:Default to on or off for the redirect_dir feature
          parm: redirect_always_follow:bool
          parm: ovl_redirect_always_follow:Follow redirects even if redirect_dir feature is turned off
          parm: index:bool
          parm: ovl_index_def:Default to on or off for the inodes index feature
          parm: nfs_export:bool
          parm: ovl_nfs_export_def:Default to on or off for the NFS export feature
          parm: xino_auto:bool
          parm: ovl_xino_auto_def:Auto enable xino feature
          parm: metacopy:bool
          parm: ovl_metacopy_def:Default to on or off for the metadata only copy up feature
      
      into:
      
          parm: check_copy_up:Obsolete; does nothing
          parm: redirect_max:Maximum length of absolute redirect xattr value (ushort)
          parm: redirect_dir:Default to on or off for the redirect_dir feature (bool)
          parm: redirect_always_follow:Follow redirects even if redirect_dir feature is turned off (bool)
          parm: index:Default to on or off for the inodes index feature (bool)
          parm: nfs_export:Default to on or off for the NFS export feature (bool)
          parm: xino_auto:Auto enable xino feature (bool)
          parm: metacopy:Default to on or off for the metadata only copy up feature (bool)
      Signed-off-by: NNicolas Schier <n.schier@avm.de>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      253e7483
  5. 08 5月, 2019 1 次提交
    • A
      ovl: relax WARN_ON() for overlapping layers use case · acf3062a
      Amir Goldstein 提交于
      This nasty little syzbot repro:
      https://syzkaller.appspot.com/x/repro.syz?x=12c7a94f400000
      
      Creates overlay mounts where the same directory is both in upper and lower
      layers. Simplified example:
      
        mkdir foo work
        mount -t overlay none foo -o"lowerdir=.,upperdir=foo,workdir=work"
      
      The repro runs several threads in parallel that attempt to chdir into foo
      and attempt to symlink/rename/exec/mkdir the file bar.
      
      The repro hits a WARN_ON() I placed in ovl_instantiate(), which suggests
      that an overlay inode already exists in cache and is hashed by the pointer
      of the real upper dentry that ovl_create_real() has just created. At the
      point of the WARN_ON(), for overlay dir inode lock is held and upper dir
      inode lock, so at first, I did not see how this was possible.
      
      On a closer look, I see that after ovl_create_real(), because of the
      overlapping upper and lower layers, a lookup by another thread can find the
      file foo/bar that was just created in upper layer, at overlay path
      foo/foo/bar and hash the an overlay inode with the new real dentry as lower
      dentry. This is possible because the overlay directory foo/foo is not
      locked and the upper dentry foo/bar is in dcache, so ovl_lookup() can find
      it without taking upper dir inode shared lock.
      
      Overlapping layers is considered a wrong setup which would result in
      unexpected behavior, but it shouldn't crash the kernel and it shouldn't
      trigger WARN_ON() either, so relax this WARN_ON() and leave a pr_warn()
      instead to cover all cases of failure to get an overlay inode.
      
      The error returned from failure to insert new inode to cache with
      inode_insert5() was changed to -EEXIST, to distinguish from the error
      -ENOMEM returned on failure to get/allocate inode with iget5_locked().
      
      Reported-by: syzbot+9c69c282adc4edd2b540@syzkaller.appspotmail.com
      Fixes: 01b39dcc ("ovl: use inode_insert5() to hash a newly...")
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      acf3062a
  6. 19 11月, 2018 1 次提交
  7. 31 10月, 2018 1 次提交
    • M
      ovl: check whiteout in ovl_create_over_whiteout() · 5e127580
      Miklos Szeredi 提交于
      Kaixuxia repors that it's possible to crash overlayfs by removing the
      whiteout on the upper layer before creating a directory over it.  This is a
      reproducer:
      
       mkdir lower upper work merge
       touch lower/file
       mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merge
       rm merge/file
       ls -al merge/file
       rm upper/file
       ls -al merge/
       mkdir merge/file
      
      Before commencing with a vfs_rename(..., RENAME_EXCHANGE) verify that the
      lookup of "upper" is positive and is a whiteout, and return ESTALE
      otherwise.
      
      Reported by: kaixuxia <xiakaixu1987@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: e9be9d5e ("overlay filesystem")
      Cc: <stable@vger.kernel.org> # v3.18
      5e127580
  8. 27 10月, 2018 3 次提交
  9. 23 8月, 2018 1 次提交
  10. 20 7月, 2018 2 次提交
    • V
      ovl: Set redirect on upper inode when it is linked · 4120fe64
      Vivek Goyal 提交于
      When we create a hardlink to a metacopy upper file, first the redirect on
      that inode.  Path based lookup will not work with newly created link and
      redirect will solve that issue.
      
      Also use absolute redirect as two hardlinks could be in different
      directores and relative redirect will not work.
      
      I have not put any additional locking around setting redirects while
      introducing redirects for non-dir files.  For now it feels like existing
      locking is sufficient.  If that's not the case, we will have add more
      locking.  Following is my rationale about why do I think current locking
      seems ok.
      
      Basic problem for non-dir files is that more than on dentry could be
      pointing to same inode and in theory only relying on dentry based locks
      (d->d_lock) did not seem sufficient.
      
      We set redirect upon rename and upon link creation.  In both the paths for
      non-dir file, VFS locks both source and target inodes (->i_rwsem).  That
      means vfs rename and link operations on same source and target can't he
      happening in parallel (Even if there are multiple dentries pointing to same
      inode).  So that probably means that at a time on an inode, only one call
      of ovl_set_redirect() could be working and we don't need additional locking
      in ovl_set_redirect().
      
      ovl_inode->redirect is initialized only when inode is created new.  That
      means it should not race with any other path and setting
      ovl_inode->redirect should be fine.
      
      Reading of ovl_inode->redirect happens in ovl_get_redirect() path.  And
      this called only in ovl_set_redirect().  And ovl_set_redirect() already
      seemed to be protected using ->i_rwsem.  That means ovl_set_redirect() and
      ovl_get_redirect() on source/target inode should not make progress in
      parallel and is mutually exclusive.  Hence no additional locking required.
      
      Now, only case where ovl_set_redirect() and ovl_get_redirect() could race
      seems to be case of absolute redirects where ovl_get_redirect() has to
      travel up the tree.  In that case we already take d->d_lock and that should
      be sufficient as directories will not have multiple dentries pointing to
      same inode.
      
      So given VFS locking and current usage of redirect, current locking around
      redirect seems to be ok for non-dir as well.  Once we have the logic to
      remove redirect when metacopy file gets copied up, then we probably will
      need additional locking.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4120fe64
    • V
      ovl: Set redirect on metacopy files upon rename · 7bb08383
      Vivek Goyal 提交于
      Set redirect on metacopy files upon rename.  This will help find data
      dentry in lower dirs.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      7bb08383
  11. 18 7月, 2018 1 次提交
    • M
      ovl: copy up times · d9854c87
      Miklos Szeredi 提交于
      Copy up mtime and ctime to overlay inode after times in real object are
      modified.  Be careful not to dirty cachelines when not necessary.
      
      This is in preparation for moving overlay functionality out of the VFS.
      
      This patch shouldn't have any observable effect.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d9854c87
  12. 31 5月, 2018 8 次提交
  13. 24 1月, 2018 1 次提交
  14. 20 1月, 2018 1 次提交
    • A
      ovl: take lower dir inode mutex outside upper sb_writers lock · 6d0a8a90
      Amir Goldstein 提交于
      The functions ovl_lower_positive() and ovl_check_empty_dir() both take
      inode mutex on the real lower dir under ovl_want_write() which takes
      the upper_mnt sb_writers lock.
      
      While this is not a clear locking order or layering violation, it creates
      an undesired lock dependency between two unrelated layers for no good
      reason.
      
      This lock dependency materializes to a false(?) positive lockdep warning
      when calling rmdir() on a nested overlayfs, where both nested and
      underlying overlayfs both use the same fs type as upper layer.
      
      rmdir() on the nested overlayfs creates the lock chain:
        sb_writers of upper_mnt (e.g. tmpfs) in ovl_do_remove()
        ovl_i_mutex_dir_key[] of lower overlay dir in ovl_lower_positive()
      
      rmdir() on the underlying overlayfs creates the lock chain in
      reverse order:
        ovl_i_mutex_dir_key[] of lower overlay dir in vfs_rmdir()
        sb_writers of nested upper_mnt (e.g. tmpfs) in ovl_do_remove()
      
      To rid of the unneeded locking dependency, move both ovl_lower_positive()
      and ovl_check_empty_dir() to before ovl_want_write() in rmdir() and
      rename() implementation.
      
      This change spreads the pieces of ovl_check_empty_and_clear() directly
      inside the rmdir()/rename() implementations so the helper is no longer
      needed and removed.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      6d0a8a90
  15. 14 12月, 2017 1 次提交
  16. 09 11月, 2017 3 次提交
    • A
      ovl: update cache version of impure parent on rename · f30536f0
      Amir Goldstein 提交于
      ovl_rename() updates dir cache version for impure old parent if an entry
      with copy up origin is moved into old parent, but it did not update
      cache version if the entry moved out of old parent has a copy up origin.
      
      [SzM] Same for new dir: we updated the version if an entry with origin was
      moved in, but not if an entry with origin was moved out.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      f30536f0
    • Z
      ovl: fix rmdir problem on non-merge dir with origin xattr · 07f6fff1
      zhangyi (F) 提交于
      An "origin && non-merge" upper dir may have leftover whiteouts that
      were created in past mount. overlayfs does no clear this dir when we
      delete it, which may lead to rmdir fail or temp file left in workdir.
      
      Simple reproducer:
        mkdir lower upper work merge
        mkdir -p lower/dir
        touch lower/dir/a
        mount -t overlay overlay -olowerdir=lower,upperdir=upper,\
          workdir=work merge
        rm merge/dir/a
        umount merge
        rm -rf lower/*
        touch lower/dir  (*)
        mount -t overlay overlay -olowerdir=lower,upperdir=upper,\
          workdir=work merge
        rm -rf merge/dir
      
      Syslog dump:
        overlayfs: cleanup of 'work/#7' failed (-39)
      
      (*): if we do not create the regular file, the result is different:
        rm: cannot remove "dir/": Directory not empty
      
      This patch adds a check for the case of non-merge dir that may contain
      whiteouts, and calls ovl_check_empty_dir() to check and clear whiteouts
      from upper dir when an empty dir is being deleted.
      
      [amir: split patch from ovl_check_empty_dir() cleanup
             rename ovl_is_origin() to ovl_may_have_whiteouts()
             check OVL_WHITEOUTS flag instead of checking origin xattr]
      Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      07f6fff1
    • Z
      ovl: simplify ovl_check_empty_and_clear() · 95e598e7
      zhangyi (F) 提交于
      Filter out non-whiteout non-upper entries from list of merge dir entries
      while checking if merge dir is empty in ovl_check_empty_dir().
      The remaining work for ovl_clear_empty() is to clear all entries on the
      list.
      
      [amir: split patch from rmdir bug fix]
      Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      95e598e7
  17. 05 10月, 2017 1 次提交
  18. 14 9月, 2017 1 次提交
    • M
      mm: treewide: remove GFP_TEMPORARY allocation flag · 0ee931c4
      Michal Hocko 提交于
      GFP_TEMPORARY was introduced by commit e12ba74d ("Group short-lived
      and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
      primary motivation was to allow users to tell that an allocation is
      short lived and so the allocator can try to place such allocations close
      together and prevent long term fragmentation.  As much as this sounds
      like a reasonable semantic it becomes much less clear when to use the
      highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
      context holding that memory sleep? Can it take locks? It seems there is
      no good answer for those questions.
      
      The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
      __GFP_RECLAIMABLE which in itself is tricky because basically none of
      the existing caller provide a way to reclaim the allocated memory.  So
      this is rather misleading and hard to evaluate for any benefits.
      
      I have checked some random users and none of them has added the flag
      with a specific justification.  I suspect most of them just copied from
      other existing users and others just thought it might be a good idea to
      use without any measuring.  This suggests that GFP_TEMPORARY just
      motivates for cargo cult usage without any reasoning.
      
      I believe that our gfp flags are quite complex already and especially
      those with highlevel semantic should be clearly defined to prevent from
      confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
      replace all existing users to simply use GFP_KERNEL.  Please note that
      SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
      so they will be placed properly for memory fragmentation prevention.
      
      I can see reasons we might want some gfp flag to reflect shorterm
      allocations but I propose starting from a clear semantic definition and
      only then add users with proper justification.
      
      This was been brought up before LSF this year by Matthew [1] and it
      turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
      seems to be a heuristic without any measured advantage for most (if not
      all) its current users.  The follow up discussion has revealed that
      opinions on what might be temporary allocation differ a lot between
      developers.  So rather than trying to tweak existing users into a
      semantic which they haven't expected I propose to simply remove the flag
      and start from scratch if we really need a semantic for short term
      allocations.
      
      [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org
      
      [akpm@linux-foundation.org: fix typo]
      [akpm@linux-foundation.org: coding-style fixes]
      [sfr@canb.auug.org.au: drm/i915: fix up]
        Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ee931c4
  19. 28 7月, 2017 1 次提交
    • M
      ovl: constant d_ino for non-merge dirs · 4edb83bb
      Miklos Szeredi 提交于
      Impure directories are ones which contain objects with origins (i.e. those
      that have been copied up).  These are relevant to readdir operation only
      because of the d_ino field, no other transformation is necessary.  Also a
      directory can become impure between two getdents(2) calls.
      
      This patch creates a cache for impure directories.  Unlike the cache for
      merged directories, this one only contains entries with origin and is not
      refcounted but has a its lifetime tied to that of the dentry.
      
      Similarly to the merged cache, the impure cache is invalidated based on a
      version number.  This version number is incremented when an entry with
      origin is added or removed from the directory.
      
      If the cache is empty, then the impure xattr is removed from the directory.
      
      This patch also fixes up handling of d_ino for the ".." entry if the parent
      directory is merged.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4edb83bb
  20. 14 7月, 2017 1 次提交
  21. 05 7月, 2017 6 次提交
    • A
      ovl: persistent overlay inode nlink for indexed inodes · 5f8415d6
      Amir Goldstein 提交于
      With inodes index enabled, an overlay inode nlink counts the union of upper
      and non-covered lower hardlinks. During the lifetime of a non-pure upper
      inode, the following nlink modifying operations can happen:
      
      1. Lower hardlink copy up
      2. Upper hardlink created, unlinked or renamed over
      3. Lower hardlink whiteout or renamed over
      
      For the first, copy up case, the union nlink does not change, whether the
      operation succeeds or fails, but the upper inode nlink may change.
      Therefore, before copy up, we store the union nlink value relative to the
      lower inode nlink in the index inode xattr trusted.overlay.nlink.
      
      For the second, upper hardlink case, the union nlink should be incremented
      or decremented IFF the operation succeeds, aligned with nlink change of the
      upper inode. Therefore, before link/unlink/rename, we store the union nlink
      value relative to the upper inode nlink in the index inode.
      
      For the last, lower cover up case, we simplify things by preceding the
      whiteout or cover up with copy up. This makes sure that there is an index
      upper inode where the nlink xattr can be stored before the copied up upper
      entry is unlink.
      
      Return the overlay inode nlinks for indexed upper inodes on stat(2).
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      5f8415d6
    • M
      ovl: add flag for upper in ovl_entry · 55acc661
      Miklos Szeredi 提交于
      For rename, we need to ensure that an upper alias exists for hard links
      before attempting the operation.  Introduce a flag in ovl_entry to track
      the state of the upper alias.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      55acc661
    • A
      ovl: cleanup bad and stale index entries on mount · 415543d5
      Amir Goldstein 提交于
      Bad index entries are entries whose name does not match the
      origin file handle stored in trusted.overlay.origin xattr.
      Bad index entries could be a result of a system power off in
      the middle of copy up.
      
      Stale index entries are entries whose origin file handle is
      stale. Stale index entries could be a result of copying layers
      or removing lower entries while the overlay is not mounted.
      The case of copying layers should be detected earlier by the
      verification of upper root dir origin and index dir origin.
      
      Both bad and stale index entries are detected and removed
      on mount.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      415543d5
    • M
      ovl: move __upperdentry to ovl_inode · 09d8b586
      Miklos Szeredi 提交于
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      09d8b586
    • M
      ovl: compare inodes · 9020df37
      Miklos Szeredi 提交于
      When checking for consistency in directory operations (unlink, rename,
      etc.) match inodes not dentries.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      9020df37
    • A
      ovl: fix nlink leak in ovl_rename() · f681eb1d
      Amir Goldstein 提交于
      This patch fixes an overlay inode nlink leak in the case where
      ovl_rename() renames over a non-dir.
      
      This is not so critical, because overlay inode doesn't rely on
      nlink dropping to zero for inode deletion.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      f681eb1d
  22. 29 5月, 2017 1 次提交
    • A
      ovl: mark upper merge dir with type origin entries "impure" · f3a15685
      Amir Goldstein 提交于
      An upper dir is marked "impure" to let ovl_iterate() know that this
      directory may contain non pure upper entries whose d_ino may need to be
      read from the origin inode.
      
      We already mark a non-merge dir "impure" when moving a non-pure child
      entry inside it, to let ovl_iterate() know not to iterate the non-merge
      dir directly.
      
      Mark also a merge dir "impure" when moving a non-pure child entry inside
      it and when copying up a child entry inside it.
      
      This can be used to optimize ovl_iterate() to perform a "pure merge" of
      upper and lower directories, merging the content of the directories,
      without having to read d_ino from origin inodes.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      f3a15685
  23. 19 5月, 2017 1 次提交
    • A
      ovl: mark upper dir with type origin entries "impure" · ee1d6d37
      Amir Goldstein 提交于
      When moving a merge dir or non-dir with copy up origin into a non-merge
      upper dir (a.k.a pure upper dir), we are marking the target parent dir
      "impure". ovl_iterate() iterates pure upper dirs directly, because there is
      no need to filter out whiteouts and merge dir content with lower dir. But
      for the case of an "impure" upper dir, ovl_iterate() will not be able to
      iterate the real upper dir directly, because it will need to lookup the
      origin inode and use it to fill d_ino.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      ee1d6d37