1. 15 11月, 2021 1 次提交
  2. 21 10月, 2021 1 次提交
  3. 09 3月, 2021 1 次提交
    • L
      ovl: fix dentry leak in ovl_get_redirect · ad6d51c6
      Liangyan 提交于
      stable inclusion
      from stable-5.10.15
      commit fb8caef7c020267ad30e868a5aeaa5da6ccf0c6e
      bugzilla: 48167
      
      --------------------------------
      
      commit e04527fe upstream.
      
      We need to lock d_parent->d_lock before dget_dlock, or this may
      have d_lockref updated parallelly like calltrace below which will
      cause dentry->d_lockref leak and risk a crash.
      
           CPU 0                                CPU 1
      ovl_set_redirect                       lookup_fast
        ovl_get_redirect                       __d_lookup
          dget_dlock
            //no lock protection here            spin_lock(&dentry->d_lock)
            dentry->d_lockref.count++            dentry->d_lockref.count++
      
      [   49.799059] PGD 800000061fed7067 P4D 800000061fed7067 PUD 61fec5067 PMD 0
      [   49.799689] Oops: 0002 [#1] SMP PTI
      [   49.800019] CPU: 2 PID: 2332 Comm: node Not tainted 4.19.24-7.20.al7.x86_64 #1
      [   49.800678] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8a46cfe 04/01/2014
      [   49.801380] RIP: 0010:_raw_spin_lock+0xc/0x20
      [   49.803470] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246
      [   49.803949] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000
      [   49.804600] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088
      [   49.805252] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040
      [   49.805898] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000
      [   49.806548] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0
      [   49.807200] FS:  00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000
      [   49.807935] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   49.808461] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0
      [   49.809113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   49.809758] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   49.810410] Call Trace:
      [   49.810653]  d_delete+0x2c/0xb0
      [   49.810951]  vfs_rmdir+0xfd/0x120
      [   49.811264]  do_rmdir+0x14f/0x1a0
      [   49.811573]  do_syscall_64+0x5b/0x190
      [   49.811917]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   49.812385] RIP: 0033:0x7ffbf505ffd7
      [   49.814404] RSP: 002b:00007ffbedffada8 EFLAGS: 00000297 ORIG_RAX: 0000000000000054
      [   49.815098] RAX: ffffffffffffffda RBX: 00007ffbedffb640 RCX: 00007ffbf505ffd7
      [   49.815744] RDX: 0000000004449700 RSI: 0000000000000000 RDI: 0000000006c8cd50
      [   49.816394] RBP: 00007ffbedffaea0 R08: 0000000000000000 R09: 0000000000017d0b
      [   49.817038] R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000012
      [   49.817687] R13: 00000000072823d8 R14: 00007ffbedffb700 R15: 00000000072823d8
      [   49.818338] Modules linked in: pvpanic cirrusfb button qemu_fw_cfg atkbd libps2 i8042
      [   49.819052] CR2: 0000000000000088
      [   49.819368] ---[ end trace 4e652b8aa299aa2d ]---
      [   49.819796] RIP: 0010:_raw_spin_lock+0xc/0x20
      [   49.821880] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246
      [   49.822363] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000
      [   49.823008] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088
      [   49.823658] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040
      [   49.825404] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000
      [   49.827147] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0
      [   49.828890] FS:  00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000
      [   49.830725] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   49.832359] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0
      [   49.834085] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   49.835792] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Cc: <stable@vger.kernel.org>
      Fixes: a6c60655 ("ovl: redirect on rename-dir")
      Signed-off-by: NLiangyan <liangyan.peng@linux.alibaba.com>
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Suggested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
      ad6d51c6
  4. 02 9月, 2020 1 次提交
  5. 03 6月, 2020 1 次提交
    • V
      ovl: initialize OVL_UPPERDATA in ovl_lookup() · 28166ab3
      Vivek Goyal 提交于
      Currently ovl_get_inode() initializes OVL_UPPERDATA flag and for that it
      has to call ovl_check_metacopy_xattr() and check if metacopy xattr is
      present or not.
      
      yangerkun reported sometimes underlying filesystem might return -EIO and in
      that case error handling path does not cleanup properly leading to various
      warnings.
      
      Run generic/461 with ext4 upper/lower layer sometimes may trigger the bug
      as below(linux 4.19):
      
      [  551.001349] overlayfs: failed to get metacopy (-5)
      [  551.003464] overlayfs: failed to get inode (-5)
      [  551.004243] overlayfs: cleanup of 'd44/fd51' failed (-5)
      [  551.004941] overlayfs: failed to get origin (-5)
      [  551.005199] ------------[ cut here ]------------
      [  551.006697] WARNING: CPU: 3 PID: 24674 at fs/inode.c:1528 iput+0x33b/0x400
      ...
      [  551.027219] Call Trace:
      [  551.027623]  ovl_create_object+0x13f/0x170
      [  551.028268]  ovl_create+0x27/0x30
      [  551.028799]  path_openat+0x1a35/0x1ea0
      [  551.029377]  do_filp_open+0xad/0x160
      [  551.029944]  ? vfs_writev+0xe9/0x170
      [  551.030499]  ? page_counter_try_charge+0x77/0x120
      [  551.031245]  ? __alloc_fd+0x160/0x2a0
      [  551.031832]  ? do_sys_open+0x189/0x340
      [  551.032417]  ? get_unused_fd_flags+0x34/0x40
      [  551.033081]  do_sys_open+0x189/0x340
      [  551.033632]  __x64_sys_creat+0x24/0x30
      [  551.034219]  do_syscall_64+0xd5/0x430
      [  551.034800]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      One solution is to improve error handling and call iget_failed() if error
      is encountered.  Amir thinks that this path is little intricate and there
      is not real need to check and initialize OVL_UPPERDATA in ovl_get_inode().
      Instead caller of ovl_get_inode() can initialize this state.  And this will
      avoid double checking of metacopy xattr lookup in ovl_lookup() and
      ovl_get_inode().
      
      OVL_UPPERDATA is inode flag.  So I was little concerned that initializing
      it outside ovl_get_inode() might have some races.  But this is one way
      transition.  That is once a file has been fully copied up, it can't go back
      to metacopy file again.  And that seems to help avoid races.  So as of now
      I can't see any races w.r.t OVL_UPPERDATA being set wrongly.  So move
      settingof OVL_UPPERDATA inside the callers of ovl_get_inode().
      ovl_obtain_alias() already does it.  So only two callers now left are
      ovl_lookup() and ovl_instantiate().
      Reported-by: Nyangerkun <yangerkun@huawei.com>
      Suggested-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      28166ab3
  6. 13 5月, 2020 1 次提交
  7. 27 3月, 2020 1 次提交
    • M
      ovl: fix WARN_ON nlink drop to zero · 83552eac
      Miklos Szeredi 提交于
      Changes to underlying layers should not cause WARN_ON(), but this repro
      does:
      
       mkdir w l u mnt
       sudo mount -t overlay -o workdir=w,lowerdir=l,upperdir=u overlay mnt
       touch mnt/h
       ln u/h u/k
       rm -rf mnt/k
       rm -rf mnt/h
       dmesg
      
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 116244 at fs/inode.c:302 drop_nlink+0x28/0x40
      
      After upper hardlinks were added while overlay is mounted, unlinking all
      overlay hardlinks drops overlay nlink to zero before all upper inodes
      are unlinked.
      
      After unlink/rename prevent i_nlink from going to zero if there are still
      hashed aliases (i.e. cached hard links to the victim) remaining.
      Reported-by: NPhasip <phasip@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      83552eac
  8. 17 3月, 2020 2 次提交
  9. 23 1月, 2020 1 次提交
  10. 10 12月, 2019 1 次提交
    • A
      ovl: relax WARN_ON() on rename to self · 6889ee5a
      Amir Goldstein 提交于
      In ovl_rename(), if new upper is hardlinked to old upper underneath
      overlayfs before upper dirs are locked, user will get an ESTALE error
      and a WARN_ON will be printed.
      
      Changes to underlying layers while overlayfs is mounted may result in
      unexpected behavior, but it shouldn't crash the kernel and it shouldn't
      trigger WARN_ON() either, so relax this WARN_ON().
      
      Reported-by: syzbot+bb1836a212e69f8e201a@syzkaller.appspotmail.com
      Fixes: 804032fa ("ovl: don't check rename to self")
      Cc: <stable@vger.kernel.org> # v4.9+
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      6889ee5a
  11. 19 6月, 2019 1 次提交
  12. 18 6月, 2019 1 次提交
    • N
      ovl: fix typo in MODULE_PARM_DESC · 253e7483
      Nicolas Schier 提交于
      Change first argument to MODULE_PARM_DESC() calls, that each of them
      matched the actual module parameter name.  The matching results in
      changing (the 'parm' section from) the output of `modinfo overlay` from:
      
          parm: ovl_check_copy_up:Obsolete; does nothing
          parm: redirect_max:ushort
          parm: ovl_redirect_max:Maximum length of absolute redirect xattr value
          parm: redirect_dir:bool
          parm: ovl_redirect_dir_def:Default to on or off for the redirect_dir feature
          parm: redirect_always_follow:bool
          parm: ovl_redirect_always_follow:Follow redirects even if redirect_dir feature is turned off
          parm: index:bool
          parm: ovl_index_def:Default to on or off for the inodes index feature
          parm: nfs_export:bool
          parm: ovl_nfs_export_def:Default to on or off for the NFS export feature
          parm: xino_auto:bool
          parm: ovl_xino_auto_def:Auto enable xino feature
          parm: metacopy:bool
          parm: ovl_metacopy_def:Default to on or off for the metadata only copy up feature
      
      into:
      
          parm: check_copy_up:Obsolete; does nothing
          parm: redirect_max:Maximum length of absolute redirect xattr value (ushort)
          parm: redirect_dir:Default to on or off for the redirect_dir feature (bool)
          parm: redirect_always_follow:Follow redirects even if redirect_dir feature is turned off (bool)
          parm: index:Default to on or off for the inodes index feature (bool)
          parm: nfs_export:Default to on or off for the NFS export feature (bool)
          parm: xino_auto:Auto enable xino feature (bool)
          parm: metacopy:Default to on or off for the metadata only copy up feature (bool)
      Signed-off-by: NNicolas Schier <n.schier@avm.de>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      253e7483
  13. 08 5月, 2019 1 次提交
    • A
      ovl: relax WARN_ON() for overlapping layers use case · acf3062a
      Amir Goldstein 提交于
      This nasty little syzbot repro:
      https://syzkaller.appspot.com/x/repro.syz?x=12c7a94f400000
      
      Creates overlay mounts where the same directory is both in upper and lower
      layers. Simplified example:
      
        mkdir foo work
        mount -t overlay none foo -o"lowerdir=.,upperdir=foo,workdir=work"
      
      The repro runs several threads in parallel that attempt to chdir into foo
      and attempt to symlink/rename/exec/mkdir the file bar.
      
      The repro hits a WARN_ON() I placed in ovl_instantiate(), which suggests
      that an overlay inode already exists in cache and is hashed by the pointer
      of the real upper dentry that ovl_create_real() has just created. At the
      point of the WARN_ON(), for overlay dir inode lock is held and upper dir
      inode lock, so at first, I did not see how this was possible.
      
      On a closer look, I see that after ovl_create_real(), because of the
      overlapping upper and lower layers, a lookup by another thread can find the
      file foo/bar that was just created in upper layer, at overlay path
      foo/foo/bar and hash the an overlay inode with the new real dentry as lower
      dentry. This is possible because the overlay directory foo/foo is not
      locked and the upper dentry foo/bar is in dcache, so ovl_lookup() can find
      it without taking upper dir inode shared lock.
      
      Overlapping layers is considered a wrong setup which would result in
      unexpected behavior, but it shouldn't crash the kernel and it shouldn't
      trigger WARN_ON() either, so relax this WARN_ON() and leave a pr_warn()
      instead to cover all cases of failure to get an overlay inode.
      
      The error returned from failure to insert new inode to cache with
      inode_insert5() was changed to -EEXIST, to distinguish from the error
      -ENOMEM returned on failure to get/allocate inode with iget5_locked().
      
      Reported-by: syzbot+9c69c282adc4edd2b540@syzkaller.appspotmail.com
      Fixes: 01b39dcc ("ovl: use inode_insert5() to hash a newly...")
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      acf3062a
  14. 19 11月, 2018 1 次提交
  15. 31 10月, 2018 1 次提交
    • M
      ovl: check whiteout in ovl_create_over_whiteout() · 5e127580
      Miklos Szeredi 提交于
      Kaixuxia repors that it's possible to crash overlayfs by removing the
      whiteout on the upper layer before creating a directory over it.  This is a
      reproducer:
      
       mkdir lower upper work merge
       touch lower/file
       mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merge
       rm merge/file
       ls -al merge/file
       rm upper/file
       ls -al merge/
       mkdir merge/file
      
      Before commencing with a vfs_rename(..., RENAME_EXCHANGE) verify that the
      lookup of "upper" is positive and is a whiteout, and return ESTALE
      otherwise.
      
      Reported by: kaixuxia <xiakaixu1987@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: e9be9d5e ("overlay filesystem")
      Cc: <stable@vger.kernel.org> # v3.18
      5e127580
  16. 27 10月, 2018 3 次提交
  17. 23 8月, 2018 1 次提交
  18. 20 7月, 2018 2 次提交
    • V
      ovl: Set redirect on upper inode when it is linked · 4120fe64
      Vivek Goyal 提交于
      When we create a hardlink to a metacopy upper file, first the redirect on
      that inode.  Path based lookup will not work with newly created link and
      redirect will solve that issue.
      
      Also use absolute redirect as two hardlinks could be in different
      directores and relative redirect will not work.
      
      I have not put any additional locking around setting redirects while
      introducing redirects for non-dir files.  For now it feels like existing
      locking is sufficient.  If that's not the case, we will have add more
      locking.  Following is my rationale about why do I think current locking
      seems ok.
      
      Basic problem for non-dir files is that more than on dentry could be
      pointing to same inode and in theory only relying on dentry based locks
      (d->d_lock) did not seem sufficient.
      
      We set redirect upon rename and upon link creation.  In both the paths for
      non-dir file, VFS locks both source and target inodes (->i_rwsem).  That
      means vfs rename and link operations on same source and target can't he
      happening in parallel (Even if there are multiple dentries pointing to same
      inode).  So that probably means that at a time on an inode, only one call
      of ovl_set_redirect() could be working and we don't need additional locking
      in ovl_set_redirect().
      
      ovl_inode->redirect is initialized only when inode is created new.  That
      means it should not race with any other path and setting
      ovl_inode->redirect should be fine.
      
      Reading of ovl_inode->redirect happens in ovl_get_redirect() path.  And
      this called only in ovl_set_redirect().  And ovl_set_redirect() already
      seemed to be protected using ->i_rwsem.  That means ovl_set_redirect() and
      ovl_get_redirect() on source/target inode should not make progress in
      parallel and is mutually exclusive.  Hence no additional locking required.
      
      Now, only case where ovl_set_redirect() and ovl_get_redirect() could race
      seems to be case of absolute redirects where ovl_get_redirect() has to
      travel up the tree.  In that case we already take d->d_lock and that should
      be sufficient as directories will not have multiple dentries pointing to
      same inode.
      
      So given VFS locking and current usage of redirect, current locking around
      redirect seems to be ok for non-dir as well.  Once we have the logic to
      remove redirect when metacopy file gets copied up, then we probably will
      need additional locking.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4120fe64
    • V
      ovl: Set redirect on metacopy files upon rename · 7bb08383
      Vivek Goyal 提交于
      Set redirect on metacopy files upon rename.  This will help find data
      dentry in lower dirs.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      7bb08383
  19. 18 7月, 2018 1 次提交
    • M
      ovl: copy up times · d9854c87
      Miklos Szeredi 提交于
      Copy up mtime and ctime to overlay inode after times in real object are
      modified.  Be careful not to dirty cachelines when not necessary.
      
      This is in preparation for moving overlay functionality out of the VFS.
      
      This patch shouldn't have any observable effect.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d9854c87
  20. 31 5月, 2018 8 次提交
  21. 24 1月, 2018 1 次提交
  22. 20 1月, 2018 1 次提交
    • A
      ovl: take lower dir inode mutex outside upper sb_writers lock · 6d0a8a90
      Amir Goldstein 提交于
      The functions ovl_lower_positive() and ovl_check_empty_dir() both take
      inode mutex on the real lower dir under ovl_want_write() which takes
      the upper_mnt sb_writers lock.
      
      While this is not a clear locking order or layering violation, it creates
      an undesired lock dependency between two unrelated layers for no good
      reason.
      
      This lock dependency materializes to a false(?) positive lockdep warning
      when calling rmdir() on a nested overlayfs, where both nested and
      underlying overlayfs both use the same fs type as upper layer.
      
      rmdir() on the nested overlayfs creates the lock chain:
        sb_writers of upper_mnt (e.g. tmpfs) in ovl_do_remove()
        ovl_i_mutex_dir_key[] of lower overlay dir in ovl_lower_positive()
      
      rmdir() on the underlying overlayfs creates the lock chain in
      reverse order:
        ovl_i_mutex_dir_key[] of lower overlay dir in vfs_rmdir()
        sb_writers of nested upper_mnt (e.g. tmpfs) in ovl_do_remove()
      
      To rid of the unneeded locking dependency, move both ovl_lower_positive()
      and ovl_check_empty_dir() to before ovl_want_write() in rmdir() and
      rename() implementation.
      
      This change spreads the pieces of ovl_check_empty_and_clear() directly
      inside the rmdir()/rename() implementations so the helper is no longer
      needed and removed.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      6d0a8a90
  23. 14 12月, 2017 1 次提交
  24. 09 11月, 2017 3 次提交
    • A
      ovl: update cache version of impure parent on rename · f30536f0
      Amir Goldstein 提交于
      ovl_rename() updates dir cache version for impure old parent if an entry
      with copy up origin is moved into old parent, but it did not update
      cache version if the entry moved out of old parent has a copy up origin.
      
      [SzM] Same for new dir: we updated the version if an entry with origin was
      moved in, but not if an entry with origin was moved out.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      f30536f0
    • Z
      ovl: fix rmdir problem on non-merge dir with origin xattr · 07f6fff1
      zhangyi (F) 提交于
      An "origin && non-merge" upper dir may have leftover whiteouts that
      were created in past mount. overlayfs does no clear this dir when we
      delete it, which may lead to rmdir fail or temp file left in workdir.
      
      Simple reproducer:
        mkdir lower upper work merge
        mkdir -p lower/dir
        touch lower/dir/a
        mount -t overlay overlay -olowerdir=lower,upperdir=upper,\
          workdir=work merge
        rm merge/dir/a
        umount merge
        rm -rf lower/*
        touch lower/dir  (*)
        mount -t overlay overlay -olowerdir=lower,upperdir=upper,\
          workdir=work merge
        rm -rf merge/dir
      
      Syslog dump:
        overlayfs: cleanup of 'work/#7' failed (-39)
      
      (*): if we do not create the regular file, the result is different:
        rm: cannot remove "dir/": Directory not empty
      
      This patch adds a check for the case of non-merge dir that may contain
      whiteouts, and calls ovl_check_empty_dir() to check and clear whiteouts
      from upper dir when an empty dir is being deleted.
      
      [amir: split patch from ovl_check_empty_dir() cleanup
             rename ovl_is_origin() to ovl_may_have_whiteouts()
             check OVL_WHITEOUTS flag instead of checking origin xattr]
      Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      07f6fff1
    • Z
      ovl: simplify ovl_check_empty_and_clear() · 95e598e7
      zhangyi (F) 提交于
      Filter out non-whiteout non-upper entries from list of merge dir entries
      while checking if merge dir is empty in ovl_check_empty_dir().
      The remaining work for ovl_clear_empty() is to clear all entries on the
      list.
      
      [amir: split patch from rmdir bug fix]
      Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      95e598e7
  25. 05 10月, 2017 1 次提交
  26. 14 9月, 2017 1 次提交
    • M
      mm: treewide: remove GFP_TEMPORARY allocation flag · 0ee931c4
      Michal Hocko 提交于
      GFP_TEMPORARY was introduced by commit e12ba74d ("Group short-lived
      and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
      primary motivation was to allow users to tell that an allocation is
      short lived and so the allocator can try to place such allocations close
      together and prevent long term fragmentation.  As much as this sounds
      like a reasonable semantic it becomes much less clear when to use the
      highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
      context holding that memory sleep? Can it take locks? It seems there is
      no good answer for those questions.
      
      The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
      __GFP_RECLAIMABLE which in itself is tricky because basically none of
      the existing caller provide a way to reclaim the allocated memory.  So
      this is rather misleading and hard to evaluate for any benefits.
      
      I have checked some random users and none of them has added the flag
      with a specific justification.  I suspect most of them just copied from
      other existing users and others just thought it might be a good idea to
      use without any measuring.  This suggests that GFP_TEMPORARY just
      motivates for cargo cult usage without any reasoning.
      
      I believe that our gfp flags are quite complex already and especially
      those with highlevel semantic should be clearly defined to prevent from
      confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
      replace all existing users to simply use GFP_KERNEL.  Please note that
      SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
      so they will be placed properly for memory fragmentation prevention.
      
      I can see reasons we might want some gfp flag to reflect shorterm
      allocations but I propose starting from a clear semantic definition and
      only then add users with proper justification.
      
      This was been brought up before LSF this year by Matthew [1] and it
      turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
      seems to be a heuristic without any measured advantage for most (if not
      all) its current users.  The follow up discussion has revealed that
      opinions on what might be temporary allocation differ a lot between
      developers.  So rather than trying to tweak existing users into a
      semantic which they haven't expected I propose to simply remove the flag
      and start from scratch if we really need a semantic for short term
      allocations.
      
      [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org
      
      [akpm@linux-foundation.org: fix typo]
      [akpm@linux-foundation.org: coding-style fixes]
      [sfr@canb.auug.org.au: drm/i915: fix up]
        Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ee931c4
  27. 28 7月, 2017 1 次提交
    • M
      ovl: constant d_ino for non-merge dirs · 4edb83bb
      Miklos Szeredi 提交于
      Impure directories are ones which contain objects with origins (i.e. those
      that have been copied up).  These are relevant to readdir operation only
      because of the d_ino field, no other transformation is necessary.  Also a
      directory can become impure between two getdents(2) calls.
      
      This patch creates a cache for impure directories.  Unlike the cache for
      merged directories, this one only contains entries with origin and is not
      refcounted but has a its lifetime tied to that of the dentry.
      
      Similarly to the merged cache, the impure cache is invalidated based on a
      version number.  This version number is incremented when an entry with
      origin is added or removed from the directory.
      
      If the cache is empty, then the impure xattr is removed from the directory.
      
      This patch also fixes up handling of d_ino for the ".." entry if the parent
      directory is merged.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4edb83bb