1. 10 6月, 2016 1 次提交
    • A
      much milder d_walk() race · ba65dc5e
      Al Viro 提交于
      d_walk() relies upon the tree not getting rearranged under it without
      rename_lock being touched.  And we do grab rename_lock around the
      places that change the tree topology.  Unfortunately, branch reordering
      is just as bad from d_walk() POV and we have two places that do it
      without touching rename_lock - one in handling of cursors (for ramfs-style
      directories) and another in autofs.  autofs one is a separate story; this
      commit deals with the cursors.
      	* mark cursor dentries explicitly at allocation time
      	* make __dentry_kill() leave ->d_child.next pointing to the next
      non-cursor sibling, making sure that it won't be moved around unnoticed
      before the parent is relocked on ascend-to-parent path in d_walk().
      	* make d_walk() skip cursors explicitly; strictly speaking it's
      not necessary (all callbacks we pass to d_walk() are no-ops on cursors),
      but it makes analysis easier.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      ba65dc5e
  2. 29 5月, 2016 1 次提交
  3. 11 5月, 2016 1 次提交
  4. 03 5月, 2016 3 次提交
    • A
      parallel lookups machinery, part 4 (and last) · d9171b93
      Al Viro 提交于
      If we *do* run into an in-lookup match, we need to wait for it to
      cease being in-lookup.  Fortunately, we do have unused space in
      in-lookup dentries - d_lru is never looked at until it stops being
      in-lookup.
      
      So we can stash a pointer to wait_queue_head from stack frame of
      the caller of ->lookup().  Some precautions are needed while
      waiting, but it's not that hard - we do hold a reference to dentry
      we are waiting for, so it can't go away.  If it's found to be
      in-lookup the wait_queue_head is still alive and will remain so
      at least while ->d_lock is held.  Moreover, the condition we
      are waiting for becomes true at the same point where everything
      on that wq gets woken up, so we can just add ourselves to the
      queue once.
      
      d_alloc_parallel() gets a pointer to wait_queue_head_t from its
      caller; lookup_slow() adjusted, d_add_ci() taught to use
      d_alloc_parallel() if the dentry passed to it happens to be
      in-lookup one (i.e. if it's been called from the parallel lookup).
      
      That's pretty much it - all that remains is to switch ->i_mutex
      to rwsem and have lookup_slow() take it shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d9171b93
    • A
      parallel lookups machinery, part 3 · 94bdd655
      Al Viro 提交于
      We will need to be able to check if there is an in-lookup
      dentry with matching parent/name.  Right now it's impossible,
      but as soon as start locking directories shared such beasts
      will appear.
      
      Add a secondary hash for locating those.  Hash chains go through
      the same space where d_alias will be once it's not in-lookup anymore.
      Search is done under the same bitlock we use for modifications -
      with the primary hash we can rely on d_rehash() into the wrong
      chain being the worst that could happen, but here the pointers are
      buggered once it's removed from the chain.  On the other hand,
      the chains are not going to be long and normally we'll end up
      adding to the chain anyway.  That allows us to avoid bothering with
      ->d_lock when doing the comparisons - everything is stable until
      removed from chain.
      
      New helper: d_alloc_parallel().  Right now it allocates, verifies
      that no hashed and in-lookup matches exist and adds to in-lookup
      hash.
      
      Returns ERR_PTR() for error, hashed match (in the unlikely case it's
      been found) or new dentry.  In-lookup matches trigger BUG() for
      now; that will change in the next commit when we introduce waiting
      for ongoing lookup to finish.  Note that in-lookup matches won't be
      possible until we actually go for shared locking.
      
      lookup_slow() switched to use of d_alloc_parallel().
      
      Again, these commits are separated only for making it easier to
      review.  All this machinery will start doing something useful only
      when we go for shared locking; it's just that the combination is
      too large for my taste.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      94bdd655
    • A
      beginning of transition to parallel lookups - marking in-lookup dentries · 85c7f810
      Al Viro 提交于
      marked as such when (would be) parallel lookup is about to pass them
      to actual ->lookup(); unmarked when
      	* __d_add() is about to make it hashed, positive or not.
      	* __d_move() (from d_splice_alias(), directly or via
      __d_unalias()) puts a preexisting dentry in its place
      	* in caller of ->lookup() if it has escaped all of the
      above.  Bug (WARN_ON, actually) if it reaches the final dput()
      or d_instantiate() while still marked such.
      
      As the result, we are guaranteed that for as long as the flag is
      set, dentry will
      	* remain negative unhashed with positive refcount
      	* never have its ->d_alias looked at
      	* never have its ->d_lru looked at
      	* never have its ->d_parent and ->d_name changed
      
      Right now we have at most one such for any given parent directory.
      With parallel lookups that restriction will weaken to
      	* only exist when parent is locked shared
      	* at most one with given (parent,name) pair (comparison of
      names is according to ->d_compare())
      	* only exist when there's no hashed dentry with the same
      (parent,name)
      
      Transition will take the next several commits; unfortunately, we'll
      only be able to switch to rwsem at the end of this series.  The
      reason for not making it a single patch is to simplify review.
      
      New primitives: d_in_lookup() (a predicate checking if dentry is in
      the in-lookup state) and d_lookup_done() (tells the system that
      we are done with lookup and if it's still marked as in-lookup, it
      should cease to be such).
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      85c7f810
  5. 27 3月, 2016 1 次提交
    • M
      fs: add file_dentry() · d101a125
      Miklos Szeredi 提交于
      This series fixes bugs in nfs and ext4 due to 4bacc9c9 ("overlayfs:
      Make f_path always point to the overlay and f_inode to the underlay").
      
      Regular files opened on overlayfs will result in the file being opened on
      the underlying filesystem, while f_path points to the overlayfs
      mount/dentry.
      
      This confuses filesystems which get the dentry from struct file and assume
      it's theirs.
      
      Add a new helper, file_dentry() [*], to get the filesystem's own dentry
      from the file.  This checks file->f_path.dentry->d_flags against
      DCACHE_OP_REAL, and returns file->f_path.dentry if DCACHE_OP_REAL is not
      set (this is the common, non-overlayfs case).
      
      In the uncommon case it will call into overlayfs's ->d_real() to get the
      underlying dentry, matching file_inode(file).
      
      The reason we need to check against the inode is that if the file is copied
      up while being open, d_real() would return the upper dentry, while the open
      file comes from the lower dentry.
      
      [*] If possible, it's better simply to use file_inode() instead.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      Cc: <stable@vger.kernel.org> # v4.2
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Daniel Axtens <dja@axtens.net>
      d101a125
  6. 18 3月, 2016 1 次提交
    • J
      fs crypto: move per-file encryption from f2fs tree to fs/crypto · 0b81d077
      Jaegeuk Kim 提交于
      This patch adds the renamed functions moved from the f2fs crypto files.
      
      1. definitions for per-file encryption used by ext4 and f2fs.
      
      2. crypto.c for encrypt/decrypt functions
       a. IO preparation:
        - fscrypt_get_ctx / fscrypt_release_ctx
       b. before IOs:
        - fscrypt_encrypt_page
        - fscrypt_decrypt_page
        - fscrypt_zeroout_range
       c. after IOs:
        - fscrypt_decrypt_bio_pages
        - fscrypt_pullback_bio_page
        - fscrypt_restore_control_page
      
      3. policy.c supporting context management.
       a. For ioctls:
        - fscrypt_process_policy
        - fscrypt_get_policy
       b. For context permission
        - fscrypt_has_permitted_context
        - fscrypt_inherit_context
      
      4. keyinfo.c to handle permissions
        - fscrypt_get_encryption_info
        - fscrypt_free_encryption_info
      
      5. fname.c to support filename encryption
       a. general wrapper functions
        - fscrypt_fname_disk_to_usr
        - fscrypt_fname_usr_to_disk
        - fscrypt_setup_filename
        - fscrypt_free_filename
      
       b. specific filename handling functions
        - fscrypt_fname_alloc_buffer
        - fscrypt_fname_free_buffer
      
      6. Makefile and Kconfig
      
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Signed-off-by: NMichael Halcrow <mhalcrow@google.com>
      Signed-off-by: NIldar Muslukhov <ildarm@google.com>
      Signed-off-by: NUday Savagaonkar <savagaon@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0b81d077
  7. 14 3月, 2016 2 次提交
  8. 01 3月, 2016 1 次提交
  9. 15 1月, 2016 1 次提交
  10. 18 7月, 2015 1 次提交
    • N
      include, lib: add __printf attributes to several function prototypes · 8db14860
      Nicolas Iooss 提交于
      Using __printf attributes helps to detect several format string issues
      at compile time (even though -Wformat-security is currently disabled in
      Makefile).  For example it can detect when formatting a pointer as a
      number, like the issue fixed in commit a3fa71c4 ("wl18xx: show
      rx_frames_per_rates as an array as it really is"), or when the arguments
      do not match the format string, c.f.  for example commit 5ce1aca8
      ("reiserfs: fix __RASSERT format string").
      
      To prevent similar bugs in the future, add a __printf attribute to every
      function prototype which needs one in include/linux/ and lib/.  These
      functions were mostly found by using gcc's -Wsuggest-attribute=format
      flag.
      Signed-off-by: NNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8db14860
  11. 24 6月, 2015 1 次提交
  12. 19 6月, 2015 1 次提交
    • D
      overlayfs: Make f_path always point to the overlay and f_inode to the underlay · 4bacc9c9
      David Howells 提交于
      Make file->f_path always point to the overlay dentry so that the path in
      /proc/pid/fd is correct and to ensure that label-based LSMs have access to the
      overlay as well as the underlay (path-based LSMs probably don't need it).
      
      Using my union testsuite to set things up, before the patch I see:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:38 5 -> /a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 13381       Links: 1
      	...
      
      After the patch:
      
      	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
      	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
      	...
      	lr-x------. 1 root root 64 Jun  5 14:22 5 -> /mnt/a/foo107
      	[root@andromeda union-testsuite]# stat /mnt/a/foo107
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
      	...
      	Device: 23h/35d Inode: 40346       Links: 1
      	...
      
      Note the change in where /proc/$$/fd/5 points to in the ls command.  It was
      pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
      (which is correct).
      
      The inode accessed, however, is the lower layer.  The union layer is on device
      25h/37d and the upper layer on 24h/36d.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4bacc9c9
  13. 16 4月, 2015 2 次提交
    • D
      VFS: Impose ordering on accesses of d_inode and d_flags · 4bf46a27
      David Howells 提交于
      Impose ordering on accesses of d_inode and d_flags to avoid the need to do
      this:
      
      	if (!dentry->d_inode || d_is_negative(dentry)) {
      
      when this:
      
      	if (d_is_negative(dentry)) {
      
      should suffice.
      
      This check is especially problematic if a dentry can have its type field set
      to something other than DENTRY_MISS_TYPE when d_inode is NULL (as in
      unionmount).
      
      What we really need to do is stick a write barrier between setting d_inode and
      setting d_flags and a read barrier between reading d_flags and reading
      d_inode.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4bf46a27
    • D
      VFS: Add owner-filesystem positive/negative dentry checks · 525d27b2
      David Howells 提交于
      Supply two functions to test whether a filesystem's own dentries are positive
      or negative (d_really_is_positive() and d_really_is_negative()).
      
      The problem is that the DCACHE_ENTRY_TYPE field of dentry->d_flags may be
      overridden by the union part of a layered filesystem and isn't thus
      necessarily indicative of the type of dentry.
      
      Normally, this would involve a negative dentry (ie. ->d_inode == NULL) having
      ->d_layer.lower pointed to a lower layer dentry, DCACHE_PINNING_LOWER set and
      the DCACHE_ENTRY_TYPE field set to something other than DCACHE_MISS_TYPE - but
      it could also involve, say, a DCACHE_SPECIAL_TYPE being overridden to
      DCACHE_WHITEOUT_TYPE if a 0,0 chardev is detected in the top layer.
      
      However, inside a filesystem, when that fs is looking at its own dentries, it
      probably wants to know if they are really negative or not - and doesn't care
      about the fallthrough bits used by the union.
      
      To this end, a filesystem should normally use d_really_is_positive/negative()
      when looking at its own dentries rather than d_is_positive/negative() and
      should use d_inode() to get at the inode.
      
      Anyone looking at someone else's dentries (this includes pathwalk) should use
      d_is_xxx() and d_backing_inode().
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      525d27b2
  14. 23 2月, 2015 4 次提交
  15. 26 1月, 2015 1 次提交
  16. 20 11月, 2014 2 次提交
  17. 04 11月, 2014 1 次提交
  18. 13 10月, 2014 2 次提交
  19. 09 10月, 2014 2 次提交
  20. 15 9月, 2014 1 次提交
    • L
      vfs: avoid non-forwarding large load after small store in path lookup · 9226b5b4
      Linus Torvalds 提交于
      The performance regression that Josef Bacik reported in the pathname
      lookup (see commit 99d263d4 "vfs: fix bad hashing of dentries") made
      me look at performance stability of the dcache code, just to verify that
      the problem was actually fixed.  That turned up a few other problems in
      this area.
      
      There are a few cases where we exit RCU lookup mode and go to the slow
      serializing case when we shouldn't, Al has fixed those and they'll come
      in with the next VFS pull.
      
      But my performance verification also shows that link_path_walk() turns
      out to have a very unfortunate 32-bit store of the length and hash of
      the name we look up, followed by a 64-bit read of the combined hash_len
      field.  That screws up the processor store to load forwarding, causing
      an unnecessary hickup in this critical routine.
      
      It's caused by the ugly calling convention for the "hash_name()"
      function, and easily fixed by just making hash_name() fill in the whole
      'struct qstr' rather than passing it a pointer to just the hash value.
      
      With that, the profile for this function looks much smoother.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9226b5b4
  21. 08 8月, 2014 1 次提交
    • J
      dcache: d_obtain_alias callers don't all want DISCONNECTED · 1a0a397e
      J. Bruce Fields 提交于
      There are a few d_obtain_alias callers that are using it to get the
      root of a filesystem which may already have an alias somewhere else.
      
      This is not the same as the filehandle-lookup case, and none of them
      actually need DCACHE_DISCONNECTED set.
      
      It isn't really a serious problem, but it would really be clearer if we
      reserved DCACHE_DISCONNECTED for those cases where it's actually needed.
      
      In the btrfs case this was causing a spurious printk from
      nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED
      dentry.  Josef worked around this by unsetting DCACHE_DISCONNECTED
      manually in 3a0dfa6a "Btrfs: unset DCACHE_DISCONNECTED when mounting
      default subvol", and this replaces that workaround.
      
      Cc: Josef Bacik <jbacik@fb.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1a0a397e
  22. 01 5月, 2014 1 次提交
    • A
      dentry_kill(): don't try to remove from shrink list · 41edf278
      Al Viro 提交于
      If the victim in on the shrink list, don't remove it from there.
      If shrink_dentry_list() manages to remove it from the list before
      we are done - fine, we'll just free it as usual.  If not - mark
      it with new flag (DCACHE_MAY_FREE) and leave it there.
      
      Eventually, shrink_dentry_list() will get to it, remove the sucker
      from shrink list and call dentry_kill(dentry, 0).  Which is where
      we'll deal with freeing.
      
      Since now dentry_kill(dentry, 0) may happen after or during
      dentry_kill(dentry, 1), we need to recognize that (by seeing
      DCACHE_DENTRY_KILLED already set), unlock everything
      and either free the sucker (in case DCACHE_MAY_FREE has been
      set) or leave it for ongoing dentry_kill(dentry, 1) to deal with.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      41edf278
  23. 01 4月, 2014 2 次提交
  24. 13 12月, 2013 1 次提交
  25. 09 11月, 2013 1 次提交
    • D
      VFS: Put a small type field into struct dentry::d_flags · b18825a7
      David Howells 提交于
      Put a type field into struct dentry::d_flags to indicate if the dentry is one
      of the following types that relate particularly to pathwalk:
      
      	Miss (negative dentry)
      	Directory
      	"Automount" directory (defective - no i_op->lookup())
      	Symlink
      	Other (regular, socket, fifo, device)
      
      The type field is set to one of the first five types on a dentry by calls to
      __d_instantiate() and d_obtain_alias() from information in the inode (if one is
      given).
      
      The type is cleared by dentry_unlink_inode() when it reconstitutes an existing
      dentry as a negative dentry.
      
      Accessors provided are:
      
      	d_set_type(dentry, type)
      	d_is_directory(dentry)
      	d_is_autodir(dentry)
      	d_is_symlink(dentry)
      	d_is_file(dentry)
      	d_is_negative(dentry)
      	d_is_positive(dentry)
      
      A bunch of checks in pathname resolution switched to those.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b18825a7
  26. 25 10月, 2013 1 次提交
  27. 11 9月, 2013 2 次提交
    • G
      super: fix calculation of shrinkable objects for small numbers · 55f841ce
      Glauber Costa 提交于
      The sysctl knob sysctl_vfs_cache_pressure is used to determine which
      percentage of the shrinkable objects in our cache we should actively try
      to shrink.
      
      It works great in situations in which we have many objects (at least more
      than 100), because the aproximation errors will be negligible.  But if
      this is not the case, specially when total_objects < 100, we may end up
      concluding that we have no objects at all (total / 100 = 0, if total <
      100).
      
      This is certainly not the biggest killer in the world, but may matter in
      very low kernel memory situations.
      Signed-off-by: NGlauber Costa <glommer@openvz.org>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      55f841ce
    • G
      fs: bump inode and dentry counters to long · 3942c07c
      Glauber Costa 提交于
      This series reworks our current object cache shrinking infrastructure in
      two main ways:
      
       * Noticing that a lot of users copy and paste their own version of LRU
         lists for objects, we put some effort in providing a generic version.
         It is modeled after the filesystem users: dentries, inodes, and xfs
         (for various tasks), but we expect that other users could benefit in
         the near future with little or no modification.  Let us know if you
         have any issues.
      
       * The underlying list_lru being proposed automatically and
         transparently keeps the elements in per-node lists, and is able to
         manipulate the node lists individually.  Given this infrastructure, we
         are able to modify the up-to-now hammer called shrink_slab to proceed
         with node-reclaim instead of always searching memory from all over like
         it has been doing.
      
      Per-node lru lists are also expected to lead to less contention in the lru
      locks on multi-node scans, since we are now no longer fighting for a
      global lock.  The locks usually disappear from the profilers with this
      change.
      
      Although we have no official benchmarks for this version - be our guest to
      independently evaluate this - earlier versions of this series were
      performance tested (details at
      http://permalink.gmane.org/gmane.linux.kernel.mm/100537) yielding no
      visible performance regressions while yielding a better qualitative
      behavior in NUMA machines.
      
      With this infrastructure in place, we can use the list_lru entry point to
      provide memcg isolation and per-memcg targeted reclaim.  Historically,
      those two pieces of work have been posted together.  This version presents
      only the infrastructure work, deferring the memcg work for a later time,
      so we can focus on getting this part tested.  You can see more about the
      history of such work at http://lwn.net/Articles/552769/
      
      Dave Chinner (18):
        dcache: convert dentry_stat.nr_unused to per-cpu counters
        dentry: move to per-sb LRU locks
        dcache: remove dentries from LRU before putting on dispose list
        mm: new shrinker API
        shrinker: convert superblock shrinkers to new API
        list: add a new LRU list type
        inode: convert inode lru list to generic lru list code.
        dcache: convert to use new lru list infrastructure
        list_lru: per-node list infrastructure
        shrinker: add node awareness
        fs: convert inode and dentry shrinking to be node aware
        xfs: convert buftarg LRU to generic code
        xfs: rework buffer dispose list tracking
        xfs: convert dquot cache lru to list_lru
        fs: convert fs shrinkers to new scan/count API
        drivers: convert shrinkers to new count/scan API
        shrinker: convert remaining shrinkers to count/scan API
        shrinker: Kill old ->shrink API.
      
      Glauber Costa (7):
        fs: bump inode and dentry counters to long
        super: fix calculation of shrinkable objects for small numbers
        list_lru: per-node API
        vmscan: per-node deferred work
        i915: bail out earlier when shrinker cannot acquire mutex
        hugepage: convert huge zero page shrinker to new shrinker API
        list_lru: dynamically adjust node arrays
      
      This patch:
      
      There are situations in very large machines in which we can have a large
      quantity of dirty inodes, unused dentries, etc.  This is particularly true
      when umounting a filesystem, where eventually since every live object will
      eventually be discarded.
      
      Dave Chinner reported a problem with this while experimenting with the
      shrinker revamp patchset.  So we believe it is time for a change.  This
      patch just moves int to longs.  Machines where it matters should have a
      big long anyway.
      Signed-off-by: NGlauber Costa <glommer@openvz.org>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3942c07c
  28. 09 9月, 2013 1 次提交
    • L
      vfs: reorganize dput() memory accesses · 8aab6a27
      Linus Torvalds 提交于
      This is me being a bit OCD after all the dentry optimization work this
      merge window: profiles end up showing 'dput()' as a rather expensive
      operation, and there were two unrelated bad reasons for that.
      
      The first reason was reading d_lockref.count for debugging purposes,
      which touches the lockref cacheline (for reads) before really need to.
      More importantly, the debugging test in question is _wrong_, and has
      hidden bugs.  It's true that we can only sleep when the count goes down
      to zero, but the test as-is hides the much more subtle bug that happens
      if we race with somebody else deleting the file.
      
      Anyway we _will_ touch that cacheline, but let's do it for a write and
      in the right routine (ie in "lockref_put_or_lock()") which annotates the
      costs better.  So remove the misleading debug code.
      
      The other was an unnecessary access to the cacheline that contains the
      d_lru list, just to check whether we already were on the LRU list or
      not.  This is exactly what we have d_flags for, so that we can avoid
      touching extra cache lines for the common case.  So just add another bit
      for "is this dentry on the LRU".
      
      Finally, mark the tests properly likely/unlikely, so that the common
      fast-paths are dense in the instruction stream.
      
      This makes the profiles look much saner.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8aab6a27