1. 14 2月, 2017 1 次提交
    • M
      Reimplement IDR and IDA using the radix tree · 0a835c4f
      Matthew Wilcox 提交于
      The IDR is very similar to the radix tree.  It has some functionality that
      the radix tree did not have (alloc next free, cyclic allocation, a
      callback-based for_each, destroy tree), which is readily implementable on
      top of the radix tree.  A few small changes were needed in order to use a
      tag to represent nodes with free space below them.  More extensive
      changes were needed to support storing NULL as a valid entry in an IDR.
      Plain radix trees still interpret NULL as a not-present entry.
      
      The IDA is reimplemented as a client of the newly enhanced radix tree.  As
      in the current implementation, it uses a bitmap at the last level of the
      tree.
      Signed-off-by: NMatthew Wilcox <willy@infradead.org>
      Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      0a835c4f
  2. 13 12月, 2016 1 次提交
  3. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  4. 13 2月, 2015 1 次提交
  5. 09 9月, 2014 1 次提交
  6. 09 8月, 2014 1 次提交
    • A
      lib/idr.c: fix out-of-bounds pointer dereference · 93b7aca3
      Andrey Ryabinin 提交于
      I'm working on address sanitizer project for kernel.  Recently we
      started experiments with stack instrumentation, to detect out-of-bounds
      read/write bugs on stack.
      
      Just after booting I've hit out-of-bounds read on stack in idr_for_each
      (and in __idr_remove_all as well):
      
      	struct idr_layer **paa = &pa[0];
      
      	while (id >= 0 && id <= max) {
      		...
      		while (n < fls(id)) {
      			n += IDR_BITS;
      			p = *--paa; <--- here we are reading pa[-1] value.
      		}
      	}
      
      Despite the fact that after this dereference we are exiting out of loop
      and never use p, such behaviour is undefined and should be avoided.
      
      Fix this by moving pointer derference to the beggining of the loop,
      right before we will use it.
      Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Alexey Preobrazhensky <preobr@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93b7aca3
  7. 07 6月, 2014 6 次提交
  8. 08 4月, 2014 2 次提交
  9. 17 2月, 2014 1 次提交
  10. 04 7月, 2013 1 次提交
  11. 30 4月, 2013 1 次提交
    • J
      idr: introduce idr_alloc_cyclic() · 3e6628c4
      Jeff Layton 提交于
      As Tejun points out, there are several users of the IDR facility that
      attempt to use it in a cyclic fashion.  These users are likely to see
      -ENOSPC errors after the counter wraps one or more times however.
      
      This patchset adds a new idr_alloc_cyclic routine and converts several
      of these users to it.  Many of these users are in obscure parts of the
      kernel, and I don't have a good way to test some of them.  The change is
      pretty straightforward though, so hopefully it won't be an issue.
      
      There is one other cyclic user of idr_alloc that I didn't touch in
      ipc/util.c.  That one is doing some strange stuff that I didn't quite
      understand, but it looks like it should probably be converted later
      somehow.
      
      This patch:
      
      Thus spake Tejun Heo:
      
          Ooh, BTW, the cyclic allocation is broken.  It's prone to -ENOSPC
          after the first wraparound.  There are several cyclic users in the
          kernel and I think it probably would be best to implement cyclic
          support in idr.
      
      This patch does that by adding new idr_alloc_cyclic function that such
      users in the kernel can use.  With this, there's no need for a caller to
      keep track of the last value used as that's now tracked internally.  This
      should prevent the ENOSPC problems that can hit when the "last allocated"
      counter exceeds INT_MAX.
      
      Later patches will convert existing cyclic users to the new interface.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NTejun Heo <tj@kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
      Cc: John McCutchan <john@johnmccutchan.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Robert Love <rlove@rlove.org>
      Cc: Roland Dreier <roland@purestorage.com>
      Cc: Sridhar Samudrala <sri@us.ibm.com>
      Cc: Steve Wise <swise@opengridcomputing.com>
      Cc: Tom Tucker <tom@opengridcomputing.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3e6628c4
  12. 14 3月, 2013 2 次提交
  13. 13 3月, 2013 1 次提交
  14. 09 3月, 2013 1 次提交
    • T
      idr: remove WARN_ON_ONCE() on negative IDs · 2e1c9b28
      Tejun Heo 提交于
      idr_find(), idr_remove() and idr_replace() used to silently ignore the
      sign bit and perform lookup with the rest of the bits.  The weird behavior
      has been changed such that negative IDs are treated as invalid.  As the
      behavior change was subtle, WARN_ON_ONCE() was added in the hope of
      determining who's calling idr functions with negative IDs so that they can
      be examined for problems.
      
      Up until now, all two reported cases are ID number coming directly from
      userland and getting fed into idr_find() and the warnings seem to cause
      more problems than being helpful.  Drop the WARN_ON_ONCE()s.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: <markus@trippelsdorf.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2e1c9b28
  15. 28 2月, 2013 13 次提交
    • T
      idr: explain WARN_ON_ONCE() on negative IDs out-of-range ID · 7175c61c
      Tejun Heo 提交于
      Until recently, when an negative ID is specified, idr functions used to
      ignore the sign bit and proceeded with the operation with the rest of
      bits, which is bizarre and error-prone.  The behavior recently got changed
      so that negative IDs are treated as invalid but we're triggering
      WARN_ON_ONCE() on negative IDs just in case somebody was depending on the
      sign bit being ignored, so that those can be detected and fixed easily.
      
      We only need this for a while.  Explain why WARN_ON_ONCE()s are there and
      that they can be removed later.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7175c61c
    • T
      idr: implement lookup hint · 0ffc2a9c
      Tejun Heo 提交于
      While idr lookup isn't a particularly heavy operation, it still is too
      substantial to use in hot paths without worrying about the performance
      implications.  With recent changes, each idr_layer covers 256 slots
      which should be enough to cover most use cases with single idr_layer
      making lookup hint very attractive.
      
      This patch adds idr->hint which points to the idr_layer which
      allocated an ID most recently and the fast path lookup becomes
      
      	if (look up target's prefix matches that of the hinted layer)
      		return hint->ary[ID's offset in the leaf layer];
      
      which can be inlined.
      
      idr->hint is set to the leaf node on idr_fill_slot() and cleared from
      free_layer().
      
      [andriy.shevchenko@linux.intel.com: always do slow path when hint is uninitialized]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ffc2a9c
    • T
      idr: add idr_layer->prefix · 54616283
      Tejun Heo 提交于
      Add a field which carries the prefix of ID the idr_layer covers.  This
      will be used to implement lookup hint.
      
      This patch doesn't make use of the new field and doesn't introduce any
      behavior difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54616283
    • T
      idr: remove length restriction from idr_layer->bitmap · 1d9b2e1e
      Tejun Heo 提交于
      Currently, idr->bitmap is declared as an unsigned long which restricts
      the number of bits an idr_layer can contain.  All bitops can handle
      arbitrary positive integer bit number and there's no reason for this
      restriction.
      
      Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single
      unsigned long.
      
      * idr_layer->bitmap is now an array.  '&' dropped from params to
        bitops.
      
      * Replaced "== IDR_FULL" tests with bitmap_full() and removed
        IDR_FULL.
      
      * Replaced find_next_bit() on ~bitmap with find_next_zero_bit().
      
      * Replaced "bitmap = 0" with bitmap_clear().
      
      This patch doesn't (or at least shouldn't) introduce any behavior
      changes.
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d9b2e1e
    • T
      idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c · e8c8d1bc
      Tejun Heo 提交于
      MAX_IDR_MASK is another weirdness in the idr interface.  As idr covers
      whole positive integer range, it's defined as 0x7fffffff or INT_MAX.
      
      Its usage in idr_find(), idr_replace() and idr_remove() is bizarre.
      They basically mask off the sign bit and operate on the rest, so if
      the caller, by accident, passes in a negative number, the sign bit
      will be masked off and the remaining part will be used as if that was
      the input, which is worse than crashing.
      
      The constant is visible in idr.h and there are several users in the
      kernel.
      
      * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter()
      
        Basically used to test if adap->nr is a negative number which isn't
        -1 and returns -EINVAL if so.  idr_alloc() already has negative
        @start checking (w/ WARN_ON_ONCE), so this can go away.
      
      * drivers/infiniband/core/cm.c:cm_alloc_id()
        drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()
      
        Used to wrap cyclic @start.  Can be replaced with max(next, 0).
        Note that this type of cyclic allocation using idr is buggy.  These
        are prone to spurious -ENOSPC failure after the first wraparound.
      
      * fs/super.c:get_anon_bdev()
      
        The ID allocated from ida is masked off before being tested whether
        it's inside valid range.  ida allocated ID can never be a negative
        number and the masking is unnecessary.
      
      Update idr_*() functions to fail with -EINVAL when negative @id is
      specified and update other MAX_IDR_MASK users as described above.
      
      This leaves MAX_IDR_MASK without any user, remove it and relocate
      other MAX_IDR_* constants to lib/idr.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com>
      Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Acked-by: NWolfram Sang <wolfram@the-dreams.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8c8d1bc
    • T
      idr: fix top layer handling · 326cf0f0
      Tejun Heo 提交于
      Most functions in idr fail to deal with the high bits when the idr
      tree grows to the maximum height.
      
      * idr_get_empty_slot() stops growing idr tree once the depth reaches
        MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to
        cover the whole range.  The function doesn't even notice that it
        didn't grow the tree enough and ends up allocating the wrong ID
        given sufficiently high @starting_id.
      
        For example, on 64 bit, if the starting id is 0x7fffff01,
        idr_get_empty_slot() will grow the tree 5 layer deep, which only
        covers the 30 bits and then proceed to allocate as if the bit 30
        wasn't specified.  It ends up allocating 0x3fffff01 without the bit
        30 but still returns 0x7fffff01.
      
      * __idr_remove_all() will not remove anything if the tree is fully
        grown.
      
      * idr_find() can't find anything if the tree is fully grown.
      
      * idr_for_each() and idr_get_next() can't iterate anything if the tree
        is fully grown.
      
      Fix it by introducing idr_max() which returns the maximum possible ID
      given the depth of tree and replacing the id limit checks in all
      affected places.
      
      As the idr_layer pointer array pa[] needs to be 1 larger than the
      maximum depth, enlarge pa[] arrays by one.
      
      While this plugs the discovered issues, the whole code base is
      horrible and in desparate need of rewrite.  It's fragile like hell,
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      326cf0f0
    • T
      idr: implement idr_preload[_end]() and idr_alloc() · d5c7409f
      Tejun Heo 提交于
      The current idr interface is very cumbersome.
      
      * For all allocations, two function calls - idr_pre_get() and
        idr_get_new*() - should be made.
      
      * idr_pre_get() doesn't guarantee that the following idr_get_new*()
        will not fail from memory shortage.  If idr_get_new*() returns
        -EAGAIN, the caller is expected to retry pre_get and allocation.
      
      * idr_get_new*() can't enforce upper limit.  Upper limit can only be
        enforced by allocating and then freeing if above limit.
      
      * idr_layer buffer is unnecessarily per-idr.  Each idr ends up keeping
        around MAX_IDR_FREE idr_layers.  The memory consumed per idr is
        under two pages but it makes it difficult to make idr_layer larger.
      
      This patch implements the following new set of allocation functions.
      
      * idr_preload[_end]() - Similar to radix preload but doesn't fail.
        The first idr_alloc() inside preload section can be treated as if it
        were called with @gfp_mask used for idr_preload().
      
      * idr_alloc() - Allocate an ID w/ lower and upper limits.  Takes
        @gfp_flags and can be used w/o preloading.  When used inside
        preloaded section, the allocation mask of preloading can be assumed.
      
      If idr_alloc() can be called from a context which allows sufficiently
      relaxed @gfp_mask, it can be used by itself.  If, for example,
      idr_alloc() is called inside spinlock protected region, preloading can
      be used like the following.
      
      	idr_preload(GFP_KERNEL);
      	spin_lock(lock);
      
      	id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT);
      
      	spin_unlock(lock);
      	idr_preload_end();
      	if (id < 0)
      		error;
      
      which is much simpler and less error-prone than idr_pre_get and
      idr_get_new*() loop.
      
      The new interface uses per-pcu idr_layer buffer and thus the number of
      idr's in the system doesn't affect the amount of memory used for
      preloading.
      
      idr_layer_alloc() is introduced to handle idr_layer allocations for
      both old and new ID allocation paths.  This is a bit hairy now but the
      new interface is expected to replace the old and the internal
      implementation eventually will become simpler.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d5c7409f
    • T
      idr: refactor idr_get_new_above() · 3594eb28
      Tejun Heo 提交于
      Move slot filling to idr_fill_slot() from idr_get_new_above_int() and
      make idr_get_new_above() directly call it.  idr_get_new_above_int() is
      no longer needed and removed.
      
      This will be used to implement a new ID allocation interface.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3594eb28
    • T
      idr: remove _idr_rc_to_errno() hack · 12d1b439
      Tejun Heo 提交于
      idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate
      exception conditions internally.  The return value is later translated
      to errno values using _idr_rc_to_errno().
      
      This is confusing.  Drop the custom ones and consistently use -EAGAIN
      for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for
      "ran out of ID space".
      
      Due to the weird memory preloading mechanism, [ra]_get_new*() return
      -EAGAIN on memory shortage, so we need to substitute -ENOMEM w/
      -EAGAIN on those interface functions.  They'll eventually be cleaned
      up and the translations will go away.
      
      This patch doesn't introduce any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12d1b439
    • T
      idr: relocate idr_for_each_entry() and reorganize id[r|a]_get_new() · 49038ef4
      Tejun Heo 提交于
      * Move idr_for_each_entry() definition next to other idr related
        definitions.
      
      * Make id[r|a]_get_new() inline wrappers of id[r|a]_get_new_above().
      
      This changes the implementation of idr_get_new() but the new
      implementation is trivial.  This patch doesn't introduce any
      functional change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      49038ef4
    • T
      idr: deprecate idr_remove_all() · fe6e24ec
      Tejun Heo 提交于
      There was only one legitimate use of idr_remove_all() and a lot more of
      incorrect uses (or lack of it).  Now that idr_destroy() implies
      idr_remove_all() and all the in-kernel users updated not to use it,
      there's no reason to keep it around.  Mark it deprecated so that we can
      later unexport it.
      
      idr_remove_all() is made an inline function calling __idr_remove_all()
      to avoid triggering deprecated warning on EXPORT_SYMBOL().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe6e24ec
    • T
      idr: make idr_destroy() imply idr_remove_all() · 9bb26bc1
      Tejun Heo 提交于
      idr is silly in quite a few ways, one of which is how it's supposed to
      be destroyed - idr_destroy() doesn't release IDs and doesn't even whine
      if the idr isn't empty.  If the caller forgets idr_remove_all(), it
      simply leaks memory.
      
      Even ida gets this wrong and leaks memory on destruction.  There is
      absoltely no reason not to call idr_remove_all() from idr_destroy().
      Nobody is abusing idr_destroy() for shrinking free layer buffer and
      continues to use idr after idr_destroy(), so it's safe to do remove_all
      from destroy.
      
      In the whole kernel, there is only one place where idr_remove_all() is
      legitimiately used without following idr_destroy() while there are quite
      a few places where the caller forgets either idr_remove_all() or
      idr_destroy() leaking memory.
      
      This patch makes idr_destroy() call idr_destroy_all() and updates the
      function description accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9bb26bc1
    • T
      idr: fix a subtle bug in idr_get_next() · 6cdae741
      Tejun Heo 提交于
      The iteration logic of idr_get_next() is borrowed mostly verbatim from
      idr_for_each().  It walks down the tree looking for the slot matching
      the current ID.  If the matching slot is not found, the ID is
      incremented by the distance of single slot at the given level and
      repeats.
      
      The implementation assumes that during the whole iteration id is aligned
      to the layer boundaries of the level closest to the leaf, which is true
      for all iterations starting from zero or an existing element and thus is
      fine for idr_for_each().
      
      However, idr_get_next() may be given any point and if the starting id
      hits in the middle of a non-existent layer, increment to the next layer
      will end up skipping the same offset into it.  For example, an IDR with
      IDs filled between [64, 127] would look like the following.
      
                [  0  64 ... ]
             /----/   |
             |        |
            NULL    [ 64 ... 127 ]
      
      If idr_get_next() is called with 63 as the starting point, it will try
      to follow down the pointer from 0.  As it is NULL, it will then try to
      proceed to the next slot in the same level by adding the slot distance
      at that level which is 64 - making the next try 127.  It goes around the
      loop and finds and returns 127 skipping [64, 126].
      
      Note that this bug also triggers in idr_for_each_entry() loop which
      deletes during iteration as deletions can make layers go away leaving
      the iteration with unaligned ID into missing layers.
      
      Fix it by ensuring proceeding to the next slot doesn't carry over the
      unaligned offset - ie.  use round_up(id + 1, slot_distance) instead of
      id += slot_distance.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NDavid Teigland <teigland@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6cdae741
  16. 06 10月, 2012 1 次提交
  17. 22 3月, 2012 1 次提交
  18. 08 3月, 2012 1 次提交
  19. 03 11月, 2011 1 次提交
  20. 01 11月, 2011 1 次提交
  21. 04 8月, 2011 1 次提交