1. 28 2月, 2013 3 次提交
    • T
      idr: deprecate idr_remove_all() · fe6e24ec
      Tejun Heo 提交于
      There was only one legitimate use of idr_remove_all() and a lot more of
      incorrect uses (or lack of it).  Now that idr_destroy() implies
      idr_remove_all() and all the in-kernel users updated not to use it,
      there's no reason to keep it around.  Mark it deprecated so that we can
      later unexport it.
      
      idr_remove_all() is made an inline function calling __idr_remove_all()
      to avoid triggering deprecated warning on EXPORT_SYMBOL().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe6e24ec
    • T
      idr: make idr_destroy() imply idr_remove_all() · 9bb26bc1
      Tejun Heo 提交于
      idr is silly in quite a few ways, one of which is how it's supposed to
      be destroyed - idr_destroy() doesn't release IDs and doesn't even whine
      if the idr isn't empty.  If the caller forgets idr_remove_all(), it
      simply leaks memory.
      
      Even ida gets this wrong and leaks memory on destruction.  There is
      absoltely no reason not to call idr_remove_all() from idr_destroy().
      Nobody is abusing idr_destroy() for shrinking free layer buffer and
      continues to use idr after idr_destroy(), so it's safe to do remove_all
      from destroy.
      
      In the whole kernel, there is only one place where idr_remove_all() is
      legitimiately used without following idr_destroy() while there are quite
      a few places where the caller forgets either idr_remove_all() or
      idr_destroy() leaking memory.
      
      This patch makes idr_destroy() call idr_destroy_all() and updates the
      function description accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9bb26bc1
    • T
      idr: fix a subtle bug in idr_get_next() · 6cdae741
      Tejun Heo 提交于
      The iteration logic of idr_get_next() is borrowed mostly verbatim from
      idr_for_each().  It walks down the tree looking for the slot matching
      the current ID.  If the matching slot is not found, the ID is
      incremented by the distance of single slot at the given level and
      repeats.
      
      The implementation assumes that during the whole iteration id is aligned
      to the layer boundaries of the level closest to the leaf, which is true
      for all iterations starting from zero or an existing element and thus is
      fine for idr_for_each().
      
      However, idr_get_next() may be given any point and if the starting id
      hits in the middle of a non-existent layer, increment to the next layer
      will end up skipping the same offset into it.  For example, an IDR with
      IDs filled between [64, 127] would look like the following.
      
                [  0  64 ... ]
             /----/   |
             |        |
            NULL    [ 64 ... 127 ]
      
      If idr_get_next() is called with 63 as the starting point, it will try
      to follow down the pointer from 0.  As it is NULL, it will then try to
      proceed to the next slot in the same level by adding the slot distance
      at that level which is 64 - making the next try 127.  It goes around the
      loop and finds and returns 127 skipping [64, 126].
      
      Note that this bug also triggers in idr_for_each_entry() loop which
      deletes during iteration as deletions can make layers go away leaving
      the iteration with unaligned ID into missing layers.
      
      Fix it by ensuring proceeding to the next slot doesn't carry over the
      unaligned offset - ie.  use round_up(id + 1, slot_distance) instead of
      id += slot_distance.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NDavid Teigland <teigland@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6cdae741
  2. 06 10月, 2012 1 次提交
  3. 22 3月, 2012 1 次提交
  4. 08 3月, 2012 1 次提交
  5. 03 11月, 2011 1 次提交
  6. 01 11月, 2011 1 次提交
  7. 04 8月, 2011 2 次提交
  8. 27 10月, 2010 2 次提交
  9. 31 8月, 2010 2 次提交
  10. 23 6月, 2010 1 次提交
  11. 28 5月, 2010 1 次提交
    • I
      idr: fix backtrack logic in idr_remove_all · 2dcb22b3
      Imre Deak 提交于
      Currently idr_remove_all will fail with a use after free error if
      idr::layers is bigger than 2, which on 32 bit systems corresponds to items
      more than 1024.  This is due to stepping back too many levels during
      backtracking.  For simplicity let's assume that IDR_BITS=1 -> we have 2
      nodes at each level below the root node and each leaf node stores two IDs.
       (In reality for 32 bit systems IDR_BITS=5, with 32 nodes at each sub-root
      level and 32 IDs in each leaf node).  The sequence of freeing the nodes at
      the moment is as follows:
      
      layer
      1 ->                       a(7)
      2 ->            b(3)                  c(5)
      3 ->        d(1)   e(2)           f(4)    g(6)
      
      Until step 4 things go fine, but then node c is freed, whereas node g
      should be freed first.  Since node c contains the pointer to node g we'll
      have a use after free error at step 6.
      
      How many levels we step back after visiting the leaf nodes is currently
      determined by the msb of the id we are currently visiting:
      
      Step
      1.          node d with IDs 0,1 is freed, current ID is advanced to 2.
                  msb of the current ID bit 1. This means we need to step back
                  1 level to node b and take the next sibling, node e.
      2-3.        node e with IDs 2,3 is freed, current ID is 4, msb is bit 2.
                  This means we need to step back 2 levels to node a, freeing
                  node b on the way.
      4-5.        node f with IDs 4,5 is freed, current ID is 6, msb is still
                  bit 2. This means we again need to step back 2 levels to node
                  a and free c on the way.
      6.          We should visit node g, but its pointer is not available as
                  node c was freed.
      
      The fix changes how we determine the number of levels to step back.
      Instead of deducting this merely from the msb of the current ID, we should
      really check if advancing the ID causes an overflow to a bit position
      corresponding to a given layer.  In the above example overflow from bit 0
      to bit 1 should mean stepping back 1 level.  Overflow from bit 1 to bit 2
      should mean stepping back 2 levels and so on.
      
      The fix was tested with IDs up to 1 << 20, which corresponds to 4 layers
      on 32 bit systems.
      Signed-off-by: NImre Deak <imre.deak@nokia.com>
      Reviewed-by: NTejun Heo <tj@kernel.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: <stable@kernel.org>		[2.6.34.1]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2dcb22b3
  12. 25 2月, 2010 2 次提交
  13. 23 2月, 2010 1 次提交
  14. 05 2月, 2010 1 次提交
  15. 03 2月, 2010 1 次提交
    • T
      idr: fix a critical misallocation bug · 859ddf09
      Tejun Heo 提交于
      Eric Paris located a bug in idr.  With IDR_BITS of 6, it grows to three
      layers when id 4096 is first allocated.  When that happens, idr wraps
      incorrectly and searches the idr array ignoring the high bits.  The
      following test code from Eric demonstrates the bug nicely.
      
      #include <linux/idr.h>
      #include <linux/kernel.h>
      #include <linux/module.h>
      
      static DEFINE_IDR(test_idr);
      
      int init_module(void)
      {
      	int ret, forty95, forty96;
      	void *addr;
      
      	/* add 2 entries both with 4095 as the start address */
      again1:
      	if (!idr_pre_get(&test_idr, GFP_KERNEL))
      		return -ENOMEM;
      	ret = idr_get_new_above(&test_idr, (void *)4095, 4095, &forty95);
      	if (ret) {
      		if (ret == -EAGAIN)
      			goto again1;
      		return ret;
      	}
      	if (forty95 != 4095)
      		printk(KERN_ERR "hmmm, forty95=%d\n", forty95);
      
      again2:
      	if (!idr_pre_get(&test_idr, GFP_KERNEL))
      		return -ENOMEM;
      	ret = idr_get_new_above(&test_idr, (void *)4096, 4095, &forty96);
      	if (ret) {
      		if (ret == -EAGAIN)
      			goto again2;
      		return ret;
      	}
      	if (forty96 != 4096)
      		printk(KERN_ERR "hmmm, forty96=%d\n", forty96);
      
      	/* try to find the 2 entries, noticing that 4096 broke */
      	addr = idr_find(&test_idr, forty95);
      	if ((int)addr != forty95)
      		printk(KERN_ERR "hmmm, after find forty95=%d addr=%d\n", forty95, (int)addr);
      	addr = idr_find(&test_idr, forty96);
      	if ((int)addr != forty96)
      		printk(KERN_ERR "hmmm, after find forty96=%d addr=%d\n", forty96, (int)addr);
      	/* really weird, the entry which should be at 4096 is actually at 0!! */
      	addr = idr_find(&test_idr, 0);
      	if ((int)addr)
      		printk(KERN_ERR "found an entry at id=0 for addr=%d\n", (int)addr);
      
      	idr_remove(&test_idr, forty95);
      	idr_remove(&test_idr, forty96);
      
      	return 0;
      }
      
      void cleanup_module(void)
      {
      }
      
      MODULE_AUTHOR("Eric Paris <eparis@redhat.com>");
      MODULE_DESCRIPTION("Simple idr test");
      MODULE_LICENSE("GPL");
      
      This happens because when sub_alloc() back tracks it doesn't always do it
      step-by-step while the over-the-limit detection assumes step-by-step
      backtracking.  The logic in sub_alloc() looks like the following.
      
        restart:
          clear pa[top level + 1] for end cond detection
          l = top level
          while (true) {
      	search for empty slot at this level
      	if (not found) {
      	    push id to the next possible value
      	    l++
      A:	    if (pa[l] is clear)
      	        failed, return asking caller to grow the tree
      	    if (going up 1 level gives more slots to search)
      	        continue the while loop above with the incremented l
      	    else
      C:	        goto restart
      	}
      	adjust id accordingly to the found slot
      	if (l == 0)
      	    return found id;
      	create lower level if not there yet
      	record pa[l] and l--
          }
      
      Test A is the fail exit condition but this assumes that failure is
      propagated upwared one level at a time but the B optimization path breaks
      the assumption and restarts the whole thing with a start value which is
      above the possible limit with the current layers.  sub_alloc() assumes the
      start id value is inside the limit when called and test A is the only exit
      condition check, so it ends up searching for empty slot while ignoring
      high set bit.
      
      So, for 4095->4096 test, level0 search fails but pa[1] contains a valid
      pointer.  However, going up 1 level wouldn't give any more empty slot so
      it takes C and when the whole thing restarts nobody notices the high bit
      set beyond the top level.
      
      This patch fixes the bug by changing the fail exit condition check to full
      id limit check.
      
      Based-on-patch-from: Eric Paris <eparis@redhat.com>
      Reported-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      859ddf09
  16. 04 12月, 2009 1 次提交
  17. 03 4月, 2009 1 次提交
    • K
      cgroup: CSS ID support · 38460b48
      KAMEZAWA Hiroyuki 提交于
      Patch for Per-CSS(Cgroup Subsys State) ID and private hierarchy code.
      
      This patch attaches unique ID to each css and provides following.
      
       - css_lookup(subsys, id)
         returns pointer to struct cgroup_subysys_state of id.
       - css_get_next(subsys, id, rootid, depth, foundid)
         returns the next css under "root" by scanning
      
      When cgroup_subsys->use_id is set, an id for css is maintained.
      
      The cgroup framework only parepares
      	- css_id of root css for subsys
      	- id is automatically attached at creation of css.
      	- id is *not* freed automatically. Because the cgroup framework
      	  don't know lifetime of cgroup_subsys_state.
      	  free_css_id() function is provided. This must be called by subsys.
      
      There are several reasons to develop this.
      	- Saving space .... For example, memcg's swap_cgroup is array of
      	  pointers to cgroup. But it is not necessary to be very fast.
      	  By replacing pointers(8bytes per ent) to ID (2byes per ent), we can
      	  reduce much amount of memory usage.
      
      	- Scanning without lock.
      	  CSS_ID provides "scan id under this ROOT" function. By this, scanning
      	  css under root can be written without locks.
      	  ex)
      	  do {
      		rcu_read_lock();
      		next = cgroup_get_next(subsys, id, root, &found);
      		/* check sanity of next here */
      		css_tryget();
      		rcu_read_unlock();
      		id = found + 1
      	 } while(...)
      
      Characteristics:
      	- Each css has unique ID under subsys.
      	- Lifetime of ID is controlled by subsys.
      	- css ID contains "ID" and "Depth in hierarchy" and stack of hierarchy
      	- Allowed ID is 1-65535, ID 0 is UNUSED ID.
      
      Design Choices:
      	- scan-by-ID v.s. scan-by-tree-walk.
      	  As /proc's pid scan does, scan-by-ID is robust when scanning is done
      	  by following kind of routine.
      	  scan -> rest a while(release a lock) -> conitunue from interrupted
      	  memcg's hierarchical reclaim does this.
      
      	- When subsys->use_id is set, # of css in the system is limited to
      	  65535.
      
      [bharata@linux.vnet.ibm.com: remove rcu_read_lock() from css_get_next()]
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NPaul Menage <menage@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      38460b48
  18. 11 3月, 2009 1 次提交
  19. 16 1月, 2009 2 次提交
    • A
      lib/idr.c: use kmem_cache_zalloc() for the idr_layer cache · 5b019e99
      Andrew Morton 提交于
      David points out that the idr_remove_all() function returns unused slabs
      to the kmem cache, but needs to zero them first or else they will be
      uninitialized upon next use.  This causes crashes which have been observed
      in the firewire subsystem.
      
      He fixed this by zeroing the object before freeing it in idr_remove_all().
      
      But we agree that simply removing the constructor and zeroing the object
      at allocation time is simpler than relying upon slab constructor machinery
      and might even be faster.
      
      This problem was introduced by "idr: make idr_remove rcu-safe" (commit
      cf481c20), which was first released in
      2.6.27.
      
      There are no known codesites which trigger this bug in 2.6.27 or 2.6.28.
      The post-2.6.28 firewire changes are the only known triggerer.
      
      There might of course be not-yet-discovered triggerers in 2.6.27 and
      2.6.28, and there might be out-of-tree triggerers which are added to those
      kernel versions.  I'll let the -stable guys decide whether they want to
      backport this fix.
      Reported-by: NDavid Moore <dcm@acm.org>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Cc: Paul E. McKenney <paulmck@us.ibm.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Kristian Hgsberg <krh@redhat.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b019e99
    • L
      idr: fix wrong kernel-doc · b098161b
      Li Zefan 提交于
      idr_get_new_above() and ida_get_new_above() return an id in the range of
      @staring_id ... 0x7fffffff, not 0 ... 0x7fffffff.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b098161b
  20. 11 12月, 2008 1 次提交
  21. 02 12月, 2008 1 次提交
  22. 27 7月, 2008 1 次提交
  23. 26 7月, 2008 6 次提交
  24. 01 5月, 2008 1 次提交
  25. 29 4月, 2008 1 次提交
  26. 17 10月, 2007 1 次提交
  27. 15 10月, 2007 1 次提交
  28. 01 8月, 2007 1 次提交