1. 31 8月, 2010 2 次提交
  2. 23 6月, 2010 1 次提交
  3. 28 5月, 2010 1 次提交
    • I
      idr: fix backtrack logic in idr_remove_all · 2dcb22b3
      Imre Deak 提交于
      Currently idr_remove_all will fail with a use after free error if
      idr::layers is bigger than 2, which on 32 bit systems corresponds to items
      more than 1024.  This is due to stepping back too many levels during
      backtracking.  For simplicity let's assume that IDR_BITS=1 -> we have 2
      nodes at each level below the root node and each leaf node stores two IDs.
       (In reality for 32 bit systems IDR_BITS=5, with 32 nodes at each sub-root
      level and 32 IDs in each leaf node).  The sequence of freeing the nodes at
      the moment is as follows:
      
      layer
      1 ->                       a(7)
      2 ->            b(3)                  c(5)
      3 ->        d(1)   e(2)           f(4)    g(6)
      
      Until step 4 things go fine, but then node c is freed, whereas node g
      should be freed first.  Since node c contains the pointer to node g we'll
      have a use after free error at step 6.
      
      How many levels we step back after visiting the leaf nodes is currently
      determined by the msb of the id we are currently visiting:
      
      Step
      1.          node d with IDs 0,1 is freed, current ID is advanced to 2.
                  msb of the current ID bit 1. This means we need to step back
                  1 level to node b and take the next sibling, node e.
      2-3.        node e with IDs 2,3 is freed, current ID is 4, msb is bit 2.
                  This means we need to step back 2 levels to node a, freeing
                  node b on the way.
      4-5.        node f with IDs 4,5 is freed, current ID is 6, msb is still
                  bit 2. This means we again need to step back 2 levels to node
                  a and free c on the way.
      6.          We should visit node g, but its pointer is not available as
                  node c was freed.
      
      The fix changes how we determine the number of levels to step back.
      Instead of deducting this merely from the msb of the current ID, we should
      really check if advancing the ID causes an overflow to a bit position
      corresponding to a given layer.  In the above example overflow from bit 0
      to bit 1 should mean stepping back 1 level.  Overflow from bit 1 to bit 2
      should mean stepping back 2 levels and so on.
      
      The fix was tested with IDs up to 1 << 20, which corresponds to 4 layers
      on 32 bit systems.
      Signed-off-by: NImre Deak <imre.deak@nokia.com>
      Reviewed-by: NTejun Heo <tj@kernel.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: <stable@kernel.org>		[2.6.34.1]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2dcb22b3
  4. 25 2月, 2010 2 次提交
  5. 23 2月, 2010 1 次提交
  6. 05 2月, 2010 1 次提交
  7. 03 2月, 2010 1 次提交
    • T
      idr: fix a critical misallocation bug · 859ddf09
      Tejun Heo 提交于
      Eric Paris located a bug in idr.  With IDR_BITS of 6, it grows to three
      layers when id 4096 is first allocated.  When that happens, idr wraps
      incorrectly and searches the idr array ignoring the high bits.  The
      following test code from Eric demonstrates the bug nicely.
      
      #include <linux/idr.h>
      #include <linux/kernel.h>
      #include <linux/module.h>
      
      static DEFINE_IDR(test_idr);
      
      int init_module(void)
      {
      	int ret, forty95, forty96;
      	void *addr;
      
      	/* add 2 entries both with 4095 as the start address */
      again1:
      	if (!idr_pre_get(&test_idr, GFP_KERNEL))
      		return -ENOMEM;
      	ret = idr_get_new_above(&test_idr, (void *)4095, 4095, &forty95);
      	if (ret) {
      		if (ret == -EAGAIN)
      			goto again1;
      		return ret;
      	}
      	if (forty95 != 4095)
      		printk(KERN_ERR "hmmm, forty95=%d\n", forty95);
      
      again2:
      	if (!idr_pre_get(&test_idr, GFP_KERNEL))
      		return -ENOMEM;
      	ret = idr_get_new_above(&test_idr, (void *)4096, 4095, &forty96);
      	if (ret) {
      		if (ret == -EAGAIN)
      			goto again2;
      		return ret;
      	}
      	if (forty96 != 4096)
      		printk(KERN_ERR "hmmm, forty96=%d\n", forty96);
      
      	/* try to find the 2 entries, noticing that 4096 broke */
      	addr = idr_find(&test_idr, forty95);
      	if ((int)addr != forty95)
      		printk(KERN_ERR "hmmm, after find forty95=%d addr=%d\n", forty95, (int)addr);
      	addr = idr_find(&test_idr, forty96);
      	if ((int)addr != forty96)
      		printk(KERN_ERR "hmmm, after find forty96=%d addr=%d\n", forty96, (int)addr);
      	/* really weird, the entry which should be at 4096 is actually at 0!! */
      	addr = idr_find(&test_idr, 0);
      	if ((int)addr)
      		printk(KERN_ERR "found an entry at id=0 for addr=%d\n", (int)addr);
      
      	idr_remove(&test_idr, forty95);
      	idr_remove(&test_idr, forty96);
      
      	return 0;
      }
      
      void cleanup_module(void)
      {
      }
      
      MODULE_AUTHOR("Eric Paris <eparis@redhat.com>");
      MODULE_DESCRIPTION("Simple idr test");
      MODULE_LICENSE("GPL");
      
      This happens because when sub_alloc() back tracks it doesn't always do it
      step-by-step while the over-the-limit detection assumes step-by-step
      backtracking.  The logic in sub_alloc() looks like the following.
      
        restart:
          clear pa[top level + 1] for end cond detection
          l = top level
          while (true) {
      	search for empty slot at this level
      	if (not found) {
      	    push id to the next possible value
      	    l++
      A:	    if (pa[l] is clear)
      	        failed, return asking caller to grow the tree
      	    if (going up 1 level gives more slots to search)
      	        continue the while loop above with the incremented l
      	    else
      C:	        goto restart
      	}
      	adjust id accordingly to the found slot
      	if (l == 0)
      	    return found id;
      	create lower level if not there yet
      	record pa[l] and l--
          }
      
      Test A is the fail exit condition but this assumes that failure is
      propagated upwared one level at a time but the B optimization path breaks
      the assumption and restarts the whole thing with a start value which is
      above the possible limit with the current layers.  sub_alloc() assumes the
      start id value is inside the limit when called and test A is the only exit
      condition check, so it ends up searching for empty slot while ignoring
      high set bit.
      
      So, for 4095->4096 test, level0 search fails but pa[1] contains a valid
      pointer.  However, going up 1 level wouldn't give any more empty slot so
      it takes C and when the whole thing restarts nobody notices the high bit
      set beyond the top level.
      
      This patch fixes the bug by changing the fail exit condition check to full
      id limit check.
      
      Based-on-patch-from: Eric Paris <eparis@redhat.com>
      Reported-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      859ddf09
  8. 04 12月, 2009 1 次提交
  9. 03 4月, 2009 1 次提交
    • K
      cgroup: CSS ID support · 38460b48
      KAMEZAWA Hiroyuki 提交于
      Patch for Per-CSS(Cgroup Subsys State) ID and private hierarchy code.
      
      This patch attaches unique ID to each css and provides following.
      
       - css_lookup(subsys, id)
         returns pointer to struct cgroup_subysys_state of id.
       - css_get_next(subsys, id, rootid, depth, foundid)
         returns the next css under "root" by scanning
      
      When cgroup_subsys->use_id is set, an id for css is maintained.
      
      The cgroup framework only parepares
      	- css_id of root css for subsys
      	- id is automatically attached at creation of css.
      	- id is *not* freed automatically. Because the cgroup framework
      	  don't know lifetime of cgroup_subsys_state.
      	  free_css_id() function is provided. This must be called by subsys.
      
      There are several reasons to develop this.
      	- Saving space .... For example, memcg's swap_cgroup is array of
      	  pointers to cgroup. But it is not necessary to be very fast.
      	  By replacing pointers(8bytes per ent) to ID (2byes per ent), we can
      	  reduce much amount of memory usage.
      
      	- Scanning without lock.
      	  CSS_ID provides "scan id under this ROOT" function. By this, scanning
      	  css under root can be written without locks.
      	  ex)
      	  do {
      		rcu_read_lock();
      		next = cgroup_get_next(subsys, id, root, &found);
      		/* check sanity of next here */
      		css_tryget();
      		rcu_read_unlock();
      		id = found + 1
      	 } while(...)
      
      Characteristics:
      	- Each css has unique ID under subsys.
      	- Lifetime of ID is controlled by subsys.
      	- css ID contains "ID" and "Depth in hierarchy" and stack of hierarchy
      	- Allowed ID is 1-65535, ID 0 is UNUSED ID.
      
      Design Choices:
      	- scan-by-ID v.s. scan-by-tree-walk.
      	  As /proc's pid scan does, scan-by-ID is robust when scanning is done
      	  by following kind of routine.
      	  scan -> rest a while(release a lock) -> conitunue from interrupted
      	  memcg's hierarchical reclaim does this.
      
      	- When subsys->use_id is set, # of css in the system is limited to
      	  65535.
      
      [bharata@linux.vnet.ibm.com: remove rcu_read_lock() from css_get_next()]
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NPaul Menage <menage@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      38460b48
  10. 11 3月, 2009 1 次提交
  11. 16 1月, 2009 2 次提交
    • A
      lib/idr.c: use kmem_cache_zalloc() for the idr_layer cache · 5b019e99
      Andrew Morton 提交于
      David points out that the idr_remove_all() function returns unused slabs
      to the kmem cache, but needs to zero them first or else they will be
      uninitialized upon next use.  This causes crashes which have been observed
      in the firewire subsystem.
      
      He fixed this by zeroing the object before freeing it in idr_remove_all().
      
      But we agree that simply removing the constructor and zeroing the object
      at allocation time is simpler than relying upon slab constructor machinery
      and might even be faster.
      
      This problem was introduced by "idr: make idr_remove rcu-safe" (commit
      cf481c20), which was first released in
      2.6.27.
      
      There are no known codesites which trigger this bug in 2.6.27 or 2.6.28.
      The post-2.6.28 firewire changes are the only known triggerer.
      
      There might of course be not-yet-discovered triggerers in 2.6.27 and
      2.6.28, and there might be out-of-tree triggerers which are added to those
      kernel versions.  I'll let the -stable guys decide whether they want to
      backport this fix.
      Reported-by: NDavid Moore <dcm@acm.org>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Cc: Paul E. McKenney <paulmck@us.ibm.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Kristian Hgsberg <krh@redhat.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b019e99
    • L
      idr: fix wrong kernel-doc · b098161b
      Li Zefan 提交于
      idr_get_new_above() and ida_get_new_above() return an id in the range of
      @staring_id ... 0x7fffffff, not 0 ... 0x7fffffff.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b098161b
  12. 11 12月, 2008 1 次提交
  13. 02 12月, 2008 1 次提交
  14. 27 7月, 2008 1 次提交
  15. 26 7月, 2008 6 次提交
  16. 01 5月, 2008 1 次提交
  17. 29 4月, 2008 1 次提交
  18. 17 10月, 2007 1 次提交
  19. 15 10月, 2007 1 次提交
  20. 01 8月, 2007 1 次提交
  21. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  22. 17 7月, 2007 2 次提交
    • K
      lib: add idr_remove_all · 23936cc0
      Kristian Hoegsberg 提交于
      Remove all ids from the given idr tree.  idr_destroy() only frees up
      unused, cached idp_layers, but this function will remove all id mappings
      and leave all idp_layers unused.
      
      A typical clean-up sequence for objects stored in an idr tree, will use
      idr_for_each() to free all objects, if necessay, then idr_remove_all() to
      remove all ids, and idr_destroy() to free up the cached idr_layers.
      Signed-off-by: NKristian Hoegsberg <krh@redhat.com>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Dave Airlie <airlied@linux.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      23936cc0
    • K
      lib: add idr_for_each() · 96d7fa42
      Kristian Hoegsberg 提交于
      This patch adds an iterator function for the idr data structure.  Compared
      to just iterating through the idr with an integer and idr_find, this
      iterator is (almost, but not quite) linear in the number of elements, as
      opposed to the number of integers in the range covered by the idr.  This
      makes a difference for sparse idrs, but more importantly, it's a nicer way
      to iterate through the elements.
      
      The drm subsystem is moving to idr for tracking contexts and drawables, and
      with this change, we can use the idr exclusively for tracking these
      resources.
      
      [akpm@linux-foundation.org: fix comment]
      Signed-off-by: NKristian Hoegsberg <krh@redhat.com>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Dave Airlie <airlied@linux.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      96d7fa42
  23. 12 7月, 2007 3 次提交
    • T
      ida: implement idr based id allocator · 72dba584
      Tejun Heo 提交于
      Implement idr based id allocator.  ida is used the same way idr is
      used but lacks id -> ptr translation and thus consumes much less
      memory.  struct ida_bitmap is attached as leaf nodes to idr tree which
      is managed by the idr code.  Each ida_bitmap is 128bytes long and
      contains slightly less than a thousand slots.
      
      ida is more aggressive with releasing extra resources acquired using
      ida_pre_get().  After every successful id allocation, ida frees one
      reserved idr_layer if possible.  Reserved ida_bitmap is not freed
      automatically but only one ida_bitmap is reserved and it's almost
      always used right away.  Under most circumstances, ida won't hold on
      to memory for too long which isn't actively used.
      Signed-off-by: NTejun Heo <htejun@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      72dba584
    • T
      idr: separate out idr_mark_full() · e33ac8bd
      Tejun Heo 提交于
      Separate out idr_mark_full() from sub_alloc() and make marking the
      allocated slot full the responsibility of idr_get_new_above_int().
      
      Allocation part of idr_get_new_above_int() is renamed to
      idr_get_empty_slot().  New idr_get_new_above_int() allocates a slot
      using the function, install the user pointer and marks it full using
      idr_mark_full().
      
      This change doesn't introduce any behavior change.  This will be
      used by ida.
      Signed-off-by: NTejun Heo <htejun@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      e33ac8bd
    • T
      idr: fix obscure bug in allocation path · 7aae6dd8
      Tejun Heo 提交于
      In sub_alloc(), when bitmap search fails, it goes up one level to
      continue search.  This is done by updating the id cursor and searching
      the upper level again.  If the cursor was at the end of the upper
      level, we need to go further than that.
      
      This wasn't implemented and when that happens the part of the cursor
      which indexes into the upper level wraps and sub_alloc() ends up
      searching the wrong bitmap.  It allocates id which doesn't match the
      actual slot.
      
      This patch fixes this by restarting from the top if the search needs
      to go higher than one level.
      Signed-off-by: NTejun Heo <htejun@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      
      7aae6dd8
  24. 12 2月, 2007 1 次提交
  25. 08 12月, 2006 1 次提交
  26. 15 7月, 2006 1 次提交
    • R
      [PATCH] Convert idr's internal locking to _irqsave variant · c259cc28
      Roland Dreier 提交于
      Currently, the code in lib/idr.c uses a bare spin_lock(&idp->lock) to do
      internal locking.  This is a nasty trap for code that might call idr
      functions from different contexts; for example, it seems perfectly
      reasonable to call idr_get_new() from process context and idr_remove() from
      interrupt context -- but with the current locking this would lead to a
      potential deadlock.
      
      The simplest fix for this is to just convert the idr locking to use
      spin_lock_irqsave().
      
      In particular, this fixes a very complicated locking issue detected by
      lockdep, involving the ib_ipoib driver's priv->lock and dev->_xmit_lock,
      which get involved with the ib_sa module's query_idr.lock.
      
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Zach Brown <zach.brown@oracle.com>,
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c259cc28
  27. 27 6月, 2006 1 次提交
  28. 26 6月, 2006 1 次提交
    • S
      [PATCH] fix race in idr code · 1eec0056
      Sonny Rao 提交于
      I ran into a bug where the kernel died in the idr code:
      
      cpu 0x1d: Vector: 300 (Data Access) at [c000000b7096f710]
          pc: c0000000001f8984: .idr_get_new_above_int+0x140/0x330
          lr: c0000000001f89b4: .idr_get_new_above_int+0x170/0x330
          sp: c000000b7096f990
         msr: 800000000000b032
         dar: 0
       dsisr: 40010000
        current = 0xc000000b70d43830
        paca    = 0xc000000000556900
          pid   = 2022, comm = hwup
      1d:mon> t
      [c000000b7096f990] c0000000000d2ad8 .expand_files+0x2e8/0x364 (unreliable)
      [c000000b7096faa0] c0000000001f8bf8 .idr_get_new_above+0x18/0x68
      [c000000b7096fb20] c00000000002a054 .init_new_context+0x5c/0xf0
      [c000000b7096fbc0] c000000000049dc8 .copy_process+0x91c/0x1404
      [c000000b7096fcd0] c00000000004a988 .do_fork+0xd8/0x224
      [c000000b7096fdc0] c00000000000ebdc .sys_clone+0x5c/0x74
      [c000000b7096fe30] c000000000008950 .ppc_clone+0x8/0xc
      1eec0056
  29. 31 10月, 2005 1 次提交