1. 14 4月, 2013 1 次提交
    • L
      kobject: fix kset_find_obj() race with concurrent last kobject_put() · a49b7e82
      Linus Torvalds 提交于
      Anatol Pomozov identified a race condition that hits module unloading
      and re-loading.  To quote Anatol:
      
       "This is a race codition that exists between kset_find_obj() and
        kobject_put().  kset_find_obj() might return kobject that has refcount
        equal to 0 if this kobject is freeing by kobject_put() in other
        thread.
      
        Here is timeline for the crash in case if kset_find_obj() searches for
        an object tht nobody holds and other thread is doing kobject_put() on
        the same kobject:
      
          THREAD A (calls kset_find_obj())     THREAD B (calls kobject_put())
          splin_lock()
                                               atomic_dec_return(kobj->kref), counter gets zero here
                                               ... starts kobject cleanup ....
                                               spin_lock() // WAIT thread A in kobj_kset_leave()
          iterate over kset->list
          atomic_inc(kobj->kref) (counter becomes 1)
          spin_unlock()
                                               spin_lock() // taken
                                               // it does not know that thread A increased counter so it
                                               remove obj from list
                                               spin_unlock()
                                               vfree(module) // frees module object with containing kobj
      
          // kobj points to freed memory area!!
          kobject_put(kobj) // OOPS!!!!
      
        The race above happens because module.c tries to use kset_find_obj()
        when somebody unloads module.  The module.c code was introduced in
        commit 6494a93d"
      
      Anatol supplied a patch specific for module.c that worked around the
      problem by simply not using kset_find_obj() at all, but rather than make
      a local band-aid, this just fixes kset_find_obj() to be thread-safe
      using the proper model of refusing the get a new reference if the
      refcount has already dropped to zero.
      
      See examples of this proper refcount handling not only in the kref
      documentation, but in various other equivalent uses of this pattern by
      grepping for atomic_inc_not_zero().
      
      [ Side note: the module race does indicate that module loading and
        unloading is not properly serialized wrt sysfs information using the
        module mutex.  That may require further thought, but this is the
        correct fix at the kobject layer regardless. ]
      Reported-analyzed-and-tested-by: NAnatol Pomozov <anatol.pomozov@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a49b7e82
  2. 23 3月, 2013 3 次提交
    • A
      dma-debug: update DMA debug API to better handle multiple mappings of a buffer · 96e7d7a1
      Alexander Duyck 提交于
      There were reports of the igb driver unmapping buffers without calling
      dma_mapping_error.  On closer inspection issues were found in the DMA
      debug API and how it handled multiple mappings of the same buffer.
      
      The issue I found is the fact that the debug_dma_mapping_error would
      only set the map_err_type to MAP_ERR_CHECKED in the case that the was
      only one match for device and device address.  However in the case of
      non-IOMMU, multiple addresses existed and as a result it was not setting
      this field once a second mapping was instantiated.  I have resolved this
      by changing the search so that it instead will now set MAP_ERR_CHECKED
      on the first buffer that matches the device and DMA address that is
      currently in the state MAP_ERR_NOT_CHECKED.
      
      A secondary side effect of this patch is that in the case of multiple
      buffers using the same address only the last mapping will have a valid
      map_err_type.  The previous mappings will all end up with map_err_type
      set to MAP_ERR_CHECKED because of the dma_mapping_error call in
      debug_dma_map_page.  However this behavior may be preferable as it means
      you will likely only see one real error per multi-mapped buffer, versus
      the current behavior of multiple false errors mer multi-mapped buffer.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Reviewed-by: NShuah Khan <shuah.khan@hp.com>
      Tested-by: NShuah Khan <shuah.khan@hp.com>
      Cc: Jakub Kicinski <kubakici@wp.pl>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      96e7d7a1
    • A
      dma-debug: fix locking bug in check_unmap() · 8d640a51
      Alexander Duyck 提交于
      In check_unmap() it is possible to get into a dead-locked state if
      dma_mapping_error is called.  The problem is that the bucket is locked in
      check_unmap, and locked again by debug_dma_mapping_error which is called
      by dma_mapping_error.  To resolve that we must release the lock on the
      bucket before making the call to dma_mapping_error.
      
      [akpm@linux-foundation.org: restore 80-col trickery to be consistent with the rest of the file]
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Reviewed-by: NShuah Khan <shuah.khan@hp.com>
      Tested-by: NShuah Khan <shuah.khan@hp.com>
      Cc: Jakub Kicinski <kubakici@wp.pl>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d640a51
    • F
      printk: Provide a wake_up_klogd() off-case · dc72c32e
      Frederic Weisbecker 提交于
      wake_up_klogd() is useless when CONFIG_PRINTK=n because neither printk()
      nor printk_sched() are in use and there are actually no waiter on
      log_wait waitqueue.  It should be a stub in this case for users like
      bust_spinlocks().
      
      Otherwise this results in this warning when CONFIG_PRINTK=n and
      CONFIG_IRQ_WORK=n:
      
      	kernel/built-in.o In function `wake_up_klogd':
      	(.text.wake_up_klogd+0xb4): undefined reference to `irq_work_queue'
      
      To fix this, provide an off-case for wake_up_klogd() when
      CONFIG_PRINTK=n.
      
      There is much more from console_unlock() and other console related code
      in printk.c that should be moved under CONFIG_PRINTK.  But for now,
      focus on a minimal fix as we passed the merged window already.
      
      [akpm@linux-foundation.org: include printk.h in bust_spinlocks.c]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reported-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dc72c32e
  3. 14 3月, 2013 3 次提交
  4. 13 3月, 2013 1 次提交
  5. 09 3月, 2013 1 次提交
    • T
      idr: remove WARN_ON_ONCE() on negative IDs · 2e1c9b28
      Tejun Heo 提交于
      idr_find(), idr_remove() and idr_replace() used to silently ignore the
      sign bit and perform lookup with the rest of the bits.  The weird behavior
      has been changed such that negative IDs are treated as invalid.  As the
      behavior change was subtle, WARN_ON_ONCE() was added in the hope of
      determining who's calling idr functions with negative IDs so that they can
      be examined for problems.
      
      Up until now, all two reported cases are ID number coming directly from
      userland and getting fed into idr_find() and the warnings seem to cause
      more problems than being helpful.  Drop the WARN_ON_ONCE()s.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: <markus@trippelsdorf.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2e1c9b28
  6. 03 3月, 2013 1 次提交
    • J
      Kconfig.debug: add METAG to dependency lists · 79f83c02
      James Hogan 提交于
      Add [!]METAG to a couple of Kconfig dependencies in lib/Kconfig.debug.
      Don't allow stack utilization instrumentation on metag, and allow
      building with frame pointers.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "Paul E. McKenney" <paul.mckenney@linaro.org>
      Cc: Akinobu Mita <akinobu.mita@gmail.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      79f83c02
  7. 02 3月, 2013 1 次提交
    • R
      Fixed dead ifdef block by adding missing Kconfig option. · 3b0eb71e
      Robert Obermeier 提交于
      Added missing Kconfig option KDB_CONTINUE_CATASTROPHIC which lead to a dead
      ifdef block in kernel/debug/kdb/kdb_main.c:73-75.
      
      The code using KDB_CONTINUE_CATASTROPHIC was originally introduced in
      commit '5d5314d6' by Jason Wessel.
      This patchset ("kdb: core for kgdb back end (1 of 2)")
      added platform independent part of kdb to the linux kernel.
      
      The Kernel option however, even though it had the same options and
      behaviour on all supported architectures, was part of the x86 and
      ia64 patchset of KDB and therefore not pulled into the mainline kernel tree.
      
      I actually took the originally written Kconfig by
      Keith Owens <kaos@sgi.com> (2003-06-20 according to KDB changelog)
      and changed it to reflect the correct behaviour,
      as the KDUMP patchset is not part of the kernel and the expected
      functionality is missing from it.
      Signed-off-by: NRobert Obermeier <obbi89@googlemail.com>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      3b0eb71e
  8. 28 2月, 2013 19 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
    • S
      kfifo: fix kfifo_alloc() and kfifo_init() · dfe2a77f
      Stefani Seibold 提交于
      Fix kfifo_alloc() and kfifo_init() to alloc at least the requested number
      of elements.  Since the kfifo operates on power of 2 the request size will
      be rounded up to the next power of two.
      Signed-off-by: NStefani Seibold <stefani@seibold.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dfe2a77f
    • S
      kfifo: move kfifo.c from kernel/ to lib/ · c759b35e
      Stefani Seibold 提交于
      Move kfifo.c from kernel/ to lib/
      Signed-off-by: NStefani Seibold <stefani@seibold.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c759b35e
    • T
      idr: explain WARN_ON_ONCE() on negative IDs out-of-range ID · 7175c61c
      Tejun Heo 提交于
      Until recently, when an negative ID is specified, idr functions used to
      ignore the sign bit and proceeded with the operation with the rest of
      bits, which is bizarre and error-prone.  The behavior recently got changed
      so that negative IDs are treated as invalid but we're triggering
      WARN_ON_ONCE() on negative IDs just in case somebody was depending on the
      sign bit being ignored, so that those can be detected and fixed easily.
      
      We only need this for a while.  Explain why WARN_ON_ONCE()s are there and
      that they can be removed later.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7175c61c
    • T
      idr: implement lookup hint · 0ffc2a9c
      Tejun Heo 提交于
      While idr lookup isn't a particularly heavy operation, it still is too
      substantial to use in hot paths without worrying about the performance
      implications.  With recent changes, each idr_layer covers 256 slots
      which should be enough to cover most use cases with single idr_layer
      making lookup hint very attractive.
      
      This patch adds idr->hint which points to the idr_layer which
      allocated an ID most recently and the fast path lookup becomes
      
      	if (look up target's prefix matches that of the hinted layer)
      		return hint->ary[ID's offset in the leaf layer];
      
      which can be inlined.
      
      idr->hint is set to the leaf node on idr_fill_slot() and cleared from
      free_layer().
      
      [andriy.shevchenko@linux.intel.com: always do slow path when hint is uninitialized]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ffc2a9c
    • T
      idr: add idr_layer->prefix · 54616283
      Tejun Heo 提交于
      Add a field which carries the prefix of ID the idr_layer covers.  This
      will be used to implement lookup hint.
      
      This patch doesn't make use of the new field and doesn't introduce any
      behavior difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54616283
    • T
      idr: remove length restriction from idr_layer->bitmap · 1d9b2e1e
      Tejun Heo 提交于
      Currently, idr->bitmap is declared as an unsigned long which restricts
      the number of bits an idr_layer can contain.  All bitops can handle
      arbitrary positive integer bit number and there's no reason for this
      restriction.
      
      Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single
      unsigned long.
      
      * idr_layer->bitmap is now an array.  '&' dropped from params to
        bitops.
      
      * Replaced "== IDR_FULL" tests with bitmap_full() and removed
        IDR_FULL.
      
      * Replaced find_next_bit() on ~bitmap with find_next_zero_bit().
      
      * Replaced "bitmap = 0" with bitmap_clear().
      
      This patch doesn't (or at least shouldn't) introduce any behavior
      changes.
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d9b2e1e
    • T
      idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c · e8c8d1bc
      Tejun Heo 提交于
      MAX_IDR_MASK is another weirdness in the idr interface.  As idr covers
      whole positive integer range, it's defined as 0x7fffffff or INT_MAX.
      
      Its usage in idr_find(), idr_replace() and idr_remove() is bizarre.
      They basically mask off the sign bit and operate on the rest, so if
      the caller, by accident, passes in a negative number, the sign bit
      will be masked off and the remaining part will be used as if that was
      the input, which is worse than crashing.
      
      The constant is visible in idr.h and there are several users in the
      kernel.
      
      * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter()
      
        Basically used to test if adap->nr is a negative number which isn't
        -1 and returns -EINVAL if so.  idr_alloc() already has negative
        @start checking (w/ WARN_ON_ONCE), so this can go away.
      
      * drivers/infiniband/core/cm.c:cm_alloc_id()
        drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()
      
        Used to wrap cyclic @start.  Can be replaced with max(next, 0).
        Note that this type of cyclic allocation using idr is buggy.  These
        are prone to spurious -ENOSPC failure after the first wraparound.
      
      * fs/super.c:get_anon_bdev()
      
        The ID allocated from ida is masked off before being tested whether
        it's inside valid range.  ida allocated ID can never be a negative
        number and the masking is unnecessary.
      
      Update idr_*() functions to fail with -EINVAL when negative @id is
      specified and update other MAX_IDR_MASK users as described above.
      
      This leaves MAX_IDR_MASK without any user, remove it and relocate
      other MAX_IDR_* constants to lib/idr.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com>
      Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Acked-by: NWolfram Sang <wolfram@the-dreams.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8c8d1bc
    • T
      idr: fix top layer handling · 326cf0f0
      Tejun Heo 提交于
      Most functions in idr fail to deal with the high bits when the idr
      tree grows to the maximum height.
      
      * idr_get_empty_slot() stops growing idr tree once the depth reaches
        MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to
        cover the whole range.  The function doesn't even notice that it
        didn't grow the tree enough and ends up allocating the wrong ID
        given sufficiently high @starting_id.
      
        For example, on 64 bit, if the starting id is 0x7fffff01,
        idr_get_empty_slot() will grow the tree 5 layer deep, which only
        covers the 30 bits and then proceed to allocate as if the bit 30
        wasn't specified.  It ends up allocating 0x3fffff01 without the bit
        30 but still returns 0x7fffff01.
      
      * __idr_remove_all() will not remove anything if the tree is fully
        grown.
      
      * idr_find() can't find anything if the tree is fully grown.
      
      * idr_for_each() and idr_get_next() can't iterate anything if the tree
        is fully grown.
      
      Fix it by introducing idr_max() which returns the maximum possible ID
      given the depth of tree and replacing the id limit checks in all
      affected places.
      
      As the idr_layer pointer array pa[] needs to be 1 larger than the
      maximum depth, enlarge pa[] arrays by one.
      
      While this plugs the discovered issues, the whole code base is
      horrible and in desparate need of rewrite.  It's fragile like hell,
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      326cf0f0
    • T
      idr: implement idr_preload[_end]() and idr_alloc() · d5c7409f
      Tejun Heo 提交于
      The current idr interface is very cumbersome.
      
      * For all allocations, two function calls - idr_pre_get() and
        idr_get_new*() - should be made.
      
      * idr_pre_get() doesn't guarantee that the following idr_get_new*()
        will not fail from memory shortage.  If idr_get_new*() returns
        -EAGAIN, the caller is expected to retry pre_get and allocation.
      
      * idr_get_new*() can't enforce upper limit.  Upper limit can only be
        enforced by allocating and then freeing if above limit.
      
      * idr_layer buffer is unnecessarily per-idr.  Each idr ends up keeping
        around MAX_IDR_FREE idr_layers.  The memory consumed per idr is
        under two pages but it makes it difficult to make idr_layer larger.
      
      This patch implements the following new set of allocation functions.
      
      * idr_preload[_end]() - Similar to radix preload but doesn't fail.
        The first idr_alloc() inside preload section can be treated as if it
        were called with @gfp_mask used for idr_preload().
      
      * idr_alloc() - Allocate an ID w/ lower and upper limits.  Takes
        @gfp_flags and can be used w/o preloading.  When used inside
        preloaded section, the allocation mask of preloading can be assumed.
      
      If idr_alloc() can be called from a context which allows sufficiently
      relaxed @gfp_mask, it can be used by itself.  If, for example,
      idr_alloc() is called inside spinlock protected region, preloading can
      be used like the following.
      
      	idr_preload(GFP_KERNEL);
      	spin_lock(lock);
      
      	id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT);
      
      	spin_unlock(lock);
      	idr_preload_end();
      	if (id < 0)
      		error;
      
      which is much simpler and less error-prone than idr_pre_get and
      idr_get_new*() loop.
      
      The new interface uses per-pcu idr_layer buffer and thus the number of
      idr's in the system doesn't affect the amount of memory used for
      preloading.
      
      idr_layer_alloc() is introduced to handle idr_layer allocations for
      both old and new ID allocation paths.  This is a bit hairy now but the
      new interface is expected to replace the old and the internal
      implementation eventually will become simpler.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d5c7409f
    • T
      idr: refactor idr_get_new_above() · 3594eb28
      Tejun Heo 提交于
      Move slot filling to idr_fill_slot() from idr_get_new_above_int() and
      make idr_get_new_above() directly call it.  idr_get_new_above_int() is
      no longer needed and removed.
      
      This will be used to implement a new ID allocation interface.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3594eb28
    • T
      idr: remove _idr_rc_to_errno() hack · 12d1b439
      Tejun Heo 提交于
      idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate
      exception conditions internally.  The return value is later translated
      to errno values using _idr_rc_to_errno().
      
      This is confusing.  Drop the custom ones and consistently use -EAGAIN
      for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for
      "ran out of ID space".
      
      Due to the weird memory preloading mechanism, [ra]_get_new*() return
      -EAGAIN on memory shortage, so we need to substitute -ENOMEM w/
      -EAGAIN on those interface functions.  They'll eventually be cleaned
      up and the translations will go away.
      
      This patch doesn't introduce any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12d1b439
    • T
      idr: relocate idr_for_each_entry() and reorganize id[r|a]_get_new() · 49038ef4
      Tejun Heo 提交于
      * Move idr_for_each_entry() definition next to other idr related
        definitions.
      
      * Make id[r|a]_get_new() inline wrappers of id[r|a]_get_new_above().
      
      This changes the implementation of idr_get_new() but the new
      implementation is trivial.  This patch doesn't introduce any
      functional change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      49038ef4
    • T
      idr: deprecate idr_remove_all() · fe6e24ec
      Tejun Heo 提交于
      There was only one legitimate use of idr_remove_all() and a lot more of
      incorrect uses (or lack of it).  Now that idr_destroy() implies
      idr_remove_all() and all the in-kernel users updated not to use it,
      there's no reason to keep it around.  Mark it deprecated so that we can
      later unexport it.
      
      idr_remove_all() is made an inline function calling __idr_remove_all()
      to avoid triggering deprecated warning on EXPORT_SYMBOL().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe6e24ec
    • T
      idr: make idr_destroy() imply idr_remove_all() · 9bb26bc1
      Tejun Heo 提交于
      idr is silly in quite a few ways, one of which is how it's supposed to
      be destroyed - idr_destroy() doesn't release IDs and doesn't even whine
      if the idr isn't empty.  If the caller forgets idr_remove_all(), it
      simply leaks memory.
      
      Even ida gets this wrong and leaks memory on destruction.  There is
      absoltely no reason not to call idr_remove_all() from idr_destroy().
      Nobody is abusing idr_destroy() for shrinking free layer buffer and
      continues to use idr after idr_destroy(), so it's safe to do remove_all
      from destroy.
      
      In the whole kernel, there is only one place where idr_remove_all() is
      legitimiately used without following idr_destroy() while there are quite
      a few places where the caller forgets either idr_remove_all() or
      idr_destroy() leaking memory.
      
      This patch makes idr_destroy() call idr_destroy_all() and updates the
      function description accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9bb26bc1
    • T
      idr: fix a subtle bug in idr_get_next() · 6cdae741
      Tejun Heo 提交于
      The iteration logic of idr_get_next() is borrowed mostly verbatim from
      idr_for_each().  It walks down the tree looking for the slot matching
      the current ID.  If the matching slot is not found, the ID is
      incremented by the distance of single slot at the given level and
      repeats.
      
      The implementation assumes that during the whole iteration id is aligned
      to the layer boundaries of the level closest to the leaf, which is true
      for all iterations starting from zero or an existing element and thus is
      fine for idr_for_each().
      
      However, idr_get_next() may be given any point and if the starting id
      hits in the middle of a non-existent layer, increment to the next layer
      will end up skipping the same offset into it.  For example, an IDR with
      IDs filled between [64, 127] would look like the following.
      
                [  0  64 ... ]
             /----/   |
             |        |
            NULL    [ 64 ... 127 ]
      
      If idr_get_next() is called with 63 as the starting point, it will try
      to follow down the pointer from 0.  As it is NULL, it will then try to
      proceed to the next slot in the same level by adding the slot distance
      at that level which is 64 - making the next try 127.  It goes around the
      loop and finds and returns 127 skipping [64, 126].
      
      Note that this bug also triggers in idr_for_each_entry() loop which
      deletes during iteration as deletions can make layers go away leaving
      the iteration with unaligned ID into missing layers.
      
      Fix it by ensuring proceeding to the next slot doesn't carry over the
      unaligned offset - ie.  use round_up(id + 1, slot_distance) instead of
      id += slot_distance.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NDavid Teigland <teigland@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6cdae741
    • I
      lib/scatterlist: use page iterator in the mapping iterator · 4225fc85
      Imre Deak 提交于
      For better code reuse use the newly added page iterator to iterate
      through the pages.  The offset, length within the page is still
      calculated by the mapping iterator as well as the actual mapping.  Idea
      from Tejun Heo.
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Stephen Warren <swarren@wwwdotorg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4225fc85
    • I
      lib/scatterlist: add simple page iterator · a321e91b
      Imre Deak 提交于
      Add an iterator to walk through a scatter list a page at a time starting
      at a specific page offset.  As opposed to the mapping iterator this is
      meant to be small, performing well even in simple loops like collecting
      all pages on the scatterlist into an array or setting up an iommu table
      based on the pages' DMA address.
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Tested-by: NStephen Warren <swarren@wwwdotorg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a321e91b
    • J
      lib/devres.c: fix misplaced #endif · 9ed8a30f
      Jingoo Han 提交于
      A misplaced #endif causes link errors related to pcim_*() functions.
      
      This is because pcim_*() functions are related to CONFIG_PCI option,
      however these are not related to CONFIG_HAS_IOPORT option.  Therefore,
      when CONFIG_PCI is enabled and CONFIG_HAS_IOPORT is not enabled, it makes
      link errors related to pcim_*() functions as below:
      
      drivers/ata/libata-sff.c:3233: undefined reference to `pcim_iomap_regions'
      drivers/ata/libata-sff.c:3238: undefined reference to `pcim_iomap_table'
      drivers/built-in.o: In function `ata_pci_sff_init_host':
      drivers/ata/libata-sff.c:2318: undefined reference to `pcim_iomap_regions'
      drivers/ata/libata-sff.c:2329: undefined reference to `pcim_iomap_table
      Signed-off-by: NJingoo Han <jg1.han@samsung.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9ed8a30f
  9. 22 2月, 2013 6 次提交
  10. 21 2月, 2013 2 次提交
  11. 19 2月, 2013 2 次提交