1. 03 9月, 2014 2 次提交
    • T
      percpu: implement asynchronous chunk population · 1a4d7607
      Tejun Heo 提交于
      The percpu allocator now supports atomic allocations by only
      allocating from already populated areas but the mechanism to ensure
      that there's adequate amount of populated areas was missing.
      
      This patch expands pcpu_balance_work so that in addition to freeing
      excess free chunks it also populates chunks to maintain an adequate
      level of populated areas.  pcpu_alloc() schedules pcpu_balance_work if
      the amount of free populated areas is too low or after an atomic
      allocation failure.
      
      * PERPCU_DYNAMIC_RESERVE is increased by two pages to account for
        PCPU_EMPTY_POP_PAGES_LOW.
      
      * pcpu_async_enabled is added to gate both async jobs -
        chunk->map_extend_work and pcpu_balance_work - so that we don't end
        up scheduling them while the needed subsystems aren't up yet.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1a4d7607
    • T
      percpu: implement [__]alloc_percpu_gfp() · 5835d96e
      Tejun Heo 提交于
      Now that pcpu_alloc_area() can allocate only from populated areas,
      it's easy to add atomic allocation support to [__]alloc_percpu().
      Update pcpu_alloc() so that it accepts @gfp and skips all the blocking
      operations and allocates only from the populated areas if @gfp doesn't
      contain GFP_KERNEL.  New interface functions [__]alloc_percpu_gfp()
      are added.
      
      While this means that atomic allocations are possible, this isn't
      complete yet as there's no mechanism to ensure that certain amount of
      populated areas is kept available and atomic allocations may keep
      failing under certain conditions.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5835d96e
  2. 18 6月, 2014 4 次提交
    • T
      percpu: move {raw|this}_cpu_*() definitions to include/linux/percpu-defs.h · a32f8d8e
      Tejun Heo 提交于
      We're in the process of moving all percpu accessors and operations to
      include/linux/percpu-defs.h so that they're available to arch headers
      without having to include full include/linux/percpu.h which may cause
      cyclic inclusion dependency.
      
      This patch moves {raw|this}_cpu_*() definitions from
      include/linux/percpu.h to include/linux/percpu-defs.h.  The code is
      moved mostly verbatim; however, raw_cpu_*() are placed above
      this_cpu_*() which is more conventional as the raw operations may be
      used to defined other variants.
      
      This is pure reorganization.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      a32f8d8e
    • T
      percpu: move generic {raw|this}_cpu_*_N() definitions to include/asm-generic/percpu.h · 47b69ad6
      Tejun Heo 提交于
      {raw|this}_cpu_*_N() operations are expected to be provided by archs
      and the generic definitions are provided as fallbacks.  As such, these
      firmly belong to include/asm-generic/percpu.h.
      
      Move the generic definitions to include/asm-generic/percpu.h.  The
      code is moved mostly verbatim; however, raw_cpu_*_N() are placed above
      this_cpu_*_N() which is more conventional as the raw operations may be
      used to defined other variants.
      
      This is pure reorganization.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      47b69ad6
    • T
      percpu: only allow sized arch overrides for {raw|this}_cpu_*() ops · dcba4333
      Tejun Heo 提交于
      Currently, percpu allows two separate methods for overriding
      {raw|this}_cpu_*() ops - for a given operation, an arch can provide
      whole replacement or sized sub operations to override specific parts
      of it.  e.g. arch either can provide this_cpu_add() or
      this_cpu_add_4() to override only the 4 byte operation.
      
      While quite flexible on a glance, the dual-overriding scheme
      complicates the code path for no actual gain.  It compilcates the
      already complex operation definitions and if an arch wants to override
      all sizes, it can easily provide all variants anyway.  In fact, no
      arch is actually making use of whole operation override.
      
      Another oddity is that __this_cpu_*() operations are defined in the
      same way as raw_cpu_*() but ignores full overrides of the raw_cpu_*()
      and doesn't allow full operation override, so if an arch provides
      whole overrides for raw_cpu_*() operations __this_cpu_*() ends up
      using the generic implementations.
      
      More importantly, it takes away the layering between arch-specific and
      generic parts making it impossible for the generic part to implement
      arch-independent features on top of arch-specific overrides.
      
      This patch removes the support for whole operation overrides.  As no
      arch is using it, this doesn't cause any actual difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      dcba4333
    • T
      percpu: move accessors from include/linux/percpu.h to percpu-defs.h · 9defda18
      Tejun Heo 提交于
      include/linux/percpu-defs.h is gonna host all accessors and operations
      so that arch headers can make use of them too without worrying about
      circular dependency through include/linux/percpu.h.
      
      This patch moves the following accessors from include/linux/percpu.h
      to include/linux/percpu-defs.h.
      
      * get/put_cpu_var()
      * get/put_cpu_ptr()
      * per_cpu_ptr()
      
      This is pure reorgniazation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      9defda18
  3. 15 5月, 2014 1 次提交
  4. 16 4月, 2014 1 次提交
  5. 08 4月, 2014 2 次提交
    • C
      percpu: add preemption checks to __this_cpu ops · 188a8140
      Christoph Lameter 提交于
      We define a check function in order to avoid trouble with the include
      files.  Then the higher level __this_cpu macros are modified to invoke
      the preemption check.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Tested-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      188a8140
    • C
      percpu: add raw_cpu_ops · b3ca1c10
      Christoph Lameter 提交于
      The kernel has never been audited to ensure that this_cpu operations are
      consistently used throughout the kernel.  The code generated in many
      places can be improved through the use of this_cpu operations (which
      uses a segment register for relocation of per cpu offsets instead of
      performing address calculations).
      
      The patch set also addresses various consistency issues in general with
      the per cpu macros.
      
      A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
         because checks are skipped. This is typically shown through a raw_
         prefix. So this patch set changes the places where __this_cpu_ptr()
         is used to raw_cpu_ptr().
      
      B. There has been the long term wish by some that __this_cpu operations
         would check for preemption. However, there are cases where preemption
         checks need to be skipped. This patch set adds raw_cpu operations that
         do not check for preemption and then adds preemption checks to the
         __this_cpu operations.
      
      C. The use of __get_cpu_var is always a reference to a percpu variable
         that can also be handled via a this_cpu operation. This patch set
         replaces all uses of __get_cpu_var with this_cpu operations.
      
      D. We can then use this_cpu RMW operations in various places replacing
         sequences of instructions by a single one.
      
      E. The use of this_cpu operations throughout will allow other arches than
         x86 to implement optimized references and RMV operations to work with
         per cpu local data.
      
      F. The use of this_cpu operations opens up the possibility to
         further optimize code that relies on synchronization through
         per cpu data.
      
      The patch set works in a couple of stages:
      
      I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
          Also converts the existing __this_cpu_xx_# primitive in the x86
          code to raw_cpu_xx_#.
      
      II. Patch 2-4 use the raw_cpu operations in places that would give
           us false positives once they are enabled.
      
      III. Patch 5 adds preemption checks to __this_cpu operations to allow
          checking if preemption is properly disabled when these functions
          are used.
      
      IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
         with this_cpu_ptr. They do not depend on any changes to the percpu
         code. No preemption tests are skipped if they are applied.
      
      V. Patches 21-46 are conversion patches that use this_cpu operations
         in various kernel subsystems/drivers or arch code.
      
      VI.  Patches 47/48 (not included in this series) remove no longer used
          functions (__this_cpu_ptr and __get_cpu_var).  These should only be
          applied after all the conversion patches have made it and after we
          have done additional passes through the kernel to ensure that none of
          the uses of these functions remain.
      
      This patch (of 46):
      
      The patches following this one will add preemption checks to __this_cpu
      ops so we need to have an alternative way to use this_cpu operations
      without preemption checks.
      
      raw_cpu_ops will be the basis for all other ops since these will be the
      operations that do not implement any checks.
      
      Primitive operations are renamed by this patch from __this_cpu_xxx to
      raw_cpu_xxxx.
      
      Also change the uses of the x86 percpu primitives in preempt.h.
      These depend directly on asm/percpu.h (header #include nesting issue).
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Alex Shi <alex.shi@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bryan Wu <cooloney@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: David Daney <david.daney@cavium.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Hedi Berriche <hedi@sgi.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Wim Van Sebroeck <wim@iguana.be>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3ca1c10
  6. 24 1月, 2014 1 次提交
  7. 31 10月, 2013 1 次提交
    • G
      percpu: fix this_cpu_sub() subtrahend casting for unsigneds · bd09d9a3
      Greg Thelen 提交于
      this_cpu_sub() is implemented as negation and addition.
      
      This patch casts the adjustment to the counter type before negation to
      sign extend the adjustment.  This helps in cases where the counter type
      is wider than an unsigned adjustment.  An alternative to this patch is
      to declare such operations unsupported, but it seemed useful to avoid
      surprises.
      
      This patch specifically helps the following example:
        unsigned int delta = 1
        preempt_disable()
        this_cpu_write(long_counter, 0)
        this_cpu_sub(long_counter, delta)
        preempt_enable()
      
      Before this change long_counter on a 64 bit machine ends with value
      0xffffffff, rather than 0xffffffffffffffff.  This is because
      this_cpu_sub(pcp, delta) boils down to this_cpu_add(pcp, -delta),
      which is basically:
        long_counter = 0 + 0xffffffff
      
      Also apply the same cast to:
        __this_cpu_sub()
        __this_cpu_sub_return()
        this_cpu_sub_return()
      
      All percpu_test.ko passes, especially the following cases which
      previously failed:
      
        l -= ui_one;
        __this_cpu_sub(long_counter, ui_one);
        CHECK(l, long_counter, -1);
      
        l -= ui_one;
        this_cpu_sub(long_counter, ui_one);
        CHECK(l, long_counter, -1);
        CHECK(l, long_counter, 0xffffffffffffffff);
      
        ul -= ui_one;
        __this_cpu_sub(ulong_counter, ui_one);
        CHECK(ul, ulong_counter, -1);
        CHECK(ul, ulong_counter, 0xffffffffffffffff);
      
        ul = this_cpu_sub_return(ulong_counter, ui_one);
        CHECK(ul, ulong_counter, 2);
      
        ul = __this_cpu_sub_return(ulong_counter, ui_one);
        CHECK(ul, ulong_counter, 1);
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd09d9a3
  8. 27 10月, 2013 1 次提交
  9. 06 10月, 2012 1 次提交
  10. 15 5月, 2012 1 次提交
  11. 05 3月, 2012 1 次提交
  12. 22 2月, 2012 2 次提交
    • M
      percpu: use raw_local_irq_* in _this_cpu op · e920d597
      Ming Lei 提交于
      It doesn't make sense to trace irq off or do irq flags
      lock proving inside 'this_cpu' operations, so replace local_irq_*
      with raw_local_irq_* in 'this_cpu' op.
      
      Also the patch fixes onelockdep warning[1] by the replacement, see
      below:
      
      In commit: 933393f5(percpu:
      Remove irqsafe_cpu_xxx variants), local_irq_save/restore(flags) are
      added inside this_cpu_inc operation, so that trace_hardirqs_off_caller
      will be called by trace_hardirqs_on_caller directly because
      __debug_atomic_inc is implemented as this_cpu_inc, which may trigger
      the lockdep warning[1], for example in the below ARM scenary:
      
      	kernel_thread_helper	/*irq disabled*/
      		->trace_hardirqs_on_caller	/*hardirqs_enabled was set*/
      			->trace_hardirqs_off_caller	/*hardirqs_enabled cleared*/
      				__this_cpu_add(redundant_hardirqs_on)
      			->trace_hardirqs_off_caller	/*irq disabled, so call here*/
      
      The 'unannotated irqs-on' warning will be triggered somewhere because
      irq is just enabled after the irq trace in kernel_thread_helper.
      
      [1],
      [    0.162841] ------------[ cut here ]------------
      [    0.167694] WARNING: at kernel/lockdep.c:3493 check_flags+0xc0/0x1d0()
      [    0.174468] Modules linked in:
      [    0.177703] Backtrace:
      [    0.180328] [<c00171f0>] (dump_backtrace+0x0/0x110) from [<c0412320>] (dump_stack+0x18/0x1c)
      [    0.189086]  r6:c051f778 r5:00000da5 r4:00000000 r3:60000093
      [    0.195007] [<c0412308>] (dump_stack+0x0/0x1c) from [<c00410e8>] (warn_slowpath_common+0x54/0x6c)
      [    0.204223] [<c0041094>] (warn_slowpath_common+0x0/0x6c) from [<c0041124>] (warn_slowpath_null+0x24/0x2c)
      [    0.214111]  r8:00000000 r7:00000000 r6:ee069598 r5:60000013 r4:ee082000
      [    0.220825] r3:00000009
      [    0.223693] [<c0041100>] (warn_slowpath_null+0x0/0x2c) from [<c0088f38>] (check_flags+0xc0/0x1d0)
      [    0.232910] [<c0088e78>] (check_flags+0x0/0x1d0) from [<c008d348>] (lock_acquire+0x4c/0x11c)
      [    0.241668] [<c008d2fc>] (lock_acquire+0x0/0x11c) from [<c0415aa4>] (_raw_spin_lock+0x3c/0x74)
      [    0.250610] [<c0415a68>] (_raw_spin_lock+0x0/0x74) from [<c010a844>] (set_task_comm+0x20/0xc0)
      [    0.259521]  r6:ee069588 r5:ee0691c0 r4:ee082000
      [    0.264404] [<c010a824>] (set_task_comm+0x0/0xc0) from [<c0060780>] (kthreadd+0x28/0x108)
      [    0.272857]  r8:00000000 r7:00000013 r6:c0044a08 r5:ee0691c0 r4:ee082000
      [    0.279571] r3:ee083fe0
      [    0.282470] [<c0060758>] (kthreadd+0x0/0x108) from [<c0044a08>] (do_exit+0x0/0x6dc)
      [    0.290405]  r5:c0060758 r4:00000000
      [    0.294189] ---[ end trace 1b75b31a2719ed1c ]---
      [    0.299041] possible reason: unannotated irqs-on.
      [    0.303955] irq event stamp: 5
      [    0.307159] hardirqs last  enabled at (4): [<c001331c>] no_work_pending+0x8/0x2c
      [    0.314880] hardirqs last disabled at (5): [<c0089b08>] trace_hardirqs_on_caller+0x60/0x26c
      [    0.323547] softirqs last  enabled at (0): [<c003f754>] copy_process+0x33c/0xef4
      [    0.331207] softirqs last disabled at (0): [<  (null)>]   (null)
      [    0.337585] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e920d597
    • K
      percpu: fix generic definition of __this_cpu_add_and_return() · 7d96b3e5
      Konstantin Khlebnikov 提交于
      This patch adds missed "__" into function prefix.
      Otherwise on all archectures (except x86) it expands to irq/preemtion-safe
      variant: _this_cpu_generic_add_return(), which do extra irq-save/irq-restore.
      Optimal generic implementation is __this_cpu_generic_add_return().
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      7d96b3e5
  13. 23 12月, 2011 1 次提交
  14. 04 6月, 2011 1 次提交
  15. 05 5月, 2011 1 次提交
    • T
      slub: Fix the lockless code on 32-bit platforms with no 64-bit cmpxchg · 30106b8c
      Thomas Gleixner 提交于
      The SLUB allocator use of the cmpxchg_double logic was wrong: it
      actually needs the irq-safe one.
      
      That happens automatically when we use the native unlocked 'cmpxchg8b'
      instruction, but when compiling the kernel for older x86 CPUs that do
      not support that instruction, we fall back to the generic emulation
      code.
      
      And if you don't specify that you want the irq-safe version, the generic
      code ends up just open-coding the cmpxchg8b equivalent without any
      protection against interrupts or preemption.  Which definitely doesn't
      work for SLUB.
      
      This was reported by Werner Landgraf <w.landgraf@ru.ru>, who saw
      instability with his distro-kernel that was compiled to support pretty
      much everything under the sun.  Most big Linux distributions tend to
      compile for PPro and later, and would never have noticed this problem.
      
      This also fixes the prototypes for the irqsafe cmpxchg_double functions
      to use 'bool' like they should.
      
      [ Btw, that whole "generic code defaults to no protection" design just
        sounds stupid - if the code needs no protection, there is no reason to
        use "cmpxchg_double" to begin with.  So we should probably just remove
        the unprotected version entirely as pointless.   - Linus ]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reported-and-tested-by: Nwerner <w.landgraf@ru.ru>
      Acked-and-tested-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Tejun Heo <tj@kernel.org>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1105041539050.3005@ionosSigned-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      30106b8c
  16. 28 2月, 2011 1 次提交
    • C
      percpu: Generic support for this_cpu_cmpxchg_double() · 7c334339
      Christoph Lameter 提交于
      Introduce this_cpu_cmpxchg_double().  this_cpu_cmpxchg_double() allows
      the comparison between two consecutive words and replaces them if
      there is a match.
      
      	bool this_cpu_cmpxchg_double(pcp1, pcp2,
      		old_word1, old_word2, new_word1, new_word2)
      
      this_cpu_cmpxchg_double does not return the old value (difficult since
      there are two words) but a boolean indicating if the operation was
      successful.
      
      The first percpu variable must be double word aligned!
      
      -tj: Updated to return bool instead of int, converted size check to
           BUILD_BUG_ON() instead of VM_BUG_ON() and other cosmetic changes.
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      7c334339
  17. 18 12月, 2010 1 次提交
    • C
      percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support · 2b712442
      Christoph Lameter 提交于
      Generic code to provide new per cpu atomic features
      
      	this_cpu_cmpxchg
      	this_cpu_xchg
      
      Fallback occurs to functions using interrupts disable/enable
      to ensure correct per cpu atomicity.
      
      Fallback to regular cmpxchg and xchg is not possible since per cpu atomic
      semantics include the guarantee that the current cpus per cpu data is
      accessed atomically. Use of regular cmpxchg and xchg requires the
      determination of the address of the per cpu data before regular cmpxchg
      or xchg which therefore cannot be atomically included in an xchg or
      cmpxchg without segment override.
      
      tj: - Relocated new ops to conform better to the general organization.
          - This patch contains a trivial comment fix.
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      2b712442
  18. 17 12月, 2010 2 次提交
  19. 02 10月, 2010 2 次提交
    • T
      percpu: use percpu allocator on UP too · 9b8327bb
      Tejun Heo 提交于
      On UP, percpu allocations were redirected to kmalloc.  This has the
      following problems.
      
      * For certain amount of allocations (determined by
        PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu
        allocator can be used before the usual kernel memory allocator is
        brought online.  On SMP, this is used to initialize the kernel
        memory allocator.
      
      * percpu allocator honors alignment upto PAGE_SIZE but kmalloc()
        doesn't.  For example, workqueue makes use of larger alignments for
        cpu_workqueues.
      
      Currently, users of percpu allocators need to handle UP differently,
      which is somewhat fragile and ugly.  Other than small amount of
      memory, there isn't much to lose by enabling percpu allocator on UP.
      It can simply use kernel memory based chunk allocation which was added
      for SMP archs w/o MMUs.
      
      This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and
      makes UP build use percpu-km.  As percpu addresses and kernel
      addresses are always identity mapped and static percpu variables don't
      need any special treatment, nothing is arch dependent and mm/percpu.c
      implements generic setup_per_cpu_areas() for UP.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      9b8327bb
    • T
      percpu: reduce PCPU_MIN_UNIT_SIZE to 32k · a7b6b77b
      Tejun Heo 提交于
      In preparation of enabling percpu allocator for UP, reduce
      PCPU_MIN_UNIT_SIZE to 32k.  On UP, the first chunk doesn't have to
      include static percpu variables and chunk size can be smaller which is
      important as UP percpu allocator will use contiguous kernel memory to
      populate chunks.
      
      PCPU_MIN_UNIT_SIZE also determines the maximum supported allocation
      size but 32k should still be enough.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a7b6b77b
  20. 21 9月, 2010 1 次提交
  21. 08 9月, 2010 2 次提交
    • T
      percpu: use percpu allocator on UP too · bbddff05
      Tejun Heo 提交于
      On UP, percpu allocations were redirected to kmalloc.  This has the
      following problems.
      
      * For certain amount of allocations (determined by
        PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu
        allocator can be used before the usual kernel memory allocator is
        brought online.  On SMP, this is used to initialize the kernel
        memory allocator.
      
      * percpu allocator honors alignment upto PAGE_SIZE but kmalloc()
        doesn't.  For example, workqueue makes use of larger alignments for
        cpu_workqueues.
      
      Currently, users of percpu allocators need to handle UP differently,
      which is somewhat fragile and ugly.  Other than small amount of
      memory, there isn't much to lose by enabling percpu allocator on UP.
      It can simply use kernel memory based chunk allocation which was added
      for SMP archs w/o MMUs.
      
      This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and
      makes UP build use percpu-km.  As percpu addresses and kernel
      addresses are always identity mapped and static percpu variables don't
      need any special treatment, nothing is arch dependent and mm/percpu.c
      implements generic setup_per_cpu_areas() for UP.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      bbddff05
    • T
      percpu: reduce PCPU_MIN_UNIT_SIZE to 32k · 6abad5ac
      Tejun Heo 提交于
      In preparation of enabling percpu allocator for UP, reduce
      PCPU_MIN_UNIT_SIZE to 32k.  On UP, the first chunk doesn't have to
      include static percpu variables and chunk size can be smaller which is
      important as UP percpu allocator will use contiguous kernel memory to
      populate chunks.
      
      PCPU_MIN_UNIT_SIZE also determines the maximum supported allocation
      size but 32k should still be enough.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux.com>
      6abad5ac
  22. 07 8月, 2010 1 次提交
  23. 28 6月, 2010 2 次提交
    • T
      percpu: allow limited allocation before slab is online · 099a19d9
      Tejun Heo 提交于
      This patch updates percpu allocator such that it can serve limited
      amount of allocation before slab comes online.  This is primarily to
      allow slab to depend on working percpu allocator.
      
      Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how
      much memory space and allocation map slots are reserved.  If this
      reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation
      will fail till slab comes online.
      
      The following changes are made to implement early alloc.
      
      * pcpu_mem_alloc() now checks slab_is_available()
      
      * Chunks are allocated using pcpu_mem_alloc()
      
      * Init paths make sure ai->dyn_size is at least as large as
        PERCPU_DYNAMIC_EARLY_SIZE.
      
      * Initial alloc maps are allocated in __initdata and copied to
        kmalloc'd areas once slab is online.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      099a19d9
    • T
      percpu: make @dyn_size always mean min dyn_size in first chunk init functions · 4ba6ce25
      Tejun Heo 提交于
      In pcpu_build_alloc_info() and pcpu_embed_first_chunk(), @dyn_size was
      ssize_t, -1 meant auto-size, 0 forced 0 and positive meant minimum
      size.  There's no use case for forcing 0 and the upcoming early alloc
      support always requires non-zero dynamic size.  Make @dyn_size always
      mean minimum dyn_size.
      
      While at it, make pcpu_build_alloc_info() static which doesn't have
      any external caller as suggested by David Rientjes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      4ba6ce25
  24. 30 3月, 2010 1 次提交
    • T
      percpu: don't implicitly include slab.h from percpu.h · de380b55
      Tejun Heo 提交于
      percpu.h has always been including slab.h to get k[mz]alloc/free() for
      UP inline implementation.  percpu.h being used by very low level
      headers including module.h and sched.h, this meant that a lot files
      unintentionally got slab.h inclusion.
      
      Lee Schermerhorn was trying to make topology.h use percpu.h and got
      bitten by this implicit inclusion.  The right thing to do is break
      this ultimately unnecessary dependency.  The previous patch added
      explicit inclusion of either gfp.h or slab.h to the source files using
      them.  This patch updates percpu.h such that slab.h is no longer
      included from percpu.h.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      de380b55
  25. 29 3月, 2010 1 次提交
  26. 02 12月, 2009 1 次提交
  27. 25 11月, 2009 1 次提交
    • V
      percpu: Fix kdump failure if booted with percpu_alloc=page · 3b034b0d
      Vivek Goyal 提交于
      o kdump functionality reserves a per cpu area at boot time and exports the
        physical address of that area to user space through sys interface. This
        area stores some dump related information like cpu register states etc
        at the time of crash.
      
      o We were assuming that per cpu area always come from linearly mapped meory
        region and using __pa() to determine physical address.
        With percpu_alloc=page, per cpu area can come from vmalloc region also and
        __pa() breaks.
      
      o This patch implments a new function to convert per cpu address to
        physical address.
      
      Before the patch, crash_notes addresses looked as follows.
      
      cpu0 60fffff49800
      cpu1 60fffff60800
      cpu2 60fffff77800
      
      These are bogus phsyical addresses.
      
      After the patch, address are following.
      
      cpu0 13eb44000
      cpu1 13eb43000
      cpu2 13eb42000
      cpu3 13eb41000
      
      These look fine. I got 4G of memory and /proc/iomem tell me following.
      
      100000000-13fffffff : System RAM
      
      tj: * added missing asm/io.h include reported by Stephen Rothwell
          * repositioned per_cpu_ptr_phys() in percpu.c and added comment.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      3b034b0d
  28. 29 10月, 2009 3 次提交
    • T
      percpu: make accessors check for percpu pointer in sparse · 545695fb
      Tejun Heo 提交于
      The previous patch made sparse warn about percpu variables being used
      directly without going through percpu accessors.  This patch
      implements the other half - checking whether non percpu variable is
      passed into percpu accessors.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      545695fb
    • R
      percpu: add __percpu for sparse. · e0fdb0e0
      Rusty Russell 提交于
      We have to make __kernel "__attribute__((address_space(0)))" so we can
      cast to it.
      
      tj: * put_cpu_var() update.
      
          * Annotations added to dynamic allocator interface.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e0fdb0e0
    • T
      percpu: make access macros universal · f7b64fe8
      Tejun Heo 提交于
      Now that per_cpu__ prefix is gone, there's no distinction between
      static and dynamic percpu variables.  Make get_cpu_var() take dynamic
      percpu variables and ensure that all macros have parentheses around
      the parameter evaluation and evaluate the variable parameter only once
      such that any expression which evaluates to percpu address can be used
      safely.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      f7b64fe8