1. 29 10月, 2009 5 次提交
    • T
      percpu: make percpu symbols in tracer unique · 9705f69e
      Tejun Heo 提交于
      This patch updates percpu related symbols in kernel tracer such that
      percpu symbols are unique and don't clash with local symbols.  This
      serves two purposes of decreasing the possibility of global percpu
      symbol collision and allowing dropping per_cpu__ prefix from percpu
      symbols.
      
      * kernel/trace/trace.c: s/max_data/max_tr_data/
      * kernel/trace/trace_hw_branches: s/tracer/hwb_tracer/, s/buffer/hwb_buffer/
      
      Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
      which cause name clashes" patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      9705f69e
    • T
      percpu: make percpu symbols under kernel/ and mm/ unique · 1871e52c
      Tejun Heo 提交于
      This patch updates percpu related symbols under kernel/ and mm/ such
      that percpu symbols are unique and don't clash with local symbols.
      This serves two purposes of decreasing the possibility of global
      percpu symbol collision and allowing dropping per_cpu__ prefix from
      percpu symbols.
      
      * kernel/lockdep.c: s/lock_stats/cpu_lock_stats/
      
      * kernel/sched.c: s/init_rq_rt/init_rt_rq_var/	(any better idea?)
        		  s/sched_group_cpus/sched_groups/
      
      * kernel/softirq.c: s/ksoftirqd/run_ksoftirqd/a
      
      * kernel/softlockup.c: s/(*)_timestamp/softlockup_\1_ts/
        		       s/watchdog_task/softlockup_watchdog/
      		       s/timestamp/ts/ for local variables
      
      * kernel/time/timer_stats: s/lookup_lock/tstats_lookup_lock/
      
      * mm/slab.c: s/reap_work/slab_reap_work/
        	     s/reap_node/slab_reap_node/
      
      * mm/vmstat.c: local variable changed to avoid collision with vmstat_work
      
      Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
      which cause name clashes" patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N(slab/vmstat) Christoph Lameter <cl@linux-foundation.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      1871e52c
    • T
      percpu: remove some sparse warnings · 0f5e4816
      Tejun Heo 提交于
      Make the following changes to remove some sparse warnings.
      
      * Make DEFINE_PER_CPU_SECTION() declare __pcpu_unique_* before
        defining it.
      
      * Annotate pcpu_extend_area_map() that it is entered with pcpu_lock
        held, releases it and then reacquires it.
      
      * Make percpu related macros use unique nested variable names.
      
      * While at it, add pcpu prefix to __size_call[_return]() macros as
        to-be-implemented sparse annotations will add percpu specific stuff
        to these macros.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      0f5e4816
    • T
      percpu: make alloc_percpu() handle array types · 64ef291f
      Tejun Heo 提交于
      alloc_percpu() couldn't handle array types like "int [100]" due to the
      way return type was casted.  Fix it by using typeof() instead.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      64ef291f
    • T
      vmalloc: fix use of non-existent percpu variable in put_cpu_var() · 3f04ba85
      Tejun Heo 提交于
      vmalloc used non-existent percpu variable vmap_cpu_blocks instead of
      the intended vmap_block_queue.  This went unnoticed because
      put_cpu_var() didn't evaluate the parameter.  Fix it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Nick Piggin <npiggin@suse.de>
      3f04ba85
  2. 13 10月, 2009 1 次提交
  3. 12 10月, 2009 4 次提交
  4. 03 10月, 2009 11 次提交
    • C
      this_cpu: Use this_cpu operations in RCU · e800879d
      Christoph Lameter 提交于
      RCU does not do dynamic allocations but it increments per cpu variables
      a lot. These instructions results in a move to a register and then back
      to memory. This patch will make it use the inc/dec instructions on x86
      that do not need a register.
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e800879d
    • C
      this_cpu: Use this_cpu ops for VM statistics · 4dac3e98
      Christoph Lameter 提交于
      Using per cpu atomics for the vm statistics reduces their overhead.
      And in the case of x86 we are guaranteed that they will never race even
      in the lax form used for vm statistics.
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4dac3e98
    • C
      this_cpu: Use this_cpu_ptr in crypto subsystem · 0b44f486
      Christoph Lameter 提交于
      Just a slight optimization that removes one array lookup.
      The processor number is needed for other things as well so the
      get/put_cpu cannot be removed.
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      0b44f486
    • C
      this_cpu: xfs_icsb_modify_counters does not need "cpu" variable · 7a9e02d6
      Christoph Lameter 提交于
      The xfs_icsb_modify_counters() function no longer needs the cpu variable
      if we use this_cpu_ptr() and we can get rid of get/put_cpu().
      Acked-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NOlaf Weber <olaf@sgi.com>
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      7a9e02d6
    • C
      this_cpu: Eliminate get/put_cpu · e7dcaa47
      Christoph Lameter 提交于
      There are cases where we can use this_cpu_ptr and as the result
      of using this_cpu_ptr() we no longer need to determine the
      currently executing cpu.
      
      In those places no get/put_cpu combination is needed anymore.
      The local cpu variable can be eliminated.
      
      Preemption still needs to be disabled and enabled since the
      modifications of the per cpu variables is not atomic. There may
      be multiple per cpu variables modified and those must all
      be from the same processor.
      Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      cc: Eric Biederman <ebiederm@aristanetworks.com>
      cc: Stephen Hemminger <shemminger@vyatta.com>
      cc: David L Stevens <dlstevens@us.ibm.com>
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e7dcaa47
    • C
      this_cpu: Straight transformations · ca0c9584
      Christoph Lameter 提交于
      Use this_cpu_ptr and __this_cpu_ptr in locations where straight
      transformations are possible because per_cpu_ptr is used with
      either smp_processor_id() or raw_smp_processor_id().
      
      cc: David Howells <dhowells@redhat.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      cc: Ingo Molnar <mingo@elte.hu>
      cc: Rusty Russell <rusty@rustcorp.com.au>
      cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      ca0c9584
    • C
    • C
      this_cpu: Use this_cpu operations for NFS statistics · fce22848
      Christoph Lameter 提交于
      Simplify NFS statistics and allow the use of optimized
      arch instructions.
      Acked-by: NTejun Heo <tj@kernel.org>
      CC: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      fce22848
    • C
      this_cpu: Use this_cpu operations for SNMP statistics · 4eb41d10
      Christoph Lameter 提交于
      SNMP statistic macros can be signficantly simplified.
      This will also reduce code size if the arch supports these operations
      in hardware.
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4eb41d10
    • C
      this_cpu: Implement X86 optimized this_cpu operations · 30ed1a79
      Christoph Lameter 提交于
      Basically the existing percpu ops can be used for this_cpu variants that allow
      operations also on dynamically allocated percpu data. However, we do not pass a
      reference to a percpu variable in. Instead a dynamically or statically
      allocated percpu variable is provided.
      
      Preempt, the non preempt and the irqsafe operations generate the same code.
      It will always be possible to have the requires per cpu atomicness in a single
      RMW instruction with segment override on x86.
      
      64 bit this_cpu operations are not supported on 32 bit.
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      30ed1a79
    • C
      this_cpu: Introduce this_cpu_ptr() and generic this_cpu_* operations · 7340a0b1
      Christoph Lameter 提交于
      This patch introduces two things: First this_cpu_ptr and then per cpu
      atomic operations.
      
      this_cpu_ptr
      ------------
      
      A common operation when dealing with cpu data is to get the instance of the
      cpu data associated with the currently executing processor. This can be
      optimized by
      
      this_cpu_ptr(xx) = per_cpu_ptr(xx, smp_processor_id).
      
      The problem with per_cpu_ptr(x, smp_processor_id) is that it requires
      an array lookup to find the offset for the cpu. Processors typically
      have the offset for the current cpu area in some kind of (arch dependent)
      efficiently accessible register or memory location.
      
      We can use that instead of doing the array lookup to speed up the
      determination of the address of the percpu variable. This is particularly
      significant because these lookups occur in performance critical paths
      of the core kernel. this_cpu_ptr() can avoid memory accesses and
      
      this_cpu_ptr comes in two flavors. The preemption context matters since we
      are referring the the currently executing processor. In many cases we must
      insure that the processor does not change while a code segment is executed.
      
      __this_cpu_ptr 	-> Do not check for preemption context
      this_cpu_ptr	-> Check preemption context
      
      The parameter to these operations is a per cpu pointer. This can be the
      address of a statically defined per cpu variable (&per_cpu_var(xxx)) or
      the address of a per cpu variable allocated with the per cpu allocator.
      
      per cpu atomic operations: this_cpu_*(var, val)
      -----------------------------------------------
      this_cpu_* operations (like this_cpu_add(struct->y, value) operate on
      abitrary scalars that are members of structures allocated with the new
      per cpu allocator. They can also operate on static per_cpu variables
      if they are passed to per_cpu_var() (See patch to use this_cpu_*
      operations for vm statistics).
      
      These operations are guaranteed to be atomic vs preemption when modifying
      the scalar. The calculation of the per cpu offset is also guaranteed to
      be atomic at the same time. This means that a this_cpu_* operation can be
      safely used to modify a per cpu variable in a context where interrupts are
      enabled and preemption is allowed. Many architectures can perform such
      a per cpu atomic operation with a single instruction.
      
      Note that the atomicity here is different from regular atomic operations.
      Atomicity is only guaranteed for data accessed from the currently executing
      processor. Modifications from other processors are still possible. There
      must be other guarantees that the per cpu data is not modified from another
      processor when using these instruction. The per cpu atomicity is created
      by the fact that the processor either executes and instruction or not.
      Embedded in the instruction is the relocation of the per cpu address to
      the are reserved for the current processor and the RMW action. Therefore
      interrupts or preemption cannot occur in the mids of this processing.
      
      Generic fallback functions are used if an arch does not define optimized
      this_cpu operations. The functions come also come in the two flavors used
      for this_cpu_ptr().
      
      The firstparameter is a scalar that is a member of a structure allocated
      through allocpercpu or a per cpu variable (use per_cpu_var(xxx)). The
      operations are similar to what percpu_add() and friends do.
      
      this_cpu_read(scalar)
      this_cpu_write(scalar, value)
      this_cpu_add(scale, value)
      this_cpu_sub(scalar, value)
      this_cpu_inc(scalar)
      this_cpu_dec(scalar)
      this_cpu_and(scalar, value)
      this_cpu_or(scalar, value)
      this_cpu_xor(scalar, value)
      
      Arch code can override the generic functions and provide optimized atomic
      per cpu operations. These atomic operations must provide both the relocation
      (x86 does it through a segment override) and the operation on the data in a
      single instruction. Otherwise preempt needs to be disabled and there is no
      gain from providing arch implementations.
      
      A third variant is provided prefixed by irqsafe_. These variants are safe
      against hardware interrupts on the *same* processor (all per cpu atomic
      primitives are *always* *only* providing safety for code running on the
      *same* processor!). The increment needs to be implemented by the hardware
      in such a way that it is a single RMW instruction that is either processed
      before or after an interrupt.
      
      cc: David Howells <dhowells@redhat.com>
      cc: Ingo Molnar <mingo@elte.hu>
      cc: Rusty Russell <rusty@rustcorp.com.au>
      cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      7340a0b1
  5. 02 10月, 2009 19 次提交