1. 24 5月, 2008 1 次提交
    • M
      x86: Add performance variants of cpumask operators · 41df0d61
      Mike Travis 提交于
        * Increase performance for systems with large count NR_CPUS by limiting
          the range of the cpumask operators that loop over the bits in a cpumask_t
          variable.  This removes a large amount of wasted cpu cycles.
      
        * Add performance variants of the cpumask operators:
      
          int cpus_weight_nr(mask)	     Same using nr_cpu_ids instead of NR_CPUS
          int first_cpu_nr(mask)	     Number lowest set bit, or nr_cpu_ids
          int next_cpu_nr(cpu, mask)	     Next cpu past 'cpu', or nr_cpu_ids
          for_each_cpu_mask_nr(cpu, mask)  for-loop cpu over mask using nr_cpu_ids
      
        * Modify following to use performance variants:
      
          #define num_online_cpus()	cpus_weight_nr(cpu_online_map)
          #define num_possible_cpus()	cpus_weight_nr(cpu_possible_map)
          #define num_present_cpus()	cpus_weight_nr(cpu_present_map)
      
          #define for_each_possible_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)
          #define for_each_online_cpu(cpu)   for_each_cpu_mask_nr((cpu), ...)
          #define for_each_present_cpu(cpu)  for_each_cpu_mask_nr((cpu), ...)
      
        * Comment added to include/linux/cpumask.h:
      
          Note: The alternate operations with the suffix "_nr" are used
      	  to limit the range of the loop to nr_cpu_ids instead of
      	  NR_CPUS when NR_CPUS > 64 for performance reasons.
      	  If NR_CPUS is <= 64 then most assembler bitmask
      	  operators execute faster with a constant range, so
      	  the operator will continue to use NR_CPUS.
      
      	  Another consideration is that nr_cpu_ids is initialized
      	  to NR_CPUS and isn't lowered until the possible cpus are
      	  discovered (including any disabled cpus).  So early uses
      	  will span the entire range of NR_CPUS.
      
          (The net effect is that for systems with 64 or less CPU's there are no
           functional changes.)
      
      For inclusion into sched-devel/latest tree.
      
      Based on:
      	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
          +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git
      
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Reviewed-by: NPaul Jackson <pj@sgi.com>
      Reviewed-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      41df0d61
  2. 08 5月, 2007 1 次提交
  3. 21 2月, 2007 1 次提交
  4. 21 10月, 2006 1 次提交
    • A
      [PATCH] highest_possible_node_id() linkage fix · 6220ec78
      Andrew Morton 提交于
      Qooting Adrian:
      
      - net/sunrpc/svc.c uses highest_possible_node_id()
      
      - include/linux/nodemask.h says highest_possible_node_id() is
        out-of-line #if MAX_NUMNODES > 1
      
      - the out-of-line highest_possible_node_id() is in lib/cpumask.c
      
      - lib/Makefile: lib-$(CONFIG_SMP) += cpumask.o
        CONFIG_ARCH_DISCONTIGMEM_ENABLE=y, CONFIG_SMP=n, CONFIG_SUNRPC=y
      
      -> highest_possible_node_id() is used in net/sunrpc/svc.c
         CONFIG_NODES_SHIFT defined and > 0
      
      -> include/linux/numa.h: MAX_NUMNODES > 1
      
      -> compile error
      
      The bug is not present on architectures where ARCH_DISCONTIGMEM_ENABLE
      depends on NUMA (but m32r isn't the only affected architecture).
      
      So move the function into page_alloc.c
      
      Cc: Adrian Bunk <bunk@stusta.de>
      Cc: Paul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6220ec78
  5. 02 10月, 2006 1 次提交
  6. 26 3月, 2006 4 次提交