1. 25 2月, 2014 4 次提交
    • F
      smp: Rename __smp_call_function_single() to smp_call_function_single_async() · c46fff2a
      Frederic Weisbecker 提交于
      The name __smp_call_function_single() doesn't tell much about the
      properties of this function, especially when compared to
      smp_call_function_single().
      
      The comments above the implementation are also misleading. The main
      point of this function is actually not to be able to embed the csd
      in an object. This is actually a requirement that result from the
      purpose of this function which is to raise an IPI asynchronously.
      
      As such it can be called with interrupts disabled. And this feature
      comes at the cost of the caller who then needs to serialize the
      IPIs on this csd.
      
      Lets rename the function and enhance the comments so that they reflect
      these properties.
      Suggested-by: NChristoph Hellwig <hch@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c46fff2a
    • F
      smp: Remove wait argument from __smp_call_function_single() · fce8ad15
      Frederic Weisbecker 提交于
      The main point of calling __smp_call_function_single() is to send
      an IPI in a pure asynchronous way. By embedding a csd in an object,
      a caller can send the IPI without waiting for a previous one to complete
      as is required by smp_call_function_single() for example. As such,
      sending this kind of IPI can be safe even when irqs are disabled.
      
      This flexibility comes at the expense of the caller who then needs to
      synchronize the csd lifecycle by himself and make sure that IPIs on a
      single csd are serialized.
      
      This is how __smp_call_function_single() works when wait = 0 and this
      usecase is relevant.
      
      Now there don't seem to be any usecase with wait = 1 that can't be
      covered by smp_call_function_single() instead, which is safer. Lets look
      at the two possible scenario:
      
      1) The user calls __smp_call_function_single(wait = 1) on a csd embedded
         in an object. It looks like a nice and convenient pattern at the first
         sight because we can then retrieve the object from the IPI handler easily.
      
         But actually it is a waste of memory space in the object since the csd
         can be allocated from the stack by smp_call_function_single(wait = 1)
         and the object can be passed an the IPI argument.
      
         Besides that, embedding the csd in an object is more error prone
         because the caller must take care of the serialization of the IPIs
         for this csd.
      
      2) The user calls __smp_call_function_single(wait = 1) on a csd that
         is allocated on the stack. It's ok but smp_call_function_single()
         can do it as well and it already takes care of the allocation on the
         stack. Again it's more simple and less error prone.
      
      Therefore, using the underscore prepend API version with wait = 1
      is a bad pattern and a sign that the caller can do safer and more
      simple.
      
      There was a single user of that which has just been converted.
      So lets remove this option to discourage further users.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      fce8ad15
    • J
      smp: Teach __smp_call_function_single() to check for offline cpus · 08eed44c
      Jan Kara 提交于
      Align __smp_call_function_single() with smp_call_function_single() so
      that it also checks whether requested cpu is still online.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      08eed44c
    • J
      smp: Remove unused list_head from csd · 0ebeb79c
      Jan Kara 提交于
      Now that we got rid of all the remaining code which fiddled with csd.list,
      lets remove it.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0ebeb79c
  2. 11 2月, 2014 1 次提交
  3. 31 1月, 2014 1 次提交
  4. 15 11月, 2013 2 次提交
  5. 25 9月, 2013 1 次提交
    • M
      watchdog: update watchdog_thresh properly · 9809b18f
      Michal Hocko 提交于
      watchdog_tresh controls how often nmi perf event counter checks per-cpu
      hrtimer_interrupts counter and blows up if the counter hasn't changed
      since the last check.  The counter is updated by per-cpu
      watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh
      period which guarantees that hrtimer is scheduled 2 times per the main
      period.  Both hrtimer and perf event are started together when the
      watchdog is enabled.
      
      So far so good.  But...
      
      But what happens when watchdog_thresh is updated from sysctl handler?
      
      proc_dowatchdog will set a new sampling period and hrtimer callback
      (watchdog_timer_fn) will use the new value in the next round.  The
      problem, however, is that nobody tells the perf event that the sampling
      period has changed so it is ticking with the period configured when it
      has been set up.
      
      This might result in an ear ripping dissonance between perf and hrtimer
      parts if the watchdog_thresh is increased.  And even worse it might lead
      to KABOOM if the watchdog is configured to panic on such a spurious
      lockup.
      
      This patch fixes the issue by updating both nmi perf even counter and
      hrtimers if the threshold value has changed.
      
      The nmi one is disabled and then reinitialized from scratch.  This has
      an unpleasant side effect that the allocation of the new event might
      fail theoretically so the hard lockup detector would be disabled for
      such cpus.  On the other hand such a memory allocation failure is very
      unlikely because the original event is deallocated right before.
      
      It would be much nicer if we just changed perf event period but there
      doesn't seem to be any API to do that right now.  It is also unfortunate
      that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we
      cannot use on_each_cpu() and do the same thing from the per-cpu context.
      The update from the current CPU should be safe because
      perf_event_disable removes the event atomically before it clears the
      per-cpu watchdog_ev so it cannot change anything under running handler
      feet.
      
      The hrtimer is simply restarted (thanks to Don Zickus who has pointed
      this out) if it is queued because we cannot rely it will fire&adopt to
      the new sampling period before a new nmi event triggers (when the
      treshold is decreased).
      
      [akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place]
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Fabio Estevam <festevam@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9809b18f
  6. 12 9月, 2013 3 次提交
  7. 04 7月, 2013 1 次提交
  8. 15 6月, 2013 1 次提交
    • D
      smp.h: Use local_irq_{save,restore}() in !SMP version of on_each_cpu(). · f21afc25
      David Daney 提交于
      Thanks to commit f91eb62f ("init: scream bloody murder if interrupts
      are enabled too early"), "bloody murder" is now being screamed.
      
      With a MIPS OCTEON config, we use on_each_cpu() in our
      irq_chip.irq_bus_sync_unlock() function.  This gets called in early as a
      result of the time_init() call.  Because the !SMP version of
      on_each_cpu() unconditionally enables irqs, we get:
      
          WARNING: at init/main.c:560 start_kernel+0x250/0x410()
          Interrupts were enabled early
          CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0-rc5-Cavium-Octeon+ #801
          Call Trace:
            show_stack+0x68/0x80
            warn_slowpath_common+0x78/0xb0
            warn_slowpath_fmt+0x38/0x48
            start_kernel+0x250/0x410
      
      Suggested fix: Do what we already do in the SMP version of
      on_each_cpu(), and use local_irq_save/local_irq_restore.  Because we
      need a flags variable, make it a static inline to avoid name space
      issues.
      
      [ Change from v1: Convert on_each_cpu to a static inline function, add
        #include <linux/irqflags.h> to avoid build breakage on some files.
      
        on_each_cpu_mask() and on_each_cpu_cond() suffer the same problem as
        on_each_cpu(), but they are not causing !SMP bugs for me, so I will
        defer changing them to a less urgent patch. ]
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f21afc25
  9. 01 5月, 2013 1 次提交
  10. 22 2月, 2013 1 次提交
    • S
      smp: make smp_call_function_many() use logic similar to smp_call_function_single() · 9a46ad6d
      Shaohua Li 提交于
      I'm testing swapout workload in a two-socket Xeon machine.  The workload
      has 10 threads, each thread sequentially accesses separate memory
      region.  TLB flush overhead is very big in the workload.  For each page,
      page reclaim need move it from active lru list and then unmap it.  Both
      need a TLB flush.  And this is a multthread workload, TLB flush happens
      in 10 CPUs.  In X86, TLB flush uses generic smp_call)function.  So this
      workload stress smp_call_function_many heavily.
      
      Without patch, perf shows:
      +  24.49%  [k] generic_smp_call_function_interrupt
      -  21.72%  [k] _raw_spin_lock
         - _raw_spin_lock
            + 79.80% __page_check_address
            + 6.42% generic_smp_call_function_interrupt
            + 3.31% get_swap_page
            + 2.37% free_pcppages_bulk
            + 1.75% handle_pte_fault
            + 1.54% put_super
            + 1.41% grab_super_passive
            + 1.36% __swap_duplicate
            + 0.68% blk_flush_plug_list
            + 0.62% swap_info_get
      +   6.55%  [k] flush_tlb_func
      +   6.46%  [k] smp_call_function_many
      +   5.09%  [k] call_function_interrupt
      +   4.75%  [k] default_send_IPI_mask_sequence_phys
      +   2.18%  [k] find_next_bit
      
      swapout throughput is around 1300M/s.
      
      With the patch, perf shows:
      -  27.23%  [k] _raw_spin_lock
         - _raw_spin_lock
            + 80.53% __page_check_address
            + 8.39% generic_smp_call_function_single_interrupt
            + 2.44% get_swap_page
            + 1.76% free_pcppages_bulk
            + 1.40% handle_pte_fault
            + 1.15% __swap_duplicate
            + 1.05% put_super
            + 0.98% grab_super_passive
            + 0.86% blk_flush_plug_list
            + 0.57% swap_info_get
      +   8.25%  [k] default_send_IPI_mask_sequence_phys
      +   7.55%  [k] call_function_interrupt
      +   7.47%  [k] smp_call_function_many
      +   7.25%  [k] flush_tlb_func
      +   3.81%  [k] _raw_spin_lock_irqsave
      +   3.78%  [k] generic_smp_call_function_single_interrupt
      
      swapout throughput is around 1400M/s.  So there is around a 7%
      improvement, and total cpu utilization doesn't change.
      
      Without the patch, cfd_data is shared by all CPUs.
      generic_smp_call_function_interrupt does read/write cfd_data several times
      which will create a lot of cache ping-pong.  With the patch, the data
      becomes per-cpu.  The ping-pong is avoided.  And from the perf data, this
      doesn't make call_single_queue lock contend.
      
      Next step is to remove generic_smp_call_function_interrupt() from arch
      code.
      Signed-off-by: NShaohua Li <shli@fusionio.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a46ad6d
  11. 05 6月, 2012 2 次提交
  12. 08 5月, 2012 1 次提交
  13. 26 4月, 2012 1 次提交
    • T
      smp: Add task_struct argument to __cpu_up() · 8239c25f
      Thomas Gleixner 提交于
      Preparatory patch to make the idle thread allocation for secondary
      cpus generic.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/20120420124556.964170564@linutronix.de
      8239c25f
  14. 29 3月, 2012 2 次提交
    • G
      smp: add func to IPI cpus based on parameter func · b3a7e98e
      Gilad Ben-Yossef 提交于
      Add the on_each_cpu_cond() function that wraps on_each_cpu_mask() and
      calculates the cpumask of cpus to IPI by calling a function supplied as a
      parameter in order to determine whether to IPI each specific cpu.
      
      The function works around allocation failure of cpumask variable in
      CONFIG_CPUMASK_OFFSTACK=y by itereating over cpus sending an IPI a time
      via smp_call_function_single().
      
      The function is useful since it allows to seperate the specific code that
      decided in each case whether to IPI a specific cpu for a specific request
      from the common boilerplate code of handling creating the mask, handling
      failures etc.
      
      [akpm@linux-foundation.org: s/gfpflags/gfp_flags/]
      [akpm@linux-foundation.org: avoid double-evaluation of `info' (per Michal), parenthesise evaluation of `cond_func']
      [akpm@linux-foundation.org: s/CPU/CPUs, use all 80 cols in comment]
      Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Sasha Levin <levinsasha928@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Avi Kivity <avi@redhat.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.org>
      Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com>
      Cc: Milton Miller <miltonm@bga.com>
      Reviewed-by: N"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3a7e98e
    • G
      smp: introduce a generic on_each_cpu_mask() function · 3fc498f1
      Gilad Ben-Yossef 提交于
      We have lots of infrastructure in place to partition multi-core systems
      such that we have a group of CPUs that are dedicated to specific task:
      cgroups, scheduler and interrupt affinity, and cpuisol= boot parameter.
      Still, kernel code will at times interrupt all CPUs in the system via IPIs
      for various needs.  These IPIs are useful and cannot be avoided
      altogether, but in certain cases it is possible to interrupt only specific
      CPUs that have useful work to do and not the entire system.
      
      This patch set, inspired by discussions with Peter Zijlstra and Frederic
      Weisbecker when testing the nohz task patch set, is a first stab at trying
      to explore doing this by locating the places where such global IPI calls
      are being made and turning the global IPI into an IPI for a specific group
      of CPUs.  The purpose of the patch set is to get feedback if this is the
      right way to go for dealing with this issue and indeed, if the issue is
      even worth dealing with at all.  Based on the feedback from this patch set
      I plan to offer further patches that address similar issue in other code
      paths.
      
      This patch creates an on_each_cpu_mask() and on_each_cpu_cond()
      infrastructure API (the former derived from existing arch specific
      versions in Tile and Arm) and uses them to turn several global IPI
      invocation to per CPU group invocations.
      
      Core kernel:
      
      on_each_cpu_mask() calls a function on processors specified by cpumask,
      which may or may not include the local processor.
      
      You must not call this function with disabled interrupts or from a
      hardware interrupt handler or from a bottom half handler.
      
      arch/arm:
      
      Note that the generic version is a little different then the Arm one:
      
      1. It has the mask as first parameter
      2. It calls the function on the calling CPU with interrupts disabled,
         but this should be OK since the function is called on the other CPUs
         with interrupts disabled anyway.
      
      arch/tile:
      
      The API is the same as the tile private one, but the generic version
      also calls the function on the with interrupts disabled in UP case
      
      This is OK since the function is called on the other CPUs
      with interrupts disabled.
      Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
      Reviewed-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Sasha Levin <levinsasha928@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Avi Kivity <avi@redhat.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.org>
      Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com>
      Cc: Milton Miller <miltonm@bga.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3fc498f1
  15. 17 6月, 2011 1 次提交
  16. 26 5月, 2011 1 次提交
  17. 23 3月, 2011 2 次提交
  18. 28 10月, 2010 1 次提交
  19. 07 3月, 2010 1 次提交
  20. 18 11月, 2009 1 次提交
    • R
      generic-ipi: Add smp_call_function_any() · 2ea6dec4
      Rusty Russell 提交于
      Andrew points out that acpi-cpufreq uses cpumask_any, when it really
      would prefer to use the same CPU if possible (to avoid an IPI).  In
      general, this seems a good idea to offer.
      
      [ tglx: Documented selection preference and Inlined the UP case to
        	avoid the copy of smp_call_function_single() and the extra
        	EXPORT ]
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Zhao Yakui <yakui.zhao@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      2ea6dec4
  21. 24 9月, 2009 1 次提交
  22. 17 6月, 2009 1 次提交
    • T
      remove put_cpu_no_resched() · 8b0b1db0
      Thomas Gleixner 提交于
      put_cpu_no_resched() is an optimization of put_cpu() which unfortunately
      can cause high latencies.
      
      The nfs iostats code uses put_cpu_no_resched() in a code sequence where a
      reschedule request caused by an interrupt between the get_cpu() and the
      put_cpu_no_resched() can delay the reschedule for at least HZ.
      
      The other users of put_cpu_no_resched() optimize correctly in interrupt
      code, but there is no real harm in using the put_cpu() function which is
      an alias for preempt_enable().  The extra check of the preemmpt count is
      not as critical as the potential source of missing a reschedule.
      
      Debugged in the preempt-rt tree and verified in mainline.
      
      Impact: remove a high latency source
      
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8b0b1db0
  23. 13 3月, 2009 1 次提交
  24. 25 2月, 2009 1 次提交
    • P
      generic-ipi: remove CSD_FLAG_WAIT · 6e275637
      Peter Zijlstra 提交于
      Oleg noticed that we don't strictly need CSD_FLAG_WAIT, rework
      the code so that we can use CSD_FLAG_LOCK for both purposes.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6e275637
  25. 06 2月, 2009 2 次提交
  26. 11 1月, 2009 1 次提交
  27. 30 12月, 2008 1 次提交
    • R
      cpumask: smp_call_function_many() · 54b11e6d
      Rusty Russell 提交于
      Impact: Implementation change to remove cpumask_t from stack.
      
      Actually change smp_call_function_mask() to smp_call_function_many().
      We avoid cpumasks on the stack in this version.
      
      (S390 has its own version, but that's going away apparently).
      
      We have to do some dancing to figure out if 0 or 1 other cpus are in
      the mask supplied and the online mask without allocating a tmp
      cpumask.  It's still fairly cheap.
      
      We allocate the cpumask at the end of the call_function_data
      structure: if allocation fails we fallback to smp_call_function_single
      rather than using the baroque quiescing code (which needs a cpumask on
      stack).
      
      (Thanks to Hiroshi Shimamoto for spotting several bugs in previous versions!)
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
      Cc: npiggin@suse.de
      Cc: axboe@kernel.dk
      54b11e6d
  28. 19 12月, 2008 1 次提交
    • M
      cpumask: add sysfs displays for configured and disabled cpu maps · e057d7ae
      Mike Travis 提交于
      Impact: add new sysfs files.
      
      Add sysfs files "kernel_max" and "offline" to display the max CPU index
      allowed (NR_CPUS-1), and the map of cpus that are offline.
      
      Cpus can be offlined via HOTPLUG, disabled by the BIOS ACPI tables, or
      if they exceed the number of cpus allowed by the NR_CPUS config option,
      or the "maxcpus=NUM" kernel start parameter.
      
      The "possible_cpus=NUM" parameter can also extend the number of possible
      cpus allowed, in which case the cpus not present at startup will be
      in the offline state.  (These cpus can be HOTPLUGGED ON after system
      startup [pending a follow-on patch to provide the capability via the
      /sys/devices/sys/cpu/cpuN/online mechanism to bring them online.])
      
      By design, the "offlined cpus > possible cpus" display will always
      use the following formats:
      
        * all possible cpus online:   "x$"    or "x-y$"
        * some possible cpus offline: ".*,x$" or ".*,x-y$"
      
      where:
        x == number of possible cpus (nr_cpu_ids); and
        y == number of cpus >= NR_CPUS or maxcpus (if y > x).
      
      One use of this feature is for distros to select (or configure) the
      appropriate kernel to install for the resident system.
      
      Notes:
        * cpus offlined <= possible cpus will be printed for all architectures.
        * cpus offlined >  possible cpus will only be printed for arches that
        	set 'total_cpus' [X86 only in this patch].
      
      Based on tip/cpus4096 + .../rusty/linux-2.6-for-ingo.git/master +
      	 x86-only-patches sent 12/15.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      e057d7ae
  29. 16 12月, 2008 1 次提交
  30. 06 11月, 2008 1 次提交
    • R
      cpumask: introduce new API, without changing anything · 2d3854a3
      Rusty Russell 提交于
      Impact: introduce new APIs
      
      We want to deprecate cpumasks on the stack, as we are headed for
      gynormous numbers of CPUs.  Eventually, we want to head towards an
      undefined 'struct cpumask' so they can never be declared on stack.
      
      1) New cpumask functions which take pointers instead of copies.
         (cpus_* -> cpumask_*)
      
      2) Several new helpers to reduce requirements for temporary cpumasks
         (cpumask_first_and, cpumask_next_and, cpumask_any_and)
      
      3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
         (cpumask_var_t, alloc_cpumask_var and free_cpumask_var)
      
      4) 'struct cpumask' for explicitness and to mark new-style code.
      
      5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
         not NR_CPUS for time efficiency and for smaller dynamic allocations
         in future.
      
      6) cpumask_copy() so we can allocate less than a full cpumask eventually
         (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
         definition eventually.
      
      7) work_on_cpu() helper for doing task on a CPU, rather than saving old
         cpumask for current thread and manipulating it.
      
      8) smp_call_function_many() which is smp_call_function_mask() except
         taking a cpumask pointer.
      
      Note that this patch simply introduces the new functions and leaves
      the obsolescent ones in place.  This is to simplify the transition
      patches.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2d3854a3