1. 10 3月, 2016 2 次提交
  2. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  3. 20 4月, 2015 1 次提交
  4. 17 4月, 2015 1 次提交
    • L
      smp: Fix smp_call_function_single_async() locking · 8053871d
      Linus Torvalds 提交于
      The current smp_function_call code suffers a number of problems, most
      notably smp_call_function_single_async() is broken.
      
      The problem is that flush_smp_call_function_queue() does csd_unlock()
      _after_ calling csd->func(). This means that a caller cannot properly
      synchronize the csd usage as it has to.
      
      Change the code to release the csd before calling ->func() for the
      async case, and put a WARN_ON_ONCE(csd->flags & CSD_FLAG_LOCK) in
      smp_call_function_single_async() to warn us of improper serialization,
      because any waiting there can results in deadlocks when called with
      IRQs disabled.
      
      Rename the (currently) unused WAIT flag to SYNCHRONOUS and (re)use it
      such that we know what to do in flush_smp_call_function_queue().
      
      Rework csd_{,un}lock() to use smp_load_acquire() / smp_store_release()
      to avoid some full barriers while more clearly providing lock
      semantics.
      
      Finally move the csd maintenance out of generic_exec_single() into its
      callers for clearer code.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      [ Added changelog. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Rafael David Tinoco <inaddy@ubuntu.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/CA+55aFz492bzLFhdbKN-Hygjcreup7CjMEYk3nTSfRWjppz-OA@mail.gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8053871d
  5. 19 9月, 2014 1 次提交
    • C
      smp: Add new wake_up_all_idle_cpus() function · c6f4459f
      Chuansheng Liu 提交于
      Currently kick_all_cpus_sync() can break non-polling idle cpus
      thru IPI interrupts.
      
      But sometimes we need to break the polling idle cpus immediately
      to reselect the suitable c-state, also for non-idle cpus, we need
      to do nothing if we try to wake up them.
      
      Here adding one new function wake_up_all_idle_cpus() to let all cpus
      out of idle based on function wake_up_if_idle().
      Signed-off-by: NChuansheng Liu <chuansheng.liu@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: daniel.lezcano@linaro.org
      Cc: rjw@rjwysocki.net
      Cc: linux-pm@vger.kernel.org
      Cc: changcheng.liu@intel.com
      Cc: xiaoming.wang@intel.com
      Cc: souvik.k.chakravarty@intel.com
      Cc: luto@amacapital.net
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Geert Uytterhoeven <geert+renesas@glider.be>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Roman Gushchin <klamm@yandex-team.ru>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1409815075-4180-2-git-send-email-chuansheng.liu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c6f4459f
  6. 27 8月, 2014 1 次提交
  7. 07 8月, 2014 1 次提交
  8. 24 6月, 2014 1 次提交
    • S
      CPU hotplug, smp: flush any pending IPI callbacks before CPU offline · 8d056c48
      Srivatsa S. Bhat 提交于
      There is a race between the CPU offline code (within stop-machine) and
      the smp-call-function code, which can lead to getting IPIs on the
      outgoing CPU, *after* it has gone offline.
      
      Specifically, this can happen when using
      smp_call_function_single_async() to send the IPI, since this API allows
      sending asynchronous IPIs from IRQ disabled contexts.  The exact race
      condition is described below.
      
      During CPU offline, in stop-machine, we don't enforce any rule in the
      _DISABLE_IRQ stage, regarding the order in which the outgoing CPU and
      the other CPUs disable their local interrupts.  Due to this, we can
      encounter a situation in which an IPI is sent by one of the other CPUs
      to the outgoing CPU (while it is *still* online), but the outgoing CPU
      ends up noticing it only *after* it has gone offline.
      
                    CPU 1                                         CPU 2
                (Online CPU)                               (CPU going offline)
      
             Enter _PREPARE stage                          Enter _PREPARE stage
      
                                                           Enter _DISABLE_IRQ stage
      
                                                         =
             Got a device interrupt, and                 | Didn't notice the IPI
             the interrupt handler sent an               | since interrupts were
             IPI to CPU 2 using                          | disabled on this CPU.
             smp_call_function_single_async()            |
                                                         =
      
             Enter _DISABLE_IRQ stage
      
             Enter _RUN stage                              Enter _RUN stage
      
                                        =
             Busy loop with interrupts  |                  Invoke take_cpu_down()
             disabled.                  |                  and take CPU 2 offline
                                        =
      
             Enter _EXIT stage                             Enter _EXIT stage
      
             Re-enable interrupts                          Re-enable interrupts
      
                                                           The pending IPI is noted
                                                           immediately, but alas,
                                                           the CPU is offline at
                                                           this point.
      
      This of course, makes the smp-call-function IPI handler code running on
      CPU 2 unhappy and it complains about "receiving an IPI on an offline
      CPU".
      
      One real example of the scenario on CPU 1 is the block layer's
      complete-request call-path:
      
      	__blk_complete_request() [interrupt-handler]
      	    raise_blk_irq()
      	        smp_call_function_single_async()
      
      However, if we look closely, the block layer does check that the target
      CPU is online before firing the IPI.  So in this case, it is actually
      the unfortunate ordering/timing of events in the stop-machine phase that
      leads to receiving IPIs after the target CPU has gone offline.
      
      In reality, getting a late IPI on an offline CPU is not too bad by
      itself (this can happen even due to hardware latencies in IPI
      send-receive).  It is a bug only if the target CPU really went offline
      without executing all the callbacks queued on its list.  (Note that a
      CPU is free to execute its pending smp-call-function callbacks in a
      batch, without waiting for the corresponding IPIs to arrive for each one
      of those callbacks).
      
      So, fixing this issue can be broken up into two parts:
      
      1. Ensure that a CPU goes offline only after executing all the
         callbacks queued on it.
      
      2. Modify the warning condition in the smp-call-function IPI handler
         code such that it warns only if an offline CPU got an IPI *and* that
         CPU had gone offline with callbacks still pending in its queue.
      
      Achieving part 1 is straight-forward - just flush (execute) all the
      queued callbacks on the outgoing CPU in the CPU_DYING stage[1],
      including those callbacks for which the source CPU's IPIs might not have
      been received on the outgoing CPU yet.  Once we do this, an IPI that
      arrives late on the CPU going offline (either due to the race mentioned
      above, or due to hardware latencies) will be completely harmless, since
      the outgoing CPU would have executed all the queued callbacks before
      going offline.
      
      Overall, this fix (parts 1 and 2 put together) additionally guarantees
      that we will see a warning only when the *IPI-sender code* is buggy -
      that is, if it queues the callback _after_ the target CPU has gone
      offline.
      
      [1].  The CPU_DYING part needs a little more explanation: by the time we
      execute the CPU_DYING notifier callbacks, the CPU would have already
      been marked offline.  But we want to flush out the pending callbacks at
      this stage, ignoring the fact that the CPU is offline.  So restructure
      the IPI handler code so that we can by-pass the "is-cpu-offline?" check
      in this particular case.  (Of course, the right solution here is to fix
      CPU hotplug to mark the CPU offline _after_ invoking the CPU_DYING
      notifiers, but this requires a lot of audit to ensure that this change
      doesn't break any existing code; hence lets go with the solution
      proposed above until that is done).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Suggested-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <mgalbraith@suse.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Tested-by: NSachin Kamat <sachin.kamat@samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d056c48
  9. 16 6月, 2014 1 次提交
    • F
      irq_work: Implement remote queueing · 47885016
      Frederic Weisbecker 提交于
      irq work currently only supports local callbacks. However its code
      is mostly ready to run remote callbacks and we have some potential user.
      
      The full nohz subsystem currently open codes its own remote irq work
      on top of the scheduler ipi when it wants a CPU to reevaluate its next
      tick. However this ad hoc solution bloats the scheduler IPI.
      
      Lets just extend the irq work subsystem to support remote queuing on top
      of the generic SMP IPI to handle this kind of user. This shouldn't add
      noticeable overhead.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      47885016
  10. 07 6月, 2014 1 次提交
    • S
      smp: print more useful debug info upon receiving IPI on an offline CPU · a219ccf4
      Srivatsa S. Bhat 提交于
      There is a longstanding problem related to CPU hotplug which causes IPIs
      to be delivered to offline CPUs, and the smp-call-function IPI handler
      code prints out a warning whenever this is detected.  Every once in a
      while this (usually harmless) warning gets reported on LKML, but so far
      it has not been completely fixed.  Usually the solution involves finding
      out the IPI sender and fixing it by adding appropriate synchronization
      with CPU hotplug.
      
      However, while going through one such internal bug reports, I found that
      there is a significant bug in the receiver side itself (more
      specifically, in stop-machine) that can lead to this problem even when
      the sender code is perfectly fine.  This patchset fixes that
      synchronization problem in the CPU hotplug stop-machine code.
      
      Patch 1 adds some additional debug code to the smp-call-function
      framework, to help debug such issues easily.
      
      Patch 2 modifies the stop-machine code to ensure that any IPIs that were
      sent while the target CPU was online, would be noticed and handled by
      that CPU without fail before it goes offline.  Thus, this avoids
      scenarios where IPIs are received on offline CPUs (as long as the sender
      uses proper hotplug synchronization).
      
      In fact, I debugged the problem by using Patch 1, and found that the
      payload of the IPI was always the block layer's trigger_softirq()
      function.  But I was not able to find anything wrong with the block
      layer code.  That's when I started looking at the stop-machine code and
      realized that there is a race-window which makes the IPI _receiver_ the
      culprit, not the sender.  Patch 2 fixes that race and hence this should
      put an end to most of the hard-to-debug IPI-to-offline-CPU issues.
      
      This patch (of 2):
      
      Today the smp-call-function code just prints a warning if we get an IPI
      on an offline CPU.  This info is sufficient to let us know that
      something went wrong, but often it is very hard to debug exactly who
      sent the IPI and why, from this info alone.
      
      In most cases, we get the warning about the IPI to an offline CPU,
      immediately after the CPU going offline comes out of the stop-machine
      phase and reenables interrupts.  Since all online CPUs participate in
      stop-machine, the information regarding the sender of the IPI is already
      lost by the time we exit the stop-machine loop.  So even if we dump the
      stack on each CPU at this point, we won't find anything useful since all
      of them will show the stack-trace of the stopper thread.  So we need a
      better way to figure out who sent the IPI and why.
      
      To achieve this, when we detect an IPI targeted to an offline CPU, loop
      through the call-single-data linked list and print out the payload
      (i.e., the name of the function which was supposed to be executed by the
      target CPU).  This would give us an insight as to who might have sent
      the IPI and help us debug this further.
      
      [akpm@linux-foundation.org: correctly suppress warning output on second and later occurrences]
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mike Galbraith <mgalbraith@suse.de>
      Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a219ccf4
  11. 25 2月, 2014 6 次提交
    • F
      smp: Rename __smp_call_function_single() to smp_call_function_single_async() · c46fff2a
      Frederic Weisbecker 提交于
      The name __smp_call_function_single() doesn't tell much about the
      properties of this function, especially when compared to
      smp_call_function_single().
      
      The comments above the implementation are also misleading. The main
      point of this function is actually not to be able to embed the csd
      in an object. This is actually a requirement that result from the
      purpose of this function which is to raise an IPI asynchronously.
      
      As such it can be called with interrupts disabled. And this feature
      comes at the cost of the caller who then needs to serialize the
      IPIs on this csd.
      
      Lets rename the function and enhance the comments so that they reflect
      these properties.
      Suggested-by: NChristoph Hellwig <hch@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c46fff2a
    • F
      smp: Remove wait argument from __smp_call_function_single() · fce8ad15
      Frederic Weisbecker 提交于
      The main point of calling __smp_call_function_single() is to send
      an IPI in a pure asynchronous way. By embedding a csd in an object,
      a caller can send the IPI without waiting for a previous one to complete
      as is required by smp_call_function_single() for example. As such,
      sending this kind of IPI can be safe even when irqs are disabled.
      
      This flexibility comes at the expense of the caller who then needs to
      synchronize the csd lifecycle by himself and make sure that IPIs on a
      single csd are serialized.
      
      This is how __smp_call_function_single() works when wait = 0 and this
      usecase is relevant.
      
      Now there don't seem to be any usecase with wait = 1 that can't be
      covered by smp_call_function_single() instead, which is safer. Lets look
      at the two possible scenario:
      
      1) The user calls __smp_call_function_single(wait = 1) on a csd embedded
         in an object. It looks like a nice and convenient pattern at the first
         sight because we can then retrieve the object from the IPI handler easily.
      
         But actually it is a waste of memory space in the object since the csd
         can be allocated from the stack by smp_call_function_single(wait = 1)
         and the object can be passed an the IPI argument.
      
         Besides that, embedding the csd in an object is more error prone
         because the caller must take care of the serialization of the IPIs
         for this csd.
      
      2) The user calls __smp_call_function_single(wait = 1) on a csd that
         is allocated on the stack. It's ok but smp_call_function_single()
         can do it as well and it already takes care of the allocation on the
         stack. Again it's more simple and less error prone.
      
      Therefore, using the underscore prepend API version with wait = 1
      is a bad pattern and a sign that the caller can do safer and more
      simple.
      
      There was a single user of that which has just been converted.
      So lets remove this option to discourage further users.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      fce8ad15
    • F
      smp: Move __smp_call_function_single() below its safe version · d7877c03
      Frederic Weisbecker 提交于
      Move this function closer to __smp_call_function_single(). These functions
      have very similar behavior and should be displayed in the same block
      for clarity.
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      d7877c03
    • F
      smp: Consolidate the various smp_call_function_single() declensions · 8b28499a
      Frederic Weisbecker 提交于
      __smp_call_function_single() and smp_call_function_single() share some
      code that can be factorized: execute inline when the target is local,
      check if the target is online, lock the csd, call generic_exec_single().
      
      Lets move the common parts to generic_exec_single().
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      8b28499a
    • J
      smp: Teach __smp_call_function_single() to check for offline cpus · 08eed44c
      Jan Kara 提交于
      Align __smp_call_function_single() with smp_call_function_single() so
      that it also checks whether requested cpu is still online.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      08eed44c
    • J
      smp: Iterate functions through llist_for_each_entry_safe() · 5fd77595
      Jan Kara 提交于
      The IPI function llist iteration is open coded. Lets simplify this
      with using an llist iterator.
      
      Also we want to keep the iteration safe against possible
      csd.llist->next value reuse from the IPI handler. At least the block
      subsystem used to do such things so lets stay careful and use
      llist_for_each_entry_safe().
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      5fd77595
  12. 31 1月, 2014 2 次提交
  13. 15 11月, 2013 2 次提交
  14. 25 10月, 2013 2 次提交
  15. 01 10月, 2013 1 次提交
    • B
      x86/boot: Further compress CPUs bootup message · a17bce4d
      Borislav Petkov 提交于
      Turn it into (for example):
      
      [    0.073380] x86: Booting SMP configuration:
      [    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
      [    0.603005] .... node   #1, CPUs:     #8   #9  #10  #11  #12  #13  #14  #15
      [    1.200005] .... node   #2, CPUs:    #16  #17  #18  #19  #20  #21  #22  #23
      [    1.796005] .... node   #3, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31
      [    2.393005] .... node   #4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
      [    2.996005] .... node   #5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
      [    3.600005] .... node   #6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
      [    4.202005] .... node   #7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
      [    4.811005] .... node   #8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
      [    5.421006] .... node   #9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
      [    6.032005] .... node  #10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
      [    6.648006] .... node  #11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
      [    7.262005] .... node  #12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
      [    7.865005] .... node  #13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
      [    8.466005] .... node  #14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
      [    9.073006] .... node  #15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
      [    9.679901] x86: Booted up 16 nodes, 128 CPUs
      
      and drop useless elements.
      
      Change num_digits() to hpa's division-avoiding, cell-phone-typed
      version which he went at great lengths and pains to submit on a
      Saturday evening.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: huawei.libin@huawei.com
      Cc: wangyijing@huawei.com
      Cc: fenghua.yu@intel.com
      Cc: guohanjun@huawei.com
      Cc: paul.gortmaker@windriver.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a17bce4d
  16. 12 9月, 2013 2 次提交
  17. 19 8月, 2013 1 次提交
  18. 31 7月, 2013 1 次提交
  19. 15 7月, 2013 1 次提交
    • P
      kernel: delete __cpuinit usage from all core kernel files · 0db0628d
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the uses of the __cpuinit macros from C files in
      the core kernel directories (kernel, init, lib, mm, and include)
      that don't really have a specific maintainer.
      
      [1] https://lkml.org/lkml/2013/5/20/589Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      0db0628d
  20. 01 5月, 2013 2 次提交
  21. 22 2月, 2013 1 次提交
    • S
      smp: make smp_call_function_many() use logic similar to smp_call_function_single() · 9a46ad6d
      Shaohua Li 提交于
      I'm testing swapout workload in a two-socket Xeon machine.  The workload
      has 10 threads, each thread sequentially accesses separate memory
      region.  TLB flush overhead is very big in the workload.  For each page,
      page reclaim need move it from active lru list and then unmap it.  Both
      need a TLB flush.  And this is a multthread workload, TLB flush happens
      in 10 CPUs.  In X86, TLB flush uses generic smp_call)function.  So this
      workload stress smp_call_function_many heavily.
      
      Without patch, perf shows:
      +  24.49%  [k] generic_smp_call_function_interrupt
      -  21.72%  [k] _raw_spin_lock
         - _raw_spin_lock
            + 79.80% __page_check_address
            + 6.42% generic_smp_call_function_interrupt
            + 3.31% get_swap_page
            + 2.37% free_pcppages_bulk
            + 1.75% handle_pte_fault
            + 1.54% put_super
            + 1.41% grab_super_passive
            + 1.36% __swap_duplicate
            + 0.68% blk_flush_plug_list
            + 0.62% swap_info_get
      +   6.55%  [k] flush_tlb_func
      +   6.46%  [k] smp_call_function_many
      +   5.09%  [k] call_function_interrupt
      +   4.75%  [k] default_send_IPI_mask_sequence_phys
      +   2.18%  [k] find_next_bit
      
      swapout throughput is around 1300M/s.
      
      With the patch, perf shows:
      -  27.23%  [k] _raw_spin_lock
         - _raw_spin_lock
            + 80.53% __page_check_address
            + 8.39% generic_smp_call_function_single_interrupt
            + 2.44% get_swap_page
            + 1.76% free_pcppages_bulk
            + 1.40% handle_pte_fault
            + 1.15% __swap_duplicate
            + 1.05% put_super
            + 0.98% grab_super_passive
            + 0.86% blk_flush_plug_list
            + 0.57% swap_info_get
      +   8.25%  [k] default_send_IPI_mask_sequence_phys
      +   7.55%  [k] call_function_interrupt
      +   7.47%  [k] smp_call_function_many
      +   7.25%  [k] flush_tlb_func
      +   3.81%  [k] _raw_spin_lock_irqsave
      +   3.78%  [k] generic_smp_call_function_single_interrupt
      
      swapout throughput is around 1400M/s.  So there is around a 7%
      improvement, and total cpu utilization doesn't change.
      
      Without the patch, cfd_data is shared by all CPUs.
      generic_smp_call_function_interrupt does read/write cfd_data several times
      which will create a lot of cache ping-pong.  With the patch, the data
      becomes per-cpu.  The ping-pong is avoided.  And from the perf data, this
      doesn't make call_single_queue lock contend.
      
      Next step is to remove generic_smp_call_function_interrupt() from arch
      code.
      Signed-off-by: NShaohua Li <shli@fusionio.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a46ad6d
  22. 28 1月, 2013 1 次提交
    • W
      smp: Fix SMP function call empty cpu mask race · f44310b9
      Wang YanQing 提交于
      I get the following warning every day with v3.7, once or
      twice a day:
      
        [ 2235.186027] WARNING: at /mnt/sda7/kernel/linux/arch/x86/kernel/apic/ipi.c:109 default_send_IPI_mask_logical+0x2f/0xb8()
      
      As explained by Linus as well:
      
       |
       | Once we've done the "list_add_rcu()" to add it to the
       | queue, we can have (another) IPI to the target CPU that can
       | now see it and clear the mask.
       |
       | So by the time we get to actually send the IPI, the mask might
       | have been cleared by another IPI.
       |
      
      This patch also fixes a system hang problem, if the data->cpumask
      gets cleared after passing this point:
      
              if (WARN_ONCE(!mask, "empty IPI mask"))
                      return;
      
      then the problem in commit 83d349f3 ("x86: don't send an IPI to
      the empty set of CPU's") will happen again.
      Signed-off-by: NWang YanQing <udknight@gmail.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NJan Beulich <jbeulich@suse.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: peterz@infradead.org
      Cc: mina86@mina86.org
      Cc: srivatsa.bhat@linux.vnet.ibm.com
      Cc: <stable@kernel.org>
      Link: http://lkml.kernel.org/r/20130126075357.GA3205@udknight
      [ Tidied up the changelog and the comment in the code. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f44310b9
  23. 05 6月, 2012 1 次提交
  24. 08 5月, 2012 1 次提交
  25. 04 5月, 2012 1 次提交
  26. 29 3月, 2012 2 次提交
    • G
      smp: add func to IPI cpus based on parameter func · b3a7e98e
      Gilad Ben-Yossef 提交于
      Add the on_each_cpu_cond() function that wraps on_each_cpu_mask() and
      calculates the cpumask of cpus to IPI by calling a function supplied as a
      parameter in order to determine whether to IPI each specific cpu.
      
      The function works around allocation failure of cpumask variable in
      CONFIG_CPUMASK_OFFSTACK=y by itereating over cpus sending an IPI a time
      via smp_call_function_single().
      
      The function is useful since it allows to seperate the specific code that
      decided in each case whether to IPI a specific cpu for a specific request
      from the common boilerplate code of handling creating the mask, handling
      failures etc.
      
      [akpm@linux-foundation.org: s/gfpflags/gfp_flags/]
      [akpm@linux-foundation.org: avoid double-evaluation of `info' (per Michal), parenthesise evaluation of `cond_func']
      [akpm@linux-foundation.org: s/CPU/CPUs, use all 80 cols in comment]
      Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Sasha Levin <levinsasha928@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Avi Kivity <avi@redhat.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.org>
      Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com>
      Cc: Milton Miller <miltonm@bga.com>
      Reviewed-by: N"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3a7e98e
    • G
      smp: introduce a generic on_each_cpu_mask() function · 3fc498f1
      Gilad Ben-Yossef 提交于
      We have lots of infrastructure in place to partition multi-core systems
      such that we have a group of CPUs that are dedicated to specific task:
      cgroups, scheduler and interrupt affinity, and cpuisol= boot parameter.
      Still, kernel code will at times interrupt all CPUs in the system via IPIs
      for various needs.  These IPIs are useful and cannot be avoided
      altogether, but in certain cases it is possible to interrupt only specific
      CPUs that have useful work to do and not the entire system.
      
      This patch set, inspired by discussions with Peter Zijlstra and Frederic
      Weisbecker when testing the nohz task patch set, is a first stab at trying
      to explore doing this by locating the places where such global IPI calls
      are being made and turning the global IPI into an IPI for a specific group
      of CPUs.  The purpose of the patch set is to get feedback if this is the
      right way to go for dealing with this issue and indeed, if the issue is
      even worth dealing with at all.  Based on the feedback from this patch set
      I plan to offer further patches that address similar issue in other code
      paths.
      
      This patch creates an on_each_cpu_mask() and on_each_cpu_cond()
      infrastructure API (the former derived from existing arch specific
      versions in Tile and Arm) and uses them to turn several global IPI
      invocation to per CPU group invocations.
      
      Core kernel:
      
      on_each_cpu_mask() calls a function on processors specified by cpumask,
      which may or may not include the local processor.
      
      You must not call this function with disabled interrupts or from a
      hardware interrupt handler or from a bottom half handler.
      
      arch/arm:
      
      Note that the generic version is a little different then the Arm one:
      
      1. It has the mask as first parameter
      2. It calls the function on the calling CPU with interrupts disabled,
         but this should be OK since the function is called on the other CPUs
         with interrupts disabled anyway.
      
      arch/tile:
      
      The API is the same as the tile private one, but the generic version
      also calls the function on the with interrupts disabled in UP case
      
      This is OK since the function is called on the other CPUs
      with interrupts disabled.
      Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
      Reviewed-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Sasha Levin <levinsasha928@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Avi Kivity <avi@redhat.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.org>
      Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com>
      Cc: Milton Miller <miltonm@bga.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3fc498f1
  27. 31 10月, 2011 1 次提交
  28. 17 6月, 2011 1 次提交