1. 31 8月, 2018 1 次提交
  2. 15 8月, 2018 1 次提交
  3. 13 8月, 2018 1 次提交
    • L
      init: rename and re-order boot_cpu_state_init() · b5b1404d
      Linus Torvalds 提交于
      This is purely a preparatory patch for upcoming changes during the 4.19
      merge window.
      
      We have a function called "boot_cpu_state_init()" that isn't really
      about the bootup cpu state: that is done much earlier by the similarly
      named "boot_cpu_init()" (note lack of "state" in name).
      
      This function initializes some hotplug CPU state, and needs to run after
      the percpu data has been properly initialized.  It even has a comment to
      that effect.
      
      Except it _doesn't_ actually run after the percpu data has been properly
      initialized.  On x86 it happens to do that, but on at least arm and
      arm64, the percpu base pointers are initialized by the arch-specific
      'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init().
      
      This had some unexpected results, and in particular we have a patch
      pending for the merge window that did the obvious cleanup of using
      'this_cpu_write()' in the cpu hotplug init code:
      
        -       per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE;
        +       this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
      
      which is obviously the right thing to do.  Except because of the
      ordering issue, it actually failed miserably and unexpectedly on arm64.
      
      So this just fixes the ordering, and changes the name of the function to
      be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu
      hotplug state, because the core CPU state was supposed to have already
      been done earlier.
      
      Marked for stable, since the (not yet merged) patch that will show this
      problem is marked for stable.
      Reported-by: NVlastimil Babka <vbabka@suse.cz>
      Reported-by: NMian Yousaf Kaukab <yousaf.kaukab@suse.com>
      Suggested-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5b1404d
  4. 07 8月, 2018 1 次提交
    • T
      cpu/hotplug: Fix SMT supported evaluation · bc2d8d26
      Thomas Gleixner 提交于
      Josh reported that the late SMT evaluation in cpu_smt_state_init() sets
      cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied
      on the kernel command line as it cannot differentiate between SMT disabled
      by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and
      makes the sysfs interface unusable.
      
      Rework this so that during bringup of the non boot CPUs the availability of
      SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a
      'primary' thread then set the local cpu_smt_available marker and evaluate
      this explicitely right after the initial SMP bringup has finished.
      
      SMT evaulation on x86 is a trainwreck as the firmware has all the
      information _before_ booting the kernel, but there is no interface to query
      it.
      
      Fixes: 73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      Reported-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      bc2d8d26
  5. 31 7月, 2018 1 次提交
  6. 26 7月, 2018 1 次提交
  7. 25 7月, 2018 1 次提交
  8. 13 7月, 2018 2 次提交
  9. 09 7月, 2018 1 次提交
  10. 05 7月, 2018 1 次提交
    • K
      x86/KVM: Warn user if KVM is loaded SMT and L1TF CPU bug being present · 26acfb66
      Konrad Rzeszutek Wilk 提交于
      If the L1TF CPU bug is present we allow the KVM module to be loaded as the
      major of users that use Linux and KVM have trusted guests and do not want a
      broken setup.
      
      Cloud vendors are the ones that are uncomfortable with CVE 2018-3620 and as
      such they are the ones that should set nosmt to one.
      
      Setting 'nosmt' means that the system administrator also needs to disable
      SMT (Hyper-threading) in the BIOS, or via the 'nosmt' command line
      parameter, or via the /sys/devices/system/cpu/smt/control. See commit
      05736e4a ("cpu/hotplug: Provide knobs to control SMT").
      
      Other mitigations are to use task affinity, cpu sets, interrupt binding,
      etc - anything to make sure that _only_ the same guests vCPUs are running
      on sibling threads.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      26acfb66
  11. 03 7月, 2018 1 次提交
  12. 02 7月, 2018 1 次提交
  13. 21 6月, 2018 3 次提交
    • T
      cpu/hotplug: Provide knobs to control SMT · 05736e4a
      Thomas Gleixner 提交于
      Provide a command line and a sysfs knob to control SMT.
      
      The command line options are:
      
       'nosmt':	Enumerate secondary threads, but do not online them
       		
       'nosmt=force': Ignore secondary threads completely during enumeration
       		via MP table and ACPI/MADT.
      
      The sysfs control file has the following states (read/write):
      
       'on':		 SMT is enabled. Secondary threads can be freely onlined
       'off':		 SMT is disabled. Secondary threads, even if enumerated
       		 cannot be onlined
       'forceoff':	 SMT is permanentely disabled. Writes to the control
       		 file are rejected.
       'notsupported': SMT is not supported by the CPU
      
      The command line option 'nosmt' sets the sysfs control to 'off'. This
      can be changed to 'on' to reenable SMT during runtime.
      
      The command line option 'nosmt=force' sets the sysfs control to
      'forceoff'. This cannot be changed during runtime.
      
      When SMT is 'on' and the control file is changed to 'off' then all online
      secondary threads are offlined and attempts to online a secondary thread
      later on are rejected.
      
      When SMT is 'off' and the control file is changed to 'on' then secondary
      threads can be onlined again. The 'off' -> 'on' transition does not
      automatically online the secondary threads.
      
      When the control file is set to 'forceoff', the behaviour is the same as
      setting it to 'off', but the operation is irreversible and later writes to
      the control file are rejected.
      
      When the control status is 'notsupported' then writes to the control file
      are rejected.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      05736e4a
    • T
      cpu/hotplug: Split do_cpu_down() · cc1fe215
      Thomas Gleixner 提交于
      Split out the inner workings of do_cpu_down() to allow reuse of that
      function for the upcoming SMT disabling mechanism.
      
      No functional change.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      cc1fe215
    • T
      cpu/hotplug: Make bringup/teardown of smp threads symmetric · c4de6569
      Thomas Gleixner 提交于
      The asymmetry caused a warning to trigger if the bootup was stopped in state
      CPUHP_AP_ONLINE_IDLE. The warning no longer triggers as kthread_park() can
      now be invoked on already or still parked threads. But there is still no
      reason to have this be asymmetric.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      c4de6569
  14. 16 3月, 2018 1 次提交
  15. 14 3月, 2018 1 次提交
  16. 30 12月, 2017 1 次提交
  17. 28 12月, 2017 1 次提交
  18. 07 12月, 2017 1 次提交
  19. 28 11月, 2017 1 次提交
  20. 21 10月, 2017 1 次提交
    • T
      cpu/hotplug: Reset node state after operation · 1f7c70d6
      Thomas Gleixner 提交于
      The recent rework of the cpu hotplug internals changed the usage of the per
      cpu state->node field, but missed to clean it up after usage.
      
      So subsequent hotplug operations use the stale pointer from a previous
      operation and hand it into the callback functions. The callbacks then
      dereference a pointer which either belongs to a different facility or
      points to freed and potentially reused memory. In either case data
      corruption and crashes are the obvious consequence.
      
      Reset the node and the last pointers in the per cpu state to NULL after the
      operation which set them has completed.
      
      Fixes: 96abb968 ("smp/hotplug: Allow external multi-instance rollback")
      Reported-by: NTvrtko Ursulin <tursulin@ursulin.net>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710211606130.3213@nanos
      1f7c70d6
  21. 26 9月, 2017 6 次提交
    • P
      smp/hotplug: Hotplug state fail injection · 1db49484
      Peter Zijlstra 提交于
      Add a sysfs file to one-time fail a specific state. This can be used
      to test the state rollback code paths.
      
      Something like this (hotplug-up.sh):
      
        #!/bin/bash
      
        echo 0 > /debug/sched_debug
        echo 1 > /debug/tracing/events/cpuhp/enable
      
        ALL_STATES=`cat /sys/devices/system/cpu/hotplug/states | cut -d':' -f1`
        STATES=${1:-$ALL_STATES}
      
        for state in $STATES
        do
      	  echo 0 > /sys/devices/system/cpu/cpu1/online
      	  echo 0 > /debug/tracing/trace
      	  echo Fail state: $state
      	  echo $state > /sys/devices/system/cpu/cpu1/hotplug/fail
      	  cat /sys/devices/system/cpu/cpu1/hotplug/fail
      	  echo 1 > /sys/devices/system/cpu/cpu1/online
      
      	  cat /debug/tracing/trace > hotfail-${state}.trace
      
      	  sleep 1
        done
      
      Can be used to test for all possible rollback (barring multi-instance)
      scenarios on CPU-up, CPU-down is a trivial modification of the above.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: bigeasy@linutronix.de
      Cc: efault@gmx.de
      Cc: rostedt@goodmis.org
      Cc: max.byungchul.park@gmail.com
      Link: https://lkml.kernel.org/r/20170920170546.972581715@infradead.org
      
      1db49484
    • P
      smp/hotplug: Differentiate the AP completion between up and down · 5ebe7742
      Peter Zijlstra 提交于
      With lockdep-crossrelease we get deadlock reports that span cpu-up and
      cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
      and cpu-down are globally serialized.
      
        takedown_cpu()
          irq_lock_sparse()
          wait_for_completion(&st->done)
      
                                      cpuhp_thread_fun
                                        cpuhp_up_callback
                                          cpuhp_invoke_callback
                                            irq_affinity_online_cpu
                                              irq_local_spare()
                                              irq_unlock_sparse()
                                        complete(&st->done)
      
      Now that we have consistent AP state, we can trivially separate the
      AP completion between up and down using st->bringup.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: max.byungchul.park@gmail.com
      Cc: bigeasy@linutronix.de
      Cc: efault@gmx.de
      Cc: rostedt@goodmis.org
      Link: https://lkml.kernel.org/r/20170920170546.872472799@infradead.org
      
      5ebe7742
    • P
      smp/hotplug: Differentiate the AP-work lockdep class between up and down · 5f4b55e1
      Peter Zijlstra 提交于
      With lockdep-crossrelease we get deadlock reports that span cpu-up and
      cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
      and cpu-down are globally serialized.
      
        CPU0                  CPU1                    CPU2
        cpuhp_up_callbacks:   takedown_cpu:           cpuhp_thread_fun:
      
        cpuhp_state
                              irq_lock_sparse()
          irq_lock_sparse()
                              wait_for_completion()
                                                      cpuhp_state
                                                      complete()
      
      Now that we have consistent AP state, we can trivially separate the
      AP-work class between up and down using st->bringup.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: max.byungchul.park@gmail.com
      Cc: bigeasy@linutronix.de
      Cc: efault@gmx.de
      Cc: rostedt@goodmis.org
      Link: https://lkml.kernel.org/r/20170920170546.922524234@infradead.org
      
      5f4b55e1
    • P
      smp/hotplug: Callback vs state-machine consistency · 724a8688
      Peter Zijlstra 提交于
      While the generic callback functions have an 'int' return and thus
      appear to be allowed to return error, this is not true for all states.
      
      Specifically, what used to be STARTING/DYING are ran with IRQs
      disabled from critical parts of CPU bringup/teardown and are not
      allowed to fail. Add WARNs to enforce this rule.
      
      But since some callbacks are indeed allowed to fail, we have the
      situation where a state-machine rollback encounters a failure, in this
      case we're stuck, we can't go forward and we can't go back. Also add a
      WARN for that case.
      
      AFAICT this is a fundamental 'problem' with no real obvious solution.
      We want the 'prepare' callbacks to allow failure on either up or down.
      Typically on prepare-up this would be things like -ENOMEM from
      resource allocations, and the typical usage in prepare-down would be
      something like -EBUSY to avoid CPUs being taken away.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: bigeasy@linutronix.de
      Cc: efault@gmx.de
      Cc: rostedt@goodmis.org
      Cc: max.byungchul.park@gmail.com
      Link: https://lkml.kernel.org/r/20170920170546.819539119@infradead.org
      
      724a8688
    • P
      smp/hotplug: Rewrite AP state machine core · 4dddfb5f
      Peter Zijlstra 提交于
      There is currently no explicit state change on rollback. That is,
      st->bringup, st->rollback and st->target are not consistent when doing
      the rollback.
      
      Rework the AP state handling to be more coherent. This does mean we
      have to do a second AP kick-and-wait for rollback, but since rollback
      is the slow path of a slowpath, this really should not matter.
      
      Take this opportunity to simplify the AP thread function to only run a
      single callback per invocation. This unifies the three single/up/down
      modes is supports. The looping it used to do for up/down are achieved
      by retaining should_run and relying on the main smpboot_thread_fn()
      loop.
      
      (I have most of a patch that does the same for the BP state handling,
      but that's not critical and gets a little complicated because
      CPUHP_BRINGUP_CPU does the AP handoff from a callback, which gets
      recursive @st usage, I still have de-fugly that.)
      
      [ tglx: Move cpuhp_down_callbacks() et al. into the HOTPLUG_CPU section to
        	avoid gcc complaining about unused functions. Make the HOTPLUG_CPU
        	one piece instead of having two consecutive ifdef sections of the
        	same type. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: bigeasy@linutronix.de
      Cc: efault@gmx.de
      Cc: rostedt@goodmis.org
      Cc: max.byungchul.park@gmail.com
      Link: https://lkml.kernel.org/r/20170920170546.769658088@infradead.org
      
      4dddfb5f
    • P
      smp/hotplug: Allow external multi-instance rollback · 96abb968
      Peter Zijlstra 提交于
      Currently the rollback of multi-instance states is handled inside
      cpuhp_invoke_callback(). The problem is that when we want to allow an
      explicit state change for rollback, we need to return from the
      function without doing the rollback.
      
      Change cpuhp_invoke_callback() to optionally return the multi-instance
      state, such that rollback can be done from a subsequent call.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: bigeasy@linutronix.de
      Cc: efault@gmx.de
      Cc: rostedt@goodmis.org
      Cc: max.byungchul.park@gmail.com
      Link: https://lkml.kernel.org/r/20170920170546.720361181@infradead.org
      
      96abb968
  22. 14 9月, 2017 1 次提交
    • T
      watchdog/hardlockup/perf: Prevent CPU hotplug deadlock · 941154bd
      Thomas Gleixner 提交于
      The following deadlock is possible in the watchdog hotplug code:
      
        cpus_write_lock()
          ...
            takedown_cpu()
              smpboot_park_threads()
                smpboot_park_thread()
                  kthread_park()
                    ->park() := watchdog_disable()
                      watchdog_nmi_disable()
                        perf_event_release_kernel();
                          put_event()
                            _free_event()
                              ->destroy() := hw_perf_event_destroy()
                                x86_release_hardware()
                                  release_ds_buffers()
                                    get_online_cpus()
      
      when a per cpu watchdog perf event is destroyed which drops the last
      reference to the PMU hardware. The cleanup code there invokes
      get_online_cpus() which instantly deadlocks because the hotplug percpu
      rwsem is write locked.
      
      To solve this add a deferring mechanism:
      
        cpus_write_lock()
      			   kthread_park()
      			    watchdog_nmi_disable(deferred)
      			      perf_event_disable(event);
      			      move_event_to_deferred(event);
      			   ....
        cpus_write_unlock()
        cleaup_deferred_events()
          perf_event_release_kernel()
      
      This is still properly serialized against concurrent hotplug via the
      cpu_add_remove_lock, which is held by the task which initiated the hotplug
      event.
      
      This is also used to handle event destruction when the watchdog threads are
      parked via other mechanisms than CPU hotplug.
      Analyzed-by: NPeter Zijlstra <peterz@infradead.org>
      Reported-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NDon Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Link: http://lkml.kernel.org/r/20170912194146.884469246@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      941154bd
  23. 26 7月, 2017 1 次提交
    • P
      rcu: Migrate callbacks earlier in the CPU-offline timeline · a58163d8
      Paul E. McKenney 提交于
      RCU callbacks must be migrated away from an outgoing CPU, and this is
      done near the end of the CPU-hotplug operation, after the outgoing CPU is
      long gone.  Unfortunately, this means that other CPU-hotplug callbacks
      can execute while the outgoing CPU's callbacks are still immobilized
      on the long-gone CPU's callback lists.  If any of these CPU-hotplug
      callbacks must wait, either directly or indirectly, for the invocation
      of any of the immobilized RCU callbacks, the system will hang.
      
      This commit avoids such hangs by migrating the callbacks away from the
      outgoing CPU immediately upon its departure, shortly after the return
      from __cpu_die() in takedown_cpu().  Thus, RCU is able to advance these
      callbacks and invoke them, which allows all the after-the-fact CPU-hotplug
      callbacks to wait on these RCU callbacks without risk of a hang.
      
      While in the neighborhood, this commit also moves rcu_send_cbs_to_orphanage()
      and rcu_adopt_orphan_cbs() under a pre-existing #ifdef to avoid including
      dead code on the one hand and to avoid define-without-use warnings on the
      other hand.
      Reported-by: NJeffrey Hugo <jhugo@codeaurora.org>
      Link: http://lkml.kernel.org/r/db9c91f6-1b17-6136-84f0-03c3c2581ab4@codeaurora.orgSigned-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Richard Weinberger <richard@nod.at>
      a58163d8
  24. 20 7月, 2017 1 次提交
    • E
      smp/hotplug: Handle removal correctly in cpuhp_store_callbacks() · 0c96b273
      Ethan Barnes 提交于
      If cpuhp_store_callbacks() is called for CPUHP_AP_ONLINE_DYN or
      CPUHP_BP_PREPARE_DYN, which are the indicators for dynamically allocated
      states, then cpuhp_store_callbacks() allocates a new dynamic state. The
      first allocation in each range returns CPUHP_AP_ONLINE_DYN or
      CPUHP_BP_PREPARE_DYN.
      
      If cpuhp_remove_state() is invoked for one of these states, then there is
      no protection against the allocation mechanism. So the removal, which
      should clear the callbacks and the name, gets a new state assigned and
      clears that one.
      
      As a consequence the state which should be cleared stays initialized. A
      consecutive CPU hotplug operation dereferences the state callbacks and
      accesses either freed or reused memory, resulting in crashes.
      
      Add a protection against this by checking the name argument for NULL. If
      it's NULL it's a removal. If not, it's an allocation.
      
      [ tglx: Added a comment and massaged changelog ]
      
      Fixes: 5b7aa87e ("cpu/hotplug: Implement setup/removal interface")
      Signed-off-by: NEthan Barnes <ethan.barnes@sandisk.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.or>
      Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
      Cc: Sebastian Siewior <bigeasy@linutronix.d>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/DM2PR04MB398242FC7776D603D9F99C894A60@DM2PR04MB398.namprd04.prod.outlook.com
      0c96b273
  25. 12 7月, 2017 1 次提交
  26. 06 7月, 2017 1 次提交
    • T
      smp/hotplug: Move unparking of percpu threads to the control CPU · 9cd4f1a4
      Thomas Gleixner 提交于
      Vikram reported the following backtrace:
      
         BUG: scheduling while atomic: swapper/7/0/0x00000002
         CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.32-perf+ #680
         schedule
         schedule_hrtimeout_range_clock
         schedule_hrtimeout
         wait_task_inactive
         __kthread_bind_mask
         __kthread_bind
         __kthread_unpark
         kthread_unpark
         cpuhp_online_idle
         cpu_startup_entry
         secondary_start_kernel
      
      He analyzed correctly that a parked cpu hotplug thread of an offlined CPU
      was still on the runqueue when the CPU came back online and tried to unpark
      it. This causes the thread which invoked kthread_unpark() to call
      wait_task_inactive() and subsequently schedule() with preemption disabled.
      His proposed workaround was to "make sure" that a parked thread has
      scheduled out when the CPU goes offline, so the situation cannot happen.
      
      But that's still wrong because the root cause is not the fact that the
      percpu thread is still on the runqueue and neither that preemption is
      disabled, which could be simply solved by enabling preemption before
      calling kthread_unpark().
      
      The real issue is that the calling thread is the idle task of the upcoming
      CPU, which is not supposed to call anything which might sleep.  The moron,
      who wrote that code, missed completely that kthread_unpark() might end up
      in schedule().
      
      The solution is simpler than expected. The thread which controls the
      hotplug operation is waiting for the CPU to call complete() on the hotplug
      state completion. So the idle task of the upcoming CPU can set its state to
      CPUHP_AP_ONLINE_IDLE and invoke complete(). This in turn wakes the control
      task on a different CPU, which then can safely do the unpark and kick the
      now unparked hotplug thread of the upcoming CPU to complete the bringup to
      the final target state.
      
      Control CPU                     AP
      
      bringup_cpu();
        __cpu_up()  ------------>
      				bringup_ap();
        bringup_wait_for_ap()
          wait_for_completion();
                                      cpuhp_online_idle();
                      <------------    complete();
          unpark(AP->stopper);
          unpark(AP->hotplugthread);
                                      while(1)
                                        do_idle();
          kick(AP->hotplugthread);
          wait_for_completion();	hotplug_thread()
      				  run_online_callbacks();
      				  complete();
      
      Fixes: 8df3e07e ("cpu/hotplug: Let upcoming cpu bring itself fully up")
      Reported-by: NVikram Mulukutla <markivx@codeaurora.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Sewior <bigeasy@linutronix.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707042218020.2131@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      9cd4f1a4
  27. 30 6月, 2017 1 次提交
  28. 23 6月, 2017 1 次提交
    • T
      genirq/cpuhotplug: Handle managed IRQs on CPU hotplug · c5cb83bb
      Thomas Gleixner 提交于
      If a CPU goes offline, interrupts affine to the CPU are moved away. If the
      outgoing CPU is the last CPU in the affinity mask the migration code breaks
      the affinity and sets it it all online cpus.
      
      This is a problem for affinity managed interrupts as CPU hotplug is often
      used for power management purposes. If the affinity is broken, the
      interrupt is not longer affine to the CPUs to which it was allocated.
      
      The affinity spreading allows to lay out multi queue devices in a way that
      they are assigned to a single CPU or a group of CPUs. If the last CPU goes
      offline, then the queue is not longer used, so the interrupt can be
      shutdown gracefully and parked until one of the assigned CPUs comes online
      again.
      
      Add a graceful shutdown mechanism into the irq affinity breaking code path,
      mark the irq as MANAGED_SHUTDOWN and leave the affinity mask unmodified.
      
      In the online path, scan the active interrupts for managed interrupts and
      if the interrupt is functional and the newly online CPU is part of the
      affinity mask, restart the interrupt if it is marked MANAGED_SHUTDOWN or if
      the interrupts is started up, try to add the CPU back to the effective
      affinity mask.
      Originally-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170619235447.273417334@linutronix.de
      c5cb83bb
  29. 13 6月, 2017 1 次提交
    • A
      cpu/hotplug: Remove unused check_for_tasks() function · 57de7212
      Arnd Bergmann 提交于
      clang -Wunused-function found one remaining function that was
      apparently meant to be removed in a recent code cleanup:
      
      kernel/cpu.c:565:20: warning: unused function 'check_for_tasks' [-Wunused-function]
      
      Sebastian explained: The function became unused unintentionally, but there
      is already a failure check, when a task cannot be removed from the outgoing
      cpu in the scheduler code, so bringing it back is not really giving any
      extra value.
      
      Fixes: 530e9b76 ("cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Link: http://lkml.kernel.org/r/20170608085544.2257132-1-arnd@arndb.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      57de7212
  30. 03 6月, 2017 1 次提交
  31. 26 5月, 2017 2 次提交