1. 07 12月, 2009 1 次提交
    • P
      sched: Fix balance vs hotplug race · 6ad4c188
      Peter Zijlstra 提交于
      Since (e761b772: cpu hotplug, sched: Introduce cpu_active_map and redo
      sched domain managment) we have cpu_active_mask which is suppose to rule
      scheduler migration and load-balancing, except it never (fully) did.
      
      The particular problem being solved here is a crash in try_to_wake_up()
      where select_task_rq() ends up selecting an offline cpu because
      select_task_rq_fair() trusts the sched_domain tree to reflect the
      current state of affairs, similarly select_task_rq_rt() trusts the
      root_domain.
      
      However, the sched_domains are updated from CPU_DEAD, which is after the
      cpu is taken offline and after stop_machine is done. Therefore it can
      race perfectly well with code assuming the domains are right.
      
      Cure this by building the domains from cpu_active_mask on
      CPU_DOWN_PREPARE.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ad4c188
  2. 26 11月, 2009 1 次提交
    • M
      timers, init: Limit the number of per cpu calibration bootup messages · feae3203
      Mike Travis 提交于
      Limit the number of per cpu calibration messages by only
      printing out results for the first cpu to boot.
      
      Also, don't print "CPUx is down" as this is expected, and we
      don't need 4096 reminders... ;-)
      Signed-off-by: NMike Travis <travis@sgi.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Roland Dreier <rdreier@cisco.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091118002219.889552000@alcatraz.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      feae3203
  3. 02 9月, 2009 1 次提交
  4. 22 8月, 2009 1 次提交
    • S
      x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init · d0af9eed
      Suresh Siddha 提交于
      SDM Vol 3a section titled "MTRR considerations in MP systems" specifies
      the need for synchronizing the logical cpu's while initializing/updating
      MTRR.
      
      Currently Linux kernel does the synchronization of all cpu's only when
      a single MTRR register is programmed/updated. During an AP online
      (during boot/cpu-online/resume)  where we initialize all the MTRR/PAT registers,
      we don't follow this synchronization algorithm.
      
      This can lead to scenarios where during a dynamic cpu online, that logical cpu
      is initializing MTRR/PAT with cache disabled (cr0.cd=1) etc while other logical
      HT sibling continue to run (also with cache disabled because of cr0.cd=1
      on its sibling).
      
      Starting from Westmere, VMX transitions with cr0.cd=1 don't work properly
      (because of some VMX performance optimizations) and the above scenario
      (with one logical cpu doing VMX activity and another logical cpu coming online)
      can result in system crash.
      
      Fix the MTRR initialization by doing rendezvous of all the cpus. During
      boot and resume, we delay the MTRR/PAT init for APs till all the
      logical cpu's come online and the rendezvous process at the end of AP's bringup,
      will initialize the MTRR/PAT for all AP's.
      
      For dynamic single cpu online, we synchronize all the logical cpus and
      do the MTRR/PAT init on the AP that is coming online.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      d0af9eed
  5. 22 7月, 2009 1 次提交
    • J
      x86, intel_txt: Intel TXT Sx shutdown support · 86886e55
      Joseph Cihula 提交于
      Support for graceful handling of sleep states (S3/S4/S5) after an Intel(R) TXT launch.
      
      Without this patch, attempting to place the system in one of the ACPI sleep
      states (S3/S4/S5) will cause the TXT hardware to treat this as an attack and
      will cause a system reset, with memory locked.  Not only may the subsequent
      memory scrub take some time, but the platform will be unable to enter the
      requested power state.
      
      This patch calls back into the tboot so that it may properly and securely clean
      up system state and clear the secrets-in-memory flag, after which it will place
      the system into the requested sleep state using ACPI information passed by the kernel.
      
       arch/x86/kernel/smpboot.c     |    2 ++
       drivers/acpi/acpica/hwsleep.c |    3 +++
       kernel/cpu.c                  |    7 ++++++-
       3 files changed, 11 insertions(+), 1 deletion(-)
      Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com>
      Signed-off-by: NShane Wang <shane.wang@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      86886e55
  6. 23 6月, 2009 1 次提交
  7. 30 3月, 2009 1 次提交
  8. 08 1月, 2009 1 次提交
    • H
      stop_machine/cpu hotplug: fix disable_nonboot_cpus · a0e280e0
      Heiko Carstens 提交于
      disable_nonboot_cpus calls _cpu_down. But _cpu_down requires that the
      caller already created the stop_machine workqueue (like cpu_down does).
      Otherwise a call to stop_machine will lead to accesses to random memory
      regions.
      
      When introducing this new interface (9ea09af3
      "stop_machine: introduce stop_machine_create/destroy") I missed the second
      call site of _cpu_down.
      So add the missing stop_machine_create/destroy calls to disable_nonboot_cpus
      as well.
      
      Fixes suspend-to-ram/disk and also this bug:
      
      [  286.547348] BUG: unable to handle kernel paging request at 6b6b6b6b
      [  286.548940] IP: [<c0150ca4>] __stop_machine+0x88/0xe3
      [  286.550598] Oops: 0002 [#1] SMP
      [  286.560580] Pid: 3273, comm: halt Not tainted (2.6.28-06127-g238c6d54
      [  286.560580] EIP: is at __stop_machine+0x88/0xe3
      [  286.560580] Process halt (pid: 3273, ti=f1a28000 task=f4530f30
      [  286.560580] Call Trace:
      [  286.560580]  [<c03d04e4>] ? _cpu_down+0x10f/0x234
      [  286.560580]  [<c012a57e>] ? disable_nonboot_cpus+0x58/0xdc
      [  286.560580]  [<c01360c0>] ? kernel_poweroff+0x22/0x39
      [  286.560580]  [<c0136301>] ? sys_reboot+0xde/0x14c
      [  286.560580]  [<c01331b2>] ? complete_signal+0x179/0x191
      [  286.560580]  [<c0133396>] ? send_signal+0x1cc/0x1e1
      [  286.560580]  [<c03de418>] ? _spin_unlock_irqrestore+0x2d/0x3c
      [  286.560580]  [<c0133b65>] ? group_send_signal_info+0x58/0x61
      [  286.560580]  [<c0133b9e>] ? kill_pid_info+0x30/0x3a
      [  286.560580]  [<c0133d49>] ? sys_kill+0x75/0x13a
      [  286.560580]  [<c01a06cb>] ? mntput_no_expire+ox1f/0x101
      [  286.560580]  [<c019b3b3>] ? dput+0x1e/0x105
      [  286.560580]  [<c018ef87>] ?  __fput+0x150/0x158
      [  286.560580]  [<c0157abf>] ? audit_syscall_entry+0x137/0x159
      [  286.560580]  [<c010329f>] ? sysenter_do_call+0x12/0x34
      Reported-and-tested-by: N"Justin P. Mattock" <justinmattock@gmail.com>
      Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Tested-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a0e280e0
  9. 05 1月, 2009 1 次提交
    • H
      stop_machine: introduce stop_machine_create/destroy. · 9ea09af3
      Heiko Carstens 提交于
      Introduce stop_machine_create/destroy. With this interface subsystems
      that need a non-failing stop_machine environment can create the
      stop_machine machine threads before actually calling stop_machine.
      When the threads aren't needed anymore they can be killed with
      stop_machine_destroy again.
      
      When stop_machine gets called and the threads aren't present they
      will be created and destroyed automatically. This restores the old
      behaviour of stop_machine.
      
      This patch also converts cpu hotplug to the new interface since it
      is special: cpu_down calls __stop_machine instead of stop_machine.
      However the kstop threads will only be created when stop_machine
      gets called.
      
      Changing the code so that the threads would be created automatically
      on __stop_machine is currently not possible: when __stop_machine gets
      called we hold cpu_add_remove_lock, which is the same lock that
      create_rt_workqueue would take. So the workqueue needs to be created
      before the cpu hotplug code locks cpu_add_remove_lock.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      9ea09af3
  10. 01 1月, 2009 1 次提交
    • R
      cpumask: convert kernel/cpu.c · e0b582ec
      Rusty Russell 提交于
      Impact: Reduce kernel stack and memory usage, use new cpumask API.
      
      Use cpumask_var_t for take_cpu_down() stack var, and frozen_cpus.
      
      Note that notify_cpu_starting() can be called before core_initcall
      allocates frozen_cpus, but the NULL check is optimized out by gcc for
      the CONFIG_CPUMASK_OFFSTACK=n case.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      e0b582ec
  11. 30 12月, 2008 2 次提交
  12. 13 12月, 2008 1 次提交
    • R
      cpumask: centralize cpu_online_map and cpu_possible_map · 98a79d6a
      Rusty Russell 提交于
      Impact: cleanup
      
      Each SMP arch defines these themselves.  Move them to a central
      location.
      
      Twists:
      1) Some archs (m32, parisc, s390) set possible_map to all 1, so we add a
         CONFIG_INIT_ALL_POSSIBLE for this rather than break them.
      
      2) mips and sparc32 '#define cpu_possible_map phys_cpu_present_map'.
         Those archs simply have phys_cpu_present_map replaced everywhere.
      
      3) Alpha defined cpu_possible_map to cpu_present_map; this is tricky
         so I just manipulate them both in sync.
      
      4) IA64, cris and m32r have gratuitous 'extern cpumask_t cpu_possible_map'
         declarations.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Reviewed-by: NGrant Grundler <grundler@parisc-linux.org>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Mike Travis <travis@sgi.com>
      Cc: ink@jurassic.park.msu.ru
      Cc: rmk@arm.linux.org.uk
      Cc: starvik@axis.com
      Cc: tony.luck@intel.com
      Cc: takata@linux-m32r.org
      Cc: ralf@linux-mips.org
      Cc: grundler@parisc-linux.org
      Cc: paulus@samba.org
      Cc: schwidefsky@de.ibm.com
      Cc: lethal@linux-sh.org
      Cc: wli@holomorphy.com
      Cc: davem@davemloft.net
      Cc: jdike@addtoit.com
      Cc: mingo@redhat.com
      98a79d6a
  13. 01 12月, 2008 1 次提交
  14. 06 11月, 2008 1 次提交
    • R
      cpumask: introduce new API, without changing anything · 2d3854a3
      Rusty Russell 提交于
      Impact: introduce new APIs
      
      We want to deprecate cpumasks on the stack, as we are headed for
      gynormous numbers of CPUs.  Eventually, we want to head towards an
      undefined 'struct cpumask' so they can never be declared on stack.
      
      1) New cpumask functions which take pointers instead of copies.
         (cpus_* -> cpumask_*)
      
      2) Several new helpers to reduce requirements for temporary cpumasks
         (cpumask_first_and, cpumask_next_and, cpumask_any_and)
      
      3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
         (cpumask_var_t, alloc_cpumask_var and free_cpumask_var)
      
      4) 'struct cpumask' for explicitness and to mark new-style code.
      
      5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
         not NR_CPUS for time efficiency and for smaller dynamic allocations
         in future.
      
      6) cpumask_copy() so we can allocate less than a full cpumask eventually
         (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
         definition eventually.
      
      7) work_on_cpu() helper for doing task on a CPU, rather than saving old
         cpumask for current thread and manipulating it.
      
      8) smp_call_function_many() which is smp_call_function_mask() except
         taking a cpumask pointer.
      
      Note that this patch simply introduces the new functions and leaves
      the obsolescent ones in place.  This is to simplify the transition
      patches.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2d3854a3
  15. 09 9月, 2008 1 次提交
    • M
      kernel/cpu.c: create a CPU_STARTING cpu_chain notifier · e545a614
      Manfred Spraul 提交于
      Right now, there is no notifier that is called on a new cpu, before the new
      cpu begins processing interrupts/softirqs.
      Various kernel function would need that notification, e.g. kvm works around
      by calling smp_call_function_single(), rcu polls cpu_online_map.
      
      The patch adds a CPU_STARTING notification. It also adds a helper function
      that sends the message to all cpu_chain handlers.
      
      Tested on x86-64.
      All other archs are untested. Especially on sparc, I'm not sure if I got
      it right.
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e545a614
  16. 07 9月, 2008 1 次提交
    • M
      kernel/cpu.c: Move the CPU_DYING notifiers · 3ba35573
      Manfred Spraul 提交于
      When a cpu is taken offline, the CPU_DYING notifiers are called on the
      dying cpu. According to <linux/notifiers.h>, the cpu should be "not
      running any task, not handling interrupts, soon dead".
      
      For the current implementation, this is not true:
      - __cpu_disable can fail. If it fails, then the cpu will remain alive
        and happy.
      - At least on x86, __cpu_disable() briefly enables the local interrupts
        to handle any outstanding interrupts.
      
      What about moving CPU_DYING down a few lines, behind the __cpu_disable()
      line?
      There are only two CPU_DYING handlers in the kernel right now: one in
      kvm, one in the scheduler. Both should work with the patch applied
      [and: I'm not sure if either one handles a failing __cpu_disable()]
      
      The patch survives simple offlining a cpu. kvm untested due to lack
      of a test setup.
      Signed-off-By: NManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3ba35573
  17. 13 8月, 2008 1 次提交
  18. 11 8月, 2008 1 次提交
    • D
      sched, cpu hotplug: fix set_cpus_allowed() use in hotplug callbacks · 279ef6bb
      Dmitry Adamushko 提交于
      Mark Langsdorf reported:
      
      > One of my co-workers noticed that the powernow-k8
      > driver no longer restarts when a CPU core is
      > hot-disabled and then hot-enabled on AMD quad-core
      > systems.
      >
      > The following comands work fine on 2.6.26 and fail
      > on 2.6.27-rc1:
      >
      > echo 0 > /sys/devices/system/cpu/cpu3/online
      > echo 1 > /sys/devices/system/cpu/cpu3/online
      > find /sys -name cpufreq
      >
      > For 2.6.26, the find will return a cpufreq
      > directory for each processor.  In 2.6.27-rc1,
      > the cpu3 directory is missing.
      >
      > After digging through the code, the following
      > logic is failing when the core is hot-enabled
      > at runtime.  The code works during the boot
      > sequence.
      >
      >       cpumask_t = current->cpus_allowed;
      >       set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
      >       if (smp_processor_id() != cpu)
      >               return -ENODEV;
      
      So set the CPU active before calling the CPU_ONLINE notifier chain,
      there are a handful of notifiers that use set_cpus_allowed().
      
      This fix also solves the problem with x86-microcode. I've sent
      alternative patches for microcode, but as this "rely on
      set_cpus_allowed_ptr() being workable in cpu-hotplug(CPU_ONLINE, ...)"
      assumption seems to be more broad than what we thought, perhaps this fix
      should be applied.
      
      With this patch we define that by the moment CPU_ONLINE is being sent,
      a 'cpu' is online and ready for tasks to be migrated onto it.
      Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com>
      Reported-by: NMark Langsdorf <mark.langsdorf@amd.com>
      Tested-by: NMark Langsdorf <mark.langsdorf@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      279ef6bb
  19. 29 7月, 2008 1 次提交
    • L
      cpu masks: optimize and clean up cpumask_of_cpu() · e56b3bc7
      Linus Torvalds 提交于
      Clean up and optimize cpumask_of_cpu(), by sharing all the zero words.
      
      Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns
      creating a huge array of constant bitmasks, realize that the zero words
      can be shared.
      
      In other words, on a 64-bit architecture, we only ever need 64 of these
      arrays - with a different bit set in one single world (with enough zero
      words around it so that we can create any bitmask by just offsetting in
      that big array). And then we just put enough zeroes around it that we
      can point every single cpumask to be one of those things.
      
      So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each,
      with one bit set in each array - 2MB memory total), we have exactly 64
      arrays instead, each 8k bits in size (64kB total).
      
      And then we just point cpumask(n) to the right position (which we can
      calculate dynamically). Once we have the right arrays, getting
      "cpumask(n)" ends up being:
      
        static inline const cpumask_t *get_cpu_mask(unsigned int cpu)
        {
                const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG];
                p -= cpu / BITS_PER_LONG;
                return (const cpumask_t *)p;
        }
      
      This brings other advantages and simplifications as well:
      
       - we are not wasting memory that is just filled with a single bit in
         various different places
      
       - we don't need all those games to re-create the arrays in some dense
         format, because they're already going to be dense enough.
      
      if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory
      is a non-issue (especially since by doing this "overlapping" trick we
      probably get better cache behaviour anyway).
      
      [ mingo@elte.hu:
      
        Converted Linus's mails into a commit. See:
      
           http://lkml.org/lkml/2008/7/27/156
           http://lkml.org/lkml/2008/7/28/320
      
        Also applied a family filter - which also has the side-effect of leaving
        out the bits where Linus calls me an idio... Oh, never mind ;-)
      ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Mike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e56b3bc7
  20. 28 7月, 2008 3 次提交
    • R
      stop_machine(): stop_machine_run() changed to use cpu mask · eeec4fad
      Rusty Russell 提交于
      Instead of a "cpu" arg with magic values NR_CPUS (any cpu) and ~0 (all
      cpus), pass a cpumask_t.  Allow NULL for the common case (where we
      don't care which CPU the function is run on): temporary cpumask_t's
      are usually considered bad for stack space.
      
      This deprecates stop_machine_run, to be removed soon when all the
      callers are dead.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      eeec4fad
    • R
      Hotplug CPU: don't check cpu_online after take_cpu_down · 04321587
      Rusty Russell 提交于
      Akinobu points out that if take_cpu_down() succeeds, the cpu must be offline.
      Remove the cpu_online() check, and put a BUG_ON().
      
      Quoting Akinobu Mita:
         Actually the cpu_online() check was necessary before appling this
         stop_machine: simplify patch.
      
         With old __stop_machine_run(), __stop_machine_run() could succeed
         (return !IS_ERR(p) value) even if take_cpu_down() returned non-zero value.
         The return value of take_cpu_down() was obtained through kthread_stop()..
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: "Akinobu Mita" <akinobu.mita@gmail.com>
      04321587
    • R
      Simplify stop_machine · ffdb5976
      Rusty Russell 提交于
      stop_machine creates a kthread which creates kernel threads.  We can
      create those threads directly and simplify things a little.  Some care
      must be taken with CPU hotunplug, which has special needs, but that code
      seems more robust than it was in the past.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      ffdb5976
  21. 26 7月, 2008 4 次提交
    • I
      cpumask: export cpumask_of_cpu_map · 5a7a201c
      Ingo Molnar 提交于
      fix:
      
       ERROR: "cpumask_of_cpu_map" [drivers/acpi/processor.ko] undefined!
       ERROR: "cpumask_of_cpu_map" [arch/x86/kernel/microcode.ko] undefined!
       ERROR: "cpumask_of_cpu_map" [arch/x86/kernel/cpu/cpufreq/speedstep-ich.ko] undefined!
       ERROR: "cpumask_of_cpu_map" [arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.ko] undefined!
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5a7a201c
    • M
      cpumask: put cpumask_of_cpu_map in the initdata section · 6524d938
      Mike Travis 提交于
        * Create the cpumask_of_cpu_map statically in the init data section
          using NR_CPUS but replace it during boot up with one sized by
          nr_cpu_ids (num possible cpus).
      Signed-off-by: NMike Travis <travis@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6524d938
    • M
      cpumask: make cpumask_of_cpu_map generic · b8d317d1
      Mike Travis 提交于
      If an arch doesn't define cpumask_of_cpu_map, create a generic
      statically-initialized one for them.  This allows removal of the buggy
      cpumask_of_cpu() macro (&cpumask_of_cpu() gives address of
      out-of-scope var).
      
      An arch with NR_CPUS of 4096 probably wants to allocate this itself
      based on the actual number of CPUs, since otherwise they're using 2MB
      of rodata (1024 cpus means 128k).  That's what
      CONFIG_HAVE_CPUMASK_OF_CPU_MAP is for (only x86/64 does so at the
      moment).
      
      In future as we support more CPUs, we'll need to resort to a
      get_cpu_map()/put_cpu_map() allocation scheme.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jack Steiner <steiner@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b8d317d1
    • O
      workqueues: make get_online_cpus() useable for work->func() · 3da1c84c
      Oleg Nesterov 提交于
      workqueue_cpu_callback(CPU_DEAD) flushes cwq->thread under
      cpu_maps_update_begin().  This means that the multithreaded workqueues
      can't use get_online_cpus() due to the possible deadlock, very bad and
      very old problem.
      
      Introduce the new state, CPU_POST_DEAD, which is called after
      cpu_hotplug_done() but before cpu_maps_update_done().
      
      Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD.
      This means that create/destroy functions can't rely on get_online_cpus()
      any longer and should take cpu_add_remove_lock instead.
      
      [akpm@linux-foundation.org: fix CONFIG_SMP=n]
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: NGautham R Shenoy <ego@in.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3da1c84c
  22. 18 7月, 2008 2 次提交
    • M
      cpu hotplug: Make cpu_active_map synchronization dependency clear · 39b0fad7
      Max Krasnyansky 提交于
      This goes on top of the cpu_active_map (take 2) patch.
      
      Currently we depend on the stop_machine to provide nescessesary
      synchronization for the cpu_active_map updates.
      As Dmitry Adamushko pointed this is fragile and is not much clearer
      than the previous scheme. In other words we do not want to depend on
      the internal stop machine operation here.
      So make the synchronization rules clear by doing synchronize_sched()
      after clearing out cpu active bit.
      
      Tested on quad-Core2 with:
      
         while true; do
            for i in 1 2 3; do
              echo 0 > /sys/devices/system/cpu/cpu$i/online
            done
            for i in 1 2 3; do
              echo 1 > /sys/devices/system/cpu/cpu$i/online
            done
         done
      and
         stress -c 200
      
      No lockdep, preempt or other complaints.
      Signed-off-by: NMax Krasnyansky <maxk@qualcomm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      39b0fad7
    • M
      cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2) · e761b772
      Max Krasnyansky 提交于
      This is based on Linus' idea of creating cpu_active_map that prevents
      scheduler load balancer from migrating tasks to the cpu that is going
      down.
      
      It allows us to simplify domain management code and avoid unecessary
      domain rebuilds during cpu hotplug event handling.
      
      Please ignore the cpusets part for now. It needs some more work in order
      to avoid crazy lock nesting. Although I did simplfy and unify domain
      reinitialization logic. We now simply call partition_sched_domains() in
      all the cases. This means that we're using exact same code paths as in
      cpusets case and hence the test below cover cpusets too.
      Cpuset changes to make rebuild_sched_domains() callable from various
      contexts are in the separate patch (right next after this one).
      
      This not only boots but also easily handles
      	while true; do make clean; make -j 8; done
      and
      	while true; do on-off-cpu 1; done
      at the same time.
      (on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing).
      
      Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing
      this on right now in gnome-terminal and things are moving just fine.
      
      Also this is running with most of the debug features enabled (lockdep,
      mutex, etc) no BUG_ONs or lockdep complaints so far.
      
      I believe I addressed all of the Dmitry's comments for original Linus'
      version. I changed both fair and rt balancer to mask out non-active cpus.
      And replaced cpu_is_offline() with !cpu_active() in the main scheduler
      code where it made sense (to me).
      Signed-off-by: NMax Krasnyanskiy <maxk@qualcomm.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NGregory Haskins <ghaskins@novell.com>
      Cc: dmitry.adamushko@gmail.com
      Cc: pj@sgi.com
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e761b772
  23. 17 7月, 2008 1 次提交
  24. 06 6月, 2008 1 次提交
  25. 24 5月, 2008 1 次提交
  26. 30 4月, 2008 1 次提交
  27. 29 4月, 2008 4 次提交
    • O
      simplify cpu_hotplug_begin()/put_online_cpus() · d2ba7e2a
      Oleg Nesterov 提交于
      cpu_hotplug_begin() must be always called under cpu_add_remove_lock, this
      means that only one process can be cpu_hotplug.active_writer.  So we don't
      need the cpu_hotplug.writer_queue, we can wake up the ->active_writer
      directly.
      
      Also, fix the comment.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Acked-by: NGautham R Shenoy <ego@in.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d2ba7e2a
    • S
      cpu: fix section mismatch warning in reference to register_cpu_notifier · f7b16c10
      Sam Ravnborg 提交于
      Fix following warnings:
      WARNING: vmlinux.o(.text+0xc60): Section mismatch in reference from the function kvm_init() to the function .cpuinit.text:register_cpu_notifier()
      WARNING: vmlinux.o(.text+0x33869a): Section mismatch in reference from the function xfs_icsb_init_counters() to the function .cpuinit.text:register_cpu_notifier()
      WARNING: vmlinux.o(.text+0x5556a1): Section mismatch in reference from the function acpi_processor_install_hotplug_notify() to the function .cpuinit.text:register_cpu_notifier()
      WARNING: vmlinux.o(.text+0xfe6b28): Section mismatch in reference from the function cpufreq_register_driver() to the function .cpuinit.text:register_cpu_notifier()
      
      register_cpu_notifier() are only really defined when HOTPLUG_CPU is enabled.
      So references to the function are OK.
      
      Annotate it with __ref so we do not get warnings from callers and do not get
      warnings for the functions/data used by register_cpu_notifier().
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f7b16c10
    • S
      cpu: fix section mismatch warnings in *cpu_down · 514a20a5
      Sam Ravnborg 提交于
      Fix following warnings:
      WARNING: vmlinux.o(.text+0x75c8d): Section mismatch in reference from the function take_cpu_down() to the variable .cpuinit.data:cpu_chain
      WARNING: vmlinux.o(.text+0x75d2a): Section mismatch in reference from the function _cpu_down() to the variable .cpuinit.data:cpu_chain
      WARNING: vmlinux.o(.text+0x75d4d): Section mismatch in reference from the function _cpu_down() to the variable .cpuinit.data:cpu_chain
      WARNING: vmlinux.o(.text+0x75de4): Section mismatch in reference from the function _cpu_down() to the variable .cpuinit.data:cpu_chain
      WARNING: vmlinux.o(.text+0x75e33): Section mismatch in reference from the function _cpu_down() to the variable .cpuinit.data:cpu_chain
      
      cpu_down is only used from code surrounded by HOTPLUG_CPU so any references to
      __cpuinit is OK.
      
      Add a few __ref to tech modpost to ignore the references.
      
      This is just papering over the fact that the cpu hotplug code is fragile with
      respect to use of HOTPLUG_CPU and in many cases rely on __cpuinit to get rid
      of code when HOTPLUG_CPU is not enabled.  For now this is the least invasive
      change.
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      514a20a5
    • S
      cpu: fix section mismatch warning in unregister_cpu_notifier · 9647155f
      Sam Ravnborg 提交于
      Fix following warning:
      WARNING: vmlinux.o(.text+0x75f4e): Section mismatch in reference from the function unregister_cpu_notifier() to the variable .cpuinit.data:cpu_chain
      
      We know that unregister_cpu_notifier is using HOTPLUG_CPU
      stuff - so ignore these references.
      Annotating unregister_cpu_notifier had been another option
      but this caused far more warnings since not all callers were
      annotated __cpuinit.
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9647155f
  28. 20 4月, 2008 1 次提交
    • M
      generic: use new set_cpus_allowed_ptr function · f70316da
      Mike Travis 提交于
        * Use new set_cpus_allowed_ptr() function added by previous patch,
          which instead of passing the "newly allowed cpus" cpumask_t arg
          by value,  pass it by pointer:
      
          -int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
          +int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
      
        * Modify CPU_MASK_ALL
      
      Depends on:
      	[sched-devel]: sched: add new set_cpus_allowed_ptr function
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f70316da
  29. 09 2月, 2008 1 次提交
  30. 26 1月, 2008 1 次提交