1. 08 7月, 2008 1 次提交
  2. 30 6月, 2008 1 次提交
    • Z
      x86: fix cpu hotplug crash · fcb43042
      Zhang, Yanmin 提交于
      Vegard Nossum reported crashes during cpu hotplug tests:
      
        http://marc.info/?l=linux-kernel&m=121413950227884&w=4
      
      In function _cpu_up, the panic happens when calling
      __raw_notifier_call_chain at the second time. Kernel doesn't panic when
      calling it at the first time. If just say because of nr_cpu_ids, that's
      not right.
      
      By checking the source code, I found that function do_boot_cpu is the culprit.
      Consider below call chain:
       _cpu_up=>__cpu_up=>smp_ops.cpu_up=>native_cpu_up=>do_boot_cpu.
      
      So do_boot_cpu is called in the end. In do_boot_cpu, if
      boot_error==true, cpu_clear(cpu, cpu_possible_map) is executed. So later
      on, when _cpu_up calls __raw_notifier_call_chain at the second time to
      report CPU_UP_CANCELED, because this cpu is already cleared from
      cpu_possible_map, get_cpu_sysdev returns NULL.
      
      Many resources are related to cpu_possible_map, so it's better not to
      change it.
      
      Below patch against 2.6.26-rc7 fixes it by removing the bit clearing in
      cpu_possible_map.
      Signed-off-by: NZhang Yanmin <yanmin_zhang@linux.intel.com>
      Tested-by: NVegard Nossum <vegard.nossum@gmail.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fcb43042
  3. 04 6月, 2008 1 次提交
    • I
      x86: disable preemption in native_smp_prepare_cpus · deef3250
      Ingo Molnar 提交于
      Priit Laes reported the following warning:
      
      Call Trace:
       [<ffffffff8022f1e1>] warn_on_slowpath+0x51/0x63
       [<ffffffff80282e48>] sys_ioctl+0x2d/0x5d
       [<ffffffff805185ff>] _spin_lock+0xe/0x24
       [<ffffffff80227459>] task_rq_lock+0x3d/0x73
       [<ffffffff805133c3>] set_cpu_sibling_map+0x336/0x350
       [<ffffffff8021c1b8>] read_apic_id+0x30/0x62
       [<ffffffff806d921d>] verify_local_APIC+0x90/0x138
       [<ffffffff806d84b5>] native_smp_prepare_cpus+0x1f9/0x305
       [<ffffffff806ce7b1>] kernel_init+0x59/0x2d9
       [<ffffffff80518a26>] _spin_unlock_irq+0x11/0x2b
       [<ffffffff8020bf48>] child_rip+0xa/0x12
       [<ffffffff806ce758>] kernel_init+0x0/0x2d9
       [<ffffffff8020bf3e>] child_rip+0x0/0x12
      
      fix this by generally disabling preemption in native_smp_prepare_cpus().
      Reported-and-bisected-by: NPriit Laes <plaes@plaes.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      deef3250
  4. 14 5月, 2008 1 次提交
    • H
      x86: fix app crashes after SMP resume · 61165d7a
      Hugh Dickins 提交于
      After resume on a 2cpu laptop, kernel builds collapse with a sed hang,
      sh or make segfault (often on 20295564), real-time signal to cc1 etc.
      
      Several hurdles to jump, but a manually-assisted bisect led to -rc1's
      d2bcbad5 x86: do not zap_low_mappings
      in __smp_prepare_cpus.  Though the low mappings were removed at bootup,
      they were left behind (with Global flags helping to keep them in TLB)
      after resume or cpu online, causing the crashes seen.
      
      Reinstate zap_low_mappings (with local __flush_tlb_all) for each cpu_up
      on x86_32.  This used to be serialized by smp_commenced_mask: that's now
      gone, but a low_mappings flag will do.  No need for native_smp_cpus_done
      to repeat the zap: let mem_init zap BSP's low mappings just like on UP.
      
      (In passing, fix error code from native_cpu_up: do_boot_cpu returns a
      variety of diagnostic values, Dprintk what it says but convert to -EIO.
      And save_pg_dir separately before zap_low_mappings: doesn't matter now,
      but zapping twice in succession wiped out resume's swsusp_pg_dir.)
      
      That worked well on the duo and one quad, but wouldn't boot 3rd or 4th
      cpu on P4 Xeon, oopsing just after unlock_ipi_call_lock.  The TLB flush
      IPI now being sent reveals a long-standing bug: the booting cpu has its
      APIC readied in smp_callin at the top of start_secondary, but isn't put
      into the cpu_online_map until just before that unlock_ipi_call_lock.
      
      So native_smp_call_function_mask to online cpus would send_IPI_allbutself,
      including the cpu just coming up, though it has been excluded from the
      count to wait for: by the time it handles the IPI, the call data on
      native_smp_call_function_mask's stack may well have been overwritten.
      
      So fall back to send_IPI_mask while cpu_online_map does not match
      cpu_callout_map: perhaps there's a better APICological fix to be
      made at the start_secondary end, but I wouldn't know that.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      61165d7a
  5. 05 5月, 2008 2 次提交
  6. 29 4月, 2008 1 次提交
  7. 26 4月, 2008 1 次提交
  8. 25 4月, 2008 2 次提交
  9. 20 4月, 2008 1 次提交
  10. 17 4月, 2008 29 次提交