1. 20 8月, 2013 1 次提交
    • Y
      x86/ioapic/kcrash: Prevent crash_kexec() from deadlocking on ioapic_lock · 17405453
      Yoshihiro YUNOMAE 提交于
      Prevent crash_kexec() from deadlocking on ioapic_lock. When
      crash_kexec() is executed on a CPU, the CPU will take ioapic_lock
      in disable_IO_APIC(). So if the cpu gets an NMI while locking
      ioapic_lock, a deadlock will happen.
      
      In this patch, ioapic_lock is zapped/initialized before disable_IO_APIC().
      
      You can reproduce this deadlock the following way:
      
      1. Add mdelay(1000) after raw_spin_lock_irqsave() in
         native_ioapic_set_affinity()@arch/x86/kernel/apic/io_apic.c
      
         Although the deadlock can occur without this modification, it will increase
         the potential of the deadlock problem.
      
      2. Build and install the kernel
      
      3. Set up the OS which will run panic() and kexec when NMI is injected
          # echo "kernel.unknown_nmi_panic=1" >> /etc/sysctl.conf
          # vim /etc/default/grub
            add "nmi_watchdog=0 crashkernel=256M" in GRUB_CMDLINE_LINUX line
          # grub2-mkconfig
      
      4. Reboot the OS
      
      5. Run following command for each vcpu on the guest
          # while true; do echo <CPU num> > /proc/irq/<IO-APIC-edge or IO-APIC-fasteoi>/smp_affinitity; done;
         By running this command, cpus will get ioapic_lock for setting affinity.
      
      6. Inject NMI (push a dump button or execute 'virsh inject-nmi <domain>' if you
         use VM). After injecting NMI, panic() is called in an nmi-handler context.
         Then, kexec will normally run in panic(), but the operation will be stopped
         by deadlock on ioapic_lock in crash_kexec()->machine_crash_shutdown()->
         native_machine_crash_shutdown()->disable_IO_APIC()->clear_IO_APIC()->
         clear_IO_APIC_pin()->ioapic_read_entry().
      Signed-off-by: NYoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/20130820070107.28245.83806.stgit@yunodevelSigned-off-by: NIngo Molnar <mingo@kernel.org>
      17405453
  2. 21 6月, 2013 1 次提交
    • S
      x86, trace: Introduce entering/exiting_irq() · eddc0e92
      Seiji Aguchi 提交于
      When implementing tracepoints in interrupt handers, if the tracepoints are
      simply added in the performance sensitive path of interrupt handers,
      it may cause potential performance problem due to the time penalty.
      
      To solve the problem, an idea is to prepare non-trace/trace irq handers and
      switch their IDTs at the enabling/disabling time.
      
      So, let's introduce entering_irq()/exiting_irq() for pre/post-
      processing of each irq handler.
      
      A way to use them is as follows.
      
      Non-trace irq handler:
      smp_irq_handler()
      {
      	entering_irq();		/* pre-processing of this handler */
      	__smp_irq_handler();	/*
      				 * common logic between non-trace and trace handlers
      				 * in a vector.
      				 */
      	exiting_irq();		/* post-processing of this handler */
      
      }
      
      Trace irq_handler:
      smp_trace_irq_handler()
      {
      	entering_irq();		/* pre-processing of this handler */
      	trace_irq_entry();	/* tracepoint for irq entry */
      	__smp_irq_handler();	/*
      				 * common logic between non-trace and trace handlers
      				 * in a vector.
      				 */
      	trace_irq_exit();	/* tracepoint for irq exit */
      	exiting_irq();		/* post-processing of this handler */
      
      }
      
      If tracepoints can place outside entering_irq()/exiting_irq() as follows,
      it looks cleaner.
      
      smp_trace_irq_handler()
      {
      	trace_irq_entry();
      	smp_irq_handler();
      	trace_irq_exit();
      }
      
      But it doesn't work.
      The problem is with irq_enter/exit() being called. They must be called before
      trace_irq_enter/exit(),  because of the rcu_irq_enter() must be called before
      any tracepoints are used, as tracepoints use  rcu to synchronize.
      
      As a possible alternative, we may be able to call irq_enter() first as follows
      if irq_enter() can nest.
      
      smp_trace_irq_hander()
      {
      	irq_entry();
      	trace_irq_entry();
      	smp_irq_handler();
      	trace_irq_exit();
      	irq_exit();
      }
      
      But it doesn't work, either.
      If irq_enter() is nested, it may have a time penalty because it has to check if it
      was already called or not. The time penalty is not desired in performance sensitive
      paths even if it is tiny.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C3238D.9040706@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      eddc0e92
  3. 06 10月, 2012 1 次提交
  4. 16 7月, 2012 2 次提交
  5. 06 7月, 2012 2 次提交
  6. 04 7月, 2012 1 次提交
  7. 14 6月, 2012 2 次提交
  8. 08 6月, 2012 4 次提交
  9. 06 6月, 2012 2 次提交
  10. 18 5月, 2012 3 次提交
  11. 29 3月, 2012 1 次提交
  12. 23 3月, 2012 1 次提交
    • S
      x86/apic: Add separate apic_id_valid() functions for selected apic drivers · b7157acf
      Steffen Persvold 提交于
      As suggested by Suresh Siddha and Yinghai Lu:
      
      For x2apic pre-enabled systems, apic driver is set already early
      through early_acpi_boot_init()/early_acpi_process_madt()/
      acpi_parse_madt()/default_acpi_madt_oem_check() path so that
      apic_id_valid() checking will be sufficient during MADT and SRAT
      parsing.
      
      For non-x2apic pre-enabled systems, all apic ids should be less
      than 255.
      
      This allows us to substitute the checks in
      arch/x86/kernel/acpi/boot.c::acpi_parse_x2apic() and
      arch/x86/mm/srat.c::acpi_numa_x2apic_affinity_init() with
      apic->apic_id_valid().
      
      In addition we can avoid feigning the x2apic cpu feature in the
      NumaChip apic code.
      
      The following apic drivers have separate apic_id_valid()
      functions which will accept x2apic type IDs :
      
       x2apic_phys
       x2apic_cluster
       x2apic_uv_x
       apic_numachip
      Signed-off-by: NSteffen Persvold <sp@numascale.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Daniel J Blueman <daniel@numascale-asia.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Jack Steiner <steiner@sgi.com>
      Link: http://lkml.kernel.org/r/1331925935-13372-1-git-send-email-sp@numascale.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b7157acf
  13. 14 3月, 2012 1 次提交
  14. 24 12月, 2011 2 次提交
  15. 18 12月, 2011 1 次提交
  16. 14 12月, 2011 1 次提交
    • F
      x86: Add per-cpu stat counter for APIC ICR read tries · 346b46be
      Fernando Luis Vázquez Cao 提交于
      In the IPI delivery slow path (NMI delivery) we retry the ICR
      read to check for delivery completion a limited number of times.
      
      [ The reason for the limited retries is that some of the places
        where it is used (cpu boot, kdump, etc) IPI delivery might not
        succeed (due to a firmware bug or system crash, for example)
        and in such a case it is better to give up and resume
        execution of other code. ]
      
      This patch adds a new entry to /proc/interrupts, RTR, which
      tells user space the number of times we retried the ICR read in
      the IPI delivery slow path.
      
      This should give some insight into how well the APIC
      message delivery hardware is working - if the counts are way
      too large then we are hitting a (very-) slow path way too
      often.
      Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Cc: Jörn Engel <joern@logfs.org>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/n/tip-vzsp20lo2xdzh5f70g0eis2s@git.kernel.org
      [ extended the changelog ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      346b46be
  17. 10 11月, 2011 1 次提交
  18. 29 9月, 2011 1 次提交
    • J
      apic, i386/bigsmp: Fix false warnings regarding logical APIC ID mismatches · 838312be
      Jan Beulich 提交于
      These warnings (generally one per CPU) are a result of
      initializing x86_cpu_to_logical_apicid while apic_default is
      still in use, but the check in setup_local_APIC() being done
      when apic_bigsmp was already used as an override in
      default_setup_apic_routing():
      
       Overriding APIC driver with bigsmp
       Enabling APIC mode:  Physflat.  Using 5 I/O APICs
       ------------[ cut here ]------------
       WARNING: at .../arch/x86/kernel/apic/apic.c:1239
       ...
       CPU 1 irqstacks, hard=f1c9a000 soft=f1c9c000
       Booting Node   0, Processors  #1
       smpboot cpu 1: start_ip = 9e000
       Initializing CPU#1
       ------------[ cut here ]------------
       WARNING: at .../arch/x86/kernel/apic/apic.c:1239
       setup_local_APIC+0x137/0x46b() Hardware name: ...
       CPU1 logical APIC ID: 2 != 8
       ...
      
      Fix this (for the time being, i.e. until
      x86_32_early_logical_apicid() will get removed again, as Tejun
      says ought to be possible) by overriding the previously stored
      values at the point where the APIC driver gets overridden.
      
      v2: Move this and the pre-existing override logic into
          arch/x86/kernel/apic/bigsmp_32.c.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: <stable@kernel.org> (2.6.39 and onwards)
      Link: http://lkml.kernel.org/r/4E835D16020000780005844C@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      838312be
  19. 27 7月, 2011 1 次提交
  20. 22 5月, 2011 3 次提交
  21. 02 5月, 2011 1 次提交
    • T
      x86-32, NUMA: Make apic->x86_32_numa_cpu_node() optional · 84914ed0
      Tejun Heo 提交于
      NUMAQ is the only meaningful user of this callback and
      setup_local_APIC() the only callsite.  Stop torturing everyone else by
      making the callback optional and removing all the boilerplate
      implementations and assignments.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      84914ed0
  22. 29 3月, 2011 1 次提交
    • J
      x86: Stop including <linux/delay.h> in two asm header files · ca444564
      Jean Delvare 提交于
      Stop including <linux/delay.h> in x86 header files which don't
      need it. This will let the compiler complain when this header is
      not included by source files when it should, so that
      contributors can fix the problem before building on other
      architectures starts to fail.
      
      Credits go to Geert for the idea.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: James E.J. Bottomley <James.Bottomley@suse.de>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      LKML-Reference: <20110325152014.297890ec@endymion.delvare>
      [ this also fixes an upstream build bug in drivers/media/rc/ite-cir.c ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ca444564
  23. 11 3月, 2011 1 次提交
  24. 25 2月, 2011 1 次提交
    • T
      x86: dt: Cleanup local apic setup · a906fdaa
      Thomas Gleixner 提交于
      Up to now we force enable the local apic in the devicetree setup
      uncoditionally and set smp_found_config unconditionally to 1 when a
      devicetree blob is available. This breaks, when local apic is disabled
      in the Kconfig.
      
      Make it consistent by initializing device tree explicitely before
      smp_get_config() so a non lapic configuration could be used as well.
      To be functional that would require to implement PIT as an interrupt
      host, but the only user of this code until now is ce4100 which
      requires apics to be available. So we leave this up to those who need
      it.
      Tested-by: NSebastian Siewior <bigeasy@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      a906fdaa
  25. 10 2月, 2011 1 次提交
    • J
      x86: Fix section mismatch in LAPIC initialization · 2fb270f3
      Jan Beulich 提交于
      Additionally doing things conditionally upon smp_processor_id()
      being zero is generally a bad idea, as this means CPU 0 cannot
      be offlined and brought back online later again.
      
      While there may be other places where this is done, I think adding
      more of those should be avoided so that some day SMP can really
      become "symmetrical".
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      LKML-Reference: <4D525C7E0200007800030EE1@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2fb270f3
  26. 28 1月, 2011 3 次提交
    • T
      x86: Replace apic->apicid_to_node() with ->x86_32_numa_cpu_node() · 89e5dc21
      Tejun Heo 提交于
      apic->apicid_to_node() is 32bit specific apic operation which
      determines NUMA node for a CPU.  Depending on the APIC
      implementation, it can be easier to determine NUMA node from
      either physical or logical apicid.  Currently,
      ->apicid_to_node() takes @logical_apicid and calls
      hard_smp_processor_id() if the physical apicid is needed.
      
      This prevents NUMA mapping from being queried from a different
      CPU, which in turn makes it impossible to initialize NUMA
      mapping before SMP bringup.
      
      This patch replaces apic->apicid_to_node() with
      ->x86_32_numa_cpu_node() which takes @cpu, from which both
      logical and physical apicids can easily be determined.  While at
      it, drop duplicate implementations from bigsmp_32 and summit_32,
      and use the default one.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NPekka Enberg <penberg@kernel.org>
      Cc: eric.dumazet@gmail.com
      Cc: yinghai@kernel.org
      Cc: brgerst@gmail.com
      Cc: gorcunov@gmail.com
      Cc: shaohui.zheng@intel.com
      Cc: rientjes@google.com
      LKML-Reference: <1295789862-25482-13-git-send-email-tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      89e5dc21
    • T
      x86: Add apic->x86_32_early_logical_apicid() · acb8bc09
      Tejun Heo 提交于
      On x86_32, the mapping between cpu and logical apic ID differs
      depending on the specific apic implementation in use.  The
      mapping is initialized while bringing up CPUs; however, this
      makes early inits ignore memory topology.
      
      Add a x86_32 specific apic->x86_32_early_logical_apicid() which
      is called early during boot to query the mapping.  The mapping
      is later verified against the result of init_apic_ldr().  The
      method is allowed to return BAD_APICID if it can't be determined
      early.
      
      noop variant which always returns BAD_APICID is implemented and
      added to all x86_32 apic implementations.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: eric.dumazet@gmail.com
      Cc: yinghai@kernel.org
      Cc: brgerst@gmail.com
      Cc: gorcunov@gmail.com
      Cc: penberg@kernel.org
      Cc: shaohui.zheng@intel.com
      Cc: rientjes@google.com
      LKML-Reference: <1295789862-25482-8-git-send-email-tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      acb8bc09
    • T
      x86: Kill apic->cpu_to_logical_apicid() · 7632611f
      Tejun Heo 提交于
      After the previous patch, apic->cpu_to_logical_apicid() is no
      longer used.  Kill it.
      
      For apic types with custom cpu_to_logical_apicid() which is also
      used for other purposes, remove the function and modify its
      users to do the mapping directly.
      
      #ifdef's on CONFIG_SMP in es7000_32 and summit_32 are ignored
      during conversion as they are not used for UP kernels.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: eric.dumazet@gmail.com
      Cc: yinghai@kernel.org
      Cc: brgerst@gmail.com
      Cc: gorcunov@gmail.com
      Cc: penberg@kernel.org
      Cc: shaohui.zheng@intel.com
      Cc: rientjes@google.com
      LKML-Reference: <1295789862-25482-7-git-send-email-tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7632611f