1. 21 6月, 2013 3 次提交
    • S
      trace,x86: Move creation of irq tracepoints from apic.c to irq.c · 83ab8514
      Steven Rostedt (Red Hat) 提交于
      Compiling without CONFIG_X86_LOCAL_APIC set, apic.c will not be
      compiled, and the irq tracepoints will not be created via the
      CREATE_TRACE_POINTS macro. When CONFIG_X86_LOCAL_APIC is not set,
      we get the following build error:
      
        LD      init/built-in.o
      arch/x86/built-in.o: In function `trace_x86_platform_ipi_entry':
      linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_entry'
      arch/x86/built-in.o: In function `trace_x86_platform_ipi_exit':
      linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_exit'
      arch/x86/built-in.o: In function `trace_irq_work_entry':
      linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_entry'
      arch/x86/built-in.o: In function `trace_irq_work_exit':
      linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_exit'
      arch/x86/built-in.o:(__jump_table+0x8): undefined reference to `__tracepoint_x86_platform_ipi_entry'
      arch/x86/built-in.o:(__jump_table+0x14): undefined reference to `__tracepoint_x86_platform_ipi_exit'
      arch/x86/built-in.o:(__jump_table+0x20): undefined reference to `__tracepoint_irq_work_entry'
      arch/x86/built-in.o:(__jump_table+0x2c): undefined reference to `__tracepoint_irq_work_exit'
      make[1]: *** [vmlinux] Error 1
      make: *** [sub-make] Error 2
      
      As irq.c is always compiled for x86, it is a more appropriate location
      to create the irq tracepoints.
      
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      83ab8514
    • S
      x86, trace: Add irq vector tracepoints · cf910e83
      Seiji Aguchi 提交于
      [Purpose of this patch]
      
      As Vaibhav explained in the thread below, tracepoints for irq vectors
      are useful.
      
      http://www.spinics.net/lists/mm-commits/msg85707.html
      
      <snip>
      The current interrupt traces from irq_handler_entry and irq_handler_exit
      provide when an interrupt is handled.  They provide good data about when
      the system has switched to kernel space and how it affects the currently
      running processes.
      
      There are some IRQ vectors which trigger the system into kernel space,
      which are not handled in generic IRQ handlers.  Tracing such events gives
      us the information about IRQ interaction with other system events.
      
      The trace also tells where the system is spending its time.  We want to
      know which cores are handling interrupts and how they are affecting other
      processes in the system.  Also, the trace provides information about when
      the cores are idle and which interrupts are changing that state.
      <snip>
      
      On the other hand, my usecase is tracing just local timer event and
      getting a value of instruction pointer.
      
      I suggested to add an argument local timer event to get instruction pointer before.
      But there is another way to get it with external module like systemtap.
      So, I don't need to add any argument to irq vector tracepoints now.
      
      [Patch Description]
      
      Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events.
      But there is an above use case to trace specific irq_vector rather than tracing all events.
      In this case, we are concerned about overhead due to unwanted events.
      
      So, add following tracepoints instead of introducing irq_vector_entry/exit.
      so that we can enable them independently.
         - local_timer_vector
         - reschedule_vector
         - call_function_vector
         - call_function_single_vector
         - irq_work_entry_vector
         - error_apic_vector
         - thermal_apic_vector
         - threshold_apic_vector
         - spurious_apic_vector
         - x86_platform_ipi_vector
      
      Also, introduce a logic switching IDT at enabling/disabling time so that a time penalty
      makes a zero when tracepoints are disabled. Detailed explanations are as follows.
       - Create trace irq handlers with entering_irq()/exiting_irq().
       - Create a new IDT, trace_idt_table, at boot time by adding a logic to
         _set_gate(). It is just a copy of original idt table.
       - Register the new handlers for tracpoints to the new IDT by introducing
         macros to alloc_intr_gate() called at registering time of irq_vector handlers.
       - Add checking, whether irq vector tracing is on/off, into load_current_idt().
         This has to be done below debug checking for these reasons.
         - Switching to debug IDT may be kicked while tracing is enabled.
         - On the other hands, switching to trace IDT is kicked only when debugging
           is disabled.
      
      In addition, the new IDT is created only when CONFIG_TRACING is enabled to avoid being
      used for other purposes.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C323ED.5050708@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      cf910e83
    • S
      x86, trace: Introduce entering/exiting_irq() · eddc0e92
      Seiji Aguchi 提交于
      When implementing tracepoints in interrupt handers, if the tracepoints are
      simply added in the performance sensitive path of interrupt handers,
      it may cause potential performance problem due to the time penalty.
      
      To solve the problem, an idea is to prepare non-trace/trace irq handers and
      switch their IDTs at the enabling/disabling time.
      
      So, let's introduce entering_irq()/exiting_irq() for pre/post-
      processing of each irq handler.
      
      A way to use them is as follows.
      
      Non-trace irq handler:
      smp_irq_handler()
      {
      	entering_irq();		/* pre-processing of this handler */
      	__smp_irq_handler();	/*
      				 * common logic between non-trace and trace handlers
      				 * in a vector.
      				 */
      	exiting_irq();		/* post-processing of this handler */
      
      }
      
      Trace irq_handler:
      smp_trace_irq_handler()
      {
      	entering_irq();		/* pre-processing of this handler */
      	trace_irq_entry();	/* tracepoint for irq entry */
      	__smp_irq_handler();	/*
      				 * common logic between non-trace and trace handlers
      				 * in a vector.
      				 */
      	trace_irq_exit();	/* tracepoint for irq exit */
      	exiting_irq();		/* post-processing of this handler */
      
      }
      
      If tracepoints can place outside entering_irq()/exiting_irq() as follows,
      it looks cleaner.
      
      smp_trace_irq_handler()
      {
      	trace_irq_entry();
      	smp_irq_handler();
      	trace_irq_exit();
      }
      
      But it doesn't work.
      The problem is with irq_enter/exit() being called. They must be called before
      trace_irq_enter/exit(),  because of the rcu_irq_enter() must be called before
      any tracepoints are used, as tracepoints use  rcu to synchronize.
      
      As a possible alternative, we may be able to call irq_enter() first as follows
      if irq_enter() can nest.
      
      smp_trace_irq_hander()
      {
      	irq_entry();
      	trace_irq_entry();
      	smp_irq_handler();
      	trace_irq_exit();
      	irq_exit();
      }
      
      But it doesn't work, either.
      If irq_enter() is nested, it may have a time penalty because it has to check if it
      was already called or not. The time penalty is not desired in performance sensitive
      paths even if it is tiny.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C3238D.9040706@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      eddc0e92
  2. 20 2月, 2013 1 次提交
  3. 12 2月, 2013 1 次提交
  4. 11 2月, 2013 1 次提交
    • S
      x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems · cb214ede
      Stoney Wang 提交于
      When a HP ProLiant DL980 G7 Server boots a regular kernel,
      there will be intermittent lost interrupts which could
      result in a hang or (in extreme cases) data loss.
      
      The reason is that this system only supports x2apic physical
      mode, while the kernel boots with a logical-cluster default
      setting.
      
      This bug can be worked around by specifying the "x2apic_phys" or
      "nox2apic" boot option, but we want to handle this system
      without requiring manual workarounds.
      
      The BIOS sets ACPI_FADT_APIC_PHYSICAL in FADT table.
      As all apicids are smaller than 255, BIOS need to pass the
      control to the OS with xapic mode, according to x2apic-spec,
      chapter 2.9.
      
      Current code handle x2apic when BIOS pass with xapic mode
      enabled:
      
      When user specifies x2apic_phys, or FADT indicates PHYSICAL:
      
      1. During madt oem check, apic driver is set with xapic logical
         or xapic phys driver at first.
      
      2. enable_IR_x2apic() will enable x2apic_mode.
      
      3. if user specifies x2apic_phys on the boot line, x2apic_phys_probe()
         will install the correct x2apic phys driver and use x2apic phys mode.
         Otherwise it will skip the driver will let x2apic_cluster_probe to
         take over to install x2apic cluster driver (wrong one) even though FADT
         indicates PHYSICAL, because x2apic_phys_probe does not check
         FADT PHYSICAL.
      
      Add checking x2apic_fadt_phys in x2apic_phys_probe() to fix the
      problem.
      Signed-off-by: NStoney Wang <song-bo.wang@hp.com>
      [ updated the changelog and simplified the code ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: stable@kernel.org
      Link: http://lkml.kernel.org/r/1360263182-16226-1-git-send-email-yinghai@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cb214ede
  5. 28 1月, 2013 16 次提交
  6. 25 1月, 2013 1 次提交
    • A
      x86/MSI: Support multiple MSIs in presense of IRQ remapping · 51906e77
      Alexander Gordeev 提交于
      The MSI specification has several constraints in comparison with
      MSI-X, most notable of them is the inability to configure MSIs
      independently. As a result, it is impossible to dispatch
      interrupts from different queues to different CPUs. This is
      largely devalues the support of multiple MSIs in SMP systems.
      
      Also, a necessity to allocate a contiguous block of vector
      numbers for devices capable of multiple MSIs might cause a
      considerable pressure on x86 interrupt vector allocator and
      could lead to fragmentation of the interrupt vectors space.
      
      This patch overcomes both drawbacks in presense of IRQ remapping
      and lets devices take advantage of multiple queues and per-IRQ
      affinity assignments.
      Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/c8bd86ff56b5fc118257436768aaa04489ac0a4c.1353324359.git.agordeev@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      51906e77
  7. 24 1月, 2013 1 次提交
  8. 08 12月, 2012 1 次提交
  9. 27 11月, 2012 1 次提交
    • S
      x86, apic: Cleanup cfg->domain setup for legacy interrupts · 29c574c0
      Suresh Siddha 提交于
      Issues that need to be handled:
      * Handle PIC interrupts on any CPU irrespective of the apic mode
      * In the apic lowest priority logical flat delivery mode, be prepared to
        handle the interrupt on any CPU irrespective of what the IO-APIC RTE says.
      * Because of above, when the IO-APIC starts handling the legacy PIC interrupt,
        use the same vector that is being used by the PIC while programming the
        corresponding IO-APIC RTE.
      
      Start with all the cpu's in the legacy PIC interrupts cfg->domain.
      
      By the time IO-APIC starts taking over the PIC interrupts, apic driver
      model is finalized. So depend on the assign_irq_vector() to update the
      cfg->domain and retain the same vector that was used by PIC before.
      
      For the logical apic flat mode, cfg->domain is updated (during the first
      call to assign_irq_vector()) to contain all the possible online cpu's (0xff).
      Vector used for the legacy PIC interrupt doesn't change when the IO-APIC
      starts handling the interrupt. Any interrupt migration after that
      doesn't change the cfg->domain or the vector used.
      
      For other apic modes like physical mode, cfg->domain is updated
      (during the first call to assign_irq_vector()) to the boot cpu (cpu-0),
      with the same vector that is being used by the PIC. When that interrupt is
      migrated to a different cpu, cfg->domin and the vector assigned will change
      accordingly.
      Tested-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1353970176.21070.51.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      29c574c0
  10. 17 11月, 2012 1 次提交
  11. 15 11月, 2012 1 次提交
  12. 03 11月, 2012 1 次提交
  13. 02 11月, 2012 1 次提交
  14. 24 10月, 2012 1 次提交
    • D
      x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt · 94777fc5
      Dimitri Sivanich 提交于
      Posting this patch to fix an issue concerning sparse irq's that
      I raised a while back.  There was discussion about adding
      refcounting to sparse irqs (to fix other potential race
      conditions), but that does not appear to have been addressed
      yet.  This covers the only issue of this type that I've
      encountered in this area.
      
      A NULL pointer dereference can occur in
      smp_irq_move_cleanup_interrupt() if we haven't yet setup the
      irq_cfg pointer in the irq_desc.irq_data.chip_data.
      
      In create_irq_nr() there is a window where we have set
      vector_irq in __assign_irq_vector(), but not yet called
      irq_set_chip_data() to set the irq_cfg pointer.
      
      Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during
      this time, smp_irq_move_cleanup_interrupt() will attempt to
      process the aforementioned irq, but panic when accessing
      irq_cfg.
      
      Only continue processing the irq if irq_cfg is non-NULL.
      Signed-off-by: NDimitri Sivanich <sivanich@sgi.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Alexander Gordeev <agordeev@redhat.com>
      Link: http://lkml.kernel.org/r/20121016125021.GA22935@sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      94777fc5
  15. 06 10月, 2012 1 次提交
  16. 19 9月, 2012 1 次提交
  17. 15 8月, 2012 1 次提交
    • S
      x86, apic: fix broken legacy interrupts in the logical apic mode · f1c63001
      Suresh Siddha 提交于
      Recent commit 332afa65 cleaned up
      a workaround that updates irq_cfg domain for legacy irq's that
      are handled by the IO-APIC. This was assuming that the recent
      changes in assign_irq_vector() were sufficient to remove the workaround.
      
      But this broke couple of AMD platforms. One of them seems to be
      sending interrupts to the offline cpu's, resulting in spurious
      "No irq handler for vector xx (irq -1)" messages when those cpu's come online.
      And the other platform seems to always send the interrupt to the last logical
      CPU (cpu-7). Recent changes had an unintended side effect of using only logical
      cpu-0 in the IO-APIC RTE (during boot for the legacy interrupts) and this
      broke the legacy interrupts not getting routed to the cpu-7 on the AMD
      platform, resulting in a boot hang.
      
      For now, reintroduce the removed workaround, (essentially not allowing the
      vector to change for legacy irq's when io-apic starts to handle the irq. Which
      also addressed the uninteded sife effect of just specifying cpu-0 in the
      IO-APIC RTE for those irq's during boot).
      Reported-and-tested-by: NRobert Richter <robert.richter@amd.com>
      Reported-and-tested-by: NBorislav Petkov <bp@amd64.org>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1344453412.29170.5.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
      f1c63001
  18. 26 7月, 2012 1 次提交
    • T
      x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs · 1d44b30f
      Tomoki Sekiyama 提交于
      In the current kernel, percpu variable `vector_irq' is not always
      cleared when a CPU is offlined. If the CPU that has the disabled
      irqs in vector_irq is hotplugged again, __setup_vector_irq()
      hits invalid irq vector and may crash.
      
      This bug can be reproduced as following;
      
       # echo 0 > /sys/devices/system/cpu/cpu7/online
       # modprobe -r some_driver_using_interrupts     # vector_irq@cpu7 uncleared
       # echo 1 > /sys/devices/system/cpu/cpu7/online # kernel may crash
      
      To fix this problem, this patch clears vector_irq in
      __fixup_irqs() when the CPU is offlined.
      
      This also reverts commit f6175f5b, which partially fixes
      this bug by clearing vector in __clear_irq_vector(). But in
      environments with IOMMU IRQ remapper, it could fail because
      cfg->domain doesn't contain offlined CPUs. With this patch, the
      fix in __clear_irq_vector() can be reverted because every
      vector_irq is already cleared in __fixup_irqs() on offlined CPUs.
      Signed-off-by: NTomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Alexander Gordeev <agordeev@redhat.com>
      Link: http://lkml.kernel.org/r/20120726104732.2889.19144.stgit@kvmdevSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1d44b30f
  19. 16 7月, 2012 1 次提交
  20. 06 7月, 2012 3 次提交
  21. 15 6月, 2012 1 次提交