1. 08 4月, 2015 1 次提交
    • D
      x86/asm/entry/irq: Simplify interrupt dispatch table (IDT) layout · 3304c9c3
      Denys Vlasenko 提交于
      Interrupt entry points are handled with the following code,
      each 32-byte code block contains seven entry points:
      
      		...
      		[push][jump 22] // 4 bytes
      		[push][jump 18] // 4 bytes
      		[push][jump 14] // 4 bytes
      		[push][jump 10] // 4 bytes
      		[push][jump  6] // 4 bytes
      		[push][jump  2] // 4 bytes
      		[push][jump common_interrupt][padding] // 8 bytes
      
      		[push][jump]
      		[push][jump]
      		[push][jump]
      		[push][jump]
      		[push][jump]
      		[push][jump]
      		[push][jump common_interrupt][padding]
      
      		[padding_2]
      	common_interrupt:
      
      And there is a table which holds pointers to every entry point,
      IOW: to every push.
      
      In cold cache, two jumps are still costlier than one, even
      though we get the benefit of them residing in the same
      cacheline.
      
      This change replaces short jumps with near ones to
      'common_interrupt', and pads every push+jump pair to 8 bytes. This
      way, each interrupt takes only one jump.
      
      This change replaces ".p2align CONFIG_X86_L1_CACHE_SHIFT" before
      dispatch table with ".align 8" - we do not need anything
      stronger than that.
      
      The table of entry addresses (the interrupt[] array) is no
      longer necessary, the address of entries can be easily
      calculated as (irq_entries_start + i*8).
      
         text	   data	    bss	    dec	    hex	filename
        12546	      0	      0	  12546	   3102	entry_64.o.before
        11626	      0	      0	  11626	   2d6a	entry_64.o
      
      The size decrease is because 1656 bytes of .init.rodata are
      gone. That's initdata, though. The resident size does go up a
      bit.
      
      Run-tested (32 and 64 bits).
      Acked-and-Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1428090553-7283-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3304c9c3
  2. 16 12月, 2014 10 次提交
  3. 14 4月, 2014 1 次提交
  4. 28 2月, 2014 1 次提交
  5. 12 1月, 2014 1 次提交
    • P
      x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs · 9345005f
      Prarit Bhargava 提交于
      During heavy CPU-hotplug operations the following spurious kernel warnings
      can trigger:
      
        do_IRQ: No ... irq handler for vector (irq -1)
      
        [ See: https://bugzilla.kernel.org/show_bug.cgi?id=64831 ]
      
      When downing a cpu it is possible that there are unhandled irqs
      left in the APIC IRR register.  The following code path shows
      how the problem can occur:
      
       1. CPU 5 is to go down.
      
       2. cpu_disable() on CPU 5 executes with interrupt flag cleared
          by local_irq_save() via stop_machine().
      
       3. IRQ 12 asserts on CPU 5, setting IRR but not ISR because
          interrupt flag is cleared (CPU unabled to handle the irq)
      
       4. IRQs are migrated off of CPU 5, and the vectors' irqs are set
          to -1. 5. stop_machine() finishes cpu_disable()
      
       6. cpu_die() for CPU 5 executes in normal context.
      
       7. CPU 5 attempts to handle IRQ 12 because the IRR is set for
          IRQ 12.  The code attempts to find the vector's IRQ and cannot
          because it has been set to -1. 8. do_IRQ() warning displays
          warning about CPU 5 IRQ 12.
      
      I added a debug printk to output which CPU & vector was
      retriggered and discovered that that we are getting bogus
      events.  I see a 100% correlation between this debug printk in
      fixup_irqs() and the do_IRQ() warning.
      
      This patchset resolves this by adding definitions for
      VECTOR_UNDEFINED(-1) and VECTOR_RETRIGGERED(-2) and modifying
      the code to use them.
      
      Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=64831Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Reviewed-by: NRui Wang <rui.y.wang@intel.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: Yang Zhang <yang.z.zhang@Intel.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: janet.morgan@Intel.com
      Cc: tony.luck@Intel.com
      Cc: ruiv.wang@gmail.com
      Link: http://lkml.kernel.org/r/1388938252-16627-1-git-send-email-prarit@redhat.com
      [ Cleaned up the code a bit. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9345005f
  6. 09 11月, 2013 1 次提交
  7. 07 8月, 2013 1 次提交
  8. 21 6月, 2013 1 次提交
    • S
      x86, trace: Add irq vector tracepoints · cf910e83
      Seiji Aguchi 提交于
      [Purpose of this patch]
      
      As Vaibhav explained in the thread below, tracepoints for irq vectors
      are useful.
      
      http://www.spinics.net/lists/mm-commits/msg85707.html
      
      <snip>
      The current interrupt traces from irq_handler_entry and irq_handler_exit
      provide when an interrupt is handled.  They provide good data about when
      the system has switched to kernel space and how it affects the currently
      running processes.
      
      There are some IRQ vectors which trigger the system into kernel space,
      which are not handled in generic IRQ handlers.  Tracing such events gives
      us the information about IRQ interaction with other system events.
      
      The trace also tells where the system is spending its time.  We want to
      know which cores are handling interrupts and how they are affecting other
      processes in the system.  Also, the trace provides information about when
      the cores are idle and which interrupts are changing that state.
      <snip>
      
      On the other hand, my usecase is tracing just local timer event and
      getting a value of instruction pointer.
      
      I suggested to add an argument local timer event to get instruction pointer before.
      But there is another way to get it with external module like systemtap.
      So, I don't need to add any argument to irq vector tracepoints now.
      
      [Patch Description]
      
      Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events.
      But there is an above use case to trace specific irq_vector rather than tracing all events.
      In this case, we are concerned about overhead due to unwanted events.
      
      So, add following tracepoints instead of introducing irq_vector_entry/exit.
      so that we can enable them independently.
         - local_timer_vector
         - reschedule_vector
         - call_function_vector
         - call_function_single_vector
         - irq_work_entry_vector
         - error_apic_vector
         - thermal_apic_vector
         - threshold_apic_vector
         - spurious_apic_vector
         - x86_platform_ipi_vector
      
      Also, introduce a logic switching IDT at enabling/disabling time so that a time penalty
      makes a zero when tracepoints are disabled. Detailed explanations are as follows.
       - Create trace irq handlers with entering_irq()/exiting_irq().
       - Create a new IDT, trace_idt_table, at boot time by adding a logic to
         _set_gate(). It is just a copy of original idt table.
       - Register the new handlers for tracpoints to the new IDT by introducing
         macros to alloc_intr_gate() called at registering time of irq_vector handlers.
       - Add checking, whether irq vector tracing is on/off, into load_current_idt().
         This has to be done below debug checking for these reasons.
         - Switching to debug IDT may be kicked while tracing is enabled.
         - On the other hands, switching to trace IDT is kicked only when debugging
           is disabled.
      
      In addition, the new IDT is created only when CONFIG_TRACING is enabled to avoid being
      used for other purposes.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C323ED.5050708@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      cf910e83
  9. 17 4月, 2013 1 次提交
  10. 28 1月, 2013 2 次提交
  11. 21 9月, 2011 1 次提交
  12. 27 7月, 2011 1 次提交
  13. 16 6月, 2011 1 次提交
  14. 14 2月, 2011 1 次提交
  15. 19 10月, 2010 1 次提交
    • P
      irq_work: Add generic hardirq context callbacks · e360adbe
      Peter Zijlstra 提交于
      Provide a mechanism that allows running code in IRQ context. It is
      most useful for NMI code that needs to interact with the rest of the
      system -- like wakeup a task to drain buffers.
      
      Perf currently has such a mechanism, so extract that and provide it as
      a generic feature, independent of perf so that others may also
      benefit.
      
      The IRQ context callback is generated through self-IPIs where
      possible, or on architectures like powerpc the decrementer (the
      built-in timer facility) is set to generate an interrupt immediately.
      
      Architectures that don't have anything like this get to do with a
      callback from the timer tick. These architectures can call
      irq_work_run() at the tail of any IRQ handlers that might enqueue such
      work (like the perf IRQ handler) to avoid undue latencies in
      processing the work.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      [ various fixes ]
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e360adbe
  16. 12 10月, 2010 3 次提交
  17. 16 3月, 2010 1 次提交
    • S
      x86: Handle legacy PIC interrupts on all the cpu's · 36e9e1ea
      Suresh Siddha 提交于
      Ingo Molnar reported that with the recent changes of not
      statically blocking IRQ0_VECTOR..IRQ15_VECTOR's on all the
      cpu's, broke an AMD platform (with Nvidia chipset) boot when
      "noapic" boot option is used.
      
      On this platform, legacy PIC interrupts are getting delivered to
      all the cpu's instead of just the boot cpu. Thus not
      initializing the vector to irq mapping for the legacy irq's
      resulted in not handling certain interrupts causing boot hang.
      
      Fix this by initializing the vector to irq mapping on all the
      logical cpu's, if the legacy IRQ is handled by the legacy PIC.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      [ -v2: io-apic-enabled improvement ]
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <1268692386.3296.43.camel@sbs-t61.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      36e9e1ea
  18. 20 2月, 2010 1 次提交
  19. 18 12月, 2009 1 次提交
    • S
      x86, irq: Allow 0xff for /proc/irq/[n]/smp_affinity on an 8-cpu system · 18374d89
      Suresh Siddha 提交于
      John Blackwood reported:
      > on an older Dell PowerEdge 6650 system with 8 cpus (4 are hyper-threaded),
      > and  32 bit (x86) kernel, once you change the irq smp_affinity of an irq
      > to be less than all cpus in the system, you can never change really the
      > irq smp_affinity back to be all cpus in the system (0xff) again,
      > even though no error status is returned on the "/bin/echo ff >
      > /proc/irq/[n]/smp_affinity" operation.
      >
      > This is due to that fact that BAD_APICID has the same value as
      > all cpus (0xff) on 32bit kernels, and thus the value returned from
      > set_desc_affinity() via the cpu_mask_to_apicid_and() function is treated
      > as a failure in set_ioapic_affinity_irq_desc(), and no affinity changes
      > are made.
      
      set_desc_affinity() is already checking if the incoming cpu mask
      intersects with the cpu online mask or not. So there is no need
      for the apic op cpu_mask_to_apicid_and() to check again
      and return BAD_APICID.
      
      Remove the BAD_APICID return value from cpu_mask_to_apicid_and()
      and also fix set_desc_affinity() to return -1 instead of using BAD_APICID
      to represent error conditions (as cpu_mask_to_apicid_and() can return
      logical or physical apicid values and BAD_APICID is really to represent
      bad physical apic id).
      Reported-by: NJohn Blackwood <john.blackwood@ccur.com>
      Root-caused-by: NJohn Blackwood <john.blackwood@ccur.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1261103386.2535.409.camel@sbs-t61>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      18374d89
  20. 02 11月, 2009 1 次提交
    • S
      x86: Remove move_cleanup_count from irq_cfg · 23359a88
      Suresh Siddha 提交于
      move_cleanup_count for each irq in irq_cfg is keeping track of
      the total number of cpus that need to free the corresponding
      vectors associated with the irq which has now been migrated to
      new destination. As long as this move_cleanup_count is non-zero
      (i.e., as long as we have n't freed the vector allocations on
      the old destinations) we were preventing the irq's further
      migration.
      
      This cleanup count is unnecessary and it is enough to not allow
      the irq migration till we send the cleanup vector to the
      previous irq destination, for which we already have irq_cfg's
      move_in_progress.  All we need to make sure is that we free the
      vector at the old desintation but we don't need to wait till
      that gets freed.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Acked-by: NGary Hade <garyhade@us.ibm.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <20091026230001.752968906@sbs-t61.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      23359a88
  21. 15 10月, 2009 1 次提交
  22. 14 10月, 2009 2 次提交
    • I
      x86, apic: Fix prototype in hw_irq.h · 7ec13187
      Ingo Molnar 提交于
      This warning:
      
       In file included from arch/x86/include/asm/ipi.h:23,
                        from arch/x86/kernel/apic/apic_noop.c:27:
       arch/x86/include/asm/hw_irq.h:105: warning: ‘struct irq_desc’ declared inside parameter list
       arch/x86/include/asm/hw_irq.h:105: warning: its scope is only this definition or declaration, which is probably not what you want
      
      triggers because irq_desc is defined after hw_irq.h is included
      in irq.h. Since it's pointer reference only, a forward declaration
      of the type will solve the problem.
      
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7ec13187
    • D
      x86, apic: Move SGI UV functionality out of generic IO-APIC code · 9338ad6f
      Dimitri Sivanich 提交于
      Move UV specific functionality out of the generic IO-APIC code.
      Signed-off-by: NDimitri Sivanich <sivanich@sgi.com>
      LKML-Reference: <20091013203236.GD20543@sgi.com>
      [ Cleaned up the code some more in their new places. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9338ad6f
  23. 04 6月, 2009 2 次提交
    • A
      x86: fix panic with interrupts off (needed for MCE) · 4ef702c1
      Andi Kleen 提交于
      For some time each panic() called with interrupts disabled
      triggered the !irqs_disabled() WARN_ON in smp_call_function(),
      producing ugly backtraces and confusing users.
      
      This is a common situation with machine checks for example which
      tend to call panic with interrupts disabled, but will also hit
      in other situations e.g. panic during early boot.  In fact it
      means that panic cannot be called in many circumstances, which
      would be bad.
      
      This all started with the new fancy queued smp_call_function,
      which is then used by the shutdown path to shut down the other
      CPUs.
      
      On closer examination it turned out that the fancy RCU
      smp_call_function() does lots of things not suitable in a panic
      situation anyways, like allocating memory and relying on complex
      system state.
      
      I originally tried to patch this over by checking for panic
      there, but it was quite complicated and the original patch
      was also not very popular.  This also didn't fix some of the
      underlying complexity problems.
      
      The new code in post 2.6.29 tries to patch around this by
      checking for oops_in_progress, but that is not enough to make
      this fully safe and I don't think that's a real solution
      because panic has to be reliable.
      
      So instead use an own vector to reboot.  This makes the reboot
      code extremly straight forward, which is definitely a big plus
      in a panic situation where it is important to avoid relying on
      too much kernel state.  The new simple code is also safe to be
      called from interupts off region because it is very very simple.
      
      There can be situations where it is important that panic
      is reliable.  For example on a fatal machine check the panic
      is needed to get the system up again and running as quickly
      as possible.  So it's important that panic is reliable and
      all function it calls simple.
      
      This is why I came up with this simple vector scheme.
      It's very hard to beat in simplicity.  Vectors are not
      particularly precious anymore since all big systems are
      using per CPU vectors.
      
      Another possibility would have been to use an NMI similar
      to kdump, but there is still the problem that NMIs don't
      work reliably on some systems due to BIOS issues.  NMIs
      would have been able to stop CPUs running with interrupts
      off too.  In the sake of universal reliability I opted for
      using a non NMI vector for now.
      
      I put the reboot vector into the highest priority bucket of
      the APIC vectors and moved the 64bit UV_BAU message down
      instead into the next lower priority.
      
      [ Impact: bug fix, fixes an old regression ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      4ef702c1
    • A
      x86, mce: implement bootstrapping for machine check wakeups · ccc3c319
      Andi Kleen 提交于
      Machine checks support waking up the mcelog daemon quickly.
      
      The original wake up code for this was pretty ugly, relying on
      a idle notifier and a special process flag. The reason it did
      it this way is that the machine check handler is not subject
      to normal interrupt locking rules so it's not safe
      to call wake_up().  Instead it set a process flag
      and then either did the wakeup in the syscall return
      or in the idle notifier.
      
      This patch adds a new "bootstraping" method as replacement.
      
      The idea is that the handler checks if it's in a state where
      it is unsafe to call wake_up(). If it's safe it calls it directly.
      When it's not safe -- that is it interrupted in a critical
      section with interrupts disables -- it uses a new "self IPI" to trigger
      an IPI to its own CPU. This can be done safely because IPI
      triggers are atomic with some care. The IPI is raised
      once the interrupts are reenabled and can then safely call
      wake_up().
      
      When APICs are disabled the event is just queued and will be picked up
      eventually by the next polling timer. I think that's a reasonable
      compromise, since it should only happen quite rarely.
      
      Contains fixes from Ying Huang.
      
      [ solve conflict on irqinit, make it work on 32bit (entry_arch.h) - HS ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      ccc3c319
  24. 03 6月, 2009 1 次提交
    • Y
      perf_counter/x86: Remove the IRQ (non-NMI) handling bits · a3288106
      Yong Wang 提交于
      Remove the IRQ (non-NMI) handling bits as NMI will be used always.
      Signed-off-by: NYong Wang <yong.y.wang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <20090603051255.GA2791@ywang-moblin2.bj.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a3288106
  25. 18 5月, 2009 1 次提交
    • Y
      x86, apic: introduce io_apic_irq_attr · e5198075
      Yinghai Lu 提交于
      according to Ingo, io_apic irq-setup related functions have too many
      parameters with a repetitive signature.
      
      So reduce related funcs to get less params by passing a pointer
      to a newly defined io_apic_irq_attr structure.
      
      v2: io_apic_irq ==> irq_attr
          triggering ==> trigger
      
      v3: add set_io_apic_irq_attr
      
      [ Impact: cleanup ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: Len Brown <lenb@kernel.org>
      LKML-Reference: <4A08ACD3.2070401@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e5198075
  26. 11 5月, 2009 1 次提交