1. 30 12月, 2009 1 次提交
    • Y
      x86: Increase NR_IRQS and nr_irqs · 9959c888
      Yinghai Lu 提交于
      I have a system with lots of igb and ixgbe, when iov/vf are
      enabled for them, we hit the limit of 3064.
      
      when system has 20 pcie installed, and one card has 2
      functions, and one function needs 64 msi-x,
       may need 20 * 2 * 64 = 2560 for msi-x
      
      but if iov and vf are enabled
       may need 20 * 2 * 64 * 3 = 7680 for msi-x
      assume system with 5 ioapic, nr_irqs_gsi will be 120.
      
      NR_CPUS = 512, and nr_cpu_ids = 128
      will have NR_IRQS = 256 + 512 * 64 = 33024
      will have nr_irqs = 120 + 8 * 128 + 120 * 64 = 8824
      
      When SPARSE_IRQ is not set, there is no increase with kernel data
      size.
      
      when NR_CPUS=128, and SPARSE_IRQ is set:
         text		   data	    bss		   dec		 hex	filename
      21837444	4216564	12480736	38534744	24bfe58	vmlinux.before
      21837442	4216580	12480736	38534758	24bfe66	vmlinux.after
      when NR_CPUS=4096, and SPARSE_IRQ is set
         text		   data	    bss		   dec		 hex	filename
      21878619	5610244	13415392	40904255	270263f	vmlinux.before
      21878617	5610244	13415392	40904253	270263d	vmlinux.after
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B398ECD.1080506@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9959c888
  2. 13 12月, 2009 1 次提交
  3. 15 10月, 2009 1 次提交
  4. 04 6月, 2009 3 次提交
    • A
      x86, mce: define MCE_VECTOR · 8fa8dd9e
      Andi Kleen 提交于
      Add MCE_VECTOR for the #MC exception.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      8fa8dd9e
    • A
      x86: fix panic with interrupts off (needed for MCE) · 4ef702c1
      Andi Kleen 提交于
      For some time each panic() called with interrupts disabled
      triggered the !irqs_disabled() WARN_ON in smp_call_function(),
      producing ugly backtraces and confusing users.
      
      This is a common situation with machine checks for example which
      tend to call panic with interrupts disabled, but will also hit
      in other situations e.g. panic during early boot.  In fact it
      means that panic cannot be called in many circumstances, which
      would be bad.
      
      This all started with the new fancy queued smp_call_function,
      which is then used by the shutdown path to shut down the other
      CPUs.
      
      On closer examination it turned out that the fancy RCU
      smp_call_function() does lots of things not suitable in a panic
      situation anyways, like allocating memory and relying on complex
      system state.
      
      I originally tried to patch this over by checking for panic
      there, but it was quite complicated and the original patch
      was also not very popular.  This also didn't fix some of the
      underlying complexity problems.
      
      The new code in post 2.6.29 tries to patch around this by
      checking for oops_in_progress, but that is not enough to make
      this fully safe and I don't think that's a real solution
      because panic has to be reliable.
      
      So instead use an own vector to reboot.  This makes the reboot
      code extremly straight forward, which is definitely a big plus
      in a panic situation where it is important to avoid relying on
      too much kernel state.  The new simple code is also safe to be
      called from interupts off region because it is very very simple.
      
      There can be situations where it is important that panic
      is reliable.  For example on a fatal machine check the panic
      is needed to get the system up again and running as quickly
      as possible.  So it's important that panic is reliable and
      all function it calls simple.
      
      This is why I came up with this simple vector scheme.
      It's very hard to beat in simplicity.  Vectors are not
      particularly precious anymore since all big systems are
      using per CPU vectors.
      
      Another possibility would have been to use an NMI similar
      to kdump, but there is still the problem that NMIs don't
      work reliably on some systems due to BIOS issues.  NMIs
      would have been able to stop CPUs running with interrupts
      off too.  In the sake of universal reliability I opted for
      using a non NMI vector for now.
      
      I put the reboot vector into the highest priority bucket of
      the APIC vectors and moved the 64bit UV_BAU message down
      instead into the next lower priority.
      
      [ Impact: bug fix, fixes an old regression ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      4ef702c1
    • A
      x86, mce: implement bootstrapping for machine check wakeups · ccc3c319
      Andi Kleen 提交于
      Machine checks support waking up the mcelog daemon quickly.
      
      The original wake up code for this was pretty ugly, relying on
      a idle notifier and a special process flag. The reason it did
      it this way is that the machine check handler is not subject
      to normal interrupt locking rules so it's not safe
      to call wake_up().  Instead it set a process flag
      and then either did the wakeup in the syscall return
      or in the idle notifier.
      
      This patch adds a new "bootstraping" method as replacement.
      
      The idea is that the handler checks if it's in a state where
      it is unsafe to call wake_up(). If it's safe it calls it directly.
      When it's not safe -- that is it interrupted in a critical
      section with interrupts disables -- it uses a new "self IPI" to trigger
      an IPI to its own CPU. This can be done safely because IPI
      triggers are atomic with some care. The IPI is raised
      once the interrupts are reenabled and can then safely call
      wake_up().
      
      When APICs are disabled the event is just queued and will be picked up
      eventually by the next polling timer. I think that's a reasonable
      compromise, since it should only happen quite rarely.
      
      Contains fixes from Ying Huang.
      
      [ solve conflict on irqinit, make it work on 32bit (entry_arch.h) - HS ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      ccc3c319
  5. 03 6月, 2009 1 次提交
    • Y
      perf_counter/x86: Remove the IRQ (non-NMI) handling bits · a3288106
      Yong Wang 提交于
      Remove the IRQ (non-NMI) handling bits as NMI will be used always.
      Signed-off-by: NYong Wang <yong.y.wang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <20090603051255.GA2791@ywang-moblin2.bj.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a3288106
  6. 29 5月, 2009 2 次提交
  7. 10 4月, 2009 1 次提交
  8. 07 4月, 2009 1 次提交
  9. 05 3月, 2009 1 次提交
  10. 25 2月, 2009 1 次提交
  11. 16 2月, 2009 1 次提交
  12. 31 1月, 2009 9 次提交
  13. 21 1月, 2009 2 次提交
    • T
      x86: make x86_32 use tlb_64.c · 02cf94c3
      Tejun Heo 提交于
      Impact: less contention when issuing invalidate IPI, cleanup
      
      Make x86_32 use the same tlb code as 64bit.  The 64bit code uses
      multiple IPI vectors for tlb shootdown to reduce contention.  This
      patch makes x86_32 allocate the same 8 IPIs as x86_64 and share the
      code paths.
      
      Note that the usage of asmlinkage is inconsistent for x86_32 and 64
      and calls for further cleanup.  This has been noted with a FIXME
      comment in tlb_64.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      02cf94c3
    • T
      x86: prepare for tlb merge · 6dd01bed
      Tejun Heo 提交于
      Impact: clean up, ipi vector number reordering for x86_32
      
      Make the following changes to prepare for tlb merge.
      
      * reorder x86_32 ip vectors
      
      * adjust tlb_32.c and tlb_64.c such that their logics coincide exactly
      	- on spurious invalidate ipi, tlb_32 acks the irq
      	- tlb_64 now has proper memory barriers around clearing
                flush_cpumask (no change in generated code)
      
      * unexport flush_tlb_page from tlb_32.c, there's no user
      
      * use unsigned int for cpu id
      
      * drop unnecessary includes from tlb_64.c
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6dd01bed
  14. 13 1月, 2009 1 次提交
  15. 12 1月, 2009 1 次提交
    • M
      irq: initialize nr_irqs based on nr_cpu_ids · 9332fccd
      Mike Travis 提交于
      Impact: Reduce memory usage.
      
      This is the second half of the changes to make the irq_desc_ptrs be
      variable sized based on nr_cpu_ids.  This is done by adding a new
      "max_nr_irqs" macro to irq_vectors.h (and a dummy in irqnr.h) to
      return a max NR_IRQS value based on NR_CPUS or nr_cpu_ids.
      
      This necessitated moving the define of MAX_IO_APICS to a separate
      file (asm/apicnum.h) so it could be included without the baggage
      of the other asm/apicdef.h declarations.
      Signed-off-by: NMike Travis <travis@sgi.com>
      9332fccd
  16. 08 12月, 2008 3 次提交
    • I
      performance counters: x86 support · 241771ef
      Ingo Molnar 提交于
      Implement performance counters for x86 Intel CPUs.
      
      It's simplified right now: the PERFMON CPU feature is assumed,
      which is available in Core2 and later Intel CPUs.
      
      The design is flexible to be extended to more CPU types as well.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      241771ef
    • Y
      x86: use NR_IRQS_LEGACY · 99d093d1
      Yinghai Lu 提交于
      Impact: cleanup
      
      Introduce NR_IRQS_LEGACY instead of hard coded number.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      99d093d1
    • Y
      sparse irq_desc[] array: core kernel and x86 changes · 0b8f1efa
      Yinghai Lu 提交于
      Impact: new feature
      
      Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with
      NR_CPUS set to large values. The goal is to be able to scale up to much
      larger NR_IRQS value without impacting the (important) common case.
      
      To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of
      irq_desc pointers.
      
      When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc,
      this also makes the IRQ descriptors NUMA-local (to the site that calls
      request_irq()).
      
      This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now
      uses desc->chip_data for x86 to store irq_cfg.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0b8f1efa
  17. 06 11月, 2008 2 次提交
  18. 23 10月, 2008 2 次提交
  19. 16 10月, 2008 2 次提交
  20. 20 8月, 2008 1 次提交
  21. 11 8月, 2008 1 次提交
    • E
      x86_64: restore the proper NR_IRQS define so larger systems work. · 3c7569b2
      Eric W. Biederman 提交于
      As pointed out and tracked by Yinghai Lu <yhlu.kernel@gmail.com>:
      
       Dhaval Giani got:
       kernel BUG at arch/x86/kernel/io_apic_64.c:357!
       invalid opcode: 0000 [1] SMP
       CPU 24
       ...
      
      his system (x3950) has 8 ioapic, irq > 256
      
      This was caused by:
      
             commit 9b7dc567
             Author: Thomas Gleixner <tglx@linutronix.de>
             Date:   Fri May 2 20:10:09 2008 +0200
      
                x86: unify interrupt vector defines
      
                The interrupt vector defines are copied 4 times around with minimal
                differences. Move them all into asm-x86/irq_vectors.h
      
      It appears that Thomas did not notice that x86_64 does something
      completely different when he merge irq_vectors.h
      
      We can solve this for 2.6.27 by simply reintroducing the old heuristic
      for setting NR_IRQS on x86_64 to a usable value, which trivially removes
      the regression.
      
      Long term it would be nice to harmonize the handling of ioapic interrupts
      of x86_32 and x86_64 so we don't have this kind of confusion.
      
      Dhaval Giani <dhaval@linux.vnet.ibm.com> tested an earlier version of
      this patch by YH which confirms simply increasing NR_IRQS fixes the
      problem.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Acked-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c7569b2
  22. 23 7月, 2008 1 次提交
    • V
      x86: consolidate header guards · 77ef50a5
      Vegard Nossum 提交于
      This patch is the result of an automatic script that consolidates the
      format of all the headers in include/asm-x86/.
      
      The format:
      
      1. No leading underscore. Names with leading underscores are reserved.
      2. Pathname components are separated by two underscores. So we can
         distinguish between mm_types.h and mm/types.h.
      3. Everything except letters and numbers are turned into single
         underscores.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      77ef50a5
  23. 11 7月, 2008 1 次提交