1. 19 1月, 2010 1 次提交
    • S
      x86, irq: Use 0x20 for the IRQ_MOVE_CLEANUP_VECTOR instead of 0x1f · 6579b474
      Suresh Siddha 提交于
      After talking to some more folks inside intel (Peter Anvin, Asit Mallick),
      the safest option (for future compatibility etc) seen was to use vector 0x20
      for IRQ_MOVE_CLEANUP_VECTOR instead of using vector 0x1f (which is documented as
      reserved vector in the Intel IA32 manuals).
      
      Also we don't need to reserve the entire privilege level (all 16 vectors in
      the priority bucket that IRQ_MOVE_CLEANUP_VECTOR falls into), as the
      x86 architecture (section 10.9.3 in SDM Vol3a) specifies that with in the
      priority level, the higher the vector number the higher the priority.
      And hence we don't need to reserve the complete priority level 0x20-0x2f for
      the IRQ migration cleanup logic.
      
      So change the IRQ_MOVE_CLEANUP_VECTOR to 0x20 and  allow 0x21-0x2f to be used
      for device interrupts. 0x30-0x3f will be used for ISA interrupts (these
      also can be migrated in the context of IOAPIC and hence need to be at a higher
      priority level than IRQ_MOVE_CLEANUP_VECTOR).
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100114002118.521826763@sbs-t61.sc.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6579b474
  2. 05 1月, 2010 2 次提交
    • H
      x86, apic: Don't waste a vector to improve vector spread · ea943966
      H. Peter Anvin 提交于
      We want to use a vector-assignment sequence that avoids stumbling onto
      0x80 earlier in the sequence, in order to improve the spread of
      vectors across priority levels on machines with a small number of
      interrupt sources.  Right now, this is done by simply making the first
      vector (0x31 or 0x41) completely unusable.  This is unnecessary; all
      we need is to start assignment at a +1 offset, we don't actually need
      to prohibit the usage of this vector once we have wrapped around.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4B426550.6000209@kernel.org>
      ea943966
    • H
      x86, apic: Reclaim IDT vectors 0x20-0x2f · 99d113b1
      H. Peter Anvin 提交于
      Reclaim 16 IDT vectors and make them available for general allocation.
      
      Reclaim vectors 0x20-0x2f by reallocating the IRQ_MOVE_CLEANUP_VECTOR
      to vector 0x1f.  This is in the range of vector numbers that is
      officially reserved for the CPU (for exceptions), however, the use of
      the APIC to generate any vector 0x10 or above is documented, and the
      CPU internally can receive any vector number (the legacy BIOS uses INT
      0x08-0x0f for interrupts, as messed up as that is.)
      
      Since IRQ_MOVE_CLEANUP_VECTOR has to be alone in the lowest-numbered
      priority level (block of 16), this effectively enables us to reclaim
      an otherwise-unusable APIC priority level and put it to use.
      
      Since this is a transient kernel-only allocation we can change it at
      any time, and if/when there is an exception at vector 0x1f this
      assignment needs to be changed as part of OS enabling that new feature.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4B4284C6.9030107@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      99d113b1
  3. 30 12月, 2009 1 次提交
    • Y
      x86: Increase NR_IRQS and nr_irqs · 9959c888
      Yinghai Lu 提交于
      I have a system with lots of igb and ixgbe, when iov/vf are
      enabled for them, we hit the limit of 3064.
      
      when system has 20 pcie installed, and one card has 2
      functions, and one function needs 64 msi-x,
       may need 20 * 2 * 64 = 2560 for msi-x
      
      but if iov and vf are enabled
       may need 20 * 2 * 64 * 3 = 7680 for msi-x
      assume system with 5 ioapic, nr_irqs_gsi will be 120.
      
      NR_CPUS = 512, and nr_cpu_ids = 128
      will have NR_IRQS = 256 + 512 * 64 = 33024
      will have nr_irqs = 120 + 8 * 128 + 120 * 64 = 8824
      
      When SPARSE_IRQ is not set, there is no increase with kernel data
      size.
      
      when NR_CPUS=128, and SPARSE_IRQ is set:
         text		   data	    bss		   dec		 hex	filename
      21837444	4216564	12480736	38534744	24bfe58	vmlinux.before
      21837442	4216580	12480736	38534758	24bfe66	vmlinux.after
      when NR_CPUS=4096, and SPARSE_IRQ is set
         text		   data	    bss		   dec		 hex	filename
      21878619	5610244	13415392	40904255	270263f	vmlinux.before
      21878617	5610244	13415392	40904253	270263d	vmlinux.after
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B398ECD.1080506@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9959c888
  4. 22 12月, 2009 2 次提交
    • A
      ACPI: processor: finish unifying arch_acpi_processor_init_pdc() · 6c5807d7
      Alex Chiang 提交于
      The only thing arch-specific about calling _PDC is what bits get
      set in the input obj_list buffer.
      
      There's no need for several levels of indirection to twiddle those
      bits. Additionally, since we're just messing around with a buffer,
      we can simplify the interface; no need to pass around the entire
      struct acpi_processor * just to get at the buffer.
      
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      6c5807d7
    • A
      ACPI: processor: introduce arch_has_acpi_pdc · 1d9cb470
      Alex Chiang 提交于
      arch dependent helper function that tells us if we should attempt to
      evaluate _PDC on this machine or not.
      
      The x86 implementation assumes that the CPUs in the machine must be
      homogeneous, and that you cannot mix CPUs of different vendors.
      
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      1d9cb470
  5. 18 12月, 2009 1 次提交
    • S
      x86, irq: Allow 0xff for /proc/irq/[n]/smp_affinity on an 8-cpu system · 18374d89
      Suresh Siddha 提交于
      John Blackwood reported:
      > on an older Dell PowerEdge 6650 system with 8 cpus (4 are hyper-threaded),
      > and  32 bit (x86) kernel, once you change the irq smp_affinity of an irq
      > to be less than all cpus in the system, you can never change really the
      > irq smp_affinity back to be all cpus in the system (0xff) again,
      > even though no error status is returned on the "/bin/echo ff >
      > /proc/irq/[n]/smp_affinity" operation.
      >
      > This is due to that fact that BAD_APICID has the same value as
      > all cpus (0xff) on 32bit kernels, and thus the value returned from
      > set_desc_affinity() via the cpu_mask_to_apicid_and() function is treated
      > as a failure in set_ioapic_affinity_irq_desc(), and no affinity changes
      > are made.
      
      set_desc_affinity() is already checking if the incoming cpu mask
      intersects with the cpu online mask or not. So there is no need
      for the apic op cpu_mask_to_apicid_and() to check again
      and return BAD_APICID.
      
      Remove the BAD_APICID return value from cpu_mask_to_apicid_and()
      and also fix set_desc_affinity() to return -1 instead of using BAD_APICID
      to represent error conditions (as cpu_mask_to_apicid_and() can return
      logical or physical apicid values and BAD_APICID is really to represent
      bad physical apic id).
      Reported-by: NJohn Blackwood <john.blackwood@ccur.com>
      Root-caused-by: NJohn Blackwood <john.blackwood@ccur.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1261103386.2535.409.camel@sbs-t61>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      18374d89
  6. 17 12月, 2009 6 次提交
    • F
      perf events, x86/stacktrace: Fix performance/softlockup by providing a special... · 06d65bda
      Frederic Weisbecker 提交于
      perf events, x86/stacktrace: Fix performance/softlockup by providing a special frame pointer-only stack walker
      
      It's just wasteful for stacktrace users like perf to walk
      through every entries on the stack whereas these only accept
      reliable ones, ie: that the frame pointer validates.
      
      Since perf requires pure reliable stacktraces, it needs a stack
      walker based on frame pointers-only to optimize the stacktrace
      processing.
      
      This might solve some near-lockup scenarios that can be triggered
      by call-graph tracing timer events.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1261024834-5336-2-git-send-regression-fweisbec@gmail.com>
      [ v2: fix for modular builds and small detail tidyup ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      06d65bda
    • F
      perf events, x86/stacktrace: Make stack walking optional · 61c1917f
      Frederic Weisbecker 提交于
      The current print_context_stack helper that does the stack
      walking job is good for usual stacktraces as it walks through
      all the stack and reports even addresses that look unreliable,
      which is nice when we don't have frame pointers for example.
      
      But we have users like perf that only require reliable
      stacktraces, and those may want a more adapted stack walker, so
      lets make this function a callback in stacktrace_ops that users
      can tune for their needs.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1261024834-5336-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      61c1917f
    • S
      x86, cpuid: Add "volatile" to asm in native_cpuid() · 45a94d7c
      Suresh Siddha 提交于
      xsave_cntxt_init() does something like:
      
      	cpuid(0xd, ..);	// find out what features FP/SSE/.. etc are supported
      
      	xsetbv();	// enable the features known to OS
      
      	cpuid(0xd, ..);	// find out the size of the context for features enabled
      
      Depending on what features get enabled in xsetbv(), value of the
      cpuid.eax=0xd.ecx=0.ebx changes correspondingly (representing the
      size of the context that is enabled).
      
      As we don't have volatile keyword for native_cpuid(), gcc 4.1.2
      optimizes away the second cpuid and the kernel continues to use
      the cpuid information obtained before xsetbv(), ultimately leading to kernel
      crash on processors supporting more state than the legacy FP/SSE.
      
      Add "volatile" for native_cpuid().
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1261009542.2745.55.camel@sbs-t61.sc.intel.com>
      Cc: stable@kernel.org
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      45a94d7c
    • B
      x86, msr: msrs_alloc/free for CONFIG_SMP=n · 6ede31e0
      Borislav Petkov 提交于
      Randy Dunlap reported the following build error:
      
      "When CONFIG_SMP=n, CONFIG_X86_MSR=m:
      
      ERROR: "msrs_free" [drivers/edac/amd64_edac_mod.ko] undefined!
      ERROR: "msrs_alloc" [drivers/edac/amd64_edac_mod.ko] undefined!"
      
      This is due to the fact that <arch/x86/lib/msr.c> is conditioned on
      CONFIG_SMP and in the UP case we have only the stubs in the header.
      Fork off SMP functionality into a new file (msr-smp.c) and build
      msrs_{alloc,free} unconditionally.
      Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NBorislav Petkov <petkovbb@gmail.com>
      LKML-Reference: <20091216231625.GD27228@liondog.tnic>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6ede31e0
    • A
      x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space · 9d260ebc
      Andreas Herrmann 提交于
      Use NodeId MSR to get NodeId and number of nodes per processor.
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      LKML-Reference: <20091216144355.GB28798@alberich.amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9d260ebc
    • A
      sanitize do_pipe_flags() callers in arch · 853b3da1
      Al Viro 提交于
      * hpux_pipe() - no need to take BKL
      * sys32_pipe() in arch/x86/ia32 and xtensa_pipe() in arch/xtensa -
      	no need at all, since both functions are open-coded sys_pipe()
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      853b3da1
  7. 16 12月, 2009 16 次提交
  8. 15 12月, 2009 6 次提交
  9. 13 12月, 2009 1 次提交
  10. 12 12月, 2009 2 次提交
    • S
      kbuild: move asm-offsets.h to include/generated · 559df2e0
      Sam Ravnborg 提交于
      The simplest method was to add an extra asm-offsets.h
      file in arch/$ARCH/include/asm that references the generated file.
      
      We can now migrate the architectures one-by-one to reference
      the generated file direct - and when done we can delete the
      temporary arch/$ARCH/include/asm/asm-offsets.h file.
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NMichal Marek <mmarek@suse.cz>
      559df2e0
    • B
      x86, msr: Add support for non-contiguous cpumasks · 50542251
      Borislav Petkov 提交于
      The current rd/wrmsr_on_cpus helpers assume that the supplied
      cpumasks are contiguous. However, there are machines out there
      like some K8 multinode Opterons which have a non-contiguous core
      enumeration on each node (e.g. cores 0,2 on node 0 instead of 0,1), see
      http://www.gossamer-threads.com/lists/linux/kernel/1160268.
      
      This patch fixes out-of-bounds writes (see URL above) by adding per-CPU
      msr structs which are used on the respective cores.
      
      Additionally, two helpers, msrs_{alloc,free}, are provided for use by
      the callers of the MSR accessors.
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Doug Thompson <dougthompson@xmission.com>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      LKML-Reference: <20091211171440.GD31998@aftab>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      50542251
  11. 11 12月, 2009 2 次提交