1. 17 12月, 2009 5 次提交
    • Y
      x86: Increase MAX_EARLY_RES; insufficient on 32-bit NUMA · 6a1e008a
      Yinghai Lu 提交于
      Due to recent changes wakeup and mptable, we run out of early
      reservations on 32-bit NUMA.  Thus, adjust the available number.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4B22D754.2020706@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6a1e008a
    • Y
      x86: Fix checking of SRAT when node 0 ram is not from 0 · 32996250
      Yinghai Lu 提交于
      Found one system that boot from socket1 instead of socket0, SRAT get rejected...
      
      [    0.000000] SRAT: Node 1 PXM 0 0-a0000
      [    0.000000] SRAT: Node 1 PXM 0 100000-80000000
      [    0.000000] SRAT: Node 1 PXM 0 100000000-2080000000
      [    0.000000] SRAT: Node 0 PXM 1 2080000000-4080000000
      [    0.000000] SRAT: Node 2 PXM 2 4080000000-6080000000
      [    0.000000] SRAT: Node 3 PXM 3 6080000000-8080000000
      [    0.000000] SRAT: Node 4 PXM 4 8080000000-a080000000
      [    0.000000] SRAT: Node 5 PXM 5 a080000000-c080000000
      [    0.000000] SRAT: Node 6 PXM 6 c080000000-e080000000
      [    0.000000] SRAT: Node 7 PXM 7 e080000000-10080000000
      ...
      [    0.000000] NUMA: Allocated memnodemap from 500000 - 701040
      [    0.000000] NUMA: Using 20 for the hash shift.
      [    0.000000] Adding active range (0, 0x2080000, 0x4080000) 0 entries of 3200 used
      [    0.000000] Adding active range (1, 0x0, 0x96) 1 entries of 3200 used
      [    0.000000] Adding active range (1, 0x100, 0x7f750) 2 entries of 3200 used
      [    0.000000] Adding active range (1, 0x100000, 0x2080000) 3 entries of 3200 used
      [    0.000000] Adding active range (2, 0x4080000, 0x6080000) 4 entries of 3200 used
      [    0.000000] Adding active range (3, 0x6080000, 0x8080000) 5 entries of 3200 used
      [    0.000000] Adding active range (4, 0x8080000, 0xa080000) 6 entries of 3200 used
      [    0.000000] Adding active range (5, 0xa080000, 0xc080000) 7 entries of 3200 used
      [    0.000000] Adding active range (6, 0xc080000, 0xe080000) 8 entries of 3200 used
      [    0.000000] Adding active range (7, 0xe080000, 0x10080000) 9 entries of 3200 used
      [    0.000000] SRAT: PXMs only cover 917504MB of your 1048566MB e820 RAM. Not used.
      [    0.000000] SRAT: SRAT not used.
      
      the early_node_map is not sorted because node0 with non zero start come first.
      
      so try to sort it right away after all regions are registered.
      
      also fixs refression by 8716273c (x86: Export srat physical topology)
      
      -v2: make it more solid to handle cross node case like node0 [0,4g), [8,12g) and node1 [4g, 8g), [12g, 16g)
      -v3: update comments.
      Reported-and-tested-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4B2579D2.3010201@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      32996250
    • S
      x86, cpuid: Add "volatile" to asm in native_cpuid() · 45a94d7c
      Suresh Siddha 提交于
      xsave_cntxt_init() does something like:
      
      	cpuid(0xd, ..);	// find out what features FP/SSE/.. etc are supported
      
      	xsetbv();	// enable the features known to OS
      
      	cpuid(0xd, ..);	// find out the size of the context for features enabled
      
      Depending on what features get enabled in xsetbv(), value of the
      cpuid.eax=0xd.ecx=0.ebx changes correspondingly (representing the
      size of the context that is enabled).
      
      As we don't have volatile keyword for native_cpuid(), gcc 4.1.2
      optimizes away the second cpuid and the kernel continues to use
      the cpuid information obtained before xsetbv(), ultimately leading to kernel
      crash on processors supporting more state than the legacy FP/SSE.
      
      Add "volatile" for native_cpuid().
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1261009542.2745.55.camel@sbs-t61.sc.intel.com>
      Cc: stable@kernel.org
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      45a94d7c
    • B
      x86, msr: msrs_alloc/free for CONFIG_SMP=n · 6ede31e0
      Borislav Petkov 提交于
      Randy Dunlap reported the following build error:
      
      "When CONFIG_SMP=n, CONFIG_X86_MSR=m:
      
      ERROR: "msrs_free" [drivers/edac/amd64_edac_mod.ko] undefined!
      ERROR: "msrs_alloc" [drivers/edac/amd64_edac_mod.ko] undefined!"
      
      This is due to the fact that <arch/x86/lib/msr.c> is conditioned on
      CONFIG_SMP and in the UP case we have only the stubs in the header.
      Fork off SMP functionality into a new file (msr-smp.c) and build
      msrs_{alloc,free} unconditionally.
      Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NBorislav Petkov <petkovbb@gmail.com>
      LKML-Reference: <20091216231625.GD27228@liondog.tnic>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6ede31e0
    • A
      x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space · 9d260ebc
      Andreas Herrmann 提交于
      Use NodeId MSR to get NodeId and number of nodes per processor.
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      LKML-Reference: <20091216144355.GB28798@alberich.amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9d260ebc
  2. 16 12月, 2009 3 次提交
    • S
      x86: Add IA32_TSC_AUX MSR and use it · 5df97400
      Sheng Yang 提交于
      Clean up write_tsc() and write_tscp_aux() by replacing
      hardcoded values.
      
      No change in functionality.
      Signed-off-by: NSheng Yang <sheng@linux.intel.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      LKML-Reference: <1260942485-19156-4-git-send-email-sheng@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5df97400
    • H
      x86, msr/cpuid: Register enough minors for the MSR and CPUID drivers · 0b962d47
      H. Peter Anvin 提交于
      register_chrdev() hardcodes registering 256 minors, presumably to
      avoid breaking old drivers.  However, we need to register enough
      minors so that we have all possible CPUs.
      
      checkpatch warns on this patch, but the patch is correct: NR_CPUS here
      is a static *upper bound* on the *maximum CPU index* (not *number of
      CPUs!*) and that is what we want.
      Reported-and-tested-by: NRuss Anderson <rja@sgi.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <tip-*@git.kernel.org>
      0b962d47
    • J
      x86: Fix kprobes build with non-gawk awk · 23637568
      Jonathan Nieder 提交于
      The instruction attribute table generator fails when run by mawk
      or original-awk:
      
       $ mawk -f arch/x86/tools/gen-insn-attr-x86.awk \
      	arch/x86/lib/x86-opcode-map.txt > /dev/null
       Semantic error at 240: Second IMM error
       $ echo $?
       1
      
      Line 240 contains "c8: ENTER Iw,Ib", which indicates that this
      instruction has two immediate operands, the second of which is
      one byte.  The script loops through the immediate operands using
      a for loop.
      
      Unfortunately, there is no guarantee in awk that a for (variable
      in array) loop will return the indices in increasing order.
      Internally, both original-awk and mawk iterate over a hash table
      for this purpose, and both implementations happen to produce the
      index 2 before 1.  The supposed second immediate operand is more
      than one byte wide, producing the error.
      
      So loop over the indices in increasing order instead.  As a
      side-effect, with mawk this means the silly two-entry hash table
      never has to be built.
      Signed-off-by: NJonathan Nieder <jrnieder@gmail.com>
      Acked-by Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091213220437.GA27718@progeny.tock>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      23637568
  3. 15 12月, 2009 3 次提交
    • F
      x86: Split swiotlb initialization into two stages · 186a2502
      FUJITA Tomonori 提交于
      The commit f4780ca0 moves
      swiotlb initialization before dma32_free_bootmem(). It's
      supposed to fix a bug that the commit
      75f1cdf1 introduced, we
      initialize SWIOTLB right after dma32_free_bootmem so we wrongly
      steal memory area allocated for GART with broken BIOS earlier.
      
      However, the above commit introduced another problem, which
      likely breaks machines with huge amount of memory. Such a box
      use the majority of DMA32_ZONE so there is no memory for
      swiotlb.
      
      With this patch, the x86 IOMMU initialization sequence are:
      
      1. We set swiotlb to 1 in the case of (max_pfn > MAX_DMA32_PFN
         && !no_iommu). If swiotlb usage is forced by the boot option,
         we go to the step 3 and finish (we don't try to detect IOMMUs).
      
      2. We call the detection functions of all the IOMMUs. The
         detection function sets x86_init.iommu.iommu_init to the IOMMU
         initialization function (so we can avoid calling the
         initialization functions of all the IOMMUs needlessly).
      
      3. We initialize swiotlb (and set dma_ops to swiotlb_dma_ops) if
         swiotlb is set to 1.
      
      4. If the IOMMU initialization function doesn't need swiotlb
         (e.g. the initialization is sucessful) then sets swiotlb to zero.
      
      5. If we find that swiotlb is set to zero, we free swiotlb
         resource.
      Reported-by: NYinghai Lu <yinghai@kernel.org>
      Reported-by: NRoland Dreier <rdreier@cisco.com>
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      LKML-Reference: <20091215204729A.fujita.tomonori@lab.ntt.co.jp>
      Tested-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      186a2502
    • H
      x86: Regex support and known-movable symbols for relocs, fix _end · 873b5271
      H. Peter Anvin 提交于
      This adds a new category of symbols to the relocs program: symbols
      which are known to be relative, even though the linker emits them as
      absolute; this is the case for symbols that live in the linker script,
      which currently applies to _end.
      
      Unfortunately the previous workaround of putting _end in its own empty
      section was defeated by newer binutils, which remove empty sections
      completely.
      
      This patch also changes the symbol matching to use regular expressions
      instead of hardcoded C for specific patterns.
      
      This is a decidedly non-minimal patch: a modified version of the
      relocs program is used as part of the Syslinux build, and this 	is
      basically a backport to Linux of some of those changes; they have
      thus been well tested.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4AF86211.3070103@zytor.com>
      Acked-by: NMichal Marek <mmarek@suse.cz>
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      873b5271
    • H
      x86, msr: Remove incorrect, duplicated code in the MSR driver · 494c2ebf
      H. Peter Anvin 提交于
      The MSR driver would compute the values for cpu and c at declaration,
      and then again in the body of the function.  This isn't merely
      redundant, but unsafe, since cpu might not refer to a valid CPU at
      that point.
      
      Remove the unnecessary and dangerous references in the declarations.
      This code now matches the equivalent code in the CPUID driver.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      494c2ebf
  4. 14 12月, 2009 5 次提交
  5. 13 12月, 2009 1 次提交
  6. 12 12月, 2009 3 次提交
    • M
      x86: Limit the number of processor bootup messages · 2eaad1fd
      Mike Travis 提交于
      When there are a large number of processors in a system, there
      is an excessive amount of messages sent to the system console.
      It's estimated that with 4096 processors in a system, and the
      console baudrate set to 56K, the startup messages will take
      about 84 minutes to clear the serial port.
      
      This set of patches limits the number of repetitious messages
      which contain no additional information.  Much of this information
      is obtainable from the /proc and /sysfs.   Some of the messages
      are also sent to the kernel log buffer as KERN_DEBUG messages so
      dmesg can be used to examine more closely any details specific to
      a problem.
      
      The new cpu bootup sequence for system_state == SYSTEM_BOOTING:
      
      Booting Node   0, Processors  #1 #2 #3 #4 #5 #6 #7 Ok.
      Booting Node   1, Processors  #8 #9 #10 #11 #12 #13 #14 #15 Ok.
      ...
      Booting Node   3, Processors  #56 #57 #58 #59 #60 #61 #62 #63 Ok.
      Brought up 64 CPUs
      
      After the system is running, a single line boot message is displayed
      when CPU's are hotplugged on:
      
          Booting Node %d Processor %d APIC 0x%x
      
      Status of the following lines:
      
          CPU: Physical Processor ID:		printed once (for boot cpu)
          CPU: Processor Core ID:		printed once (for boot cpu)
          CPU: Hyper-Threading is disabled	printed once (for boot cpu)
          CPU: Thermal monitoring enabled	printed once (for boot cpu)
          CPU %d/0x%x -> Node %d:		removed
          CPU %d is now offline:		only if system_state == RUNNING
          Initializing CPU#%d:		KERN_DEBUG
      Signed-off-by: NMike Travis <travis@sgi.com>
      LKML-Reference: <4B219E28.8080601@sgi.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      2eaad1fd
    • M
      x86: Remove enabling x2apic message for every CPU · 450b1e8d
      Mike Travis 提交于
      Print only once that the system is supporting x2apic mode.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      LKML-Reference: <4B226E92.5080904@sgi.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      450b1e8d
    • B
      x86, msr: Add support for non-contiguous cpumasks · 50542251
      Borislav Petkov 提交于
      The current rd/wrmsr_on_cpus helpers assume that the supplied
      cpumasks are contiguous. However, there are machines out there
      like some K8 multinode Opterons which have a non-contiguous core
      enumeration on each node (e.g. cores 0,2 on node 0 instead of 0,1), see
      http://www.gossamer-threads.com/lists/linux/kernel/1160268.
      
      This patch fixes out-of-bounds writes (see URL above) by adding per-CPU
      msr structs which are used on the respective cores.
      
      Additionally, two helpers, msrs_{alloc,free}, are provided for use by
      the callers of the MSR accessors.
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Doug Thompson <dougthompson@xmission.com>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      LKML-Reference: <20091211171440.GD31998@aftab>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      50542251
  7. 11 12月, 2009 7 次提交
  8. 10 12月, 2009 13 次提交