1. 02 11月, 2008 3 次提交
    • A
      x86: Add a synthetic TSC_RELIABLE feature bit. · eca0cd02
      Alok Kataria 提交于
      Impact: Changes timebase calibration on Vmware.
      
      Use the synthetic TSC_RELIABLE bit to workaround virtualization anomalies.
      
      Virtual TSCs can be kept nearly in sync, but because the virtual TSC
      offset is set by software, it's not perfect.  So, the TSC
      synchronization test can fail. Even then the TSC can be used as a
      clocksource since the VMware platform exports a reliable TSC to the
      guest for timekeeping purposes. Use this bit to check if we need to
      skip the TSC sync checks.
      
      Along with this also set the CONSTANT_TSC bit when on VMware, since we
      still want to use TSC as clocksource on VM running over hardware which
      has unsynchronized TSC's (opteron's), since the hypervisor will take
      care of providing consistent TSC to the guest.
      Signed-off-by: NAlok N Kataria <akataria@vmware.com>
      Signed-off-by: NDan Hecht <dhecht@vmware.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      eca0cd02
    • A
      x86: Hypervisor detection and get tsc_freq from hypervisor · 88b094fb
      Alok Kataria 提交于
      Impact: Changes timebase calibration on Vmware.
      
      v3->v2 : Abstract the hypervisor detection and feature (tsc_freq) request
      	 behind a hypervisor.c file
      v2->v1 : Add a x86_hyper_vendor field to the cpuinfo_x86 structure.
      	 This avoids multiple calls to the hypervisor detection function.
      
      This patch adds function to detect if we are running under VMware.
      The current way to check if we are on VMware is following,
      #  check if "hypervisor present bit" is set, if so read the 0x40000000
         cpuid leaf and check for "VMwareVMware" signature.
      #  if the above fails, check the DMI vendors name for "VMware" string
         if we find one we query the VMware hypervisor port to check if we are
         under VMware.
      
      The DMI + "VMware hypervisor port check" is needed for older VMware products,
      which don't implement the hypervisor signature cpuid leaf.
      Also note that since we are checking for the DMI signature the hypervisor
      port should never be accessed on native hardware.
      
      This patch also adds a hypervisor_get_tsc_freq function, instead of
      calibrating the frequency which can be error prone in virtualized
      environment, we ask the hypervisor for it. We get the frequency from
      the hypervisor by accessing the hypervisor port if we are running on VMware.
      Other hypervisors too can add code to the generic routine to get frequency on
      their platform.
      Signed-off-by: NAlok N Kataria <akataria@vmware.com>
      Signed-off-by: NDan Hecht <dhecht@vmware.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      88b094fb
    • A
      x86: add X86_FEATURE_HYPERVISOR feature bit · 49ab56ac
      Alok Kataria 提交于
      Impact: Number declaration only.
      
      Add X86_FEATURE_HYPERVISOR bit (CPUID level 1, ECX, bit 31).
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      49ab56ac
  2. 01 11月, 2008 1 次提交
  3. 31 10月, 2008 13 次提交
  4. 30 10月, 2008 1 次提交
  5. 29 10月, 2008 5 次提交
    • G
      x86: remove debug code from arch_add_memory() · fe8b868e
      Gary Hade 提交于
      Impact: remove incorrect WARN_ON(1)
      
      Gets rid of dmesg spam created during physical memory hot-add which
      will very likely confuse users.  The change removes what appears to
      be debugging code which I assume was unintentionally included in:
      
        x86: arch/x86/mm/init_64.c printk fixes
        commit 10f22ddeSigned-off-by: NGary Hade <garyhade@us.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fe8b868e
    • H
      x86: start annotating early ioremap pointers with __iomem · 1d6cf1fe
      Harvey Harrison 提交于
      Impact: some new sparse warnings in e820.c etc, but no functional change.
      
      As with regular ioremap, iounmap etc, annotate with __iomem.
      
      Fixes the following sparse warnings, will produce some new ones
      elsewhere in arch/x86 that will get worked out over time.
      
      arch/x86/mm/ioremap.c:402:9: warning: cast removes address space of expression
      arch/x86/mm/ioremap.c:406:10: warning: cast adds address space to expression (<asn:2>)
      arch/x86/mm/ioremap.c:782:19: warning: Using plain integer as NULL pointer
      Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1d6cf1fe
    • H
      x86: two trivial sparse annotations · 9352f569
      Harvey Harrison 提交于
      Impact: fewer sparse warnings, no functional changes
      
      arch/x86/kernel/vsmp_64.c:87:14: warning: incorrect type in argument 1 (different address spaces)
      arch/x86/kernel/vsmp_64.c:87:14:    expected void const volatile [noderef] <asn:2>*addr
      arch/x86/kernel/vsmp_64.c:87:14:    got void *[assigned] address
      arch/x86/kernel/vsmp_64.c:88:22: warning: incorrect type in argument 1 (different address spaces)
      arch/x86/kernel/vsmp_64.c:88:22:    expected void const volatile [noderef] <asn:2>*addr
      arch/x86/kernel/vsmp_64.c:88:22:    got void *
      arch/x86/kernel/vsmp_64.c:100:23: warning: incorrect type in argument 2 (different address spaces)
      arch/x86/kernel/vsmp_64.c:100:23:    expected void volatile [noderef] <asn:2>*addr
      arch/x86/kernel/vsmp_64.c:100:23:    got void *
      arch/x86/kernel/vsmp_64.c:101:23: warning: incorrect type in argument 1 (different address spaces)
      arch/x86/kernel/vsmp_64.c:101:23:    expected void const volatile [noderef] <asn:2>*addr
      arch/x86/kernel/vsmp_64.c:101:23:    got void *
      arch/x86/mm/gup.c:235:6: warning: incorrect type in argument 1 (different base types)
      arch/x86/mm/gup.c:235:6:    expected void const volatile [noderef] <asn:1>*<noident>
      arch/x86/mm/gup.c:235:6:    got unsigned long [unsigned] [assigned] start
      Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9352f569
    • Y
      x86: fix init_memory_mapping for [dc000000 - e0000000) - v2 · f96f57d9
      Yinghai Lu 提交于
      Impact: change over-mapping to precise mapping, fix /proc/meminfo output
      
      v2: fix less than 1G ram system handling
      
      when gart aperture is 0xdc000000 - 0xe0000000
      it return 0xc0000000 - 0xe0000000
      
      that is not right.
      
      this patch fix that will get exact mapping
      
      on 256g sytem with that aperture after patch
      LBSuse:~ # cat /proc/meminfo
      MemTotal:       264742432 kB
      MemFree:        263920628 kB
      Buffers:            1416 kB
      Cached:            24468 kB
      ...
      DirectMap4k:      5760 kB
      DirectMap2M:   3205120 kB
      DirectMap1G:  265289728 kB
      
      it is consistent to
      LBSuse:~ # cat /sys/kernel/debug/kernel_page_tables
      ..
      ---[ Low Kernel Mapping ]---
      0xffff880000000000-0xffff880000200000           2M     RW             GLB x  pte
      0xffff880000200000-0xffff880040000000        1022M     RW         PSE GLB x  pmd
      0xffff880040000000-0xffff8800c0000000           2G     RW         PSE GLB NX pud
      0xffff8800c0000000-0xffff8800d7e00000         382M     RW         PSE GLB NX pmd
      0xffff8800d7e00000-0xffff8800d7fa0000        1664K     RW             GLB NX pte
      0xffff8800d7fa0000-0xffff8800d8000000         384K                           pte
      0xffff8800d8000000-0xffff8800dc000000          64M                           pmd
      0xffff8800dc000000-0xffff8800e0000000          64M     RW         PSE GLB NX pmd
      0xffff8800e0000000-0xffff880100000000         512M                           pmd
      0xffff880100000000-0xffff880800000000          28G     RW         PSE GLB NX pud
      0xffff880800000000-0xffff880824600000         582M     RW         PSE GLB NX pmd
      0xffff880824600000-0xffff8808247f0000        1984K     RW             GLB NX pte
      0xffff8808247f0000-0xffff880824800000          64K     RW     PCD     GLB NX pte
      0xffff880824800000-0xffff880840000000         440M     RW         PSE GLB NX pmd
      0xffff880840000000-0xffff884000000000         223G     RW         PSE GLB NX pud
      0xffff884000000000-0xffff884028000000         640M     RW         PSE GLB NX pmd
      0xffff884028000000-0xffff884040000000         384M                           pmd
      0xffff884040000000-0xffff888000000000         255G                           pud
      0xffff888000000000-0xffffc20000000000       58880G                           pgd
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f96f57d9
    • J
      x86, gart: fix gart detection for Fam11h CPUs · 87c6f401
      Joerg Roedel 提交于
      Impact: fix AMD Family 11h boot hangs / USB device problems
      
      The AMD Fam11h CPUs have a K8 northbridge. This northbridge is different
      from other family's because it lacks GART support (as I just learned).
      
      But the kernel implicitly expects a GART if it finds an AMD northbridge.
      
      Fix this by removing the Fam11h northbridge id from the scan list of K8
      northbridges. This patch also changes the message in the GART driver
      about missing K8 northbridges to tell that the GART is missing which is
      the correct information in this case.
      Reported-by: NJouni Malinen <jkmalinen@gmail.com>
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      87c6f401
  6. 28 10月, 2008 10 次提交
  7. 27 10月, 2008 3 次提交
    • S
      ftrace: use a real variable for ftrace_nop in x86 · 8115f3f0
      Steven Rostedt 提交于
      Impact: avoid section mismatch warning, clean up
      
      The dynamic ftrace determines which nop is safe to use at start up.
      When it finds a safe nop for patching, it sets a pointer called ftrace_nop
      to point to the code. All call sites are then patched to this nop.
      
      Later, when tracing is turned on, this ftrace_nop variable is again used
      to compare the location to make sure it is a nop before we update it to
      an mcount call. If this fails just once, a warning is printed and ftrace
      is disabled.
      
      Rakib Mullick noted that the code that sets up the nop is a .init section
      where as the nop itself is in the .text section. This is needed because
      the nop is used later on after boot up. The problem is that the test of the
      nop jumps back to the setup code and causes a "section mismatch" warning.
      
      Rakib first recommended to convert the nop to .init.text, but as stated
      above, this would fail since that text is used later.
      
      The real solution is to extend Rabik's patch, and to make the ftrace_nop
      into an array, and just save the code from the assembly to this array.
      
      Now the section can stay as an init section, and we have a nop to use
      later on.
      Reported-by: NRakib Mullick <rakib.mullick@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8115f3f0
    • C
      x86/uv: memory allocation at initialization · ef020ab0
      Cliff Wickman 提交于
      Impact: on SGI UV platforms, fix boot crash
      
      UV initialization is currently called too late to call alloc_bootmem_pages().
      The current sequence is:
      
       start_kernel()
         mem_init()
           free_all_bootmem()           <--- discard of bootmem
         rest_init()
           kernel_init()
             smp_prepare_cpus()
             native_smp_prepare_cpus()
               uv_system_init()         <--- uses alloc_bootmem_pages()
      
      It should be calling kmalloc().
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ef020ab0
    • C
      xen: fix Xen domU boot with batched mprotect · 9f32d21c
      Chris Lalancette 提交于
      Impact: fix guest kernel boot crash on certain configs
      
      Recent i686 2.6.27 kernels with a certain amount of memory (between
      736 and 855MB) have a problem booting under a hypervisor that supports
      batched mprotect (this includes the RHEL-5 Xen hypervisor as well as
      any 3.3 or later Xen hypervisor).
      
      The problem ends up being that xen_ptep_modify_prot_commit() is using
      virt_to_machine to calculate which pfn to update.  However, this only
      works for pages that are in the p2m list, and the pages coming from
      change_pte_range() in mm/mprotect.c are kmap_atomic pages.  Because of
      this, we can run into the situation where the lookup in the p2m table
      returns an INVALID_MFN, which we then try to pass to the hypervisor,
      which then (correctly) denies the request to a totally bogus pfn.
      
      The right thing to do is to use arbitrary_virt_to_machine, so that we
      can be sure we are modifying the right pfn.  This unfortunately
      introduces a performance penalty because of a full page-table-walk,
      but we can avoid that penalty for pages in the p2m list by checking if
      virt_addr_valid is true, and if so, just doing the lookup in the p2m
      table.
      
      The attached patch implements this, and allows my 2.6.27 i686 based
      guest with 768MB of memory to boot on a RHEL-5 hypervisor again.
      Thanks to Jeremy for the suggestions about how to fix this particular
      issue.
      Signed-off-by: NChris Lalancette <clalance@redhat.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Chris Lalancette <clalance@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f32d21c
  8. 25 10月, 2008 1 次提交
  9. 24 10月, 2008 2 次提交
    • F
      x86: restore the old swiotlb alloc_coherent behavior · 03967c52
      FUJITA Tomonori 提交于
      This restores the old swiotlb alloc_coherent behavior (before the
      alloc_coherent rewrite):
      
        http://lkml.org/lkml/2008/8/12/200
      
      The old alloc_coherent avoids GFP_DMA allocation first and if the
      allocated address is not fit for the device's coherent_dma_mask, then
      dma_alloc_coherent does GFP_DMA allocation. If it fails,
      alloc_coherent calls swiotlb_alloc_coherent (in short, we rarely used
      swiotlb_alloc_coherent).
      
      After the alloc_coherent rewrite, dma_alloc_coherent
      (include/asm-x86/dma-mapping.h) directly calls swiotlb_alloc_coherent.
      It means that we possibly can't handle a device having dma_masks >
      24bit < 32bits since swiotlb_alloc_coherent doesn't have the above
      GFP_DMA retry mechanism.
      
      This patch fixes x86's swiotlb alloc_coherent to use the GFP_DMA retry
      mechanism, which dma_generic_alloc_coherent() provides now
      (pci-nommu.c and GART IOMMU driver also use
      dma_generic_alloc_coherent).
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      03967c52
    • F
      x86: use GFP_DMA for 24bit coherent_dma_mask · 75bebb7f
      FUJITA Tomonori 提交于
      dma_alloc_coherent (include/asm-x86/dma-mapping.h) avoids GFP_DMA
      allocation first and if the allocated address is not fit for the
      device's coherent_dma_mask, then dma_alloc_coherent does GFP_DMA
      allocation. This is because dma_alloc_coherent avoids precious GFP_DMA
      zone if possible. This is also how the old dma_alloc_coherent
      (arch/x86/kernel/pci-dma.c) works.
      
      However, if the coherent_dma_mask of a device is 24bit, there is no
      point to go into the above GFP_DMA retry mechanism. We had better use
      GFP_DMA in the first place.
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Tested-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75bebb7f
  10. 23 10月, 2008 1 次提交