1. 24 3月, 2011 1 次提交
    • O
      crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr,... · 93a72052
      Olaf Hering 提交于
      crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn
      
      The Xen PV drivers in a crashed HVM guest can not connect to the dom0
      backend drivers because both frontend and backend drivers are still in
      connected state.  To run the connection reset function only in case of a
      crashdump, the is_kdump_kernel() function needs to be available for the PV
      driver modules.
      
      Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
      kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
      usable for modules.
      
      Leave 'elfcorehdr' as early_param().  This changes powerpc from __setup()
      to early_param().  It adds an address range check from x86 also on ia64
      and powerpc.
      
      [akpm@linux-foundation.org: additional #includes]
      [akpm@linux-foundation.org: remove elfcorehdr_addr export]
      [akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
      Signed-off-by: NOlaf Hering <olaf@aepfle.de>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93a72052
  2. 20 3月, 2011 1 次提交
    • Y
      x86: Cleanup highmap after brk is concluded · e5f15b45
      Yinghai Lu 提交于
      Now cleanup_highmap actually is in two steps: one is early in head64.c
      and only clears above _end; a second one is in init_memory_mapping() and
      tries to clean from _brk_end to _end.
      It should check if those boundaries are PMD_SIZE aligned but currently
      does not.
      Also init_memory_mapping() is called several times for numa or memory
      hotplug, so we really should not handle initial kernel mappings there.
      
      This patch moves cleanup_highmap() down after _brk_end is settled so
      we can do everything in one step.
      Also we honor max_pfn_mapped in the implementation of cleanup_highmap.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      e5f15b45
  3. 04 3月, 2011 1 次提交
    • T
      x86-64, NUMA: Revert NUMA affine page table allocation · f8911250
      Tejun Heo 提交于
      This patch reverts NUMA affine page table allocation added by commit
      1411e0ec (x86-64, numa: Put pgtable to local node memory).
      
      The commit made an undocumented change where the kernel linear mapping
      strictly follows intersection of e820 memory map and NUMA
      configuration.  If the physical memory configuration has holes or NUMA
      nodes are not properly aligned, this leads to using unnecessarily
      smaller mapping size which leads to increased TLB pressure.  For
      details,
      
        http://thread.gmane.org/gmane.linux.kernel/1104672
      
      Patches to fix the problem have been proposed but the underlying code
      needs more cleanup and the approach itself seems a bit heavy handed
      and it has been determined to revert the feature for now and come back
      to it in the next developement cycle.
      
        http://thread.gmane.org/gmane.linux.kernel/1105959
      
      As init_memory_mapping_high() callsites have been consolidated since
      the commit, reverting is done manually.  Also, the RED-PEN comment in
      arch/x86/mm/init.c is not restored as the problem no longer exists
      with memblock based top-down early memory allocation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      f8911250
  4. 25 2月, 2011 1 次提交
    • T
      x86: dt: Cleanup local apic setup · a906fdaa
      Thomas Gleixner 提交于
      Up to now we force enable the local apic in the devicetree setup
      uncoditionally and set smp_found_config unconditionally to 1 when a
      devicetree blob is available. This breaks, when local apic is disabled
      in the Kconfig.
      
      Make it consistent by initializing device tree explicitely before
      smp_get_config() so a non lapic configuration could be used as well.
      To be functional that would require to implement PIT as an interrupt
      host, but the only user of this code until now is ce4100 which
      requires apics to be available. So we leave this up to those who need
      it.
      Tested-by: NSebastian Siewior <bigeasy@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      a906fdaa
  5. 24 2月, 2011 2 次提交
  6. 18 2月, 2011 2 次提交
    • H
      x86, trampoline: Use the unified trampoline setup for ACPI wakeup · d1ee4335
      H. Peter Anvin 提交于
      Use the unified trampoline allocation setup to allocate and install
      the ACPI wakeup code in low memory.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      LKML-Reference: <4D5DFBE4.7090104@intel.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Matthieu Castet <castet.matthieu@free.fr>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      d1ee4335
    • H
      x86, trampoline: Common infrastructure for low memory trampolines · 4822b7fc
      H. Peter Anvin 提交于
      Common infrastructure for low memory trampolines.  This code installs
      the trampolines permanently in low memory very early.  It also permits
      multiple pieces of code to be used for this purpose.
      
      This code also introduces a standard infrastructure for computing
      symbol addresses in the trampoline code.
      
      The only change to the actual SMP trampolines themselves is that the
      64-bit trampoline has been made reusable -- the previous version would
      overwrite the code with a status variable; this moves the status
      variable to a separate location.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      LKML-Reference: <4D5DFBE4.7090104@intel.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Matthieu Castet <castet.matthieu@free.fr>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      4822b7fc
  7. 16 2月, 2011 4 次提交
    • T
      x86, NUMA: Move *_numa_init() invocations into initmem_init() · d8fc3afc
      Tejun Heo 提交于
      There's no reason for these to live in setup_arch().  Move them inside
      initmem_init().
      
      - v2: x86-32 initmem_init() weren't updated breaking 32bit builds.
        Fixed.  Found by Ankita.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ankita Garg <ankita@in.ibm.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Shaohui Zheng <shaohui.zheng@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      d8fc3afc
    • T
      x86-64, NUMA: Wrap acpi_numa_init() so that failure can be indicated by return value · a9aec56a
      Tejun Heo 提交于
      Because of the way ACPI tables are parsed, the generic
      acpi_numa_init() couldn't return failure when error was detected by
      arch hooks.  Instead, the failure state was recorded and later arch
      dependent init hook - acpi_scan_nodes() - would fail.
      
      Wrap acpi_numa_init() with x86_acpi_numa_init() so that failure can be
      indicated as return value immediately.  This is in preparation for
      further NUMA init cleanups.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Shaohui Zheng <shaohui.zheng@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      a9aec56a
    • T
      x86-64, NUMA: Unify {acpi|amd}_{numa_init|scan_nodes}() arguments and return values · 940fed2e
      Tejun Heo 提交于
      The functions used during NUMA initialization - *_numa_init() and
      *_scan_nodes() - have different arguments and return values.  Unify
      them such that they all take no argument and return 0 on success and
      -errno on failure.  This is in preparation for further NUMA init
      cleanups.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Shaohui Zheng <shaohui.zheng@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      940fed2e
    • T
      x86, NUMA: Drop @start/last_pfn from initmem_init() · 86ef4dbf
      Tejun Heo 提交于
      initmem_init() extensively accesses and modifies global data
      structures and the parameters aren't even followed depending on which
      path is being used.  Drop @start/last_pfn and let it deal with
      @max_pfn directly.  This is in preparation for further NUMA init
      cleanups.
      
      - v2: x86-32 initmem_init() weren't updated breaking 32bit builds.
        Fixed.  Found by Yinghai.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Shaohui Zheng <shaohui.zheng@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      86ef4dbf
  8. 15 2月, 2011 1 次提交
  9. 28 1月, 2011 1 次提交
    • T
      x86: Unify NUMA initialization between 32 and 64bit · 8db78cc4
      Tejun Heo 提交于
      Now that everything else is unified, NUMA initialization can be
      unified too.
      
      * numa_init_array() and init_cpu_to_node() are moved from
        numa_64 to numa.
      
      * numa_32::initmem_init() is updated to call numa_init_array()
        and setup_arch() to call init_cpu_to_node() on 32bit too.
      
      * x86_cpu_to_node_map is now initialized to NUMA_NO_NODE on
        32bit too. This is safe now as numa_init_array() will initialize
        it early during boot.
      
      This makes NUMA mapping fully initialized before
      setup_per_cpu_areas() on 32bit too and thus makes the first
      percpu chunk which contains all the static variables and some of
      dynamic area allocated with NUMA affinity correctly considered.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: yinghai@kernel.org
      Cc: brgerst@gmail.com
      Cc: gorcunov@gmail.com
      Cc: shaohui.zheng@intel.com
      Cc: rientjes@google.com
      LKML-Reference: <1295789862-25482-17-git-send-email-tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Reviewed-by: NPekka Enberg <penberg@kernel.org>
      8db78cc4
  10. 05 1月, 2011 1 次提交
  11. 30 12月, 2010 2 次提交
    • Y
      x86-64, numa: Put pgtable to local node memory · 1411e0ec
      Yinghai Lu 提交于
      Introduce init_memory_mapping_high(), and use it with 64bit.
      
      It will go with every memory segment above 4g to create page table to the
      memory range itself.
      
      before this patch all page tables was on one node.
      
      with this patch, one RED-PEN is killed
      
      debug out for 8 sockets system after patch
      [    0.000000] initial memory mapped : 0 - 20000000
      [    0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff]
      [    0.000000]  0000000000 - 007f600000 page 2M
      [    0.000000]  007f600000 - 007f750000 page 4k
      [    0.000000] kernel direct mapping tables up to 7f750000 @ [0x7f74c000-0x7f74ffff]
      [    0.000000] RAMDISK: 7bc84000 - 7f745000
      ....
      [    0.000000] Adding active range (0, 0x10, 0x95) 0 entries of 3200 used
      [    0.000000] Adding active range (0, 0x100, 0x7f750) 1 entries of 3200 used
      [    0.000000] Adding active range (0, 0x100000, 0x1080000) 2 entries of 3200 used
      [    0.000000] Adding active range (1, 0x1080000, 0x2080000) 3 entries of 3200 used
      [    0.000000] Adding active range (2, 0x2080000, 0x3080000) 4 entries of 3200 used
      [    0.000000] Adding active range (3, 0x3080000, 0x4080000) 5 entries of 3200 used
      [    0.000000] Adding active range (4, 0x4080000, 0x5080000) 6 entries of 3200 used
      [    0.000000] Adding active range (5, 0x5080000, 0x6080000) 7 entries of 3200 used
      [    0.000000] Adding active range (6, 0x6080000, 0x7080000) 8 entries of 3200 used
      [    0.000000] Adding active range (7, 0x7080000, 0x8080000) 9 entries of 3200 used
      [    0.000000] init_memory_mapping: [0x00000100000000-0x0000107fffffff]
      [    0.000000]  0100000000 - 1080000000 page 2M
      [    0.000000] kernel direct mapping tables up to 1080000000 @ [0x107ffbd000-0x107fffffff]
      [    0.000000]     memblock_x86_reserve_range: [0x107ffc2000-0x107fffffff]          PGTABLE
      [    0.000000] init_memory_mapping: [0x00001080000000-0x0000207fffffff]
      [    0.000000]  1080000000 - 2080000000 page 2M
      [    0.000000] kernel direct mapping tables up to 2080000000 @ [0x207ff7d000-0x207fffffff]
      [    0.000000]     memblock_x86_reserve_range: [0x207ffc0000-0x207fffffff]          PGTABLE
      [    0.000000] init_memory_mapping: [0x00002080000000-0x0000307fffffff]
      [    0.000000]  2080000000 - 3080000000 page 2M
      [    0.000000] kernel direct mapping tables up to 3080000000 @ [0x307ff3d000-0x307fffffff]
      [    0.000000]     memblock_x86_reserve_range: [0x307ffc0000-0x307fffffff]          PGTABLE
      [    0.000000] init_memory_mapping: [0x00003080000000-0x0000407fffffff]
      [    0.000000]  3080000000 - 4080000000 page 2M
      [    0.000000] kernel direct mapping tables up to 4080000000 @ [0x407fefd000-0x407fffffff]
      [    0.000000]     memblock_x86_reserve_range: [0x407ffc0000-0x407fffffff]          PGTABLE
      [    0.000000] init_memory_mapping: [0x00004080000000-0x0000507fffffff]
      [    0.000000]  4080000000 - 5080000000 page 2M
      [    0.000000] kernel direct mapping tables up to 5080000000 @ [0x507febd000-0x507fffffff]
      [    0.000000]     memblock_x86_reserve_range: [0x507ffc0000-0x507fffffff]          PGTABLE
      [    0.000000] init_memory_mapping: [0x00005080000000-0x0000607fffffff]
      [    0.000000]  5080000000 - 6080000000 page 2M
      [    0.000000] kernel direct mapping tables up to 6080000000 @ [0x607fe7d000-0x607fffffff]
      [    0.000000]     memblock_x86_reserve_range: [0x607ffc0000-0x607fffffff]          PGTABLE
      [    0.000000] init_memory_mapping: [0x00006080000000-0x0000707fffffff]
      [    0.000000]  6080000000 - 7080000000 page 2M
      [    0.000000] kernel direct mapping tables up to 7080000000 @ [0x707fe3d000-0x707fffffff]
      [    0.000000]     memblock_x86_reserve_range: [0x707ffc0000-0x707fffffff]          PGTABLE
      [    0.000000] init_memory_mapping: [0x00007080000000-0x0000807fffffff]
      [    0.000000]  7080000000 - 8080000000 page 2M
      [    0.000000] kernel direct mapping tables up to 8080000000 @ [0x807fdfc000-0x807fffffff]
      [    0.000000]     memblock_x86_reserve_range: [0x807ffbf000-0x807fffffff]          PGTABLE
      [    0.000000] Initmem setup node 0 [0000000000000000-000000107fffffff]
      [    0.000000]   NODE_DATA [0x0000107ffbd000-0x0000107ffc1fff]
      [    0.000000] Initmem setup node 1 [0000001080000000-000000207fffffff]
      [    0.000000]   NODE_DATA [0x0000207ffbb000-0x0000207ffbffff]
      [    0.000000] Initmem setup node 2 [0000002080000000-000000307fffffff]
      [    0.000000]   NODE_DATA [0x0000307ffbb000-0x0000307ffbffff]
      [    0.000000] Initmem setup node 3 [0000003080000000-000000407fffffff]
      [    0.000000]   NODE_DATA [0x0000407ffbb000-0x0000407ffbffff]
      [    0.000000] Initmem setup node 4 [0000004080000000-000000507fffffff]
      [    0.000000]   NODE_DATA [0x0000507ffbb000-0x0000507ffbffff]
      [    0.000000] Initmem setup node 5 [0000005080000000-000000607fffffff]
      [    0.000000]   NODE_DATA [0x0000607ffbb000-0x0000607ffbffff]
      [    0.000000] Initmem setup node 6 [0000006080000000-000000707fffffff]
      [    0.000000]   NODE_DATA [0x0000707ffbb000-0x0000707ffbffff]
      [    0.000000] Initmem setup node 7 [0000007080000000-000000807fffffff]
      [    0.000000]   NODE_DATA [0x0000807ffba000-0x0000807ffbefff]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4D1933D1.9020609@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      1411e0ec
    • Y
      x86: Change get_max_mapped() to inline · 45635ab5
      Yinghai Lu 提交于
      Move it into head file. to prepare use it in other files.
      
      [ hpa: added missing <linux/types.h> and changed type to phys_addr_t. ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4D1933BA.8000508@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      45635ab5
  12. 18 12月, 2010 2 次提交
  13. 18 11月, 2010 1 次提交
  14. 27 10月, 2010 2 次提交
  15. 23 10月, 2010 1 次提交
  16. 21 10月, 2010 1 次提交
  17. 14 10月, 2010 1 次提交
  18. 06 10月, 2010 1 次提交
    • Y
      x86, memblock: Fix crashkernel allocation · 9f4c1396
      Yinghai Lu 提交于
      Cai Qian found crashkernel is broken with the x86 memblock changes.
      
      1. crashkernel=128M@32M always reported that range is used, even if
         the first kernel is small and does not usethat range
      
      2. we always got following report when using "kexec -p"
      	Could not find a free area of memory of a000 bytes...
      	locate_hole failed
      
      The root cause is that generic memblock_find_in_range() will try to
      allocate from the top of the range, whereas the kexec code was written
      assuming that allocation was always near the bottom and that it could
      blindly extend memory upward.  Unfortunately the kexec code doesn't
      have a system for requesting the range that it really needs, so this
      is subject to probabilistic failures.
      
      This patch hacks around the problem by limiting the target range
      heuristically to below the traditional bzImage max range.  This number
      is arbitrary and not always correct, and a much better result would be
      obtained by having kexec communicate this number based on the kernel
      header information and any appropriate command line options.
      Reported-and-Bisected-by: NCAI Qian <caiqian@redhat.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4CABAF2A.5090501@kernel.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9f4c1396
  19. 21 9月, 2010 2 次提交
  20. 28 8月, 2010 4 次提交
    • Y
      x86: Remove old bootmem code · 774ea0bc
      Yinghai Lu 提交于
      Requested by Ingo, Thomas and HPA.
      
      The old bootmem code is no longer necessary, and the transition is
      complete.  Remove it.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      774ea0bc
    • Y
      x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve · 6f2a7536
      Yinghai Lu 提交于
      memblock_memory_size() will return memory size in memblock.memory.region.
      memblock_free_memory_size() will return free memory size in memblock.memory.region.
      
      So We can get exact reseved size in specified range.
      
      Set the size right after initmem_init(), because later bootmem API will
      get area above 16M. (except some fallback).
      
      Later after we remove the bootmem, We could call that just before paging_init().
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6f2a7536
    • Y
      x86, memblock: Replace e820_/_early string with memblock_ · a9ce6bc1
      Yinghai Lu 提交于
      1.include linux/memblock.h directly. so later could reduce e820.h reference.
      2 this patch is done by sed scripts mainly
      
      -v2: use MEMBLOCK_ERROR instead of -1ULL or -1UL
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      a9ce6bc1
    • Y
      x86: Use memblock to replace early_res · 72d7c3b3
      Yinghai Lu 提交于
      1. replace find_e820_area with memblock_find_in_range
      2. replace reserve_early with memblock_x86_reserve_range
      3. replace free_early with memblock_x86_free_range.
      4. NO_BOOTMEM will switch to use memblock too.
      5. use _e820, _early wrap in the patch, in following patch, will
         replace them all
      6. because memblock_x86_free_range support partial free, we can remove some special care
      7. Need to make sure that memblock_find_in_range() is called after memblock_x86_fill()
         so adjust some calling later in setup.c::setup_arch()
         -- corruption_check and mptable_update
      
      -v2: Move reserve_brk() early
          Before fill_memblock_area, to avoid overlap between brk and memblock_find_in_range()
          that could happen We have more then 128 RAM entry in E820 tables, and
          memblock_x86_fill() could use memblock_find_in_range() to find a new place for
          memblock.memory.region array.
          and We don't need to use extend_brk() after fill_memblock_area()
          So move reserve_brk() early before fill_memblock_area().
      -v3: Move find_smp_config early
          To make sure memblock_find_in_range not find wrong place, if BIOS doesn't put mptable
          in right place.
      -v4: Treat RESERVED_KERN as RAM in memblock.memory. and they are already in
          memblock.reserved already..
          use __NOT_KEEP_MEMBLOCK to make sure memblock related code could be freed later.
      -v5: Generic version __memblock_find_in_range() is going from high to low, and for 32bit
          active_region for 32bit does include high pages
          need to replace the limit with memblock.default_alloc_limit, aka get_max_mapped()
      -v6: Use current_limit instead
      -v7: check with MEMBLOCK_ERROR instead of -1ULL or -1L
      -v8: Set memblock_can_resize early to handle EFI with more RAM entries
      -v9: update after kmemleak changes in mainline
      Suggested-by: NDavid S. Miller <davem@davemloft.net>
      Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      72d7c3b3
  21. 26 8月, 2010 1 次提交
  22. 25 8月, 2010 1 次提交
  23. 24 8月, 2010 1 次提交
    • A
      x86, vmware: Remove deprecated VMI kernel support · 9863c90f
      Alok Kataria 提交于
      With the recent innovations in CPU hardware acceleration technologies
      from Intel and AMD, VMware ran a few experiments to compare these
      techniques to guest paravirtualization technique on VMware's platform.
      These hardware assisted virtualization techniques have outperformed the
      performance benefits provided by VMI in most of the workloads. VMware
      expects that these hardware features will be ubiquitous in a couple of
      years, as a result, VMware has started a phased retirement of this
      feature from the hypervisor.
      
      Please note that VMI has always been an optimization and non-VMI kernels
      still work fine on VMware's platform.
      Latest versions of VMware's product which support VMI are,
      Workstation 7.0 and VSphere 4.0 on ESX side, future maintainence
      releases for these products will continue supporting VMI.
      
      For more details about VMI retirement take a look at this,
      http://blogs.vmware.com/guestosguide/2009/09/vmi-retirement.html
      
      This feature removal was scheduled for 2.6.37 back in September 2009.
      Signed-off-by: NAlok N Kataria <akataria@vmware.com>
      LKML-Reference: <1282600151.19396.22.camel@ank32.eng.vmware.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      9863c90f
  24. 19 8月, 2010 1 次提交
    • J
      x86-32: Separate 1:1 pagetables from swapper_pg_dir · fd89a137
      Joerg Roedel 提交于
      This patch fixes machine crashes which occur when heavily exercising the
      CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by
      AMD Erratum 383 and result in a fatal machine check exception. Here's
      the scenario:
      
      1. On 32-bit, the swapper_pg_dir page table is used as the initial page
      table for booting a secondary CPU.
      
      2. To make this work, swapper_pg_dir needs a direct mapping of physical
      memory in it (the low mappings). By adding those low, large page (2M)
      mappings (PAE kernel), we create the necessary conditions for Erratum
      383 to occur.
      
      3. Other CPUs which do not participate in the off- and onlining game may
      use swapper_pg_dir while the low mappings are present (when leave_mm is
      called). For all steps below, the CPU referred to is a CPU that is using
      swapper_pg_dir, and not the CPU which is being onlined.
      
      4. The presence of the low mappings in swapper_pg_dir can result
      in TLB entries for addresses below __PAGE_OFFSET to be established
      speculatively. These TLB entries are marked global and large.
      
      5. When the CPU with such TLB entry switches to another page table, this
      TLB entry remains because it is global.
      
      6. The process then generates an access to an address covered by the
      above TLB entry but there is a permission mismatch - the TLB entry
      covers a large global page not accessible to userspace.
      
      7. Due to this permission mismatch a new 4kb, user TLB entry gets
      established. Further, Erratum 383 provides for a small window of time
      where both TLB entries are present. This results in an uncorrectable
      machine check exception signalling a TLB multimatch which panics the
      machine.
      
      There are two ways to fix this issue:
      
              1. Always do a global TLB flush when a new cr3 is loaded and the
              old page table was swapper_pg_dir. I consider this a hack hard
              to understand and with performance implications
      
              2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit
              does.
      
      This patch implements solution 2. It introduces a trampoline_pg_dir
      which has the same layout as swapper_pg_dir with low_mappings. This page
      table is used as the initial page table of the booting CPU. Later in the
      bringup process, it switches to swapper_pg_dir and does a global TLB
      flush. This fixes the crashes in our test cases.
      
      -v2: switch to swapper_pg_dir right after entering start_secondary() so
      that we are able to access percpu data which might not be mapped in the
      trampoline page table.
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      LKML-Reference: <20100816123833.GB28147@aftab>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      fd89a137
  25. 13 8月, 2010 1 次提交
  26. 19 6月, 2010 1 次提交
    • A
      x86, olpc: Add support for calling into OpenFirmware · fd699c76
      Andres Salomon 提交于
      Add support for saving OFW's cif, and later calling into it to run OFW
      commands.  OFW remains resident in memory, living within virtual range
      0xff800000 - 0xffc00000.  A single page directory entry points to the
      pgdir that OFW actually uses, so rather than saving the entire page
      table, we grab and install that one entry permanently in the kernel's
      page table.
      
      This is currently only used by the OLPC XO.  Note that this particular
      calling convention breaks PAE and PAT, and so cannot be used on newer
      x86 hardware.
      Signed-off-by: NAndres Salomon <dilinger@queued.net>
      LKML-Reference: <20100618174653.7755a39a@dev.queued.net>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      fd699c76
  27. 25 5月, 2010 1 次提交
  28. 21 5月, 2010 1 次提交