1. 08 7月, 2008 4 次提交
    • J
      x86: fix CPA self-test for "x86/paravirt: groundwork for 64-bit Xen support" · cd5dce2f
      Jeremy Fitzhardinge 提交于
      Ingo Molnar wrote:
      > -tip auto-testing found pagetable corruption (CPA self-test failure):
      >
      > [   32.956015] CPA self-test:
      > [   32.958822]  4k 2048 large 508 gb 0 x 2556[ffff880000000000-ffff88003fe00000] miss 0
      > [   32.964000] CPA ffff88001d54e000: bad pte 1d4000e3
      > [   32.968000] CPA ffff88001d54e000: unexpected level 2
      > [   32.972000] CPA ffff880022c5d000: bad pte 22c000e3
      > [   32.976000] CPA ffff880022c5d000: unexpected level 2
      > [   32.980000] CPA ffff8800200ce000: bad pte 200000e3
      > [   32.984000] CPA ffff8800200ce000: unexpected level 2
      > [   32.988000] CPA ffff8800210f0000: bad pte 210000e3
      >
      > config and full log can be found at:
      >
      >  http://redhat.com/~mingo/misc/config-Mon_Jun_30_11_11_51_CEST_2008.bad
      >  http://redhat.com/~mingo/misc/log-Mon_Jun_30_11_11_51_CEST_2008.bad
      
      Phew.  OK, I've worked this out.  Short version is that's it's a false
      alarm, and there was no real failure here.  Long version:
      
          * I changed the code to create the physical mapping pagetables to
            reuse any existing mapping rather than replace it.   Specifically,
            reusing an pud pointed to by the pgd caused this symptom to appear.
          * The specific PUD being reused is the one created statically in
            head_64.S, which creates an initial 1GB mapping.
          * That mapping doesn't have _PAGE_GLOBAL set on it, due to the
            inconsistency between __PAGE_* and PAGE_*.
          * The CPA test attempts to clear _PAGE_GLOBAL, and then checks to
            see that the resulting range is 1) shattered into 4k pages, and 2)
            has no _PAGE_GLOBAL.
          * However, since it didn't have _PAGE_GLOBAL on that range to start
            with, change_page_attr_clear() had nothing to do, and didn't
            bother shattering the range,
          * resulting in the reported messages
      
      The simple fix is to set _PAGE_GLOBAL in level2_ident_pgt.
      
      An additional fix to make CPA testing more robust by using some other
      pagetable bit (one of the unused available-to-software ones).  This
      would solve spurious CPA test warnings under Xen which uses _PAGE_GLOBAL
      for its own purposes (ie, not under guest control).
      
      Also, we should revisit the use of _PAGE_GLOBAL in asm-x86/pgtable.h,
      and use it consistently, and drop MAKE_GLOBAL.  The first time I
      proposed it it caused breakages in the very early CPA code; with luck
      that's all fixed now.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Cc: xen-devel <xen-devel@lists.xensource.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cd5dce2f
    • E
      paravirt/x86, 64-bit: move __PAGE_OFFSET to leave a space for hypervisor · a6523748
      Eduardo Habkost 提交于
      Set __PAGE_OFFSET to the most negative possible address +
      16*PGDIR_SIZE.  The gap is to allow a space for a hypervisor to fit.
      The gap is more or less arbitrary, but it's what Xen needs.
      
      When booting native, kernel/head_64.S has a set of compile-time
      generated pagetables used at boot time.  This patch removes their
      absolutely hard-coded layout, and makes it parameterised on
      __PAGE_OFFSET (and __START_KERNEL_map).
      Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: xen-devel <xen-devel@lists.xensource.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a6523748
    • G
      x86: move x86_64 gdt closer to i386 · a939098a
      Glauber Costa 提交于
      i386 and x86_64 used two different schemes for maintaining the gdt.
      With this patch, x86_64 initial gdt table is defined in a .c file,
      same way as i386 is now. Also, we call it "gdt_page", and the descriptor,
      "early_gdt_descr". This way we achieve common naming, which can allow for
      more code integration.
      Signed-off-by: NGlauber Costa <gcosta@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a939098a
    • G
      x86: use stack_start in x86_64 · 9cf4f298
      Glauber Costa 提交于
      call x86_64's init_rsp stack_start, just as i386 does.
      Put a zeroed stack segment for consistency. With this,
      we can eliminate one ugly ifdef in smpboot.c.
      Signed-off-by: NGlauber Costa <gcosta@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9cf4f298
  2. 05 7月, 2008 1 次提交
  3. 25 5月, 2008 3 次提交
  4. 17 4月, 2008 5 次提交
  5. 26 2月, 2008 2 次提交
    • I
      x86: rename KERNEL_TEXT_SIZE => KERNEL_IMAGE_SIZE · d4afe414
      Ingo Molnar 提交于
      The KERNEL_TEXT_SIZE constant was mis-named, as we not only map the kernel
      text but data, bss and init sections as well.
      
      That name led me on the wrong path with the KERNEL_TEXT_SIZE regression,
      because i knew how big of _text_ my images have and i knew about the 40 MB
      "text" limit so i wrongly thought to be on the safe side of the 40 MB limit
      with my 29 MB of text, while the total image size was slightly above 40 MB.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d4afe414
    • I
      x86: fix spontaneous reboot with allyesconfig bzImage · 88f3aec7
      Ingo Molnar 提交于
      recently the 64-bit allyesconfig bzImage kernel started spontaneously
      rebooting during early bootup.
      
      after a few fun hours spent with early init debugging, it turns out
      that we've got this rather annoying limit on the size of the kernel
      image:
      
            #define KERNEL_TEXT_SIZE  (40*1024*1024)
      
      which limit my vmlinux just happened to pass:
      
             text           data       bss        dec       hex   filename
         29703744        4222751   8646224c   42572719   2899baf   vmlinux
      
      40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/
      
      So it happily crashed right in head_64.S, which - as we all know - is
      the most debuggable code in the whole architecture ;-)
      
      So increase the limit to allow an up to 128MB kernel image to be mapped.
      (should anyone be that crazy or lazy)
      
      We have a full 4K of pagetable (level2_kernel_pgt) allocated for these
      mappings already, so there's no RAM overhead and the limit was rather
      pointless and arbitrary.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      88f3aec7
  6. 19 2月, 2008 2 次提交
    • S
      x86: fix section mismatch in head_64.S:initial_code · da5968ae
      Sam Ravnborg 提交于
      initial_code are initially used to hold a function pointer
      from __init and later from __cpuinit. This confuses modpost
      and changing initial_code to REFDATA silence the warning.
      (But now we do not discard the variable anymore).
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      da5968ae
    • T
      x86: zap invalid and unused pmds in early boot · 31eedd82
      Thomas Gleixner 提交于
      The early boot code maps KERNEL_TEXT_SIZE (currently 40MB) starting
      from __START_KERNEL_map. The kernel itself only needs _text to _end
      mapped in the high alias. On relocatible kernels the ASM setup code
      adjusts the compile time created high mappings to the relocation. This
      creates invalid pmd entries for negative offsets:
      
      0xffffffff80000000 -> pmd entry: ffffffffff2001e3
      It points outside of the physical address space and is marked present.
      
      This starts at the virtual address __START_KERNEL_map and goes up to
      the point where the first valid physical address (0x0) is mapped.
      
      Zap the mappings before _text and after _end right away in early
      boot. This removes also the invalid entries.
      
      Furthermore it simplifies the range check for high aliases.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      31eedd82
  7. 07 2月, 2008 1 次提交
  8. 04 2月, 2008 1 次提交
  9. 30 1月, 2008 3 次提交
  10. 11 10月, 2007 2 次提交
  11. 19 8月, 2007 1 次提交
  12. 30 7月, 2007 1 次提交
  13. 25 7月, 2007 1 次提交
  14. 23 7月, 2007 1 次提交
  15. 17 7月, 2007 1 次提交
  16. 03 5月, 2007 9 次提交
    • A
      [PATCH] x86-64: Remove unused stext symbol · b8716890
      Andi Kleen 提交于
      suggested by Jan Beulich
      Signed-off-by: NAndi Kleen <ak@suse.de>
      b8716890
    • J
      [PATCH] x86: tighten kernel image page access rights · 6fb14755
      Jan Beulich 提交于
      On x86-64, kernel memory freed after init can be entirely unmapped instead
      of just getting 'poisoned' by overwriting with a debug pattern.
      
      On i386 and x86-64 (under CONFIG_DEBUG_RODATA), kernel text and bug table
      can also be write-protected.
      
      Compared to the first version, this one prevents re-creating deleted
      mappings in the kernel image range on x86-64, if those got removed
      previously. This, together with the original changes, prevents temporarily
      having inconsistent mappings when cacheability attributes are being
      changed on such pages (e.g. from AGP code). While on i386 such duplicate
      mappings don't exist, the same change is done there, too, both for
      consistency and because checking pte_present() before using various other
      pte_XXX functions is a requirement anyway. At once, i386 code gets
      adjusted to use pte_huge() instead of open coding this.
      
      AK: split out cpa() changes
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      6fb14755
    • V
      [PATCH] x86-64: Relocatable Kernel Support · 1ab60e0f
      Vivek Goyal 提交于
      This patch modifies the x86_64 kernel so that it can be loaded and run
      at any 2M aligned address, below 512G.  The technique used is to
      compile the decompressor with -fPIC and modify it so the decompressor
      is fully relocatable.  For the main kernel the page tables are
      modified so the kernel remains at the same virtual address.  In
      addition a variable phys_base is kept that holds the physical address
      the kernel is loaded at.  __pa_symbol is modified to add that when
      we take the address of a kernel symbol.
      
      When loaded with a normal bootloader the decompressor will decompress
      the kernel to 2M and it will run there.  This both ensures the
      relocation code is always working, and makes it easier to use 2M
      pages for the kernel and the cpu.
      
      AK: changed to not make RELOCATABLE default in Kconfig
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      1ab60e0f
    • V
      [PATCH] x86-64: Remove the identity mapping as early as possible · cfd243d4
      Vivek Goyal 提交于
      With the rewrite of the SMP trampoline and the early page
      allocator there is nothing that needs identity mapped pages,
      once we start executing C code.
      
      So add zap_identity_mappings into head64.c and remove
      zap_low_mappings() from much later in the code.  The functions
       are subtly different thus the name change.
      
      This also kills boot_level4_pgt which was from an earlier
      attempt to move the identity mappings as early as possible,
      and is now no longer needed.  Essentially I have replaced
      boot_level4_pgt with trampoline_level4_pgt in trampoline.S
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      cfd243d4
    • V
      [PATCH] x86-64: 64bit ACPI wakeup trampoline · d8e1baf1
      Vivek Goyal 提交于
      o Moved wakeup_level4_pgt into the wakeup routine so we can
        run the kernel above 4G.
      
      o Now we first go to 64bit mode and continue to run from trampoline and
        then then start accessing kernel symbols and restore processor context.
        This enables us to resume even in relocatable kernel context when
        kernel might not be loaded at physical addr it has been compiled for.
      
      o Removed the need for modifying any existing kernel page table.
      
      o Increased the size of the wakeup routine to 8K. This is required as
        wake page tables are on trampoline itself and they got to be at 4K
        boundary, hence one page is not sufficient.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      d8e1baf1
    • V
      [PATCH] x86-64: 64bit PIC SMP trampoline · 90b1c208
      Vivek Goyal 提交于
      This modifies the SMP trampoline and all of the associated code so
      it can jump to a 64bit kernel loaded at an arbitrary address.
      
      The dependencies on having an idenetity mapped page in the kernel
      page tables for SMP bootup have all been removed.
      
      In addition the trampoline has been modified to verify
      that long mode is supported.  Asking if long mode is implemented is
      down right silly but we have traditionally had some of these checks,
      and they can't hurt anything.  So when the totally ludicrous happens
      we just might handle it correctly.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      90b1c208
    • V
      [PATCH] x86-64: cleanup segments · 30f47289
      Vivek Goyal 提交于
      Move __KERNEL32_CS up into the unused gdt entry.  __KERNEL32_CS is
      used when entering the kernel so putting it first is useful when
      trying to keep boot gdt sizes to a minimum.
      
      Set the accessed bit on all gdt entries.  We don't care
      so there is no need for the cpu to burn the extra cycles,
      and it potentially allows the pages to be immutable.  Plus
      it is confusing when debugging and your gdt entries mysteriously
      change.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      30f47289
    • V
      [PATCH] x86-64: Clean up the early boot page table · 67dcbb6b
      Vivek Goyal 提交于
      - Merge physmem_pgt and ident_pgt, removing physmem_pgt.  The merge
        is broken as soon as mm/init.c:init_memory_mapping is run.
      - As physmem_pgt is gone don't export it in pgtable.h.
      - Use defines from pgtable.h for page permissions.
      - Fix the physical memory identity mapping so it is at the correct
        address.
      - Remove the physical memory mapping from wakeup_level4_pgt it
        is at the wrong address so we can't possibly be usinging it.
      - Simply NEXT_PAGE the work to calculate the phys_ alias
        of the labels was very cool.  Unfortuantely it was a brittle
        special purpose hack that makes maitenance more difficult.
        Instead just use label - __START_KERNEL_map like we do
        everywhere else in assembly.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      67dcbb6b
    • V
      [PATCH] x86-64: Kill temp boot pmds · dafe41ee
      Vivek Goyal 提交于
      Early in the boot process we need the ability to set
      up temporary mappings, before our normal mechanisms are
      initialized.  Currently this is used to map pages that
      are part of the page tables we are building and pages
      during the dmi scan.
      
      The core problem is that we are using the user portion of
      the page tables to implement this.  Which means that while
      this mechanism is active we cannot catch NULL pointer dereferences
      and we deviate from the normal ways of handling things.
      
      In this patch I modify early_ioremap to map pages into
      the kernel portion of address space, roughly where
      we will later put modules, and I make the discovery of
      which addresses we can use dynamic which removes all
      kinds of static limits and remove the dependencies
      on implementation details between different parts of the code.
      
      Now alloc_low_page() and unmap_low_page() use
      early_iomap() and early_iounmap() to allocate/map and
      unmap a page.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      dafe41ee
  17. 13 2月, 2007 1 次提交
    • Z
      [PATCH] x86-64: x86_64 - Fix FS/GS registers for VT execution · ffb60175
      Zachary Amsden 提交于
      Initialize FS and GS to __KERNEL_DS as well.  The actual value of them is not
      important, but it is important to reload them in protected mode.  At this time,
      they still retain the real mode values from initial boot.  VT disallows
      execution of code under such conditions, which means hardware virtualization
      can not be used to boot the kernel on Intel platforms, making the boot time
      painfully slow.
      
      This requires moving the GS load before the load of GS_BASE, so just move
      all the segments loads there to keep them together in the code.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      ffb60175
  18. 26 9月, 2006 1 次提交
    • E
      [PATCH] Reload CS when startup_64 is used. · 26374c7b
      Eric W. Biederman 提交于
      In long mode the %cs is largely a relic.  However there are a few cases
      like iret where it matters that we have a valid value.  Without this
      patch it is possible to enter the kernel in startup_64 without setting
      %cs to a valid value.  With this patch we don't care what %cs value
      we enter the kernel with, so long as the cs shadow register indicates
      it is a privileged code segment.
      
      Thanks to Magnus Damm for finding this problem and posting the
      first workable patch.  I have moved the jump to set %cs down a
      few instructions so we don't need to take an extra jump.  Which
      keeps the code simpler.
      Signed-of-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      26374c7b