1. 04 2月, 2009 1 次提交
    • H
      x86: kexec: Use one page table in x86_64 machine_kexec · f5deb796
      Huang Ying 提交于
      Impact: reduce kernel BSS size by 7 pages, improve code readability
      
      Two page tables are used in current x86_64 kexec implementation. One
      is used to jump from kernel virtual address to identity map address,
      the other is used to map all physical memory. In fact, on x86_64,
      there is no conflict between kernel virtual address space and physical
      memory space, so just one page table is sufficient. The page table
      pages used to map control page are dynamically allocated to save
      memory if kexec image is not loaded. ASM code used to map control page
      is replaced by C code too.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      f5deb796
  2. 27 7月, 2008 1 次提交
    • H
      kexec jump · 3ab83521
      Huang Ying 提交于
      This patch provides an enhancement to kexec/kdump.  It implements the
      following features:
      
      - Backup/restore memory used by the original kernel before/after
        kexec.
      
      - Save/restore CPU state before/after kexec.
      
      The features of this patch can be used as a general method to call program in
      physical mode (paging turning off).  This can be used to call BIOS code under
      Linux.
      
      kexec-tools needs to be patched to support kexec jump. The patches and
      the precompiled kexec can be download from the following URL:
      
             source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
             patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
             binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10
      
      Usage example of calling some physical mode code and return:
      
      1. Compile and install patched kernel with following options selected:
      
      CONFIG_X86_32=y
      CONFIG_KEXEC=y
      CONFIG_PM=y
      CONFIG_KEXEC_JUMP=y
      
      2. Build patched kexec-tool or download the pre-built one.
      
      3. Build some physical mode executable named such as "phy_mode"
      
      4. Boot kernel compiled in step 1.
      
      5. Load physical mode executable with /sbin/kexec. The shell command
         line can be as follow:
      
         /sbin/kexec --load-preserve-context --args-none phy_mode
      
      6. Call physical mode executable with following shell command line:
      
         /sbin/kexec -e
      
      Implementation point:
      
      To support jumping without reserving memory.  One shadow backup page (source
      page) is allocated for each page used by kexeced code image (destination
      page).  When do kexec_load, the image of kexeced code is loaded into source
      pages, and before executing, the destination pages and the source pages are
      swapped, so the contents of destination pages are backupped.  Before jumping
      to the kexeced code image and after jumping back to the original kernel, the
      destination pages and the source pages are swapped too.
      
      C ABI (calling convention) is used as communication protocol between
      kernel and called code.
      
      A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to
      indicate that the loaded kernel image is used for jumping back.
      
      Now, only the i386 architecture is supported.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ab83521
  3. 08 7月, 2008 1 次提交
  4. 24 5月, 2008 1 次提交
  5. 03 4月, 2008 1 次提交
    • K
      vmcoreinfo: add the symbol "phys_base" · 629c8b4c
      Ken'ichi Ohmichi 提交于
      Fix the problem that makedumpfile sometimes fails on x86_64 machine.
      
      This patch adds the symbol "phys_base" to a vmcoreinfo data.  The
      vmcoreinfo data has the minimum debugging information only for dump
      filtering.  makedumpfile (dump filtering command) gets it to distinguish
      unnecessary pages, and makedumpfile creates a small dumpfile.
      
      On x86_64 kernel which compiled with CONFIG_PHYSICAL_START=0x0 and
      CONFIG_RELOCATABLE=y, makedumpfile fails like the following:
      
       # makedumpfile -d31 /proc/vmcore dumpfile
       The kernel version is not supported.
       The created dumpfile may be incomplete.
       _exclude_free_page: Can't get next online node.
      
       makedumpfile Failed.
       #
      
      The cause is the lack of the symbol "phys_base" in a vmcoreinfo data.
      If the symbol "phys_base" does not exist, makedumpfile considers an
      x86_64 kernel as non relocatable.  As the result, makedumpfile
      misunderstands the physical address where the kernel is loaded, and it
      cannot translate a kernel virtual address to physical address correctly.
      
      To fix this problem, this patch adds the symbol "phys_base" to a
      vmcoreinfo data.
      Signed-off-by: NKen'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: <stable@kernel.org>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      629c8b4c
  6. 08 2月, 2008 1 次提交
  7. 30 1月, 2008 1 次提交
    • C
      x86: 64-bit, make sparsemem vmemmap the only memory model · b263295d
      Christoph Lameter 提交于
      Use sparsemem as the only memory model for UP, SMP and NUMA.  Measurements
      indicate that DISCONTIGMEM has a higher overhead than sparsemem.  And
      FLATMEMs benefits are minimal.  So I think its best to simply standardize
      on sparsemem.
      
      Results of page allocator tests (test can be had via git from slab git
      tree branch tests)
      
      Measurements in cycle counts. 1000 allocations were performed and then the
      average cycle count was calculated.
      
      Order	FlatMem	Discontig	SparseMem
      0	  639	  665		  641
      1	  567	  647		  593
      2	  679	  774		  692
      3	  763	  967		  781
      4	  961	 1501		  962
      5	 1356	 2344		 1392
      6	 2224	 3982		 2336
      7	 4869	 7225		 5074
      8	12500	14048		12732
      9	27926	28223		28165
      10	58578	58714		58682
      
      (Note that FlatMem is an SMP config and the rest NUMA configurations)
      
      Memory use:
      
      SMP Sparsemem
      -------------
      
      Kernel size:
      
         text    data     bss     dec     hex filename
      3849268  397739 1264856 5511863  541ab7 vmlinux
      
                   total       used       free     shared    buffers     cached
      Mem:       8242252      41164    8201088          0        352      11512
      -/+ buffers/cache:      29300    8212952
      Swap:      9775512          0    9775512
      
      SMP Flatmem
      -----------
      
      Kernel size:
      
         text    data     bss     dec     hex filename
      3844612  397739 1264536 5506887  540747 vmlinux
      
      So 4.5k growth in text size vs. FLATMEM.
      
                   total       used       free     shared    buffers     cached
      Mem:       8244052      40544    8203508          0        352      11484
      -/+ buffers/cache:      28708    8215344
      
      2k growth in overall memory use after boot.
      
      NUMA discontig:
      
         text    data     bss     dec     hex filename
      3888124  470659 1276504 5635287  55fcd7 vmlinux
      
                   total       used       free     shared    buffers     cached
      Mem:       8256256      56908    8199348          0        352      11496
      -/+ buffers/cache:      45060    8211196
      Swap:      9775512          0    9775512
      
      NUMA sparse:
      
         text    data     bss     dec     hex filename
      3896428  470659 1276824 5643911  561e87 vmlinux
      
      8k text growth. Given that we fully inline virt_to_page and friends now
      that is rather good.
      
                   total       used       free     shared    buffers     cached
      Mem:       8264720      57240    8207480          0        352      11516
      -/+ buffers/cache:      45372    8219348
      Swap:      9775512          0    9775512
      
      The total available memory is increased by 8k.
      
      This patch makes sparsemem the default and removes discontig and
      flatmem support from x86.
      
      [ akpm@linux-foundation.org: allnoconfig build fix ]
      Acked-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b263295d
  8. 28 10月, 2007 1 次提交
    • K
      x86: Dump filtering supports x86_64 sparsemem · 69243f91
      Ken'ichi Ohmichi 提交于
      This patch adds the symbol "init_level4_pgt" to the vmcoreinfo data so
      that makedumpfile (dump filtering command) supports x86_64 sparsemem 
      kernel of linux-2.6.24.
      
      makedumpfile creates a small dumpfile by excluding unnecessary pages for
      the analysis. It checks attributes in page structures and distinguishes
      necessary pages and unnecessary ones. To check them, makedumpfile gets
      the vmcoreinfo data which has the minimum debugging information only for
      dump filtering.
      
      For older x86_64 kernel (linux-2.6.23 or before), makedumpfile translates
      the virtual address of page structure into physical address by subtracting
      PAGE_OFFSET from virtual address, but this translation isn't effective for
      linux-2.6.24 sparsemem kernel, because its page structures are in virtual
      memmap area. makedumpfile should translate their virtual address by 4-levels
      paging and it needs the symbol "init_level4_pgt".
      Signed-off-by: NKen'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      69243f91
  9. 20 10月, 2007 1 次提交
  10. 17 10月, 2007 2 次提交
  11. 14 10月, 2007 1 次提交
    • D
      Delete filenames in comments. · 835c34a1
      Dave Jones 提交于
      Since the x86 merge, lots of files that referenced their own filenames
      are no longer correct.  Rather than keep them up to date, just delete
      them, as they add no real value.
      
      Additionally:
      - fix up comment formatting in scx200_32.c
      - Remove a credit from myself in setup_64.c from a time when we had no SCM
      - remove longwinded history from tsc_32.c which can be figured out from
        git.
      Signed-off-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      835c34a1
  12. 11 10月, 2007 2 次提交
  13. 07 5月, 2007 1 次提交
    • L
      Revert "[PATCH] x86: __pa and __pa_symbol address space separation" · e3ebadd9
      Linus Torvalds 提交于
      This was broken.  It adds complexity, for no good reason.  Rather than
      separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
      and preferably __pa() too - and just use "virt_to_phys()" instead, which
      is more readable and has nicer semantics.
      
      However, right now, just undo the separation, and make __pa_symbol() be
      the exact same as __pa().  That fixes the bugs this patch introduced,
      and we can do the fairly obvious cleanups later.
      
      Do the new __phys_addr() function (which is now the actual workhorse for
      the unified __pa()/__pa_symbol()) as a real external function, that way
      all the potential issues with compile/link-time optimizations of
      constant symbol addresses go away, and we can also, if we choose to, add
      more sanity-checking of the argument.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3ebadd9
  14. 03 5月, 2007 1 次提交
    • V
      [PATCH] x86: __pa and __pa_symbol address space separation · 0dbf7028
      Vivek Goyal 提交于
      Currently __pa_symbol is for use with symbols in the kernel address
      map and __pa is for use with pointers into the physical memory map.
      But the code is implemented so you can usually interchange the two.
      
      __pa which is much more common can be implemented much more cheaply
      if it is it doesn't have to worry about any other kernel address
      spaces.  This is especially true with a relocatable kernel as
      __pa_symbol needs to peform an extra variable read to resolve
      the address.
      
      There is a third macro that is added for the vsyscall data
      __pa_vsymbol for finding the physical addesses of vsyscall pages.
      
      Most of this patch is simply sorting through the references to
      __pa or __pa_symbol and using the proper one.  A little of
      it is continuing to use a physical address when we have it
      instead of recalculating it several times.
      
      swapper_pgd is now NULL.  leave_mm now uses init_mm.pgd
      and init_mm.pgd is initialized at boot (instead of compile time)
      to the physmem virtual mapping of init_level4_pgd.  The
      physical address changed.
      
      Except for the for EMPTY_ZERO page all of the remaining references
      to __pa_symbol appear to be during kernel initialization.  So this
      should reduce the cost of __pa in the common case, even on a relocated
      kernel.
      
      As this is technically a semantic change we need to be on the lookout
      for anything I missed.  But it works for me (tm).
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      0dbf7028
  15. 26 9月, 2006 2 次提交
    • M
      [PATCH] Avoid overwriting the current pgd (V4, x86_64) · 4bfaaef0
      Magnus Damm 提交于
      kexec: Avoid overwriting the current pgd (V4, x86_64)
      
      This patch upgrades the x86_64-specific kexec code to avoid overwriting the
      current pgd. Overwriting the current pgd is bad when CONFIG_CRASH_DUMP is used
      to start a secondary kernel that dumps the memory of the previous kernel.
      
      The code introduces a new set of page tables. These tables are used to provide
      an executable identity mapping without overwriting the current pgd.
      Signed-off-by: NMagnus Damm <magnus@valinux.co.jp>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      4bfaaef0
    • A
      [PATCH] Convert x86-64 to early param · 2c8c0e6b
      Andi Kleen 提交于
      Instead of hackish manual parsing
      
      Requires earlier i386 patchkit, but also fixes i386 early_printk again.
      
      I removed some obsolete really early parameters which didn't do anything useful.
      Also made a few parameters that needed it early (mostly oops printing setup)
      
      Also removed one panic check that wasn't visible without
      early console anyways (the early console is now initialized after that
      panic)
      
      This cleans up a lot of code.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      2c8c0e6b
  16. 01 8月, 2006 1 次提交
  17. 27 6月, 2006 1 次提交
  18. 09 3月, 2006 1 次提交
  19. 30 7月, 2005 2 次提交
  20. 26 6月, 2005 2 次提交