1. 29 4月, 2015 1 次提交
    • J
      x86: introduce kaslr_offset() · 4545c898
      Jiri Kosina 提交于
      Offset that has been chosen for kaslr during kernel decompression can be
      easily computed as a difference between _text and __START_KERNEL. We are
      already making use of this in dump_kernel_offset() notifier and in
      arch_crash_save_vmcoreinfo().
      
      Introduce kaslr_offset() that makes this computation instead of hard-coding
      it, so that other kernel code (such as live patching) can make use of it.
      Also convert existing users to make use of it.
      
      This patch is equivalent transofrmation without any effects on the resulting
      code:
      
      	$ diff -u vmlinux.old.asm vmlinux.new.asm
      	--- vmlinux.old.asm     2015-04-28 17:55:19.520983368 +0200
      	+++ vmlinux.new.asm     2015-04-28 17:55:24.141206072 +0200
      	@@ -1,5 +1,5 @@
      
      	-vmlinux.old:     file format elf64-x86-64
      	+vmlinux.new:     file format elf64-x86-64
      
      	Disassembly of section .text:
      	$
      Acked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      4545c898
  2. 16 12月, 2014 1 次提交
    • J
      x86, irq: Move IOAPIC related declarations from hw_irq.h into io_apic.h · 8643e28d
      Jiang Liu 提交于
      Clean up code by moving IOAPIC related declarations from hw_irq.h into
      io_apic.h.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
      Cc: Aubrey <aubrey.li@linux.intel.com>
      Cc: Ryan Desfosses <ryan@desfo.org>
      Cc: Quentin Lambert <lambert.quentin@gmail.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Link: http://lkml.kernel.org/r/1414397531-28254-14-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8643e28d
  3. 30 8月, 2014 1 次提交
    • V
      kexec: create a new config option CONFIG_KEXEC_FILE for new syscall · 74ca317c
      Vivek Goyal 提交于
      Currently new system call kexec_file_load() and all the associated code
      compiles if CONFIG_KEXEC=y.  But new syscall also compiles purgatory
      code which currently uses gcc option -mcmodel=large.  This option seems
      to be available only gcc 4.4 onwards.
      
      Hiding new functionality behind a new config option will not break
      existing users of old gcc.  Those who wish to enable new functionality
      will require new gcc.  Having said that, I am trying to figure out how
      can I move away from using -mcmodel=large but that can take a while.
      
      I think there are other advantages of introducing this new config
      option.  As this option will be enabled only on x86_64, other arches
      don't have to compile generic kexec code which will never be used.  This
      new code selects CRYPTO=y and CRYPTO_SHA256=y.  And all other arches had
      to do this for CONFIG_KEXEC.  Now with introduction of new config
      option, we can remove crypto dependency from other arches.
      
      Now CONFIG_KEXEC_FILE is available only on x86_64.  So whereever I had
      CONFIG_X86_64 defined, I got rid of that.
      
      For CONFIG_KEXEC_FILE, instead of doing select CRYPTO=y, I changed it to
      "depends on CRYPTO=y".  This should be safer as "select" is not
      recursive.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Tested-by: NShaun Ruffell <sruffell@digium.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      74ca317c
  4. 09 8月, 2014 5 次提交
    • V
      kexec: verify the signature of signed PE bzImage · 8e7d8381
      Vivek Goyal 提交于
      This is the final piece of the puzzle of verifying kernel image signature
      during kexec_file_load() syscall.
      
      This patch calls into PE file routines to verify signature of bzImage.  If
      signature are valid, kexec_file_load() succeeds otherwise it fails.
      
      Two new config options have been introduced.  First one is
      CONFIG_KEXEC_VERIFY_SIG.  This option enforces that kernel has to be
      validly signed otherwise kernel load will fail.  If this option is not
      set, no signature verification will be done.  Only exception will be when
      secureboot is enabled.  In that case signature verification should be
      automatically enforced when secureboot is enabled.  But that will happen
      when secureboot patches are merged.
      
      Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG.  This option
      enables signature verification support on bzImage.  If this option is not
      set and previous one is set, kernel image loading will fail because kernel
      does not have support to verify signature of bzImage.
      
      I tested these patches with both "pesign" and "sbsign" signed bzImages.
      
      I used signing_key.priv key and signing_key.x509 cert for signing as
      generated during kernel build process (if module signing is enabled).
      
      Used following method to sign bzImage.
      
      pesign
      ======
      - Convert DER format cert to PEM format cert
      openssl x509 -in signing_key.x509 -inform DER -out signing_key.x509.PEM -outform
      PEM
      
      - Generate a .p12 file from existing cert and private key file
      openssl pkcs12 -export -out kernel-key.p12 -inkey signing_key.priv -in
      signing_key.x509.PEM
      
      - Import .p12 file into pesign db
      pk12util -i /tmp/kernel-key.p12 -d /etc/pki/pesign
      
      - Sign bzImage
      pesign -i /boot/vmlinuz-3.16.0-rc3+ -o /boot/vmlinuz-3.16.0-rc3+.signed.pesign
      -c "Glacier signing key - Magrathea" -s
      
      sbsign
      ======
      sbsign --key signing_key.priv --cert signing_key.x509.PEM --output
      /boot/vmlinuz-3.16.0-rc3+.signed.sbsign /boot/vmlinuz-3.16.0-rc3+
      
      Patch details:
      
      Well all the hard work is done in previous patches.  Now bzImage loader
      has just call into that code and verify whether bzImage signature are
      valid or not.
      
      Also create two config options.  First one is CONFIG_KEXEC_VERIFY_SIG.
      This option enforces that kernel has to be validly signed otherwise kernel
      load will fail.  If this option is not set, no signature verification will
      be done.  Only exception will be when secureboot is enabled.  In that case
      signature verification should be automatically enforced when secureboot is
      enabled.  But that will happen when secureboot patches are merged.
      
      Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG.  This option
      enables signature verification support on bzImage.  If this option is not
      set and previous one is set, kernel image loading will fail because kernel
      does not have support to verify signature of bzImage.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Matt Fleming <matt@console-pimps.org>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8e7d8381
    • V
      kexec: support for kexec on panic using new system call · dd5f7260
      Vivek Goyal 提交于
      This patch adds support for loading a kexec on panic (kdump) kernel usning
      new system call.
      
      It prepares ELF headers for memory areas to be dumped and for saved cpu
      registers.  Also prepares the memory map for second kernel and limits its
      boot to reserved areas only.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd5f7260
    • V
      kexec-bzImage64: support for loading bzImage using 64bit entry · 27f48d3e
      Vivek Goyal 提交于
      This is loader specific code which can load bzImage and set it up for
      64bit entry.  This does not take care of 32bit entry or real mode entry.
      
      32bit mode entry can be implemented if somebody needs it.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27f48d3e
    • V
      kexec: load and relocate purgatory at kernel load time · 12db5562
      Vivek Goyal 提交于
      Load purgatory code in RAM and relocate it based on the location.
      Relocation code has been inspired by module relocation code and purgatory
      relocation code in kexec-tools.
      
      Also compute the checksums of loaded kexec segments and store them in
      purgatory.
      
      Arch independent code provides this functionality so that arch dependent
      bootloaders can make use of it.
      
      Helper functions are provided to get/set symbol values in purgatory which
      are used by bootloaders later to set things like stack and entry point of
      second kernel etc.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12db5562
    • V
      kexec: implementation of new syscall kexec_file_load · cb105258
      Vivek Goyal 提交于
      Previous patch provided the interface definition and this patch prvides
      implementation of new syscall.
      
      Previously segment list was prepared in user space.  Now user space just
      passes kernel fd, initrd fd and command line and kernel will create a
      segment list internally.
      
      This patch contains generic part of the code.  Actual segment preparation
      and loading is done by arch and image specific loader.  Which comes in
      next patch.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cb105258
  5. 26 2月, 2014 1 次提交
  6. 30 1月, 2013 3 次提交
  7. 23 9月, 2010 1 次提交
  8. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  9. 03 6月, 2009 1 次提交
  10. 08 5月, 2009 1 次提交
  11. 11 3月, 2009 3 次提交
  12. 04 2月, 2009 1 次提交
    • H
      x86: kexec: Use one page table in x86_64 machine_kexec · f5deb796
      Huang Ying 提交于
      Impact: reduce kernel BSS size by 7 pages, improve code readability
      
      Two page tables are used in current x86_64 kexec implementation. One
      is used to jump from kernel virtual address to identity map address,
      the other is used to map all physical memory. In fact, on x86_64,
      there is no conflict between kernel virtual address space and physical
      memory space, so just one page table is sufficient. The page table
      pages used to map control page are dynamically allocated to save
      memory if kexec image is not loaded. ASM code used to map control page
      is replaced by C code too.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      f5deb796
  13. 27 7月, 2008 1 次提交
    • H
      kexec jump · 3ab83521
      Huang Ying 提交于
      This patch provides an enhancement to kexec/kdump.  It implements the
      following features:
      
      - Backup/restore memory used by the original kernel before/after
        kexec.
      
      - Save/restore CPU state before/after kexec.
      
      The features of this patch can be used as a general method to call program in
      physical mode (paging turning off).  This can be used to call BIOS code under
      Linux.
      
      kexec-tools needs to be patched to support kexec jump. The patches and
      the precompiled kexec can be download from the following URL:
      
             source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
             patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
             binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10
      
      Usage example of calling some physical mode code and return:
      
      1. Compile and install patched kernel with following options selected:
      
      CONFIG_X86_32=y
      CONFIG_KEXEC=y
      CONFIG_PM=y
      CONFIG_KEXEC_JUMP=y
      
      2. Build patched kexec-tool or download the pre-built one.
      
      3. Build some physical mode executable named such as "phy_mode"
      
      4. Boot kernel compiled in step 1.
      
      5. Load physical mode executable with /sbin/kexec. The shell command
         line can be as follow:
      
         /sbin/kexec --load-preserve-context --args-none phy_mode
      
      6. Call physical mode executable with following shell command line:
      
         /sbin/kexec -e
      
      Implementation point:
      
      To support jumping without reserving memory.  One shadow backup page (source
      page) is allocated for each page used by kexeced code image (destination
      page).  When do kexec_load, the image of kexeced code is loaded into source
      pages, and before executing, the destination pages and the source pages are
      swapped, so the contents of destination pages are backupped.  Before jumping
      to the kexeced code image and after jumping back to the original kernel, the
      destination pages and the source pages are swapped too.
      
      C ABI (calling convention) is used as communication protocol between
      kernel and called code.
      
      A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to
      indicate that the loaded kernel image is used for jumping back.
      
      Now, only the i386 architecture is supported.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ab83521
  14. 08 7月, 2008 1 次提交
  15. 24 5月, 2008 1 次提交
  16. 03 4月, 2008 1 次提交
    • K
      vmcoreinfo: add the symbol "phys_base" · 629c8b4c
      Ken'ichi Ohmichi 提交于
      Fix the problem that makedumpfile sometimes fails on x86_64 machine.
      
      This patch adds the symbol "phys_base" to a vmcoreinfo data.  The
      vmcoreinfo data has the minimum debugging information only for dump
      filtering.  makedumpfile (dump filtering command) gets it to distinguish
      unnecessary pages, and makedumpfile creates a small dumpfile.
      
      On x86_64 kernel which compiled with CONFIG_PHYSICAL_START=0x0 and
      CONFIG_RELOCATABLE=y, makedumpfile fails like the following:
      
       # makedumpfile -d31 /proc/vmcore dumpfile
       The kernel version is not supported.
       The created dumpfile may be incomplete.
       _exclude_free_page: Can't get next online node.
      
       makedumpfile Failed.
       #
      
      The cause is the lack of the symbol "phys_base" in a vmcoreinfo data.
      If the symbol "phys_base" does not exist, makedumpfile considers an
      x86_64 kernel as non relocatable.  As the result, makedumpfile
      misunderstands the physical address where the kernel is loaded, and it
      cannot translate a kernel virtual address to physical address correctly.
      
      To fix this problem, this patch adds the symbol "phys_base" to a
      vmcoreinfo data.
      Signed-off-by: NKen'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: <stable@kernel.org>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      629c8b4c
  17. 08 2月, 2008 1 次提交
  18. 30 1月, 2008 1 次提交
    • C
      x86: 64-bit, make sparsemem vmemmap the only memory model · b263295d
      Christoph Lameter 提交于
      Use sparsemem as the only memory model for UP, SMP and NUMA.  Measurements
      indicate that DISCONTIGMEM has a higher overhead than sparsemem.  And
      FLATMEMs benefits are minimal.  So I think its best to simply standardize
      on sparsemem.
      
      Results of page allocator tests (test can be had via git from slab git
      tree branch tests)
      
      Measurements in cycle counts. 1000 allocations were performed and then the
      average cycle count was calculated.
      
      Order	FlatMem	Discontig	SparseMem
      0	  639	  665		  641
      1	  567	  647		  593
      2	  679	  774		  692
      3	  763	  967		  781
      4	  961	 1501		  962
      5	 1356	 2344		 1392
      6	 2224	 3982		 2336
      7	 4869	 7225		 5074
      8	12500	14048		12732
      9	27926	28223		28165
      10	58578	58714		58682
      
      (Note that FlatMem is an SMP config and the rest NUMA configurations)
      
      Memory use:
      
      SMP Sparsemem
      -------------
      
      Kernel size:
      
         text    data     bss     dec     hex filename
      3849268  397739 1264856 5511863  541ab7 vmlinux
      
                   total       used       free     shared    buffers     cached
      Mem:       8242252      41164    8201088          0        352      11512
      -/+ buffers/cache:      29300    8212952
      Swap:      9775512          0    9775512
      
      SMP Flatmem
      -----------
      
      Kernel size:
      
         text    data     bss     dec     hex filename
      3844612  397739 1264536 5506887  540747 vmlinux
      
      So 4.5k growth in text size vs. FLATMEM.
      
                   total       used       free     shared    buffers     cached
      Mem:       8244052      40544    8203508          0        352      11484
      -/+ buffers/cache:      28708    8215344
      
      2k growth in overall memory use after boot.
      
      NUMA discontig:
      
         text    data     bss     dec     hex filename
      3888124  470659 1276504 5635287  55fcd7 vmlinux
      
                   total       used       free     shared    buffers     cached
      Mem:       8256256      56908    8199348          0        352      11496
      -/+ buffers/cache:      45060    8211196
      Swap:      9775512          0    9775512
      
      NUMA sparse:
      
         text    data     bss     dec     hex filename
      3896428  470659 1276824 5643911  561e87 vmlinux
      
      8k text growth. Given that we fully inline virt_to_page and friends now
      that is rather good.
      
                   total       used       free     shared    buffers     cached
      Mem:       8264720      57240    8207480          0        352      11516
      -/+ buffers/cache:      45372    8219348
      Swap:      9775512          0    9775512
      
      The total available memory is increased by 8k.
      
      This patch makes sparsemem the default and removes discontig and
      flatmem support from x86.
      
      [ akpm@linux-foundation.org: allnoconfig build fix ]
      Acked-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b263295d
  19. 28 10月, 2007 1 次提交
    • K
      x86: Dump filtering supports x86_64 sparsemem · 69243f91
      Ken'ichi Ohmichi 提交于
      This patch adds the symbol "init_level4_pgt" to the vmcoreinfo data so
      that makedumpfile (dump filtering command) supports x86_64 sparsemem 
      kernel of linux-2.6.24.
      
      makedumpfile creates a small dumpfile by excluding unnecessary pages for
      the analysis. It checks attributes in page structures and distinguishes
      necessary pages and unnecessary ones. To check them, makedumpfile gets
      the vmcoreinfo data which has the minimum debugging information only for
      dump filtering.
      
      For older x86_64 kernel (linux-2.6.23 or before), makedumpfile translates
      the virtual address of page structure into physical address by subtracting
      PAGE_OFFSET from virtual address, but this translation isn't effective for
      linux-2.6.24 sparsemem kernel, because its page structures are in virtual
      memmap area. makedumpfile should translate their virtual address by 4-levels
      paging and it needs the symbol "init_level4_pgt".
      Signed-off-by: NKen'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      69243f91
  20. 20 10月, 2007 1 次提交
  21. 17 10月, 2007 2 次提交
  22. 14 10月, 2007 1 次提交
    • D
      Delete filenames in comments. · 835c34a1
      Dave Jones 提交于
      Since the x86 merge, lots of files that referenced their own filenames
      are no longer correct.  Rather than keep them up to date, just delete
      them, as they add no real value.
      
      Additionally:
      - fix up comment formatting in scx200_32.c
      - Remove a credit from myself in setup_64.c from a time when we had no SCM
      - remove longwinded history from tsc_32.c which can be figured out from
        git.
      Signed-off-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      835c34a1
  23. 11 10月, 2007 2 次提交
  24. 07 5月, 2007 1 次提交
    • L
      Revert "[PATCH] x86: __pa and __pa_symbol address space separation" · e3ebadd9
      Linus Torvalds 提交于
      This was broken.  It adds complexity, for no good reason.  Rather than
      separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
      and preferably __pa() too - and just use "virt_to_phys()" instead, which
      is more readable and has nicer semantics.
      
      However, right now, just undo the separation, and make __pa_symbol() be
      the exact same as __pa().  That fixes the bugs this patch introduced,
      and we can do the fairly obvious cleanups later.
      
      Do the new __phys_addr() function (which is now the actual workhorse for
      the unified __pa()/__pa_symbol()) as a real external function, that way
      all the potential issues with compile/link-time optimizations of
      constant symbol addresses go away, and we can also, if we choose to, add
      more sanity-checking of the argument.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3ebadd9
  25. 03 5月, 2007 1 次提交
    • V
      [PATCH] x86: __pa and __pa_symbol address space separation · 0dbf7028
      Vivek Goyal 提交于
      Currently __pa_symbol is for use with symbols in the kernel address
      map and __pa is for use with pointers into the physical memory map.
      But the code is implemented so you can usually interchange the two.
      
      __pa which is much more common can be implemented much more cheaply
      if it is it doesn't have to worry about any other kernel address
      spaces.  This is especially true with a relocatable kernel as
      __pa_symbol needs to peform an extra variable read to resolve
      the address.
      
      There is a third macro that is added for the vsyscall data
      __pa_vsymbol for finding the physical addesses of vsyscall pages.
      
      Most of this patch is simply sorting through the references to
      __pa or __pa_symbol and using the proper one.  A little of
      it is continuing to use a physical address when we have it
      instead of recalculating it several times.
      
      swapper_pgd is now NULL.  leave_mm now uses init_mm.pgd
      and init_mm.pgd is initialized at boot (instead of compile time)
      to the physmem virtual mapping of init_level4_pgd.  The
      physical address changed.
      
      Except for the for EMPTY_ZERO page all of the remaining references
      to __pa_symbol appear to be during kernel initialization.  So this
      should reduce the cost of __pa in the common case, even on a relocated
      kernel.
      
      As this is technically a semantic change we need to be on the lookout
      for anything I missed.  But it works for me (tm).
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      0dbf7028
  26. 26 9月, 2006 2 次提交
    • M
      [PATCH] Avoid overwriting the current pgd (V4, x86_64) · 4bfaaef0
      Magnus Damm 提交于
      kexec: Avoid overwriting the current pgd (V4, x86_64)
      
      This patch upgrades the x86_64-specific kexec code to avoid overwriting the
      current pgd. Overwriting the current pgd is bad when CONFIG_CRASH_DUMP is used
      to start a secondary kernel that dumps the memory of the previous kernel.
      
      The code introduces a new set of page tables. These tables are used to provide
      an executable identity mapping without overwriting the current pgd.
      Signed-off-by: NMagnus Damm <magnus@valinux.co.jp>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      4bfaaef0
    • A
      [PATCH] Convert x86-64 to early param · 2c8c0e6b
      Andi Kleen 提交于
      Instead of hackish manual parsing
      
      Requires earlier i386 patchkit, but also fixes i386 early_printk again.
      
      I removed some obsolete really early parameters which didn't do anything useful.
      Also made a few parameters that needed it early (mostly oops printing setup)
      
      Also removed one panic check that wasn't visible without
      early console anyways (the early console is now initialized after that
      panic)
      
      This cleans up a lot of code.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      2c8c0e6b
  27. 01 8月, 2006 1 次提交
  28. 27 6月, 2006 1 次提交
  29. 09 3月, 2006 1 次提交