1. 27 7月, 2008 3 次提交
    • N
      x86: lockless get_user_pages_fast() · 8174c430
      Nick Piggin 提交于
      Implement get_user_pages_fast without locking in the fastpath on x86.
      
      Do an optimistic lockless pagetable walk, without taking mmap_sem or any
      page table locks or even mmap_sem.  Page table existence is guaranteed by
      turning interrupts off (combined with the fact that we're always looking
      up the current mm, means we can do the lockless page table walk within the
      constraints of the TLB shootdown design).  Basically we can do this
      lockless pagetable walk in a similar manner to the way the CPU's pagetable
      walker does not have to take any locks to find present ptes.
      
      This patch (combined with the subsequent ones to convert direct IO to use
      it) was found to give about 10% performance improvement on a 2 socket 8
      core Intel Xeon system running an OLTP workload on DB2 v9.5
      
       "To test the effects of the patch, an OLTP workload was run on an IBM
        x3850 M2 server with 2 processors (quad-core Intel Xeon processors at
        2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel.  Comparing
        runs with and without the patch resulted in an overall performance
        benefit of ~9.8%.  Correspondingly, oprofiles showed that samples from
        __up_read and __down_read routines that is seen during thread contention
        for system resources was reduced from 2.8% down to .05%.  Monitoring the
        /proc/vmstat output from the patched run showed that the counter for
        fast_gup contained a very high number while the fast_gup_slow value was
        zero."
      
      (fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a
      counter we had for the number of times the slowpath was invoked).
      
      The main reason for the improvement is that DB2 has multiple threads each
      issuing direct-IO.  Direct-IO uses get_user_pages, and thus the threads
      contend the mmap_sem cacheline, and can also contend on page table locks.
      
      I would anticipate larger performance gains on larger systems, however I
      think DB2 uses an adaptive mix of threads and processes, so it could be
      that thread contention remains pretty constant as machine size increases.
      In which case, we stuck with "only" a 10% gain.
      
      The downside of using get_user_pages_fast is that if there is not a pte
      with the correct permissions for the access, we end up falling back to
      get_user_pages and so the get_user_pages_fast is a bit of extra work.
      However this should not be the common case in most performance critical
      code.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: Kconfig fix]
      [akpm@linux-foundation.org: Makefile fix/cleanup]
      [akpm@linux-foundation.org: warning fix]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Reviewed-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8174c430
    • H
      kexec jump: save/restore device state · 89081d17
      Huang Ying 提交于
      This patch implements devices state save/restore before after kexec.
      
      This patch together with features in kexec_jump patch can be used for
      following:
      
      - A simple hibernation implementation without ACPI support.  You can kexec a
        hibernating kernel, save the memory image of original system and shutdown
        the system.  When resuming, you restore the memory image of original system
        via ordinary kexec load then jump back.
      
      - Kernel/system debug through making system snapshot.  You can make system
        snapshot, jump back, do some thing and make another system snapshot.
      
      - Cooperative multi-kernel/system.  With kexec jump, you can switch between
        several kernels/systems quickly without boot process except the first time.
        This appears like swap a whole kernel/system out/in.
      
      - A general method to call program in physical mode (paging turning
        off). This can be used to invoke BIOS code under Linux.
      
      The following user-space tools can be used with kexec jump:
      
      - kexec-tools needs to be patched to support kexec jump. The patches
        and the precompiled kexec can be download from the following URL:
             source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
             patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
             binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10
      
      - makedumpfile with patches are used as memory image saving tool, it
        can exclude free pages from original kernel memory image file. The
        patches and the precompiled makedumpfile can be download from the
        following URL:
             source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2
             patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2
             binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10
      
      - An initramfs image can be used as the root file system of kexeced
        kernel. An initramfs image built with "BuildRoot" can be downloaded
        from the following URL:
             initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz
        All user space tools above are included in the initramfs image.
      
      Usage example of simple hibernation:
      
      1. Compile and install patched kernel with following options selected:
      
      CONFIG_X86_32=y
      CONFIG_RELOCATABLE=y
      CONFIG_KEXEC=y
      CONFIG_CRASH_DUMP=y
      CONFIG_PM=y
      CONFIG_HIBERNATION=y
      CONFIG_KEXEC_JUMP=y
      
      2. Build an initramfs image contains kexec-tool and makedumpfile, or
         download the pre-built initramfs image, called rootfs.gz in
         following text.
      
      3. Prepare a partition to save memory image of original kernel, called
         hibernating partition in following text.
      
      4. Boot kernel compiled in step 1 (kernel A).
      
      5. In the kernel A, load kernel compiled in step 1 (kernel B) with
         /sbin/kexec. The shell command line can be as follow:
      
         /sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000
           --mem-max=0xffffff --initrd=rootfs.gz
      
      6. Boot the kernel B with following shell command line:
      
         /sbin/kexec -e
      
      7. The kernel B will boot as normal kexec. In kernel B the memory
         image of kernel A can be saved into hibernating partition as
         follow:
      
         jump_back_entry=`cat /proc/cmdline | tr ' ' '\n' | grep kexec_jump_back_entry | cut -d '='`
         echo $jump_back_entry > kexec_jump_back_entry
         cp /proc/vmcore dump.elf
      
         Then you can shutdown the machine as normal.
      
      8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as
         root file system.
      
      9. In kernel C, load the memory image of kernel A as follow:
      
         /sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf
      
      10. Jump back to the kernel A as follow:
      
         /sbin/kexec -e
      
         Then, kernel A is resumed.
      
      Implementation point:
      
      To support jumping between two kernels, before jumping to (executing)
      the new kernel and jumping back to the original kernel, the devices
      are put into quiescent state, and the state of devices and CPU is
      saved. After jumping back from kexeced kernel and jumping to the new
      kernel, the state of devices and CPU are restored accordingly. The
      devices/CPU state save/restore code of software suspend is called to
      implement corresponding function.
      
      Known issues:
      
      - Because the segment number supported by sys_kexec_load is limited,
        hibernation image with many segments may not be load. This is
        planned to be eliminated by adding a new flag to sys_kexec_load to
        make a image can be loaded with multiple sys_kexec_load invoking.
      
      Now, only the i386 architecture is supported.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      89081d17
    • H
      kexec jump · 3ab83521
      Huang Ying 提交于
      This patch provides an enhancement to kexec/kdump.  It implements the
      following features:
      
      - Backup/restore memory used by the original kernel before/after
        kexec.
      
      - Save/restore CPU state before/after kexec.
      
      The features of this patch can be used as a general method to call program in
      physical mode (paging turning off).  This can be used to call BIOS code under
      Linux.
      
      kexec-tools needs to be patched to support kexec jump. The patches and
      the precompiled kexec can be download from the following URL:
      
             source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
             patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
             binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10
      
      Usage example of calling some physical mode code and return:
      
      1. Compile and install patched kernel with following options selected:
      
      CONFIG_X86_32=y
      CONFIG_KEXEC=y
      CONFIG_PM=y
      CONFIG_KEXEC_JUMP=y
      
      2. Build patched kexec-tool or download the pre-built one.
      
      3. Build some physical mode executable named such as "phy_mode"
      
      4. Boot kernel compiled in step 1.
      
      5. Load physical mode executable with /sbin/kexec. The shell command
         line can be as follow:
      
         /sbin/kexec --load-preserve-context --args-none phy_mode
      
      6. Call physical mode executable with following shell command line:
      
         /sbin/kexec -e
      
      Implementation point:
      
      To support jumping without reserving memory.  One shadow backup page (source
      page) is allocated for each page used by kexeced code image (destination
      page).  When do kexec_load, the image of kexeced code is loaded into source
      pages, and before executing, the destination pages and the source pages are
      swapped, so the contents of destination pages are backupped.  Before jumping
      to the kexeced code image and after jumping back to the original kernel, the
      destination pages and the source pages are swapped too.
      
      C ABI (calling convention) is used as communication protocol between
      kernel and called code.
      
      A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to
      indicate that the loaded kernel image is used for jumping back.
      
      Now, only the i386 architecture is supported.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ab83521
  2. 26 7月, 2008 3 次提交
    • I
      x86, RDC321x: add to mach-default · 1f972768
      Ingo Molnar 提交于
      first step to add RDC321x support to the default PC architecture.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1f972768
    • M
      gpiolib: allow user-selection · 7444a72e
      Michael Buesch 提交于
      This patch adds functionality to the gpio-lib subsystem to make it
      possible to enable the gpio-lib code even if the architecture code didn't
      request to get it built in.
      
      The archtitecture code does still need to implement the gpiolib accessor
      functions in its asm/gpio.h file.  This patch adds the implementations for
      x86 and PPC.
      
      With these changes it is possible to run generic GPIO expansion cards on
      every architecture that implements the trivial wrapper functions.  Support
      for more architectures can easily be added.
      Signed-off-by: NMichael Buesch <mb@bu3sch.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: Samuel Ortiz <sameo@openedhand.com>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Adrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7444a72e
    • J
      introduce HAVE_EFFICIENT_UNALIGNED_ACCESS Kconfig symbol · 58340a07
      Johannes Berg 提交于
      In many cases, especially in networking, it can be beneficial to know at
      compile time whether the architecture can do unaligned accesses efficiently.
      This patch introduces a new Kconfig symbol
      
      	HAVE_EFFICIENT_UNALIGNED_ACCESS
      
      for that purpose and adds it to the powerpc and x86 architectures.  Also add
      some documentation about alignment and networking, and especially one intended
      use of this symbol.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Acked-by: Ingo Molnar <mingo@elte.hu> [x86 architecture part]
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58340a07
  3. 25 7月, 2008 1 次提交
  4. 18 7月, 2008 1 次提交
  5. 15 7月, 2008 1 次提交
  6. 11 7月, 2008 7 次提交
  7. 10 7月, 2008 2 次提交
  8. 08 7月, 2008 11 次提交
  9. 04 7月, 2008 1 次提交
  10. 01 7月, 2008 1 次提交
  11. 30 6月, 2008 1 次提交
  12. 27 6月, 2008 3 次提交
    • I
      x86, AMD IOMMU: build fix #3 · 07c40e8a
      Ingo Molnar 提交于
      fix typo causing:
      
      arch/x86/kernel/built-in.o: In function `__unmap_single':
      amd_iommu.c:(.text+0x17771): undefined reference to `iommu_area_free'
      arch/x86/kernel/built-in.o: In function `__map_single':
      amd_iommu.c:(.text+0x1797a): undefined reference to `iommu_area_alloc'
      amd_iommu.c:(.text+0x179a2): undefined reference to `iommu_area_alloc'
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      07c40e8a
    • I
      x86, AMD IOMMU: build fix · 24d2ba0a
      Ingo Molnar 提交于
      fix:
      
       arch/x86/kernel/amd_iommu_init.c:247: warning: 'struct acpi_table_header' declared inside parameter list
       arch/x86/kernel/amd_iommu_init.c:247: warning: its scope is only this definition or declaration, which is probably not what you want
       arch/x86/kernel/amd_iommu_init.c: In function 'find_last_devid_acpi':
       arch/x86/kernel/amd_iommu_init.c:257: error: dereferencing pointer to incomplete type
       arch/x86/kernel/amd_iommu_init.c:265: error: dereferencing pointer to incomplete type
      
      the AMD IOMMU code depends on ACPI facilities.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      24d2ba0a
    • J
      x86, AMD IOMMU: add Kconfig entry · 2b188723
      Joerg Roedel 提交于
      This patch adds the Kconfig entry for the AMD IOMMU driver.
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      Cc: iommu@lists.linux-foundation.org
      Cc: bhavna.sarathy@amd.com
      Cc: Sebastian.Biemueller@amd.com
      Cc: robert.richter@amd.com
      Cc: joro@8bytes.org
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2b188723
  13. 26 6月, 2008 1 次提交
  14. 25 6月, 2008 2 次提交
  15. 10 6月, 2008 1 次提交
  16. 06 6月, 2008 1 次提交
    • A
      PCI/x86: fix up PCI stuff so that PCI_GOANY supports OLPC · 2bdd1b03
      Andres Salomon 提交于
      Previously, one would have to specifically choose CONFIG_OLPC and
      CONFIG_PCI_GOOLPC in order to enable PCI_OLPC.  That doesn't really work
      for distro kernels, so this patch allows one to choose CONFIG_OLPC and
      CONFIG_PCI_GOANY in order to build in OLPC support in a generic kernel (as
      requested by Robert Millan).
      
      This also moves GOOLPC before GOANY in the menuconfig list.
      
      Finally, make pci_access_init return early if we detect OLPC hardware.
      There's no need to continue probing stuff, and pci_pcbios_init
      specifically trashes our settings (we didn't run into that before because
      PCI_GOANY wasn't supported).
      Signed-off-by: NAndres Salomon <dilinger@debian.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      2bdd1b03