1. 27 7月, 2008 8 次提交
    • N
      x86: lockless get_user_pages_fast() · 8174c430
      Nick Piggin 提交于
      Implement get_user_pages_fast without locking in the fastpath on x86.
      
      Do an optimistic lockless pagetable walk, without taking mmap_sem or any
      page table locks or even mmap_sem.  Page table existence is guaranteed by
      turning interrupts off (combined with the fact that we're always looking
      up the current mm, means we can do the lockless page table walk within the
      constraints of the TLB shootdown design).  Basically we can do this
      lockless pagetable walk in a similar manner to the way the CPU's pagetable
      walker does not have to take any locks to find present ptes.
      
      This patch (combined with the subsequent ones to convert direct IO to use
      it) was found to give about 10% performance improvement on a 2 socket 8
      core Intel Xeon system running an OLTP workload on DB2 v9.5
      
       "To test the effects of the patch, an OLTP workload was run on an IBM
        x3850 M2 server with 2 processors (quad-core Intel Xeon processors at
        2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel.  Comparing
        runs with and without the patch resulted in an overall performance
        benefit of ~9.8%.  Correspondingly, oprofiles showed that samples from
        __up_read and __down_read routines that is seen during thread contention
        for system resources was reduced from 2.8% down to .05%.  Monitoring the
        /proc/vmstat output from the patched run showed that the counter for
        fast_gup contained a very high number while the fast_gup_slow value was
        zero."
      
      (fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a
      counter we had for the number of times the slowpath was invoked).
      
      The main reason for the improvement is that DB2 has multiple threads each
      issuing direct-IO.  Direct-IO uses get_user_pages, and thus the threads
      contend the mmap_sem cacheline, and can also contend on page table locks.
      
      I would anticipate larger performance gains on larger systems, however I
      think DB2 uses an adaptive mix of threads and processes, so it could be
      that thread contention remains pretty constant as machine size increases.
      In which case, we stuck with "only" a 10% gain.
      
      The downside of using get_user_pages_fast is that if there is not a pte
      with the correct permissions for the access, we end up falling back to
      get_user_pages and so the get_user_pages_fast is a bit of extra work.
      However this should not be the common case in most performance critical
      code.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: Kconfig fix]
      [akpm@linux-foundation.org: Makefile fix/cleanup]
      [akpm@linux-foundation.org: warning fix]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Reviewed-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8174c430
    • H
    • H
    • H
      cris: use the common ascii hex helpers · 42a9a583
      Harvey Harrison 提交于
      Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      42a9a583
    • H
      kexec jump: save/restore device state · 89081d17
      Huang Ying 提交于
      This patch implements devices state save/restore before after kexec.
      
      This patch together with features in kexec_jump patch can be used for
      following:
      
      - A simple hibernation implementation without ACPI support.  You can kexec a
        hibernating kernel, save the memory image of original system and shutdown
        the system.  When resuming, you restore the memory image of original system
        via ordinary kexec load then jump back.
      
      - Kernel/system debug through making system snapshot.  You can make system
        snapshot, jump back, do some thing and make another system snapshot.
      
      - Cooperative multi-kernel/system.  With kexec jump, you can switch between
        several kernels/systems quickly without boot process except the first time.
        This appears like swap a whole kernel/system out/in.
      
      - A general method to call program in physical mode (paging turning
        off). This can be used to invoke BIOS code under Linux.
      
      The following user-space tools can be used with kexec jump:
      
      - kexec-tools needs to be patched to support kexec jump. The patches
        and the precompiled kexec can be download from the following URL:
             source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
             patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
             binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10
      
      - makedumpfile with patches are used as memory image saving tool, it
        can exclude free pages from original kernel memory image file. The
        patches and the precompiled makedumpfile can be download from the
        following URL:
             source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2
             patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2
             binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10
      
      - An initramfs image can be used as the root file system of kexeced
        kernel. An initramfs image built with "BuildRoot" can be downloaded
        from the following URL:
             initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz
        All user space tools above are included in the initramfs image.
      
      Usage example of simple hibernation:
      
      1. Compile and install patched kernel with following options selected:
      
      CONFIG_X86_32=y
      CONFIG_RELOCATABLE=y
      CONFIG_KEXEC=y
      CONFIG_CRASH_DUMP=y
      CONFIG_PM=y
      CONFIG_HIBERNATION=y
      CONFIG_KEXEC_JUMP=y
      
      2. Build an initramfs image contains kexec-tool and makedumpfile, or
         download the pre-built initramfs image, called rootfs.gz in
         following text.
      
      3. Prepare a partition to save memory image of original kernel, called
         hibernating partition in following text.
      
      4. Boot kernel compiled in step 1 (kernel A).
      
      5. In the kernel A, load kernel compiled in step 1 (kernel B) with
         /sbin/kexec. The shell command line can be as follow:
      
         /sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000
           --mem-max=0xffffff --initrd=rootfs.gz
      
      6. Boot the kernel B with following shell command line:
      
         /sbin/kexec -e
      
      7. The kernel B will boot as normal kexec. In kernel B the memory
         image of kernel A can be saved into hibernating partition as
         follow:
      
         jump_back_entry=`cat /proc/cmdline | tr ' ' '\n' | grep kexec_jump_back_entry | cut -d '='`
         echo $jump_back_entry > kexec_jump_back_entry
         cp /proc/vmcore dump.elf
      
         Then you can shutdown the machine as normal.
      
      8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as
         root file system.
      
      9. In kernel C, load the memory image of kernel A as follow:
      
         /sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf
      
      10. Jump back to the kernel A as follow:
      
         /sbin/kexec -e
      
         Then, kernel A is resumed.
      
      Implementation point:
      
      To support jumping between two kernels, before jumping to (executing)
      the new kernel and jumping back to the original kernel, the devices
      are put into quiescent state, and the state of devices and CPU is
      saved. After jumping back from kexeced kernel and jumping to the new
      kernel, the state of devices and CPU are restored accordingly. The
      devices/CPU state save/restore code of software suspend is called to
      implement corresponding function.
      
      Known issues:
      
      - Because the segment number supported by sys_kexec_load is limited,
        hibernation image with many segments may not be load. This is
        planned to be eliminated by adding a new flag to sys_kexec_load to
        make a image can be loaded with multiple sys_kexec_load invoking.
      
      Now, only the i386 architecture is supported.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      89081d17
    • H
      kexec jump · 3ab83521
      Huang Ying 提交于
      This patch provides an enhancement to kexec/kdump.  It implements the
      following features:
      
      - Backup/restore memory used by the original kernel before/after
        kexec.
      
      - Save/restore CPU state before/after kexec.
      
      The features of this patch can be used as a general method to call program in
      physical mode (paging turning off).  This can be used to call BIOS code under
      Linux.
      
      kexec-tools needs to be patched to support kexec jump. The patches and
      the precompiled kexec can be download from the following URL:
      
             source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
             patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
             binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10
      
      Usage example of calling some physical mode code and return:
      
      1. Compile and install patched kernel with following options selected:
      
      CONFIG_X86_32=y
      CONFIG_KEXEC=y
      CONFIG_PM=y
      CONFIG_KEXEC_JUMP=y
      
      2. Build patched kexec-tool or download the pre-built one.
      
      3. Build some physical mode executable named such as "phy_mode"
      
      4. Boot kernel compiled in step 1.
      
      5. Load physical mode executable with /sbin/kexec. The shell command
         line can be as follow:
      
         /sbin/kexec --load-preserve-context --args-none phy_mode
      
      6. Call physical mode executable with following shell command line:
      
         /sbin/kexec -e
      
      Implementation point:
      
      To support jumping without reserving memory.  One shadow backup page (source
      page) is allocated for each page used by kexeced code image (destination
      page).  When do kexec_load, the image of kexeced code is loaded into source
      pages, and before executing, the destination pages and the source pages are
      swapped, so the contents of destination pages are backupped.  Before jumping
      to the kexeced code image and after jumping back to the original kernel, the
      destination pages and the source pages are swapped too.
      
      C ABI (calling convention) is used as communication protocol between
      kernel and called code.
      
      A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to
      indicate that the loaded kernel image is used for jumping back.
      
      Now, only the i386 architecture is supported.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ab83521
    • A
      x86 calgary: fix handling of devices that aren't behind the Calgary · 1956a96d
      Alexis Bruemmer 提交于
      The calgary code can give drivers addresses above 4GB which is very bad
      for hardware that is only 32bit DMA addressable.
      
      With this patch, the calgary code sets the global dma_ops to swiotlb or
      nommu properly, and the dma_ops of devices behind the Calgary/CalIOC2
      to calgary_dma_ops.  So the calgary code can handle devices safely that
      aren't behind the Calgary/CalIOC2.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NAlexis Bruemmer <alexisb@us.ibm.com>
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1956a96d
    • F
      dma-mapping: add the device argument to dma_mapping_error() · 8d8bb39b
      FUJITA Tomonori 提交于
      Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
      architecture does:
      
      This enables us to cleanly fix the Calgary IOMMU issue that some devices
      are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).
      
      I think that per-device dma_mapping_ops support would be also helpful for
      KVM people to support PCI passthrough but Andi thinks that this makes it
      difficult to support the PCI passthrough (see the above thread).  So I
      CC'ed this to KVM camp.  Comments are appreciated.
      
      A pointer to dma_mapping_ops to struct dev_archdata is added.  If the
      pointer is non NULL, DMA operations in asm/dma-mapping.h use it.  If it's
      NULL, the system-wide dma_ops pointer is used as before.
      
      If it's useful for KVM people, I plan to implement a mechanism to register
      a hook called when a new pci (or dma capable) device is created (it works
      with hot plugging).  It enables IOMMUs to set up an appropriate
      dma_mapping_ops per device.
      
      The major obstacle is that dma_mapping_error doesn't take a pointer to the
      device unlike other DMA operations.  So x86 can't have dma_mapping_ops per
      device.  Note all the POWER IOMMUs use the same dma_mapping_error function
      so this is not a problem for POWER but x86 IOMMUs use different
      dma_mapping_error functions.
      
      The first patch adds the device argument to dma_mapping_error.  The patch
      is trivial but large since it touches lots of drivers and dma-mapping.h in
      all the architecture.
      
      This patch:
      
      dma_mapping_error() doesn't take a pointer to the device unlike other DMA
      operations.  So we can't have dma_mapping_ops per device.
      
      Note that POWER already has dma_mapping_ops per device but all the POWER
      IOMMUs use the same dma_mapping_error function.  x86 IOMMUs use device
      argument.
      
      [akpm@linux-foundation.org: fix sge]
      [akpm@linux-foundation.org: fix svc_rdma]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: fix bnx2x]
      [akpm@linux-foundation.org: fix s2io]
      [akpm@linux-foundation.org: fix pasemi_mac]
      [akpm@linux-foundation.org: fix sdhci]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: fix sparc]
      [akpm@linux-foundation.org: fix ibmvscsi]
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Avi Kivity <avi@qumranet.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d8bb39b
  2. 26 7月, 2008 13 次提交
    • R
      x86_64: fix ia32 AMD syscall audit fast-path · 024e8ac0
      Roland McGrath 提交于
      The new code in commit 5cbf1565
      has a bug in the version supporting the AMD 'syscall' instruction.
      It clobbers the user's %ecx register value (with the %ebp value).
      
      This change fixes it.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      024e8ac0
    • N
      powerpc: Fix boot problem due to AT_BASE_PLATFORM change · fc532f81
      Nathan Lynch 提交于
      Commit 9115d134 ("powerpc: Enable
      AT_BASE_PLATFORM aux vector") broke boot on 32-bit powerpc systems; we
      have to use PTRRELOC to initialize powerpc_base_platform this early in
      boot.
      
      Bug reported by Jon Smirl.
      Signed-off-by: NNathan Lynch <ntl@pobox.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fc532f81
    • D
      sparc: Wire up new system calls. · f1373da8
      David S. Miller 提交于
      This wires up the recently added Wire up signalfd4, eventfd2,
      epoll_create1, dup3, pipe2, and inotify_init1 system calls.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1373da8
    • A
      pty: remove unused UNIX98_PTY_COUNT options · 7833351b
      Adrian Bunk 提交于
      The h8300 and sparc options somehow survived when the code stopped using
      CONFIG_UNIX98_PTY_COUNT.
      Reviewed-by: NRobert P. J. Day <rpjday@crashcourse.ca>
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7833351b
    • C
      calgary iommu: use the first kernels TCE tables in kdump · 95b68dec
      Chandru 提交于
      kdump kernel fails to boot with calgary iommu and aacraid driver on a x366
      box.  The ongoing dma's of aacraid from the first kernel continue to exist
      until the driver is loaded in the kdump kernel.  Calgary is initialized
      prior to aacraid and creation of new tce tables causes wrong dma's to
      occur.  Here we try to get the tce tables of the first kernel in kdump
      kernel and use them.  While in the kdump kernel we do not allocate new tce
      tables but instead read the base address register contents of calgary
      iommu and use the tables that the registers point to.  With these changes
      the kdump kernel and hence aacraid now boots normally.
      Signed-off-by: NChandru Siddalingappa <chandru@in.ibm.com>
      Acked-by: NMuli Ben-Yehuda <muli@il.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      95b68dec
    • O
      S390 topology: don't use kthread() for arch_reinit_sched_domains() · 69b895fd
      Oleg Nesterov 提交于
      Now that it is safe to use get_online_cpus() we can revert
      
      	[S390] cpu topology: Fix possible deadlock.
      	commit: fd781fa2
      
      and call arch_reinit_sched_domains() directly from topology_work_fn().
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      69b895fd
    • A
      remove unused #include <linux/dirent.h>'s · e8938a62
      Adrian Bunk 提交于
      Remove some unused #include <linux/dirent.h>'s.
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8938a62
    • M
      gpiolib: allow user-selection · 7444a72e
      Michael Buesch 提交于
      This patch adds functionality to the gpio-lib subsystem to make it
      possible to enable the gpio-lib code even if the architecture code didn't
      request to get it built in.
      
      The archtitecture code does still need to implement the gpiolib accessor
      functions in its asm/gpio.h file.  This patch adds the implementations for
      x86 and PPC.
      
      With these changes it is possible to run generic GPIO expansion cards on
      every architecture that implements the trivial wrapper functions.  Support
      for more architectures can easily be added.
      Signed-off-by: NMichael Buesch <mb@bu3sch.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: Samuel Ortiz <sameo@openedhand.com>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Adrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7444a72e
    • D
      gpio: sysfs interface · d8f388d8
      David Brownell 提交于
      This adds a simple sysfs interface for GPIOs.
      
          /sys/class/gpio
          	/export ... asks the kernel to export a GPIO to userspace
          	/unexport ... to return a GPIO to the kernel
              /gpioN ... for each exported GPIO #N
      	    /value ... always readable, writes fail for input GPIOs
      	    /direction ... r/w as: in, out (default low); write high, low
      	/gpiochipN ... for each gpiochip; #N is its first GPIO
      	    /base ... (r/o) same as N
      	    /label ... (r/o) descriptive, not necessarily unique
      	    /ngpio ... (r/o) number of GPIOs; numbered N .. N+(ngpio - 1)
      
      GPIOs claimed by kernel code may be exported by its owner using a new
      gpio_export() call, which should be most useful for driver debugging.
      Such exports may optionally be done without a "direction" attribute.
      
      Userspace may ask to take over a GPIO by writing to a sysfs control file,
      helping to cope with incomplete board support or other "one-off"
      requirements that don't merit full kernel support:
      
        echo 23 > /sys/class/gpio/export
      	... will gpio_request(23, "sysfs") and gpio_export(23);
      	use /sys/class/gpio/gpio-23/direction to (re)configure it,
      	when that GPIO can be used as both input and output.
        echo 23 > /sys/class/gpio/unexport
      	... will gpio_free(23), when it was exported as above
      
      The extra D-space footprint is a few hundred bytes, except for the sysfs
      resources associated with each exported GPIO.  The additional I-space
      footprint is about two thirds of the current size of gpiolib (!).  Since
      no /dev node creation is involved, no "udev" support is needed.
      
      Related changes:
      
        * This adds a device pointer to "struct gpio_chip".  When GPIO
          providers initialize that, sysfs gpio class devices become children of
          that device instead of being "virtual" devices.
      
        * The (few) gpio_chip providers which have such a device node have
          been updated.
      
        * Some gpio_chip drivers also needed to update their module "owner"
          field ...  for which missing kerneldoc was added.
      
        * Some gpio_chips don't support input GPIOs.  Those GPIOs are now
          flagged appropriately when the chip is registered.
      
      Based on previous patches, and discussion both on and off LKML.
      
      A Documentation/ABI/testing/sysfs-gpio update is ready to submit once this
      merges to mainline.
      
      [akpm@linux-foundation.org: a few maintenance build fixes]
      Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
      Cc: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d8f388d8
    • S
      kprobes: improve kretprobe scalability with hashed locking · ef53d9c5
      Srinivasa D S 提交于
      Currently list of kretprobe instances are stored in kretprobe object (as
      used_instances,free_instances) and in kretprobe hash table.  We have one
      global kretprobe lock to serialise the access to these lists.  This causes
      only one kretprobe handler to execute at a time.  Hence affects system
      performance, particularly on SMP systems and when return probe is set on
      lot of functions (like on all systemcalls).
      
      Solution proposed here gives fine-grain locks that performs better on SMP
      system compared to present kretprobe implementation.
      
      Solution:
      
       1) Instead of having one global lock to protect kretprobe instances
          present in kretprobe object and kretprobe hash table.  We will have
          two locks, one lock for protecting kretprobe hash table and another
          lock for kretporbe object.
      
       2) We hold lock present in kretprobe object while we modify kretprobe
          instance in kretprobe object and we hold per-hash-list lock while
          modifying kretprobe instances present in that hash list.  To prevent
          deadlock, we never grab a per-hash-list lock while holding a kretprobe
          lock.
      
       3) We can remove used_instances from struct kretprobe, as we can
          track used instances of kretprobe instances using kretprobe hash
          table.
      
      Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system
      with return probes set on all systemcalls looks like this.
      
      cacheline              non-cacheline             Un-patched kernel
      aligned patch 	       aligned patch
      ===============================================================================
      real    9m46.784s       9m54.412s                  10m2.450s
      user    40m5.715s       40m7.142s                  40m4.273s
      sys     2m57.754s       2m58.583s                  3m17.430s
      ===========================================================
      
      Time duration for kernel compilation ("make -j 8) on the same system, when
      kernel is not probed.
      =========================
      real    9m26.389s
      user    40m8.775s
      sys     2m7.283s
      =========================
      Signed-off-by: NSrinivasa DS <srinivasa@in.ibm.com>
      Signed-off-by: NJim Keniston <jkenisto@us.ibm.com>
      Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef53d9c5
    • T
      inflate: refactor inflate malloc code · 2d6ffcca
      Thomas Petazzoni 提交于
      Inflate requires some dynamic memory allocation very early in the boot
      process and this is provided with a set of four functions:
      malloc/free/gzip_mark/gzip_release.
      
      The old inflate code used a mark/release strategy rather than implement
      free.  This new version instead keeps a count on the number of outstanding
      allocations and when it hits zero, it resets the malloc arena.
      
      This allows removing all the mark and release implementations and unifying
      all the malloc/free implementations.
      
      The architecture-dependent code must define two addresses:
       - free_mem_ptr, the address of the beginning of the area in which
         allocations should be made
       - free_mem_end_ptr, the address of the end of the area in which
         allocations should be made. If set to 0, then no check is made on
         the number of allocations, it just grows as much as needed
      
      The architecture-dependent code can also provide an arch_decomp_wdog()
      function call.  This function will be called several times during the
      decompression process, and allow to notify the watchdog that the system is
      still running.  If an architecture provides such a call, then it must
      define ARCH_HAS_DECOMP_WDOG so that the generic inflate code calls
      arch_decomp_wdog().
      
      Work initially done by Matt Mackall, updated to a recent version of the
      kernel and improved by me.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Mikael Starvik <mikael.starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: NYoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d6ffcca
    • J
      introduce HAVE_EFFICIENT_UNALIGNED_ACCESS Kconfig symbol · 58340a07
      Johannes Berg 提交于
      In many cases, especially in networking, it can be beneficial to know at
      compile time whether the architecture can do unaligned accesses efficiently.
      This patch introduces a new Kconfig symbol
      
      	HAVE_EFFICIENT_UNALIGNED_ACCESS
      
      for that purpose and adds it to the powerpc and x86 architectures.  Also add
      some documentation about alignment and networking, and especially one intended
      use of this symbol.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Acked-by: Ingo Molnar <mingo@elte.hu> [x86 architecture part]
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58340a07
    • T
      [IA64] Wire up new system calls · 3e4d0cab
      Tony Luck 提交于
      Six new system calls: signalfd4, eventfd2, epoll_create1,
      dup3, pipe2 and inotify_init1.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      3e4d0cab
  3. 25 7月, 2008 19 次提交