1. 27 7月, 2008 23 次提交
  2. 26 7月, 2008 13 次提交
    • R
      x86_64: fix ia32 AMD syscall audit fast-path · 024e8ac0
      Roland McGrath 提交于
      The new code in commit 5cbf1565
      has a bug in the version supporting the AMD 'syscall' instruction.
      It clobbers the user's %ecx register value (with the %ebp value).
      
      This change fixes it.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      024e8ac0
    • N
      powerpc: Fix boot problem due to AT_BASE_PLATFORM change · fc532f81
      Nathan Lynch 提交于
      Commit 9115d134 ("powerpc: Enable
      AT_BASE_PLATFORM aux vector") broke boot on 32-bit powerpc systems; we
      have to use PTRRELOC to initialize powerpc_base_platform this early in
      boot.
      
      Bug reported by Jon Smirl.
      Signed-off-by: NNathan Lynch <ntl@pobox.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fc532f81
    • D
      sparc: Wire up new system calls. · f1373da8
      David S. Miller 提交于
      This wires up the recently added Wire up signalfd4, eventfd2,
      epoll_create1, dup3, pipe2, and inotify_init1 system calls.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1373da8
    • A
      pty: remove unused UNIX98_PTY_COUNT options · 7833351b
      Adrian Bunk 提交于
      The h8300 and sparc options somehow survived when the code stopped using
      CONFIG_UNIX98_PTY_COUNT.
      Reviewed-by: NRobert P. J. Day <rpjday@crashcourse.ca>
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7833351b
    • C
      calgary iommu: use the first kernels TCE tables in kdump · 95b68dec
      Chandru 提交于
      kdump kernel fails to boot with calgary iommu and aacraid driver on a x366
      box.  The ongoing dma's of aacraid from the first kernel continue to exist
      until the driver is loaded in the kdump kernel.  Calgary is initialized
      prior to aacraid and creation of new tce tables causes wrong dma's to
      occur.  Here we try to get the tce tables of the first kernel in kdump
      kernel and use them.  While in the kdump kernel we do not allocate new tce
      tables but instead read the base address register contents of calgary
      iommu and use the tables that the registers point to.  With these changes
      the kdump kernel and hence aacraid now boots normally.
      Signed-off-by: NChandru Siddalingappa <chandru@in.ibm.com>
      Acked-by: NMuli Ben-Yehuda <muli@il.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      95b68dec
    • O
      S390 topology: don't use kthread() for arch_reinit_sched_domains() · 69b895fd
      Oleg Nesterov 提交于
      Now that it is safe to use get_online_cpus() we can revert
      
      	[S390] cpu topology: Fix possible deadlock.
      	commit: fd781fa2
      
      and call arch_reinit_sched_domains() directly from topology_work_fn().
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      69b895fd
    • A
      remove unused #include <linux/dirent.h>'s · e8938a62
      Adrian Bunk 提交于
      Remove some unused #include <linux/dirent.h>'s.
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8938a62
    • M
      gpiolib: allow user-selection · 7444a72e
      Michael Buesch 提交于
      This patch adds functionality to the gpio-lib subsystem to make it
      possible to enable the gpio-lib code even if the architecture code didn't
      request to get it built in.
      
      The archtitecture code does still need to implement the gpiolib accessor
      functions in its asm/gpio.h file.  This patch adds the implementations for
      x86 and PPC.
      
      With these changes it is possible to run generic GPIO expansion cards on
      every architecture that implements the trivial wrapper functions.  Support
      for more architectures can easily be added.
      Signed-off-by: NMichael Buesch <mb@bu3sch.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: Samuel Ortiz <sameo@openedhand.com>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Adrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7444a72e
    • D
      gpio: sysfs interface · d8f388d8
      David Brownell 提交于
      This adds a simple sysfs interface for GPIOs.
      
          /sys/class/gpio
          	/export ... asks the kernel to export a GPIO to userspace
          	/unexport ... to return a GPIO to the kernel
              /gpioN ... for each exported GPIO #N
      	    /value ... always readable, writes fail for input GPIOs
      	    /direction ... r/w as: in, out (default low); write high, low
      	/gpiochipN ... for each gpiochip; #N is its first GPIO
      	    /base ... (r/o) same as N
      	    /label ... (r/o) descriptive, not necessarily unique
      	    /ngpio ... (r/o) number of GPIOs; numbered N .. N+(ngpio - 1)
      
      GPIOs claimed by kernel code may be exported by its owner using a new
      gpio_export() call, which should be most useful for driver debugging.
      Such exports may optionally be done without a "direction" attribute.
      
      Userspace may ask to take over a GPIO by writing to a sysfs control file,
      helping to cope with incomplete board support or other "one-off"
      requirements that don't merit full kernel support:
      
        echo 23 > /sys/class/gpio/export
      	... will gpio_request(23, "sysfs") and gpio_export(23);
      	use /sys/class/gpio/gpio-23/direction to (re)configure it,
      	when that GPIO can be used as both input and output.
        echo 23 > /sys/class/gpio/unexport
      	... will gpio_free(23), when it was exported as above
      
      The extra D-space footprint is a few hundred bytes, except for the sysfs
      resources associated with each exported GPIO.  The additional I-space
      footprint is about two thirds of the current size of gpiolib (!).  Since
      no /dev node creation is involved, no "udev" support is needed.
      
      Related changes:
      
        * This adds a device pointer to "struct gpio_chip".  When GPIO
          providers initialize that, sysfs gpio class devices become children of
          that device instead of being "virtual" devices.
      
        * The (few) gpio_chip providers which have such a device node have
          been updated.
      
        * Some gpio_chip drivers also needed to update their module "owner"
          field ...  for which missing kerneldoc was added.
      
        * Some gpio_chips don't support input GPIOs.  Those GPIOs are now
          flagged appropriately when the chip is registered.
      
      Based on previous patches, and discussion both on and off LKML.
      
      A Documentation/ABI/testing/sysfs-gpio update is ready to submit once this
      merges to mainline.
      
      [akpm@linux-foundation.org: a few maintenance build fixes]
      Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
      Cc: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d8f388d8
    • S
      kprobes: improve kretprobe scalability with hashed locking · ef53d9c5
      Srinivasa D S 提交于
      Currently list of kretprobe instances are stored in kretprobe object (as
      used_instances,free_instances) and in kretprobe hash table.  We have one
      global kretprobe lock to serialise the access to these lists.  This causes
      only one kretprobe handler to execute at a time.  Hence affects system
      performance, particularly on SMP systems and when return probe is set on
      lot of functions (like on all systemcalls).
      
      Solution proposed here gives fine-grain locks that performs better on SMP
      system compared to present kretprobe implementation.
      
      Solution:
      
       1) Instead of having one global lock to protect kretprobe instances
          present in kretprobe object and kretprobe hash table.  We will have
          two locks, one lock for protecting kretprobe hash table and another
          lock for kretporbe object.
      
       2) We hold lock present in kretprobe object while we modify kretprobe
          instance in kretprobe object and we hold per-hash-list lock while
          modifying kretprobe instances present in that hash list.  To prevent
          deadlock, we never grab a per-hash-list lock while holding a kretprobe
          lock.
      
       3) We can remove used_instances from struct kretprobe, as we can
          track used instances of kretprobe instances using kretprobe hash
          table.
      
      Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system
      with return probes set on all systemcalls looks like this.
      
      cacheline              non-cacheline             Un-patched kernel
      aligned patch 	       aligned patch
      ===============================================================================
      real    9m46.784s       9m54.412s                  10m2.450s
      user    40m5.715s       40m7.142s                  40m4.273s
      sys     2m57.754s       2m58.583s                  3m17.430s
      ===========================================================
      
      Time duration for kernel compilation ("make -j 8) on the same system, when
      kernel is not probed.
      =========================
      real    9m26.389s
      user    40m8.775s
      sys     2m7.283s
      =========================
      Signed-off-by: NSrinivasa DS <srinivasa@in.ibm.com>
      Signed-off-by: NJim Keniston <jkenisto@us.ibm.com>
      Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef53d9c5
    • T
      inflate: refactor inflate malloc code · 2d6ffcca
      Thomas Petazzoni 提交于
      Inflate requires some dynamic memory allocation very early in the boot
      process and this is provided with a set of four functions:
      malloc/free/gzip_mark/gzip_release.
      
      The old inflate code used a mark/release strategy rather than implement
      free.  This new version instead keeps a count on the number of outstanding
      allocations and when it hits zero, it resets the malloc arena.
      
      This allows removing all the mark and release implementations and unifying
      all the malloc/free implementations.
      
      The architecture-dependent code must define two addresses:
       - free_mem_ptr, the address of the beginning of the area in which
         allocations should be made
       - free_mem_end_ptr, the address of the end of the area in which
         allocations should be made. If set to 0, then no check is made on
         the number of allocations, it just grows as much as needed
      
      The architecture-dependent code can also provide an arch_decomp_wdog()
      function call.  This function will be called several times during the
      decompression process, and allow to notify the watchdog that the system is
      still running.  If an architecture provides such a call, then it must
      define ARCH_HAS_DECOMP_WDOG so that the generic inflate code calls
      arch_decomp_wdog().
      
      Work initially done by Matt Mackall, updated to a recent version of the
      kernel and improved by me.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Mikael Starvik <mikael.starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: NYoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d6ffcca
    • J
      introduce HAVE_EFFICIENT_UNALIGNED_ACCESS Kconfig symbol · 58340a07
      Johannes Berg 提交于
      In many cases, especially in networking, it can be beneficial to know at
      compile time whether the architecture can do unaligned accesses efficiently.
      This patch introduces a new Kconfig symbol
      
      	HAVE_EFFICIENT_UNALIGNED_ACCESS
      
      for that purpose and adds it to the powerpc and x86 architectures.  Also add
      some documentation about alignment and networking, and especially one intended
      use of this symbol.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Acked-by: Ingo Molnar <mingo@elte.hu> [x86 architecture part]
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58340a07
    • T
      [IA64] Wire up new system calls · 3e4d0cab
      Tony Luck 提交于
      Six new system calls: signalfd4, eventfd2, epoll_create1,
      dup3, pipe2 and inotify_init1.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      3e4d0cab
  3. 25 7月, 2008 4 次提交
    • N
      powerpc/pseries: Remove kmalloc call in handling writes to lparcfg · 16c14b46
      Nathan Fontenot 提交于
      There are only 4 valid name=value pairs for writes to
      /proc/ppc64/lparcfg.  Current code allocates a buffer to copy
      this information in from the user.  Since the longest name=value
      pair will easily fit into a buffer of 64 characters, simply
      put the buffer on the stack instead of allocating the buffer.
      Signed-off-by: NNathan Fotenot <nfont@austin.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      16c14b46
    • N
      powerpc/pseries: Update arch vector to indicate support for CMO · 8391e42a
      Nathan Fontenot 提交于
      Update the architecture vector to indicate that Cooperative Memory
      Overcommitment is supported if CONFIG_PPC_SMLPAR is set.
      Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com>
      Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8391e42a
    • N
      powerpc/pseries: Verify CMO memory entitlement updates with virtual I/O · 22e1a4dd
      Nathan Fontenot 提交于
      Verify memory entitlement updates can be handled by vio.
      Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com>
      Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      22e1a4dd
    • R
      powerpc/pseries: vio bus support for CMO · a90ab95a
      Robert Jennings 提交于
      This is a large patch but the normal code path is not affected.  For
      non-pSeries platforms the code is ifdef'ed out and for non-CMO enabled
      pSeries systems this does not affect the normal code path.  Devices that
      do not perform DMA operations do not need modification with this patch.
      The function get_desired_dma was renamed from get_io_entitlement for
      clarity.
      
      Overview
      
      Cooperative Memory Overcommitment (CMO) allows for a set of OS partitions
      to be run with less RAM than the aggregate needs of the group of
      partitions.  The firmware will balance memory between the partitions
      and page in/out memory as needed.  Based on the number and type of IO
      adpaters preset each partition is allocated an amount of memory for
      DMA operations and this allocation will be guaranteed to the partition;
      this is referred to as the partition's 'entitlement'.
      
      Partitions running in a CMO environment can only have virtual IO devices
      present.  The VIO bus layer will manage the IO entitlement for the system.
      Accounting, at a system and per-device level, is tracked in the VIO bus
      code and exposed via sysfs.  A set of dma_ops functions are added to
      the bus to allow for this accounting.
      
      Bus initialization
      
      At initialization, the bus will calculate the minimum needs of the system
      based on providing each device present with a standard minimum entitlement
      along with a spare allocation for the bus to handle hotplug events.
      If the minimum needs can not be met the system boot will be halted.
      
      Device changes
      
      The significant changes for devices while running under CMO are that the
      devices must specify how much dedicated IO entitlement they desire and
      must also handle DMA mapping errors that can occur due to constrained
      IO memory.  The virtual IO drivers are modified to silence errors when
      DMA mappings fail for CMO and handle these failures gracefully.
      
      Each devices will be guaranteed a minimum entitlement that can always
      be mapped.  Devices will specify how much entitlement they desire and
      the VIO bus will attempt to provide for this.  Devices can change their
      desired entitlement level at any point in time to address particular needs
      (via vio_cmo_set_dev_desired()), not just at device probe time.
      
      VIO bus changes
      
      The system will have a particular entitlement level available from which
      it can provide memory to the devices.  The bus defines two pools of memory
      within this entitlement, the reserved and excess pools.  Each device is
      provided with it's own entitlement no less than a system defined minimum
      entitlement and no greater than what the device has specified as it's
      desired entitlement.  The entitlement provided to devices comes from the
      reserve pool.  The reserve pool can also contain a spare allocation as
      large as the system defined minimum entitlement which is used for device
      hotplug events.  Any entitlement not needed to fulfill the needs of a
      reserve pool is placed in the excess pool.  Each device is guaranteed
      that it can map up to it's entitled level; additional mapping are possible
      as long as there is unmapped memory in the excess pool.
      
      Bus probe
      
      As the system starts, each device is given an entitlement equal only
      to the system defined minimum entitlement.  The reserve pool is equal
      to the sum of these entitlements, plus a spare allocation.  The VIO bus
      also tracks the aggregate desired entitlement of all the devices.  If the
      system desired entitlement is greater than the size of the reserve pool,
      when devices unmap IO memory it will be reserved and a balance operation
      will be scheduled for some time in the future.
      
      Entitlement balancing
      
      The balance function tries to fairly distribute entitlement between the
      devices in the system with the goal of providing each device with it's
      desired amount of entitlement.  Devices using more than what would be
      ideal will have their entitled set-point adjusted; this will effectively
      set a goal for lower IO memory usage as future mappings can fail and
      deallocations will trigger a balance operation to distribute the newly
      unmapped memory.  A fair distribution of entitlement can take several
      balance operations to achieve.  Entitlement changes and device DLPAR
      events will alter the state of CMO and will trigger balance operations.
      
      Hotplug events
      
      The VIO bus allows for changes in system entitlement at run-time via
      'vio_cmo_entitlement_update()'.  When devices are added the hotplug
      device event will be preceded by a system entitlement increase and this
      is reversed when devices are removed.
      
      The following changes are made that the VIO bus layer for CMO:
       * add IO memory accounting per device structure.
       * add IO memory entitlement query function to driver structure.
       * during vio bus probe, if CMO is enabled, check that driver has
         memory entitlement query function defined.  Fail if function not defined.
       * fail to register driver if io entitlement function not defined.
       * create set of dma_ops at vio level for CMO that will track allocations
         and return DMA failures once entitlement is reached.  Entitlement will
         limited by overall system entitlement.  Devices will have a reserved
         quantity of memory that is guaranteed, the rest can be used as available.
       * expose entitlement, current allocation, desired allocation, and the
         allocation error counter for devices to the user through sysfs
       * provide mechanism for changing a device's desired entitlement at run time
         for devices as an exported function and sysfs tunable
       * track any DMA failures for entitled IO memory for each vio device.
       * check entitlement against available system entitlement on device add
       * track entitlement metrics (high water mark, current usage)
       * provide function to reset high water mark
       * provide minimum and desired entitlement numbers at a bus level
       * provide drivers with a minimum guaranteed entitlement
       * balance available entitlement between devices to satisfy their needs
       * handle system entitlement changes and device hotplug
      Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a90ab95a