1. 01 4月, 2009 40 次提交
    • D
      rtc-parisc: remove redundant locking · 05439f1f
      dann frazier 提交于
      The RTC subsystem proides ops locking, no need to implement our own
      Signed-off-by: Ndann frazier <dannf@hp.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05439f1f
    • D
      rtc-parisc: add a missing include for linux/rtc.h · 93d456d9
      dann frazier 提交于
      Signed-off-by: Ndann frazier <dannf@hp.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93d456d9
    • D
      rtc: add platform driver for EFI · 5e3fd9e5
      dann frazier 提交于
      Munge Stephane Eranian's efirtc.c code into an rtc platform driver
      
      [akpm@linux-foundation.org: use is_leap_year()]
      Signed-off-by: Ndann frazier <dannf@hp.com>
      Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: David Brownell <david-b@pacbell.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5e3fd9e5
    • A
      rtc: convert LEAP_YEAR into an inline · 78d89ef4
      Andrew Morton 提交于
      - the LEAP_YEAR macro is buggy - it references its arg multiple times.
        Fix this by turning it into a C function.
      
      - give it a more approriate name
      
      - Move it to rtc.h so that other .c files can use it, instead of copying it.
      
      Cc: dann frazier <dannf@hp.com>
      Acked-by: NAlessandro Zummo <alessandro.zummo@towertech.it>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: David Brownell <david-b@pacbell.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      78d89ef4
    • M
      rtc: convert wm8350 use new alarm and update operations · 47367a3b
      Mark Brown 提交于
      These are the only two ioctls so the ioctl() function is also removed.
      Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
      Cc: Acked-by: Alessandro Zummo <a.zummo@towertech.it>
      Cc: David Brownell <david-b@pacbell.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      47367a3b
    • I
      autofs4: fix kernel includes · 79955898
      Ian Kent 提交于
      autofs_dev-ioctl.h is included by both the kernel module and user space tools
      and it includes two kernel header files.  Compiles work if the kernel headers
      are installed but fail otherwise.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79955898
    • I
      autofs4: fix lookup deadlock · 8f63aaa8
      Ian Kent 提交于
      A deadlock can occur when user space uses a signal (autofs version 4 uses
      SIGCHLD for this) to effect expire completion.
      
      The order of events is:
      
      Expire process completes, but before being able to send SIGCHLD to it's parent
      ...
      
      Another process walks onto a different mount point and drops the directory
      inode semaphore prior to sending the request to the daemon as it must ...
      
      A third process does an lstat on on the expired mount point causing it to wait
      on expire completion (unfortunately) holding the directory semaphore.
      
      The mount request then arrives at the daemon which does an lstat and,
      deadlock.
      
      For some time I was concerned about releasing the directory semaphore around
      the expire wait in autofs4_lookup as well as for the mount call back.  I
      finally realized that the last round of changes in this function made the
      expiring dentry and the lookup dentry separate and distinct so the check and
      possible wait can be done anywhere prior to the mount call back.  This patch
      moves the check to just before the mount call back and inside the directory
      inode mutex release.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f63aaa8
    • I
      autofs4: cleanup expire code duplication · 56fcef75
      Ian Kent 提交于
      A significant portion of the autofs_dev_ioctl_expire() and
      autofs4_expire_multi() functions is duplicated code.  This patch cleans that
      up.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      56fcef75
    • J
      ecryptfs: use kzfree() · 00fcf2cb
      Johannes Weiner 提交于
      Use kzfree() instead of memset() + kfree().
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00fcf2cb
    • A
      powerpc/fsl_soc: isolate legacy fsl_spi support to mpc832x_rdb boards · e2801806
      Anton Vorontsov 提交于
      The advantages of this:
      - Don't encourage legacy support;
      - Less external symbols, less code to compile-in for !MPC832x_RDB
        platforms.
      Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e2801806
    • A
      powerpc/83xx: add mmc-spi support via the device tree for MPC8323E-RDB · 75458285
      Anton Vorontsov 提交于
      - Add gpio-controller node to manage QE GPIO Bank D;
      - Add mmc-spi node;
      - Modify board file so that it won't use legacy SPI support with the new
        device trees.
      Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      75458285
    • A
      powerpc: add mmc-spi-slot bindings · 3f1c6ebf
      Anton Vorontsov 提交于
      The bindings describes a case where MMC/SD/SDIO slot directly connected to
      a SPI bus.  Such setups are widely used on embedded PowerPC boards.
      
      The patch also adds the mmc-spi-slot entry to the OpenFirmware modalias
      table.
      Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3f1c6ebf
    • A
      spi_mpc83xx: add OF platform driver bindings · 35b4b3c0
      Anton Vorontsov 提交于
      Implement full support for OF SPI bindings.  Now the driver can manage its
      own chip selects without any help from the board files and/or fsl_soc
      constructors.
      
      The "legacy" code is well isolated and could be removed as time goes by.
      Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35b4b3c0
    • A
      spi_mpc83xx: rework chip selects handling · 364fdbc0
      Anton Vorontsov 提交于
      The main purpose of this patch is to pass 'struct spi_device' to the chip
      select handling routines.  This is needed so that we could implement
      full-fledged OpenFirmware support for this driver.
      
      While at it, also:
      - Replace two {de,activate}_cs routines by single cs_contol().
      - Don't duplicate platform data callbacks in mpc83xx_spi struct.
      Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      364fdbc0
    • A
      spi_mpc83xx: fix sparse warnings · 34c8a20c
      Anton Vorontsov 提交于
      The patch fixes following sparse warnings:
      
        CHECK   spi_mpc83xx.c
      spi_mpc83xx.c:145:1: warning: symbol 'mpc83xx_spi_rx_buf_u8' was not declared. Should it be static?
      spi_mpc83xx.c:146:1: warning: symbol 'mpc83xx_spi_rx_buf_u16' was not declared. Should it be static?
      spi_mpc83xx.c:147:1: warning: symbol 'mpc83xx_spi_rx_buf_u32' was not declared. Should it be static?
      spi_mpc83xx.c:148:1: warning: symbol 'mpc83xx_spi_tx_buf_u8' was not declared. Should it be static?
      spi_mpc83xx.c:149:1: warning: symbol 'mpc83xx_spi_tx_buf_u16' was not declared. Should it be static?
      spi_mpc83xx.c:150:1: warning: symbol 'mpc83xx_spi_tx_buf_u32' was not declared. Should it be static?
      spi_mpc83xx.c:175:32: warning: incorrect type in initializer (different address spaces)
      spi_mpc83xx.c:175:32:    expected void *tmp_ptr
      spi_mpc83xx.c:175:32:    got unsigned int [noderef] <asn:2>*<noident>
      spi_mpc83xx.c:183:26: warning: incorrect type in argument 1 (different address spaces)
      spi_mpc83xx.c:183:26:    expected unsigned int [noderef] [usertype] <asn:2>*reg
      spi_mpc83xx.c:183:26:    got void *tmp_ptr
      spi_mpc83xx.c:184:26: warning: incorrect type in argument 1 (different address spaces)
      spi_mpc83xx.c:184:26:    expected unsigned int [noderef] [usertype] <asn:2>*reg
      spi_mpc83xx.c:184:26:    got void *tmp_ptr
      spi_mpc83xx.c:287:31: warning: incorrect type in initializer (different address spaces)
      spi_mpc83xx.c:287:31:    expected void *tmp_ptr
      spi_mpc83xx.c:287:31:    got unsigned int [noderef] <asn:2>*<noident>
      spi_mpc83xx.c:295:25: warning: incorrect type in argument 1 (different address spaces)
      spi_mpc83xx.c:295:25:    expected unsigned int [noderef] [usertype] <asn:2>*reg
      spi_mpc83xx.c:295:25:    got void *tmp_ptr
      spi_mpc83xx.c:296:25: warning: incorrect type in argument 1 (different address spaces)
      spi_mpc83xx.c:296:25:    expected unsigned int [noderef] [usertype] <asn:2>*reg
      spi_mpc83xx.c:296:25:    got void *tmp_ptr
      spi_mpc83xx.c:486:13: warning: symbol 'mpc83xx_spi_irq' was not declared. Should it be static?
      Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34c8a20c
    • W
      ramfs: add support for "mode=" mount option · c3b1b1cb
      Wu Fengguang 提交于
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12843
      
      "I use ramfs instead of tmpfs for /tmp because I don't use swap on my
      laptop.  Some apps need 1777 mode for /tmp directory, but ramfs does not
      support 'mode=' mount option."
      Reported-by: NAvan Anishchuk <matimatik@gmail.com>
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c3b1b1cb
    • D
      lis3: SPI transport layer · bb233fdf
      Daniel Mack 提交于
      Make use of the new abstraction layer and add a new transport layer for
      spi.  Works fine on a PXA based board.
      Signed-off-by: NDaniel Mack <daniel@caiaq.de>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Acked-by: NEric Piel <eric.piel@tremplin-utc.net>
      Cc: David Brownell <david-b@pacbell.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bb233fdf
    • D
      lis3: solve dependency between core and ACPI · a38da2ed
      Daniel Mack 提交于
      This solves the dependency between lis3lv02d.[ch] and ACPI specific
      methods.  It introduces a ->bus_priv pointer to the device struct which is
      casted to 'struct acpi_device' in the ACIP layer.  Changed hp_accel.c
      accordingly.
      Signed-off-by: NDaniel Mack <daniel@caiaq.de>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Acked-by: NEric Piel <eric.piel@tremplin-utc.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a38da2ed
    • D
      lis3: reorder functions to make forward decl obsolete · ab337a63
      Daniel Mack 提交于
      Move lis3lv02d_init_device() down so that the forward declaration of
      lis3lv02d_add_fs() becomes unnecessary.
      Signed-off-by: NDaniel Mack <daniel@caiaq.de>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Acked-by: NEric Piel <eric.piel@tremplin-utc.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ab337a63
    • L
      hp_accel: axis conversion for hp compaq 8710w · 12a324b6
      Luca Cappa 提交于
      I have a laptop HP Compaq 8710W, I compiled into my kernel the LIS3LV02DL
      and HP_ACCEL module drivers.  While loading it cannot recognize the laptop
      model, so i am sending the necessary information to update the database of
      axis orientations.
      
      >When the laptop is horizontal the position reported is about 0 for X and Y
      >and a positive value for Z
      Yes, it is about 0,0,1000, the actual reading says: (-17,-26,1018);
      
      > If the left side is elevated, X increases (becomes positive)
      Yes, X goes toward to positive 1000.
      
      >If the front side (where the touchpad is) is elevated, Y decreases (becomes negative)
      No, Y goes toward to positive 1000.
      
      >If the laptop is put upside-down, Z becomes negative
      Yes, the laptop on a table Z gives 1000, and if upsidedown the Z reads
      -1000.
      
      So in few words the Y axis is inverted.
      
      Cc: Eric Piel <eric.piel@tremplin-utc.net>
      Signed-off-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12a324b6
    • P
      hp_accel: add two more axis information · 9d7639d3
      Pavel Machek 提交于
      Add two more laptops to whitelist.
      Signed-off-by: NMichal Marek <mmarek@suse.cz>
      Signed-off-by: NPavel Machek <pavel@ucw.cz>
      Cc: Daniel Mack <daniel@caiaq.de>
      Cc: Eric Piel <eric.piel@tremplin-utc.net>
      Cc: Vladimir Botka <vbotka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d7639d3
    • I
      hwmon: Add LTC4215 driver · 72f5de92
      Ira Snyder 提交于
      Add Linux support for the Linear Technology LTC4215 Hot Swap controller
      I2C monitoring interface.
      
      I have tested the driver with my board, and it appears to work fine.  With
      the power supplies disabled, it reads 11.93V input, 1.93V output, no
      current and no power.  With the supplies enabled, it reads 11.93V input,
      11.98V output, no current, no power.  I'm not drawing any current at the
      moment, so this is reasonable.  The value in the sense register never
      reads anything except 0, so I expect to get zero from the current and
      power calculations.
      
      I didn't attempt to support changing any of the chip's settings or
      enabling the FET.  I'm not sure even how to do that and still fit within
      the hwmon framework.  :)
      Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72f5de92
    • D
      hwmon: LM95241 driver · 06160327
      Davide Rizzo 提交于
      An hwmon driver for the National Semiconductor LM95241 triple temperature
      sensors chip
      Signed-off-by: NDavide Rizzo <elpa-rizzo@gmail.com>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      06160327
    • P
      hp_accel: adev is poor name of exported symbol · be84cfc5
      Pavel Machek 提交于
      As Andrew noted, adev is pretty poor name for symbol being exported.
      Rename it to lis3.
      Signed-off-by: NPavel Machek <pavel@ucw.cz>
      Cc: Eric Piel <eric.piel@tremplin-utc.net>
      Cc: Vladimir Botka <vbotka@suse.cz>
      Cc: <Quoc.Pham@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      be84cfc5
    • P
      hp_accel: small documentation updates · 2b872903
      Pavel Machek 提交于
      Fix english in Documentation, add "how to test" description.
      Signed-off-by: NPavel Machek <pavel@suse.cz>
      Cc: Eric Piel <eric.piel@tremplin-utc.net>
      Cc: Vladimir Botka <vbotka@suse.cz>
      Cc: <Quoc.Pham@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b872903
    • D
      epoll keyed wakeups: make tty use keyed wakeups · 4b19449d
      Davide Libenzi 提交于
      Introduce keyed event wakeups inside the TTY code.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b19449d
    • D
      epoll keyed wakeups: make eventfd use keyed wakeups · 39510888
      Davide Libenzi 提交于
      Introduce keyed event wakeups inside the eventfd code.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      39510888
    • D
      epoll keyed wakeups: teach epoll about hints coming with the wakeup key · 2dfa4eea
      Davide Libenzi 提交于
      Use the events hint now sent by some devices, to avoid unnecessary wakeups
      for events that are of no interest for the caller.  This code handles both
      devices that are sending keyed events, and the ones that are not (and
      event the ones that sometimes send events, and sometimes don't).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2dfa4eea
    • D
      epoll keyed wakeups: make sockets use keyed wakeups · 37e5540b
      Davide Libenzi 提交于
      Add support for event-aware wakeups to the sockets code.  Events are
      delivered to the wakeup target, so that epoll can avoid spurious wakeups
      for non-interesting events.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Acked-by: NAlan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      37e5540b
    • D
      epoll keyed wakeups: introduce new *_poll() wakeup macros · c0da3775
      Davide Libenzi 提交于
      Introduce new wakeup macros that allow passing an event mask to the wakeup
      targets.  They exactly mimic their non-_poll() counterpart, with the added
      event mask passing capability.  I did add only the ones currently
      requested, avoiding the _nr() and _all() for the moment.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0da3775
    • D
      epoll keyed wakeups: add __wake_up_locked_key() and __wake_up_sync_key() · 4ede816a
      Davide Libenzi 提交于
      This patchset introduces wakeup hints for some of the most popular (from
      epoll POV) devices, so that epoll code can avoid spurious wakeups on its
      waiters.
      
      The problem with epoll is that the callback-based wakeups do not, ATM,
      carry any information about the events the wakeup is related to.  So the
      only choice epoll has (not being able to call f_op->poll() from inside the
      callback), is to add the file* to a ready-list and resolve the real events
      later on, at epoll_wait() (or its own f_op->poll()) time.  This can cause
      spurious wakeups, since the wake_up() itself might be for an event the
      caller is not interested into.
      
      The rate of these spurious wakeup can be pretty high in case of many
      network sockets being monitored.
      
      By allowing devices to report the events the wakeups refer to (at least
      the two major classes - POLLIN/POLLOUT), we are able to spare useless
      wakeups by proper handling inside the epoll's poll callback.
      
      Epoll will have in any case to call f_op->poll() on the file* later on,
      since the change to be done in order to have the full event set sent via
      wakeup, is too invasive for the way our f_op->poll() system works (the
      full event set is calculated inside the poll function - there are too many
      of them to even start thinking the change - also poll/select would need
      change too).
      
      Epoll is changed in a way that both devices which send event hints, and
      the ones that don't, are correctly handled.  The former will gain some
      efficiency though.
      
      As a general rule for devices, would be to add an event mask by using
      key-aware wakeup macros, when making up poll wait queues.  I tested it
      (together with the epoll's poll fix patch Andrew has in -mm) and wakeups
      for the supported devices are correctly filtered.
      
      Test program available here:
      
      http://www.xmailserver.org/epoll_test.c
      
      This patch:
      
      Nothing revolutionary here.  Just using the available "key" that our
      wakeup core already support.  The __wake_up_locked_key() was no brainer,
      since both __wake_up_locked() and __wake_up_locked_key() are thin wrappers
      around __wake_up_common().
      
      The __wake_up_sync() function had a body, so the choice was between
      borrowing the body for __wake_up_sync_key() and calling it from
      __wake_up_sync(), or make an inline and calling it from both.  I chose the
      former since in most archs it all resolves to "mov $0, REG; jmp ADDR".
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4ede816a
    • D
      eventfd: improve support for semaphore-like behavior · bcd0b235
      Davide Libenzi 提交于
      People started using eventfd in a semaphore-like way where before they
      were using pipes.
      
      That is, counter-based resource access.  Where a "wait()" returns
      immediately by decrementing the counter by one, if counter is greater than
      zero.  Otherwise will wait.  And where a "post(count)" will add count to
      the counter releasing the appropriate amount of waiters.  If eventfd the
      "post" (write) part is fine, while the "wait" (read) does not dequeue 1,
      but the whole counter value.
      
      The problem with eventfd is that a read() on the fd returns and wipes the
      whole counter, making the use of it as semaphore a little bit more
      cumbersome.  You can do a read() followed by a write() of COUNTER-1, but
      IMO it's pretty easy and cheap to make this work w/out extra steps.  This
      patch introduces a new eventfd flag that tells eventfd to only dequeue 1
      from the counter, allowing simple read/write to make it behave like a
      semaphore.  Simple test here:
      
      http://www.xmailserver.org/eventfd-sem.c
      
      To be back-compatible with earlier kernels, userspace applications should
      probe for the availability of this feature via
      
      #ifdef EFD_SEMAPHORE
      	fd = eventfd2 (CNT, EFD_SEMAPHORE);
      	if (fd == -1 && errno == EINVAL)
      		<fallback>
      #else
      		<fallback>
      #endif
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: <linux-api@vger.kernel.org>
      Tested-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bcd0b235
    • T
      epoll: use real type instead of void * · 4f0989db
      Tony Battersby 提交于
      eventpoll.c uses void * in one place for no obvious reason; change it to
      use the real type instead.
      Signed-off-by: NTony Battersby <tonyb@cybernetics.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4f0989db
    • T
      epoll: clean up ep_modify · e057e15f
      Tony Battersby 提交于
      ep_modify() doesn't need to set event.data from within the ep->lock
      spinlock as the comment suggests.  The only place event.data is used is
      ep_send_events_proc(), and this is protected by ep->mtx instead of
      ep->lock.  Also update the comment for mutex_lock() at the top of
      ep_scan_ready_list(), which mentions epoll_ctl(EPOLL_CTL_DEL) but not
      epoll_ctl(EPOLL_CTL_MOD).
      
      ep_modify() can also use spin_lock_irq() instead of spin_lock_irqsave().
      Signed-off-by: NTony Battersby <tonyb@cybernetics.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e057e15f
    • T
      epoll: remove unnecessary xchg · d1bc90dd
      Tony Battersby 提交于
      xchg in ep_unregister_pollwait() is unnecessary because it is protected by
      either epmutex or ep->mtx (the same protection as ep_remove()).
      
      If xchg was necessary, it would be insufficient to protect against
      problems: if multiple concurrent calls to ep_unregister_pollwait() were
      possible then a second caller that returns without doing anything because
      nwait == 0 could return before the waitqueues are removed by the first
      caller, which looks like it could lead to problematic races with
      ep_poll_callback().
      
      So remove xchg and add comments about the locking.
      Signed-off-by: NTony Battersby <tonyb@cybernetics.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d1bc90dd
    • T
      epoll: remember the event if epoll_wait returns -EFAULT · d0305882
      Tony Battersby 提交于
      If epoll_wait returns -EFAULT, the event that was being returned when the
      fault was encountered will be forgotten.  This is not a big deal since
      EFAULT will happen only if a buggy userspace program passes in a bad
      address, in which case what happens later usually doesn't matter.
      However, it is easy to remember the event for later, and this patch makes
      a simple change to do that.
      Signed-off-by: NTony Battersby <tonyb@cybernetics.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0305882
    • T
      epoll: don't use current in irq context · abff55ce
      Tony Battersby 提交于
      ep_call_nested() (formerly ep_poll_safewake()) uses "current" (without
      dereferencing it) to detect callback recursion, but it may be called from
      irq context where the use of current is generally discouraged.  It would
      be better to use get_cpu() and put_cpu() to detect the callback recursion.
      Signed-off-by: NTony Battersby <tonyb@cybernetics.com>
      Acked-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      abff55ce
    • D
      epoll: remove debugging code · bb57c3ed
      Davide Libenzi 提交于
      Remove debugging code from epoll.  There's no need for it to be included
      into mainline code.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bb57c3ed
    • D
      epoll: fix epoll's own poll (update) · 296e236e
      Davide Libenzi 提交于
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      296e236e
    • D
      epoll: fix epoll's own poll · 5071f97e
      Davide Libenzi 提交于
      Fix a bug inside the epoll's f_op->poll() code, that returns POLLIN even
      though there are no actual ready monitored fds.  The bug shows up if you
      add an epoll fd inside another fd container (poll, select, epoll).
      
      The problem is that callback-based wake ups used by epoll does not carry
      (patches will follow, to fix this) any information about the events that
      actually happened.  So the callback code, since it can't call the file*
      ->poll() inside the callback, chains the file* into a ready-list.
      
      So, suppose you added an fd with EPOLLOUT only, and some data shows up on
      the fd, the file* mapped by the fd will be added into the ready-list (via
      wakeup callback).  During normal epoll_wait() use, this condition is
      sorted out at the time we're actually able to call the file*'s
      f_op->poll().
      
      Inside the old epoll's f_op->poll() though, only a quick check
      !list_empty(ready-list) was performed, and this could have led to
      reporting POLLIN even though no ready fds would show up at a following
      epoll_wait().  In order to correctly report the ready status for an epoll
      fd, the ready-list must be checked to see if any really available fd+event
      would be ready in a following epoll_wait().
      
      Operation (calling f_op->poll() from inside f_op->poll()) that, like wake
      ups, must be handled with care because of the fact that epoll fds can be
      added to other epoll fds.
      
      Test code:
      
      /*
       *  epoll_test by Davide Libenzi (Simple code to test epoll internals)
       *  Copyright (C) 2008  Davide Libenzi
       *
       *  This program is free software; you can redistribute it and/or modify
       *  it under the terms of the GNU General Public License as published by
       *  the Free Software Foundation; either version 2 of the License, or
       *  (at your option) any later version.
       *
       *  This program is distributed in the hope that it will be useful,
       *  but WITHOUT ANY WARRANTY; without even the implied warranty of
       *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
       *  GNU General Public License for more details.
       *
       *  You should have received a copy of the GNU General Public License
       *  along with this program; if not, write to the Free Software
       *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
       *
       *  Davide Libenzi <davidel@xmailserver.org>
       *
       */
      
      #include <sys/types.h>
      #include <unistd.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <string.h>
      #include <errno.h>
      #include <signal.h>
      #include <limits.h>
      #include <poll.h>
      #include <sys/epoll.h>
      #include <sys/wait.h>
      
      #define EPWAIT_TIMEO	(1 * 1000)
      #ifndef POLLRDHUP
      #define POLLRDHUP 0x2000
      #endif
      
      #define EPOLL_MAX_CHAIN	100L
      
      #define EPOLL_TF_LOOP (1 << 0)
      
      struct epoll_test_cfg {
      	long size;
      	long flags;
      };
      
      static int xepoll_create(int n) {
      	int epfd;
      
      	if ((epfd = epoll_create(n)) == -1) {
      		perror("epoll_create");
      		exit(2);
      	}
      
      	return epfd;
      }
      
      static void xepoll_ctl(int epfd, int cmd, int fd, struct epoll_event *evt) {
      	if (epoll_ctl(epfd, cmd, fd, evt) < 0) {
      		perror("epoll_ctl");
      		exit(3);
      	}
      }
      
      static void xpipe(int *fds) {
      	if (pipe(fds)) {
      		perror("pipe");
      		exit(4);
      	}
      }
      
      static pid_t xfork(void) {
      	pid_t pid;
      
      	if ((pid = fork()) == (pid_t) -1) {
      		perror("pipe");
      		exit(5);
      	}
      
      	return pid;
      }
      
      static int run_forked_proc(int (*proc)(void *), void *data) {
      	int status;
      	pid_t pid;
      
      	if ((pid = xfork()) == 0)
      		exit((*proc)(data));
      	if (waitpid(pid, &status, 0) != pid) {
      		perror("waitpid");
      		return -1;
      	}
      
      	return WIFEXITED(status) ? WEXITSTATUS(status): -2;
      }
      
      static int check_events(int fd, int timeo) {
      	struct pollfd pfd;
      
      	fprintf(stdout, "Checking events for fd %d\n", fd);
      	memset(&pfd, 0, sizeof(pfd));
      	pfd.fd = fd;
      	pfd.events = POLLIN | POLLOUT;
      	if (poll(&pfd, 1, timeo) < 0) {
      		perror("poll()");
      		return 0;
      	}
      	if (pfd.revents & POLLIN)
      		fprintf(stdout, "\tPOLLIN\n");
      	if (pfd.revents & POLLOUT)
      		fprintf(stdout, "\tPOLLOUT\n");
      	if (pfd.revents & POLLERR)
      		fprintf(stdout, "\tPOLLERR\n");
      	if (pfd.revents & POLLHUP)
      		fprintf(stdout, "\tPOLLHUP\n");
      	if (pfd.revents & POLLRDHUP)
      		fprintf(stdout, "\tPOLLRDHUP\n");
      
      	return pfd.revents;
      }
      
      static int epoll_test_tty(void *data) {
      	int epfd, ifd = fileno(stdin), res;
      	struct epoll_event evt;
      
      	if (check_events(ifd, 0) != POLLOUT) {
      		fprintf(stderr, "Something is cooking on STDIN (%d)\n", ifd);
      		return 1;
      	}
      	epfd = xepoll_create(1);
      	fprintf(stdout, "Created epoll fd (%d)\n", epfd);
      	memset(&evt, 0, sizeof(evt));
      	evt.events = EPOLLIN;
      	xepoll_ctl(epfd, EPOLL_CTL_ADD, ifd, &evt);
      	if (check_events(epfd, 0) & POLLIN) {
      		res = epoll_wait(epfd, &evt, 1, 0);
      		if (res == 0) {
      			fprintf(stderr, "Epoll fd (%d) is ready when it shouldn't!\n",
      				epfd);
      			return 2;
      		}
      	}
      
      	return 0;
      }
      
      static int epoll_wakeup_chain(void *data) {
      	struct epoll_test_cfg *tcfg = data;
      	int i, res, epfd, bfd, nfd, pfds[2];
      	pid_t pid;
      	struct epoll_event evt;
      
      	memset(&evt, 0, sizeof(evt));
      	evt.events = EPOLLIN;
      
      	epfd = bfd = xepoll_create(1);
      
      	for (i = 0; i < tcfg->size; i++) {
      		nfd = xepoll_create(1);
      		xepoll_ctl(bfd, EPOLL_CTL_ADD, nfd, &evt);
      		bfd = nfd;
      	}
      	xpipe(pfds);
      	if (tcfg->flags & EPOLL_TF_LOOP)
      	{
      		xepoll_ctl(bfd, EPOLL_CTL_ADD, epfd, &evt);
      		/*
      		 * If we're testing for loop, we want that the wakeup
      		 * triggered by the write to the pipe done in the child
      		 * process, triggers a fake event. So we add the pipe
      		 * read size with EPOLLOUT events. This will trigger
      		 * an addition to the ready-list, but no real events
      		 * will be there. The the epoll kernel code will proceed
      		 * in calling f_op->poll() of the epfd, triggering the
      		 * loop we want to test.
      		 */
      		evt.events = EPOLLOUT;
      	}
      	xepoll_ctl(bfd, EPOLL_CTL_ADD, pfds[0], &evt);
      
      	/*
      	 * The pipe write must come after the poll(2) call inside
      	 * check_events(). This tests the nested wakeup code in
      	 * fs/eventpoll.c:ep_poll_safewake()
      	 * By having the check_events() (hence poll(2)) happens first,
      	 * we have poll wait queue filled up, and the write(2) in the
      	 * child will trigger the wakeup chain.
      	 */
      	if ((pid = xfork()) == 0) {
      		sleep(1);
      		write(pfds[1], "w", 1);
      		exit(0);
      	}
      
      	res = check_events(epfd, 2000) & POLLIN;
      
      	if (waitpid(pid, NULL, 0) != pid) {
      		perror("waitpid");
      		return -1;
      	}
      
      	return res;
      }
      
      static int epoll_poll_chain(void *data) {
      	struct epoll_test_cfg *tcfg = data;
      	int i, res, epfd, bfd, nfd, pfds[2];
      	pid_t pid;
      	struct epoll_event evt;
      
      	memset(&evt, 0, sizeof(evt));
      	evt.events = EPOLLIN;
      
      	epfd = bfd = xepoll_create(1);
      
      	for (i = 0; i < tcfg->size; i++) {
      		nfd = xepoll_create(1);
      		xepoll_ctl(bfd, EPOLL_CTL_ADD, nfd, &evt);
      		bfd = nfd;
      	}
      	xpipe(pfds);
      	if (tcfg->flags & EPOLL_TF_LOOP)
      	{
      		xepoll_ctl(bfd, EPOLL_CTL_ADD, epfd, &evt);
      		/*
      		 * If we're testing for loop, we want that the wakeup
      		 * triggered by the write to the pipe done in the child
      		 * process, triggers a fake event. So we add the pipe
      		 * read size with EPOLLOUT events. This will trigger
      		 * an addition to the ready-list, but no real events
      		 * will be there. The the epoll kernel code will proceed
      		 * in calling f_op->poll() of the epfd, triggering the
      		 * loop we want to test.
      		 */
      		evt.events = EPOLLOUT;
      	}
      	xepoll_ctl(bfd, EPOLL_CTL_ADD, pfds[0], &evt);
      
      	/*
      	 * The pipe write mush come before the poll(2) call inside
      	 * check_events(). This tests the nested f_op->poll calls code in
      	 * fs/eventpoll.c:ep_eventpoll_poll()
      	 * By having the pipe write(2) happen first, we make the kernel
      	 * epoll code to load the ready lists, and the following poll(2)
      	 * done inside check_events() will test nested poll code in
      	 * ep_eventpoll_poll().
      	 */
      	if ((pid = xfork()) == 0) {
      		write(pfds[1], "w", 1);
      		exit(0);
      	}
      	sleep(1);
      	res = check_events(epfd, 1000) & POLLIN;
      
      	if (waitpid(pid, NULL, 0) != pid) {
      		perror("waitpid");
      		return -1;
      	}
      
      	return res;
      }
      
      int main(int ac, char **av) {
      	int error;
      	struct epoll_test_cfg tcfg;
      
      	fprintf(stdout, "\n********** Testing TTY events\n");
      	error = run_forked_proc(epoll_test_tty, NULL);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = 3;
      	tcfg.flags = 0;
      	fprintf(stdout, "\n********** Testing short wakeup chain\n");
      	error = run_forked_proc(epoll_wakeup_chain, &tcfg);
      	fprintf(stdout, error == POLLIN ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = EPOLL_MAX_CHAIN;
      	tcfg.flags = 0;
      	fprintf(stdout, "\n********** Testing long wakeup chain (HOLD ON)\n");
      	error = run_forked_proc(epoll_wakeup_chain, &tcfg);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = 3;
      	tcfg.flags = 0;
      	fprintf(stdout, "\n********** Testing short poll chain\n");
      	error = run_forked_proc(epoll_poll_chain, &tcfg);
      	fprintf(stdout, error == POLLIN ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = EPOLL_MAX_CHAIN;
      	tcfg.flags = 0;
      	fprintf(stdout, "\n********** Testing long poll chain (HOLD ON)\n");
      	error = run_forked_proc(epoll_poll_chain, &tcfg);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = 3;
      	tcfg.flags = EPOLL_TF_LOOP;
      	fprintf(stdout, "\n********** Testing loopy wakeup chain (HOLD ON)\n");
      	error = run_forked_proc(epoll_wakeup_chain, &tcfg);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = 3;
      	tcfg.flags = EPOLL_TF_LOOP;
      	fprintf(stdout, "\n********** Testing loopy poll chain (HOLD ON)\n");
      	error = run_forked_proc(epoll_poll_chain, &tcfg);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	return 0;
      }
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5071f97e