1. 22 5月, 2013 1 次提交
    • R
      Add include dependencies to <linux/printk.h>. · 154c2670
      Ralf Baechle 提交于
      If <linux/linkage.h> has not been included before <linux/printk.h>,
      a build error like the below one will result:
      
        CC      arch/mips/kernel/idle.o
      In file included from arch/mips/kernel/idle.c:17:0:
      include/linux/printk.h:109:1: error: data definition has no type or storage class [-Werror]
      include/linux/printk.h:109:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
      include/linux/printk.h:110:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
      include/linux/printk.h:110:1: error: expected ‘,’ or ‘;’ before ‘int’
      include/linux/printk.h:114:1: error: data definition has no type or storage class [-Werror]
      include/linux/printk.h:114:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
      include/linux/printk.h:115:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
      include/linux/printk.h:115:1: error: expected ‘,’ or ‘;’ before ‘int’
      include/linux/printk.h:117:1: error: data definition has no type or storage class [-Werror]
      include/linux/printk.h:117:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
      include/linux/printk.h:118:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
      include/linux/printk.h:118:1: error: ‘__cold__’ attribute ignored [-Werror=attributes]
      include/linux/printk.h:118:1: error: expected ‘,’ or ‘;’ before ‘asmlinkage’
      include/linux/printk.h:122:1: error: data definition has no type or storage class [-Werror]
      include/linux/printk.h:122:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
      include/linux/printk.h:123:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
      include/linux/printk.h:123:1: error: ‘__cold__’ attribute ignored [-Werror=attributes]
      include/linux/printk.h:123:1: error: expected ‘,’ or ‘;’ before ‘int’
      In file included from include/linux/kernel.h:14:0,
                       from include/linux/sched.h:15,
                       from arch/mips/kernel/idle.c:18:
      include/linux/dynamic_debug.h: In function ‘ddebug_dyndbg_module_param_cb’:
      include/linux/dynamic_debug.h:124:3: error: implicit declaration of function ‘printk’ [-Werror=implicit-function-declaration]
      
      Fixed by including <linux/linkage.h>.
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      154c2670
  2. 21 5月, 2013 4 次提交
    • F
      phy: add phy_mac_interrupt() to use with PHY_IGNORE_INTERRUPT · 5ea94e76
      Florian Fainelli 提交于
      There is currently no way for an Ethernet MAC driver servicing PHY link
      interrupts to notify this to the PHY state machine without defining its
      own state machine. Since most drivers are not so special, introduce a
      helper: phy_mac_interrupt() which can be called from a link up/down
      interrupt routine to update the PHY state machine. To avoid code
      duplication some refactoring has been done to expose the workqueue and
      its corresponding callback internally.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ea94e76
    • F
      phy: fix the use of PHY_IGNORE_INTERRUPT · 2c7b4921
      Florian Fainelli 提交于
      When a PHY device is registered with the special IRQ value
      PHY_IGNORE_INTERRUPT (-2) it will not properly be handled by the PHY
      library:
      
      - it continues to poll its register, while we do not want this
        because such PHY link events or register changes are serviced by an
        Ethernet MAC
      - it will still try to configure PHY interrupts at the PHY level, such
        interrupts do not exist at the PHY but at the MAC level
      - the state machine only handles PHY_POLL, but should also handle
        PHY_IGNORE_INTERRUPT similarly
      
      This patch updates the PHY state machine and initialization paths to
      account for the specific PHY_IGNORE_INTERRUPT. Based on an earlier patch
      by Thomas Petazzoni, and reworked to add the missing bits. Add a helper
      phy_interrupt_is_valid() which specifically tests for a PHY interrupt
      not to be PHY_POLL or PHY_IGNORE_INTERRUPT and use it throughout the
      code.
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c7b4921
    • W
      rps: selective flow shedding during softnet overflow · 99bbc707
      Willem de Bruijn 提交于
      A cpu executing the network receive path sheds packets when its input
      queue grows to netdev_max_backlog. A single high rate flow (such as a
      spoofed source DoS) can exceed a single cpu processing rate and will
      degrade throughput of other flows hashed onto the same cpu.
      
      This patch adds a more fine grained hashtable. If the netdev backlog
      is above a threshold, IRQ cpus track the ratio of total traffic of
      each flow (using 4096 buckets, configurable). The ratio is measured
      by counting the number of packets per flow over the last 256 packets
      from the source cpu. Any flow that occupies a large fraction of this
      (set at 50%) will see packet drop while above the threshold.
      
      Tested:
      Setup is a muli-threaded UDP echo server with network rx IRQ on cpu0,
      kernel receive (RPS) on cpu0 and application threads on cpus 2--7
      each handling 20k req/s. Throughput halves when hit with a 400 kpps
      antagonist storm. With this patch applied, antagonist overload is
      dropped and the server processes its complete load.
      
      The patch is effective when kernel receive processing is the
      bottleneck. The above RPS scenario is a extreme, but the same is
      reached with RFS and sufficient kernel processing (iptables, packet
      socket tap, ..).
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99bbc707
    • P
      tty/vt: Fix vc_deallocate() lock order · 421b40a6
      Peter Hurley 提交于
      Now that the tty port owns the flip buffers and i/o is allowed
      from the driver even when no tty is attached, the destruction
      of the tty port (and the flip buffers) must ensure that no
      outstanding work is pending.
      
      Unfortunately, this creates a lock order problem with the
      console_lock (see attached lockdep report [1] below).
      
      For single console deallocation, drop the console_lock prior
      to port destruction. When multiple console deallocation,
      defer port destruction until the consoles have been
      deallocated.
      
      tty_port_destroy() is not required if the port has not
      been used; remove from vc_allocate() failure path.
      
      [1] lockdep report from Dave Jones <davej@redhat.com>
      
       ======================================================
       [ INFO: possible circular locking dependency detected ]
       3.9.0+ #16 Not tainted
       -------------------------------------------------------
       (agetty)/26163 is trying to acquire lock:
       blocked:  ((&buf->work)){+.+...}, instance: ffff88011c8b0020, at: [<ffffffff81062065>] flush_work+0x5/0x2e0
      
       but task is already holding lock:
       blocked:  (console_lock){+.+.+.}, instance: ffffffff81c2fde0, at: [<ffffffff813bc201>] vt_ioctl+0xb61/0x1230
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (console_lock){+.+.+.}:
              [<ffffffff810b3f74>] lock_acquire+0xa4/0x210
              [<ffffffff810416c7>] console_lock+0x77/0x80
              [<ffffffff813c3dcd>] con_flush_chars+0x2d/0x50
              [<ffffffff813b32b2>] n_tty_receive_buf+0x122/0x14d0
              [<ffffffff813b7709>] flush_to_ldisc+0x119/0x170
              [<ffffffff81064381>] process_one_work+0x211/0x700
              [<ffffffff8106498b>] worker_thread+0x11b/0x3a0
              [<ffffffff8106ce5d>] kthread+0xed/0x100
              [<ffffffff81601cac>] ret_from_fork+0x7c/0xb0
      
       -> #0 ((&buf->work)){+.+...}:
              [<ffffffff810b349a>] __lock_acquire+0x193a/0x1c00
              [<ffffffff810b3f74>] lock_acquire+0xa4/0x210
              [<ffffffff810620ae>] flush_work+0x4e/0x2e0
              [<ffffffff81065305>] __cancel_work_timer+0x95/0x130
              [<ffffffff810653b0>] cancel_work_sync+0x10/0x20
              [<ffffffff813b8212>] tty_port_destroy+0x12/0x20
              [<ffffffff813c65e8>] vc_deallocate+0xf8/0x110
              [<ffffffff813bc20c>] vt_ioctl+0xb6c/0x1230
              [<ffffffff813b01a5>] tty_ioctl+0x285/0xd50
              [<ffffffff811ba825>] do_vfs_ioctl+0x305/0x530
              [<ffffffff811baad1>] sys_ioctl+0x81/0xa0
              [<ffffffff81601d59>] system_call_fastpath+0x16/0x1b
      
       other info that might help us debug this:
      
       [ 6760.076175]  Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(console_lock);
                                      lock((&buf->work));
                                      lock(console_lock);
         lock((&buf->work));
      
        *** DEADLOCK ***
      
       1 lock on stack by (agetty)/26163:
        #0: blocked:  (console_lock){+.+.+.}, instance: ffffffff81c2fde0, at: [<ffffffff813bc201>] vt_ioctl+0xb61/0x1230
       stack backtrace:
       Pid: 26163, comm: (agetty) Not tainted 3.9.0+ #16
       Call Trace:
        [<ffffffff815edb14>] print_circular_bug+0x200/0x20e
        [<ffffffff810b349a>] __lock_acquire+0x193a/0x1c00
        [<ffffffff8100a269>] ? sched_clock+0x9/0x10
        [<ffffffff8100a269>] ? sched_clock+0x9/0x10
        [<ffffffff8100a200>] ? native_sched_clock+0x20/0x80
        [<ffffffff810b3f74>] lock_acquire+0xa4/0x210
        [<ffffffff81062065>] ? flush_work+0x5/0x2e0
        [<ffffffff810620ae>] flush_work+0x4e/0x2e0
        [<ffffffff81062065>] ? flush_work+0x5/0x2e0
        [<ffffffff810b15db>] ? mark_held_locks+0xbb/0x140
        [<ffffffff8113c8a3>] ? __free_pages_ok.part.57+0x93/0xc0
        [<ffffffff810b15db>] ? mark_held_locks+0xbb/0x140
        [<ffffffff810652f2>] ? __cancel_work_timer+0x82/0x130
        [<ffffffff81065305>] __cancel_work_timer+0x95/0x130
        [<ffffffff810653b0>] cancel_work_sync+0x10/0x20
        [<ffffffff813b8212>] tty_port_destroy+0x12/0x20
        [<ffffffff813c65e8>] vc_deallocate+0xf8/0x110
        [<ffffffff813bc20c>] vt_ioctl+0xb6c/0x1230
        [<ffffffff810aec41>] ? lock_release_holdtime.part.30+0xa1/0x170
        [<ffffffff813b01a5>] tty_ioctl+0x285/0xd50
        [<ffffffff812b00f6>] ? inode_has_perm.isra.46.constprop.61+0x56/0x80
        [<ffffffff811ba825>] do_vfs_ioctl+0x305/0x530
        [<ffffffff812b04db>] ? selinux_file_ioctl+0x5b/0x110
        [<ffffffff811baad1>] sys_ioctl+0x81/0xa0
        [<ffffffff81601d59>] system_call_fastpath+0x16/0x1b
      
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      421b40a6
  3. 20 5月, 2013 3 次提交
    • E
      filter: do not output bpf image address for security reason · 16495445
      Eric Dumazet 提交于
      Do not leak starting address of BPF JIT code for non root users,
      as it might help intruders to perform an attack.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Daniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16495445
    • Y
      tcp: remove bad timeout logic in fast recovery · 3e59cb0d
      Yuchung Cheng 提交于
      tcp_timeout_skb() was intended to trigger fast recovery on timeout,
      unfortunately in reality it often causes spurious retransmission
      storms during fast recovery. The particular sign is a fast retransmit
      over the highest sacked sequence (SND.FACK).
      
      Currently the RTO timer re-arming (as in RFC6298) offers a nice cushion
      to avoid spurious timeout: when SND.UNA advances the sender re-arms
      RTO and extends the timeout by icsk_rto. The sender does not offset
      the time elapsed since the packet at SND.UNA was sent.
      
      But if the next (DUP)ACK arrives later than ~RTTVAR and triggers
      tcp_fastretrans_alert(), then tcp_timeout_skb() will mark any packet
      sent before the icsk_rto interval lost, including one that's above the
      highest sacked sequence. Most likely a large part of scorebard will be
      marked.
      
      If most packets are not lost then the subsequent DUPACKs with new SACK
      blocks will cause the sender to continue to retransmit packets beyond
      SND.FACK spuriously. Even if only one packet is lost the sender may
      falsely retransmit almost the entire window.
      
      The situation becomes common in the world of bufferbloat: the RTT
      continues to grow as the queue builds up but RTTVAR remains small and
      close to the minimum 200ms. If a data packet is lost and the DUPACK
      triggered by the next data packet is slightly delayed, then a spurious
      retransmission storm forms.
      
      As the original comment on tcp_timeout_skb() suggests: the usefulness
      of this feature is questionable. It also wastes cycles walking the
      sack scoreboard and is actually harmful because of false recovery.
      
      It's time to remove this.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NNandita Dukkipati <nanditad@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e59cb0d
    • R
      Hoist memcpy_fromiovec/memcpy_toiovec into lib/ · d2f83e90
      Rusty Russell 提交于
      ERROR: "memcpy_fromiovec" [drivers/vhost/vhost_scsi.ko] undefined!
      
      That function is only present with CONFIG_NET.  Turns out that
      crypto/algif_skcipher.c also uses that outside net, but it actually
      needs sockets anyway.
      
      In addition, commit 6d4f0139 added
      CONFIG_NET dependency to CONFIG_VMCI for memcpy_toiovec, so hoist
      that function and revert that commit too.
      
      socket.h already includes uio.h, so no callers need updating; trying
      only broke things fo x86_64 randconfig (thanks Fengguang!).
      Reported-by: NRandy Dunlap <rdunlap@infradead.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      d2f83e90
  4. 18 5月, 2013 2 次提交
  5. 17 5月, 2013 4 次提交
  6. 16 5月, 2013 1 次提交
  7. 15 5月, 2013 3 次提交
  8. 14 5月, 2013 1 次提交
  9. 13 5月, 2013 1 次提交
  10. 12 5月, 2013 1 次提交
  11. 10 5月, 2013 5 次提交
    • A
      dm: document iterate_devices · 058ce5ca
      Alasdair G Kergon 提交于
      Document iterate_devices in device-mapper.h.
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      058ce5ca
    • M
      tracing: Modify soft-mode only if there's no other referrer · 1cf4c073
      Masami Hiramatsu 提交于
      Modify soft-mode flag only if no other soft-mode referrer
      (currently only the ftrace triggers) by using a reference
      counter in each ftrace_event_file.
      
      Without this fix, adding and removing several different
      enable/disable_event triggers on the same event clear
      soft-mode bit from the ftrace_event_file. This also
      happens with a typo of glob on setting triggers.
      
      e.g.
      
       # echo vfs_symlink:enable_event:net:netif_rx > set_ftrace_filter
       # cat events/net/netif_rx/enable
       0*
       # echo typo_func:enable_event:net:netif_rx > set_ftrace_filter
       # cat events/net/netif_rx/enable
       0
       # cat set_ftrace_filter
       #### all functions enabled ####
       vfs_symlink:enable_event:net:netif_rx:unlimited
      
      As above, we still have a trigger, but soft-mode is gone.
      
      Link: http://lkml.kernel.org/r/20130509054429.30398.7464.stgit@mhiramat-M0-7522
      
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: David Sharp <dhsharp@google.com>
      Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
      Cc: Tom Zanussi <tom.zanussi@intel.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1cf4c073
    • M
      ftrace, kprobes: Fix a deadlock on ftrace_regex_lock · f04f24fb
      Masami Hiramatsu 提交于
      Fix a deadlock on ftrace_regex_lock which happens when setting
      an enable_event trigger on dynamic kprobe event as below.
      
      ----
      sh-2.05b# echo p vfs_symlink > kprobe_events
      sh-2.05b# echo vfs_symlink:enable_event:kprobes:p_vfs_symlink_0 > set_ftrace_filter
      
      =============================================
      [ INFO: possible recursive locking detected ]
      3.9.0+ #35 Not tainted
      ---------------------------------------------
      sh/72 is trying to acquire lock:
       (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810ba6c1>] ftrace_set_hash+0x81/0x1f0
      
      but task is already holding lock:
       (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810b7cbd>] ftrace_regex_write.isra.29.part.30+0x3d/0x220
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(ftrace_regex_lock);
        lock(ftrace_regex_lock);
      
       *** DEADLOCK ***
      ----
      
      To fix that, this introduces a finer regex_lock for each ftrace_ops.
      ftrace_regex_lock is too big of a lock which protects all
      filter/notrace_hash operations, but it doesn't need to be a global
      lock after supporting multiple ftrace_ops because each ftrace_ops
      has its own filter/notrace_hash.
      
      Link: http://lkml.kernel.org/r/20130509054417.30398.84254.stgit@mhiramat-M0-7522
      
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Tom Zanussi <tom.zanussi@intel.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      [ Added initialization flag and automate mutex initialization for
        non ftrace.c ftrace_probes. ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f04f24fb
    • F
      ARM: imx: Select GENERIC_ALLOCATOR · 60371952
      Fabio Estevam 提交于
      Since commit 657eee7d (media: coda: use genalloc API) the following build
      error happens with imx_v4_v5_defconfig:
      
      drivers/built-in.o: In function 'coda_remove':
      clk-composite.c:(.text+0x112180): undefined reference to 'gen_pool_free'
      drivers/built-in.o: In function 'coda_probe':
      clk-composite.c:(.text+0x112310): undefined reference to 'of_get_named_gen_pool'
      clk-composite.c:(.text+0x1123f4): undefined reference to 'gen_pool_alloc'
      clk-composite.c:(.text+0x11240c): undefined reference to 'gen_pool_virt_to_phys'
      clk-composite.c:(.text+0x112458): undefined reference to 'dev_get_gen_pool'
      
      Select GENERIC_ALLOCATOR and get rid of the custom IRAM_ALLOC.
      Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
      Signed-off-by: NShawn Guo <shawn.guo@linaro.org>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      60371952
    • A
      91c2e0bc
  12. 09 5月, 2013 1 次提交
    • D
      usbnet: allow status interrupt URB to always be active · 6eecdc5f
      Dan Williams 提交于
      Some drivers (sierra_net) need the status interrupt URB
      active even when the device is closed, because they receive
      custom indications from firmware.  Add functions to refcount
      the status interrupt URB submit/kill operation so that
      sub-drivers and the generic driver don't fight over whether
      the status interrupt URB is active or not.
      
      A sub-driver can call usbnet_status_start() at any time, but
      the URB is only submitted the first time the function is
      called.  Likewise, when the sub-driver is done with the URB,
      it calls usbnet_status_stop() but the URB is only killed when
      all users have stopped it.  The URB is still killed and
      re-submitted for suspend/resume, as before, with the same
      refcount it had at suspend.
      Signed-off-by: NDan Williams <dcbw@redhat.com>
      Acked-by: NOliver Neukum <oliver@neukum.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6eecdc5f
  13. 08 5月, 2013 13 次提交
    • M
      NVMe: Simplify Firmware Activate code slightly · ab3ea5bf
      Matthew Wilcox 提交于
      Add definitions for the three Firmware Activate actions, and change the
      SCSI translation code to construct the command into a temporary variable
      instead of translating the endianness back-and-forth.
      Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
      Reviewed-by: NVishal Verma <vishal.l.verma@linux.intel.com>
      ab3ea5bf
    • K
      aio: don't include aio.h in sched.h · a27bb332
      Kent Overstreet 提交于
      Faster kernel compiles by way of fewer unnecessary includes.
      
      [akpm@linux-foundation.org: fix fallout]
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a27bb332
    • K
      aio: kill ki_retry · 41ef4eb8
      Kent Overstreet 提交于
      Thanks to Zach Brown's work to rip out the retry infrastructure, we don't
      need this anymore - ki_retry was only called right after the kiocb was
      initialized.
      
      This also refactors and trims some duplicated code, as well as cleaning up
      the refcounting/error handling a bit.
      
      [akpm@linux-foundation.org: use fmode_t in aio_run_iocb()]
      [akpm@linux-foundation.org: fix file_start_write/file_end_write tests]
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41ef4eb8
    • K
      aio: kill ki_key · 8a660890
      Kent Overstreet 提交于
      ki_key wasn't actually used for anything previously - it was always 0.
      Drop it to trim struct kiocb a bit.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a660890
    • E
      audit: Make testing for a valid loginuid explicit. · 780a7654
      Eric W. Biederman 提交于
      audit rule additions containing "-F auid!=4294967295" were failing
      with EINVAL because of a regression caused by e1760bd5.
      
      Apparently some userland audit rule sets want to know if loginuid uid
      has been set and are using a test for auid != 4294967295 to determine
      that.
      
      In practice that is a horrible way to ask if a value has been set,
      because it relies on subtle implementation details and will break
      every time the uid implementation in the kernel changes.
      
      So add a clean way to test if the audit loginuid has been set, and
      silently convert the old idiom to the cleaner and more comprehensible
      new idiom.
      
      Cc: <stable@vger.kernel.org> # 3.7
      Reported-By: NRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Tested-by: NRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      780a7654
    • K
      aio: kill batch allocation · a1c8eae7
      Kent Overstreet 提交于
      Previously, allocating a kiocb required touching quite a few global
      (well, per kioctx) cachelines...  so batching up allocation to amortize
      those was worthwhile.  But we've gotten rid of some of those, and in
      another couple of patches kiocb allocation won't require writing to any
      shared cachelines, so that means we can just rip this code out.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a1c8eae7
    • K
      aio: use cancellation list lazily · 0460fef2
      Kent Overstreet 提交于
      Cancelling kiocbs requires adding them to a per kioctx linked list,
      which is one of the few things we need to take the kioctx lock for in
      the fast path.  But most kiocbs can't be cancelled - so if we just do
      this lazily, we can avoid quite a bit of locking overhead.
      
      While we're at it, instead of using a flag bit switch to using ki_cancel
      itself to indicate that a kiocb has been cancelled/completed.  This lets
      us get rid of ki_flags entirely.
      
      [akpm@linux-foundation.org: remove buggy BUG()]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0460fef2
    • K
      wait: add wait_event_hrtimeout() · 774a08b3
      Kent Overstreet 提交于
      Analagous to wait_event_timeout() and friends, this adds
      wait_event_hrtimeout() and wait_event_interruptible_hrtimeout().
      
      Note that unlike the versions that use regular timers, these don't
      return the amount of time remaining when they return - instead, they
      return 0 or -ETIME if they timed out.  because I was uncomfortable with
      the semantics of doing it the other way (that I could get it right,
      anyways).
      
      If the timer expires, there's no real guarantee that expire_time -
      current_time would be <= 0 - due to timer slack certainly, and I'm not
      sure I want to know the implications of the different clock bases in
      hrtimers.
      
      If the timer does expire and the code calculates that the time remaining
      is nonnegative, that could be even worse if the calling code then reuses
      that timeout.  Probably safer to just return 0 then, but I could imagine
      weird bugs or at least unintended behaviour arising from that too.
      
      I came to the conclusion that if other users end up actually needing the
      amount of time remaining, the sanest thing to do would be to create a
      version that uses absolute timeouts instead of relative.
      
      [akpm@linux-foundation.org: fix description of `timeout' arg]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      774a08b3
    • K
      aio: make aio_put_req() lockless · 11599eba
      Kent Overstreet 提交于
      Freeing a kiocb needed to touch the kioctx for three things:
      
       * Pull it off the reqs_active list
       * Decrementing reqs_active
       * Issuing a wakeup, if the kioctx was in the process of being freed.
      
      This patch moves these to aio_complete(), for a couple reasons:
      
       * aio_complete() already has to issue the wakeup, so if we drop the
         kioctx refcount before aio_complete does its wakeup we don't have to
         do it twice.
       * aio_complete currently has to take the kioctx lock, so it makes sense
         for it to pull the kiocb off the reqs_active list too.
       * A later patch is going to change reqs_active to include unreaped
         completions - this will mean allocating a kiocb doesn't have to look
         at the ringbuffer. So taking the decrement of reqs_active out of
         kiocb_free() is useful prep work for that patch.
      
      This doesn't really affect cancellation, since existing (usb) code that
      implements a cancel function still calls aio_complete() - we just have
      to make sure that aio_complete does the necessary teardown for cancelled
      kiocbs.
      
      It does affect code paths where we free kiocbs that were never
      submitted; they need to decrement reqs_active and pull the kiocb off the
      reqs_active list.  This occurs in two places: kiocb_batch_free(), which
      is going away in a later patch, and the error path in io_submit_one.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      11599eba
    • K
      aio: move private stuff out of aio.h · 4e179bca
      Kent Overstreet 提交于
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4e179bca
    • K
      aio: kill return value of aio_complete() · 2d68449e
      Kent Overstreet 提交于
      Nothing used the return value, and it probably wasn't possible to use it
      safely for the locked versions (aio_complete(), aio_put_req()).  Just
      kill it.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Acked-by: NZach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d68449e
    • Z
      aio: remove retry-based AIO · 41003a7b
      Zach Brown 提交于
      This removes the retry-based AIO infrastructure now that nothing in tree
      is using it.
      
      We want to remove retry-based AIO because it is fundemantally unsafe.
      It retries IO submission from a kernel thread that has only assumed the
      mm of the submitting task.  All other task_struct references in the IO
      submission path will see the kernel thread, not the submitting task.
      This design flaw means that nothing of any meaningful complexity can use
      retry-based AIO.
      
      This removes all the code and data associated with the retry machinery.
      The most significant benefit of this is the removal of the locking
      around the unused run list in the submission path.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Signed-off-by: NZach Brown <zab@redhat.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41003a7b
    • Z
      aio: remove dead code from aio.h · 4b49bb8a
      Zach Brown 提交于
      Signed-off-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b49bb8a