1. 29 1月, 2016 1 次提交
    • P
      n_tty: Always wake up read()/poll() if new input · 33d71363
      Peter Hurley 提交于
      A read() in non-canonical mode when VMIN > 0 and VTIME == 0 does not
      complete until at least VMIN chars have been read (or the user buffer is
      full). In this infrequent read mode, n_tty_read() attempts to reduce
      wakeups by computing the amount of data still necessary to complete the
      read (minimum_to_wake) and only waking the read()/poll() when that much
      unread data has been processed. This is the only read mode for which
      new data does not necessarily generate a wakeup.
      
      However, this optimization is broken and commonly leads to hung reads
      even though the necessary amount of data has been received. Since the
      optimization is of marginal value anyway, just remove the whole
      thing. This also remedies a race between a concurrent poll() and
      read() in this mode, where the poll() can reset the minimum_to_wake
      of the read() (and vice versa).
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33d71363
  2. 28 1月, 2016 6 次提交
  3. 27 1月, 2016 1 次提交
  4. 14 12月, 2015 4 次提交
  5. 13 12月, 2015 1 次提交
    • P
      n_tty: Fix poll() after buffer-limited eof push read · ac8f3bf8
      Peter Hurley 提交于
      commit 40d5e090 ("n_tty: Fix EOF push handling") fixed EOF push
      for reads. However, that approach still allows a condition mismatch
      between poll() and read(), where poll() returns POLLIN but read()
      blocks. This state can happen when a previous read() returned because
      the user buffer was full and the next character was an EOF not at the
      beginning of the line. While the next read() will properly identify
      the condition and advance the read buffer tail without improperly
      indicating an EOF file condition (ie., read() will not mistakenly
      return 0), poll() will mistakenly indicate POLLIN.
      
      Although a possible solution would be to peek at the input buffer
      in n_tty_poll(), the better solution in this patch is to eat the
      EOF during the previous read() (ie., fix the problem by eliminating
      the condition).
      
      The current canon line buffer copy limits the scan for next end-of-line
      to the smaller of either,
         a. the remaining user buffer size
         b. completed lines in the input buffer
      When the remaining user buffer size is exactly one less than the
      end-of-line marked by EOF push, the EOF is not scanned nor skipped
      but left for subsequent reads. In the example below, the scan
      index 'eol' has stopped at the EOF because it is past the scan
      limit of 5 (not because it has found the next set bit in read_flags)
      
         user buffer [*nr = 5]    _ _ _ _ _
      
         read_flags               0 0 0 0 0   1
         input buffer             h e l l o [EOF]
                                  ^           ^
                                 /           /
                               tail        eol
      
         result: found = 0, tail += 5, *nr += 5
      
      Instead, allow the scan to peek ahead 1 byte (while still limiting the
      scan to completed lines in the input buffer). For the example above,
      
         result: found = 1, tail += 6, *nr += 5
      
      Because the scan limit is now bumped +1 byte, when the scan is
      completed, the tail advance and the user buffer copy limit is
      re-clamped to *nr when EOF is _not_ found.
      
      Fixes: 40d5e090 ("n_tty: Fix EOF push handling")
      Cc: <stable@vger.kernel.org> # 3.12+
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac8f3bf8
  6. 21 11月, 2015 1 次提交
  7. 18 10月, 2015 3 次提交
  8. 05 10月, 2015 1 次提交
    • K
      tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c · e81107d4
      Kosuke Tatsukawa 提交于
      My colleague ran into a program stall on a x86_64 server, where
      n_tty_read() was waiting for data even if there was data in the buffer
      in the pty.  kernel stack for the stuck process looks like below.
       #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
       #1 [ffff88303d107bd0] schedule at ffffffff815c513e
       #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
       #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
       #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
       #5 [ffff88303d107dd0] tty_read at ffffffff81368013
       #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
       #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
       #8 [ffff88303d107f00] sys_read at ffffffff811a4306
       #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7
      
      There seems to be two problems causing this issue.
      
      First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
      updates ldata->commit_head using smp_store_release() and then checks
      the wait queue using waitqueue_active().  However, since there is no
      memory barrier, __receive_buf() could return without calling
      wake_up_interactive_poll(), and at the same time, n_tty_read() could
      start to wait in wait_woken() as in the following chart.
      
              __receive_buf()                         n_tty_read()
      ------------------------------------------------------------------------
      if (waitqueue_active(&tty->read_wait))
      /* Memory operations issued after the
         RELEASE may be completed before the
         RELEASE operation has completed */
                                              add_wait_queue(&tty->read_wait, &wait);
                                              ...
                                              if (!input_available_p(tty, 0)) {
      smp_store_release(&ldata->commit_head,
                        ldata->read_head);
                                              ...
                                              timeout = wait_woken(&wait,
                                                TASK_INTERRUPTIBLE, timeout);
      ------------------------------------------------------------------------
      
      The second problem is that n_tty_read() also lacks a memory barrier
      call and could also cause __receive_buf() to return without calling
      wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
      as in the chart below.
      
              __receive_buf()                         n_tty_read()
      ------------------------------------------------------------------------
                                              spin_lock_irqsave(&q->lock, flags);
                                              /* from add_wait_queue() */
                                              ...
                                              if (!input_available_p(tty, 0)) {
                                              /* Memory operations issued after the
                                                 RELEASE may be completed before the
                                                 RELEASE operation has completed */
      smp_store_release(&ldata->commit_head,
                        ldata->read_head);
      if (waitqueue_active(&tty->read_wait))
                                              __add_wait_queue(q, wait);
                                              spin_unlock_irqrestore(&q->lock,flags);
                                              /* from add_wait_queue() */
                                              ...
                                              timeout = wait_woken(&wait,
                                                TASK_INTERRUPTIBLE, timeout);
      ------------------------------------------------------------------------
      
      There are also other places in drivers/tty/n_tty.c which have similar
      calls to waitqueue_active(), so instead of adding many memory barrier
      calls, this patch simply removes the call to waitqueue_active(),
      leaving just wake_up*() behind.
      
      This fixes both problems because, even though the memory access before
      or after the spinlocks in both wake_up*() and add_wait_queue() can
      sneak into the critical section, it cannot go past it and the critical
      section assures that they will be serialized (please see "INTER-CPU
      ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
      better explanation).  Moreover, the resulting code is much simpler.
      
      Latency measurement using a ping-pong test over a pty doesn't show any
      visible performance drop.
      Signed-off-by: NKosuke Tatsukawa <tatsu@ab.jp.nec.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e81107d4
  9. 24 7月, 2015 2 次提交
  10. 01 6月, 2015 1 次提交
  11. 25 5月, 2015 1 次提交
  12. 11 5月, 2015 1 次提交
    • P
      pty: Fix input race when closing · 1a48632f
      Peter Hurley 提交于
      A read() from a pty master may mistakenly indicate EOF (errno == -EIO)
      after the pty slave has closed, even though input data remains to be read.
      For example,
      
             pty slave       |        input worker        |    pty master
                             |                            |
                             |                            |   n_tty_read()
      pty_write()            |                            |     input avail? no
        add data             |                            |     sleep
        schedule worker  --->|                            |     .
                             |---> flush_to_ldisc()       |     .
      pty_close()            |       fill read buffer     |     .
        wait for worker      |       wakeup reader    --->|     .
                             |       read buffer full?    |---> input avail ? yes
                             |<---   yes - exit worker    |     copy 4096 bytes to user
        TTY_OTHER_CLOSED <---|                            |<--- kick worker
                             |                            |
      
      		                **** New read() before worker starts ****
      
                             |                            |   n_tty_read()
                             |                            |     input avail? no
                             |                            |     TTY_OTHER_CLOSED? yes
                             |                            |     return -EIO
      
      Several conditions are required to trigger this race:
      1. the ldisc read buffer must become full so the input worker exits
      2. the read() count parameter must be >= 4096 so the ldisc read buffer
         is empty
      3. the subsequent read() occurs before the kicked worker has processed
         more input
      
      However, the underlying cause of the race is that data is pipelined, while
      tty state is not; ie., data already written by the pty slave end is not
      yet visible to the pty master end, but state changes by the pty slave end
      are visible to the pty master end immediately.
      
      Pipeline the TTY_OTHER_CLOSED state through input worker to the reader.
      1. Introduce TTY_OTHER_DONE which is set by the input worker when
         TTY_OTHER_CLOSED is set and either the input buffers are flushed or
         input processing has completed. Readers/polls are woken when
         TTY_OTHER_DONE is set.
      2. Reader/poll checks TTY_OTHER_DONE instead of TTY_OTHER_CLOSED.
      3. A new input worker is started from pty_close() after setting
         TTY_OTHER_CLOSED, which ensures the TTY_OTHER_DONE state will be
         set if the last input worker is already finished (or just about to
         exit).
      
      Remove tty_flush_to_ldisc(); no in-tree callers.
      
      Fixes: 52bce7f8 ("pty, n_tty: Simplify input processing on final close")
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96311
      BugLink: http://bugs.launchpad.net/bugs/1429756
      Cc: <stable@vger.kernel.org> # 3.19+
      Reported-by: NAndy Whitcroft <apw@canonical.com>
      Reported-by: NH.J. Lu <hjl.tools@gmail.com>
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a48632f
  13. 07 5月, 2015 1 次提交
  14. 03 2月, 2015 7 次提交
    • P
      n_tty: Fix signal handling flushes · d2b6f447
      Peter Hurley 提交于
      BRKINT and ISIG requires input and output flush when a signal char
      is received. However, the order of operations is significant since
      parallel i/o may be ongoing.
      
      Merge the signal handling for BRKINT with ISIG handling.
      
      Process the signal first. This ensures any ongoing i/o is aborted;
      without this, a waiting writer may continue writing after the flush
      occurs and after the signal char has been echoed.
      
      Write lock the termios_rwsem, which excludes parallel writers from
      pushing new i/o until after the output buffers are flushed; claiming
      the write lock is necessary anyway to exclude parallel readers while
      the read buffer is flushed.
      
      Subclass the termios_rwsem for ptys since the slave pty performing
      the flush may appear to reorder the termios_rwsem->tty buffer lock
      lock order; adding annotation clarifies that
        slave tty_buffer lock-> slave termios_rwsem -> master tty_buffer lock
      is a valid lock order.
      
      Flush the echo buffer. In this context, the echo buffer is 'output'.
      Otherwise, the output will appear discontinuous because the output buffer
      was cleared which contains older output than the echo buffer.
      
      Open-code the read buffer flush since the input worker does not need
      kicking (this is the input worker).
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2b6f447
    • P
      n_tty: Fix read buffer overwrite when no newline · fb5ef9e7
      Peter Hurley 提交于
      In canon mode, the read buffer head will advance over the buffer tail
      if the input > 4095 bytes without receiving a line termination char.
      
      Discard additional input until a line termination is received.
      Before evaluating for overflow, the 'room' value is normalized for
      I_PARMRK and 1 byte is reserved for line termination (even in !icanon
      mode, in case the mode is switched). The following table shows the
      transform:
      
       actual buffer |  'room' value before overflow calc
        space avail  |    !I_PARMRK    |    I_PARMRK
       --------------------------------------------------
            0        |       -1        |       -1
            1        |        0        |        0
            2        |        1        |        0
            3        |        2        |        0
            4+       |        3        |        1
      
      When !icanon or when icanon and the read buffer contains newlines,
      normalized 'room' values of -1 and 0 are clamped to 0, and
      'overflow' is 0, so read_head is not adjusted and the input i/o loop
      exits (setting no_room if called from flush_to_ldisc()). No input
      is discarded since the reader does have input available to read
      which ensures forward progress.
      
      When icanon and the read buffer does not contain newlines and the
      normalized 'room' value is 0, then overflow and room are reset to 1,
      so that the i/o loop will process the next input char normally
      (except for parity errors which are ignored). Thus, erasures, signalling
      chars, 7-bit mode, etc. will continue to be handled properly.
      
      If the input char processed was not a line termination char, then
      the canon_head index will not have advanced, so the normalized 'room'
      value will now be -1 and 'overflow' will be set, which indicates the
      read_head can safely be reset, effectively erasing the last char
      processed.
      
      If the input char processed was a line termination, then the
      canon_head index will have advanced, so 'overflow' is cleared to 0,
      the read_head is not reset, and 'room' is cleared to 0, which exits
      the i/o loop (because the reader now have input available to read
      which ensures forward progress).
      
      Note that it is possible for a line termination to be received, and
      for the reader to copy the line to the user buffer before the
      input i/o loop is ready to process the next input char. This is
      why the i/o loop recomputes the room/overflow state with every
      input char while handling overflow.
      
      Finally, if the input data was processed without receiving
      a line termination (so that overflow is still set), the pty
      driver must receive a write wakeup. A pty writer may be waiting
      to write more data in n_tty_write() but without unthrottling
      here that wakeup will not arrive, and forward progress will halt.
      (Normally, the pty writer is woken when the reader reads data out
      of the buffer and more space become available).
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb5ef9e7
    • P
      n_tty: Fix PARMRK over-throttling · 06c49f9f
      Peter Hurley 提交于
      If PARMRK is enabled, the available read buffer space computation is
      overly-pessimistic, which results in severely throttled i/o, even
      in the absence of parity errors. For example, if the 4k read buffer
      contains 1k processed data, the input worker will compute available
      space of 333 bytes, despite 3k being available. At 1365 chars of
      processed data, 0 space available is computed.
      
      *Divide remaining space* by 3, truncating down (if left == 2, left = 0).
      Reported-by: NChristian Riesch <christian.riesch@omicron.at>
      
      Conflicts:
      	drivers/tty/n_tty.c
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      06c49f9f
    • P
      n_tty: Fix unordered accesses to lockless read buffer · 70aca71f
      Peter Hurley 提交于
      Add commit_head buffer index, which the producer-side publishes
      after input processing in non-canon mode. This ensures the consumer-side
      observes correctly-ordered writes in non-canonical mode (ie., the buffer
      data is written before the buffer index is advanced). Fix consumer-side
      uses of read_cnt() to use commit_head instead.
      
      Add required memory barriers to the tail index to guarantee
      the consumer-side has completed the loads before the producer-side
      begins writing new data. Open-code the producer-side receive_room()
      into the i/o loop.
      
      Remove no-longer-referenced receive_room().
      
      Based on work by Christian Riesch <christian.riesch@omicron.at>
      
      Cc: Christian Riesch <christian.riesch@omicron.at>
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      70aca71f
    • P
      n_tty: Simplify throttle threshold calculation · 5e28cca1
      Peter Hurley 提交于
      The adjustments performed by receive_room() are to ensure a line
      termination can always be written to the read buffer. However,
      these adjustments are irrelevant to the throttle threshold (because
      the threshold < buffer limit).
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e28cca1
    • P
      n_tty: Fix throttle for canon lines > 3967 chars · a342846f
      Peter Hurley 提交于
      The tty driver will be mistakenly throttled if a line termination
      has not been received, and the line exceeds 3967 chars. Thus, it is
      possible for the driver to stop sending when it has not yet sent
      the newline. This does not apply to the pty driver.
      
      Don't throttle until at least one line termination has been
      received.
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a342846f
    • P
      n_tty: Eliminate receive_room() from consumer/exclusive paths · 2c5dc464
      Peter Hurley 提交于
      The input worker never reschedules itself; it only processes input until
      either there is no more input or the read buffer is full. So the reader
      is responsible for restarting the input worker only if the read buffer
      was previously full (no_room == 1) _and_ space is now available to process
      more input because the reader has consumed data from the read buffer.
      
      However, computing the actual space available is not required to determine
      if the reader has consumed data from the read buffer. This condition is
      evaluated in 5 situations, each of which the space avail is already known:
      1. n_tty_flush_buffer() - the read buffer is empty; kick the worker
      2. n_tty_set_termios() - no data has been consumed; do not kick the worker
             (although it may have kicked the reader so data _will be_ consumed)
      3. n_tty_check_unthrottle - avail space > 3968; kick the worker
      4. n_tty_read, before leaving - only kick the worker if the reader has
             moved the tail. This prevents unnecessarily kicking the worker
             when timeout-style reading is used.
      5. n_tty_read, before sleeping - although it is possible for the read
             buffer to be full and input_available_p() to be false, this can
             only happen when the input worker is racing the reader, in which
             case the reader will have been woken and won't sleep.
      
      Rename n_tty_set_room() to n_tty_kick_worker() to reflect what the
      function actually does.
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c5dc464
  15. 10 1月, 2015 1 次提交
  16. 27 11月, 2014 1 次提交
    • C
      n_tty: Fix read_buf race condition, increment read_head after pushing data · 8bfbe2de
      Christian Riesch 提交于
      Commit 19e2ad6a ("n_tty: Remove overflow
      tests from receive_buf() path") moved the increment of read_head into
      the arguments list of read_buf_addr(). Function calls represent a
      sequence point in C. Therefore read_head is incremented before the
      character c is placed in the buffer. Since the circular read buffer is
      a lock-less design since commit 6d76bd26
      ("n_tty: Make N_TTY ldisc receive path lockless"), this creates a race
      condition that leads to communication errors.
      
      This patch modifies the code to increment read_head _after_ the data
      is placed in the buffer and thus fixes the race for non-SMP machines.
      To fix the problem for SMP machines, memory barriers must be added in
      a separate patch.
      Signed-off-by: NChristian Riesch <christian.riesch@omicron.at>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8bfbe2de
  17. 07 11月, 2014 1 次提交
    • F
      tty: Fix pty master poll() after slave closes v2 · c4dc3046
      Francesco Ruggeri 提交于
      Commit f95499c3 ("n_tty: Don't wait for buffer work in read() loop")
      introduces a race window where a pty master can be signalled that the pty
      slave was closed before all the data that the slave wrote is delivered.
      Commit f8747d4a ("tty: Fix pty master read() after slave closes") fixed the
      problem in case of n_tty_read, but the problem still exists for n_tty_poll.
      This can be seen by running 'for ((i=0; i<100;i++));do ./test.py ;done'
      where test.py is:
      
      import os, select, pty
      
      (pid, pty_fd) = pty.fork()
      
      if pid == 0:
         os.write(1, 'This string should be received by parent')
      else:
         poller = select.epoll()
         poller.register( pty_fd, select.EPOLLIN )
         ready = poller.poll( 1 * 1000 )
         for fd, events in ready:
            if not events & select.EPOLLIN:
               print 'missed POLLIN event'
            else:
               print os.read(fd, 100)
         poller.close()
      
      The string from the slave is missed several times.
      This patch takes the same approach as the fix for read and special cases
      this condition for poll.
      Tested on 3.16.
      Signed-off-by: NFrancesco Ruggeri <fruggeri@arista.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4dc3046
  18. 06 11月, 2014 6 次提交