1. 26 10月, 2020 20 次提交
  2. 25 9月, 2020 1 次提交
  3. 11 9月, 2020 1 次提交
    • A
      epoll: EPOLL_CTL_ADD: close the race in decision to take fast path · fe0a916c
      Al Viro 提交于
      Checking for the lack of epitems refering to the epoll we want to insert into
      is not enough; we might have an insertion of that epoll into another one that
      has already collected the set of files to recheck for excessive reverse paths,
      but hasn't gotten to creating/inserting the epitem for it.
      
      However, any such insertion in progress can be detected - it will update the
      generation count in our epoll when it's done looking through it for files
      to check.  That gets done under ->mtx of our epoll and that allows us to
      detect that safely.
      
      We are *not* holding epmutex here, so the generation count is not stable.
      However, since both the update of ep->gen by loop check and (later)
      insertion into ->f_ep_link are done with ep->mtx held, we are fine -
      the sequence is
      	grab epmutex
      	bump loop_check_gen
      	...
      	grab tep->mtx		// 1
      	tep->gen = loop_check_gen
      	...
      	drop tep->mtx		// 2
      	...
      	grab tep->mtx		// 3
      	...
      	insert into ->f_ep_link
      	...
      	drop tep->mtx		// 4
      	bump loop_check_gen
      	drop epmutex
      and if the fastpath check in another thread happens for that
      eventpoll, it can come
      	* before (1) - in that case fastpath is just fine
      	* after (4) - we'll see non-empty ->f_ep_link, slow path
      taken
      	* between (2) and (3) - loop_check_gen is stable,
      with ->mtx providing barriers and we end up taking slow path.
      
      Note that ->f_ep_link emptiness check is slightly racy - we are protected
      against insertions into that list, but removals can happen right under us.
      Not a problem - in the worst case we'll end up taking a slow path for
      no good reason.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fe0a916c
  4. 10 9月, 2020 2 次提交
  5. 02 9月, 2020 1 次提交
  6. 23 8月, 2020 2 次提交
  7. 15 5月, 2020 1 次提交
  8. 08 5月, 2020 2 次提交
    • R
      epoll: atomically remove wait entry on wake up · 412895f0
      Roman Penyaev 提交于
      This patch does two things:
      
       - fixes a lost wakeup introduced by commit 339ddb53 ("fs/epoll:
         remove unnecessary wakeups of nested epoll")
      
       - improves performance for events delivery.
      
      The description of the problem is the following: if N (>1) threads are
      waiting on ep->wq for new events and M (>1) events come, it is quite
      likely that >1 wakeups hit the same wait queue entry, because there is
      quite a big window between __add_wait_queue_exclusive() and the
      following __remove_wait_queue() calls in ep_poll() function.
      
      This can lead to lost wakeups, because thread, which was woken up, can
      handle not all the events in ->rdllist.  (in better words the problem is
      described here: https://lkml.org/lkml/2019/10/7/905)
      
      The idea of the current patch is to use init_wait() instead of
      init_waitqueue_entry().
      
      Internally init_wait() sets autoremove_wake_function as a callback,
      which removes the wait entry atomically (under the wq locks) from the
      list, thus the next coming wakeup hits the next wait entry in the wait
      queue, thus preventing lost wakeups.
      
      Problem is very well reproduced by the epoll60 test case [1].
      
      Wait entry removal on wakeup has also performance benefits, because
      there is no need to take a ep->lock and remove wait entry from the queue
      after the successful wakeup.  Here is the timing output of the epoll60
      test case:
      
        With explicit wakeup from ep_scan_ready_list() (the state of the
        code prior 339ddb53):
      
          real    0m6.970s
          user    0m49.786s
          sys     0m0.113s
      
       After this patch:
      
         real    0m5.220s
         user    0m36.879s
         sys     0m0.019s
      
      The other testcase is the stress-epoll [2], where one thread consumes
      all the events and other threads produce many events:
      
        With explicit wakeup from ep_scan_ready_list() (the state of the
        code prior 339ddb53):
      
          threads  events/ms  run-time ms
                8       5427         1474
               16       6163         2596
               32       6824         4689
               64       7060         9064
              128       6991        18309
      
       After this patch:
      
          threads  events/ms  run-time ms
                8       5598         1429
               16       7073         2262
               32       7502         4265
               64       7640         8376
              128       7634        16767
      
       (number of "events/ms" represents event bandwidth, thus higher is
        better; number of "run-time ms" represents overall time spent
        doing the benchmark, thus lower is better)
      
      [1] tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
      [2] https://github.com/rouming/test-tools/blob/master/stress-epoll.cSigned-off-by: NRoman Penyaev <rpenyaev@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NJason Baron <jbaron@akamai.com>
      Cc: Khazhismel Kumykov <khazhy@google.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Heiher <r@hev.cc>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200430130326.1368509-2-rpenyaev@suse.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      412895f0
    • K
      eventpoll: fix missing wakeup for ovflist in ep_poll_callback · 0c54a6a4
      Khazhismel Kumykov 提交于
      In the event that we add to ovflist, before commit 339ddb53
      ("fs/epoll: remove unnecessary wakeups of nested epoll") we would be
      woken up by ep_scan_ready_list, and did no wakeup in ep_poll_callback.
      
      With that wakeup removed, if we add to ovflist here, we may never wake
      up.  Rather than adding back the ep_scan_ready_list wakeup - which was
      resulting in unnecessary wakeups, trigger a wake-up in ep_poll_callback.
      
      We noticed that one of our workloads was missing wakeups starting with
      339ddb53 and upon manual inspection, this wakeup seemed missing to me.
      With this patch added, we no longer see missing wakeups.  I haven't yet
      tried to make a small reproducer, but the existing kselftests in
      filesystem/epoll passed for me with this patch.
      
      [khazhy@google.com: use if/elif instead of goto + cleanup suggested by Roman]
        Link: http://lkml.kernel.org/r/20200424190039.192373-1-khazhy@google.com
      Fixes: 339ddb53 ("fs/epoll: remove unnecessary wakeups of nested epoll")
      Signed-off-by: NKhazhismel Kumykov <khazhy@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NRoman Penyaev <rpenyaev@suse.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Roman Penyaev <rpenyaev@suse.de>
      Cc: Heiher <r@hev.cc>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200424025057.118641-1-khazhy@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0c54a6a4
  9. 08 4月, 2020 1 次提交
  10. 22 3月, 2020 1 次提交
  11. 30 1月, 2020 2 次提交
  12. 05 12月, 2019 2 次提交
  13. 21 8月, 2019 1 次提交
  14. 19 7月, 2019 1 次提交
  15. 17 7月, 2019 1 次提交
  16. 29 6月, 2019 1 次提交
    • O
      signal: remove the wrong signal_pending() check in restore_user_sigmask() · 97abc889
      Oleg Nesterov 提交于
      This is the minimal fix for stable, I'll send cleanups later.
      
      Commit 854a6ed5 ("signal: Add restore_user_sigmask()") introduced
      the visible change which breaks user-space: a signal temporary unblocked
      by set_user_sigmask() can be delivered even if the caller returns
      success or timeout.
      
      Change restore_user_sigmask() to accept the additional "interrupted"
      argument which should be used instead of signal_pending() check, and
      update the callers.
      
      Eric said:
      
      : For clarity.  I don't think this is required by posix, or fundamentally to
      : remove the races in select.  It is what linux has always done and we have
      : applications who care so I agree this fix is needed.
      :
      : Further in any case where the semantic change that this patch rolls back
      : (aka where allowing a signal to be delivered and the select like call to
      : complete) would be advantage we can do as well if not better by using
      : signalfd.
      :
      : Michael is there any chance we can get this guarantee of the linux
      : implementation of pselect and friends clearly documented.  The guarantee
      : that if the system call completes successfully we are guaranteed that no
      : signal that is unblocked by using sigmask will be delivered?
      
      Link: http://lkml.kernel.org/r/20190604134117.GA29963@redhat.com
      Fixes: 854a6ed5 ("signal: Add restore_user_sigmask()")
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reported-by: NEric Wong <e@80x24.org>
      Tested-by: NEric Wong <e@80x24.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: <stable@vger.kernel.org>	[5.0+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97abc889