1. 01 11月, 2013 1 次提交
  2. 23 10月, 2013 1 次提交
  3. 16 10月, 2013 2 次提交
  4. 04 10月, 2013 16 次提交
  5. 22 8月, 2013 1 次提交
    • M
      [SCSI] zfcp: fix lock imbalance by reworking request queue locking · d79ff142
      Martin Peschke 提交于
      This patch adds wait_event_interruptible_lock_irq_timeout(), which is a
      straight-forward descendant of wait_event_interruptible_timeout() and
      wait_event_interruptible_lock_irq().
      
      The zfcp driver used to call wait_event_interruptible_timeout()
      in combination with some intricate and error-prone locking. Using
      wait_event_interruptible_lock_irq_timeout() as a replacement
      nicely cleans up that locking.
      
      This rework removes a situation that resulted in a locking imbalance
      in zfcp_qdio_sbal_get():
      
      BUG: workqueue leaked lock or atomic: events/1/0xffffff00/10
          last function: zfcp_fc_wka_port_offline+0x0/0xa0 [zfcp]
      
      It was introduced by commit c2af7545
      "[SCSI] zfcp: Do not wait for SBALs on stopped queue", which had a new
      code path related to ZFCP_STATUS_ADAPTER_QDIOUP that took an early exit
      without a required lock being held. The problem occured when a
      special, non-SCSI I/O request was being submitted in process context,
      when the adapter's queues had been torn down. In this case the bug
      surfaced when the Fibre Channel port connection for a well-known address
      was closed during a concurrent adapter shut-down procedure, which is a
      rare constellation.
      
      This patch also fixes these warnings from the sparse tool (make C=1):
      
      drivers/s390/scsi/zfcp_qdio.c:224:12: warning: context imbalance in
       'zfcp_qdio_sbal_check' - wrong count at exit
      drivers/s390/scsi/zfcp_qdio.c:244:5: warning: context imbalance in
       'zfcp_qdio_sbal_get' - unexpected unlock
      
      Last but not least, we get rid of that crappy lock-unlock-lock
      sequence at the beginning of the critical section.
      
      It is okay to call zfcp_erp_adapter_reopen() with req_q_lock held.
      Reported-by: NMikulas Patocka <mpatocka@redhat.com>
      Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Peschke <mpeschke@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org #2.6.35+
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      d79ff142
  6. 25 5月, 2013 1 次提交
  7. 15 5月, 2013 1 次提交
    • D
      Add wait_on_atomic_t() and wake_up_atomic_t() · cb65537e
      David Howells 提交于
      Add wait_on_atomic_t() and wake_up_atomic_t() to indicate became-zero events on
      atomic_t types.  This uses the bit-wake waitqueue table.  The key is set to a
      value outside of the number of bits in a long so that wait_on_bit() won't be
      woken up accidentally.
      
      What I'm using this for is: in a following patch I add a counter to struct
      fscache_cookie to count the number of outstanding operations that need access
      to netfs data.  The way this works is:
      
       (1) When a cookie is allocated, the counter is initialised to 1.
      
       (2) When an operation wants to access netfs data, it calls atomic_inc_unless()
           to increment the counter before it does so.  If it was 0, then the counter
           isn't incremented, the operation isn't permitted to access the netfs data
           (which might by this point no longer exist) and the operation aborts in
           some appropriate manner.
      
       (3) When an operation finishes with the netfs data, it decrements the counter
           and if it reaches 0, calls wake_up_atomic_t() on it - the assumption being
           that it was the last blocker.
      
       (4) When a cookie is released, the counter is decremented and the releaser
           uses wait_on_atomic_t() to wait for the counter to become 0 - which should
           indicate no one is using the netfs data any longer.  The netfs data can
           then be destroyed.
      
      There are some alternatives that I have thought of and that have been suggested
      by Tejun Heo:
      
       (A) Using wait_on_bit() to wait on a bit in the counter.  This doesn't work
           because if that bit happens to be 0 then the wait won't happen - even if
           the counter is non-zero.
      
       (B) Using wait_on_bit() to wait on a flag elsewhere which is cleared when the
           counter reaches 0.  Such a flag would be redundant and would add
           complexity.
      
       (C) Adding a waitqueue to fscache_cookie - this would expand that struct by
           several words for an event that happens just once in each cookie's
           lifetime.  Further, cookies are generally per-file so there are likely to
           be a lot of them.
      
       (D) Similar to (C), but add a pointer to a waitqueue in the cookie instead of
           a waitqueue.  This would add single word per cookie and so would be less
           of an expansion - but still an expansion.
      
       (E) Adding a static waitqueue to the fscache module.  Generally this would be
           fine, but under certain circumstances many cookies will all get added at
           the same time (eg. NFS umount, cache withdrawal) thereby presenting
           scaling issues.  Note that the wait may be significant as disk I/O may be
           in progress.
      
      So, I think reusing the wait_on_bit() waitqueue set is reasonable.  I don't
      make much use of the waitqueue I need on a per-cookie basis, but sometimes I
      have a huge flood of the cookies to deal with.
      
      I also don't want to add a whole new set of global waitqueue tables
      specifically for the dec-to-0 event if I can reuse the bit tables.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-By: NMilosz Tanski <milosz@adfin.com>
      Acked-by: NJeff Layton <jlayton@redhat.com>
      cb65537e
  8. 08 5月, 2013 1 次提交
    • K
      wait: add wait_event_hrtimeout() · 774a08b3
      Kent Overstreet 提交于
      Analagous to wait_event_timeout() and friends, this adds
      wait_event_hrtimeout() and wait_event_interruptible_hrtimeout().
      
      Note that unlike the versions that use regular timers, these don't
      return the amount of time remaining when they return - instead, they
      return 0 or -ETIME if they timed out.  because I was uncomfortable with
      the semantics of doing it the other way (that I could get it right,
      anyways).
      
      If the timer expires, there's no real guarantee that expire_time -
      current_time would be <= 0 - due to timer slack certainly, and I'm not
      sure I want to know the implications of the different clock bases in
      hrtimers.
      
      If the timer does expire and the code calculates that the time remaining
      is nonnegative, that could be even worse if the calling code then reuses
      that timeout.  Probably safer to just return 0 then, but I could imagine
      weird bugs or at least unintended behaviour arising from that too.
      
      I came to the conclusion that if other users end up actually needing the
      amount of time remaining, the sanest thing to do would be to create a
      version that uses absolute timeouts instead of relative.
      
      [akpm@linux-foundation.org: fix description of `timeout' arg]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      774a08b3
  9. 30 11月, 2012 1 次提交
    • L
      wait: add wait_event_lock_irq() interface · eed8c02e
      Lukas Czerner 提交于
      New wait_event{_interruptible}_lock_irq{_cmd} macros added. This commit
      moves the private wait_event_lock_irq() macro from MD to regular wait
      includes, introduces new macro wait_event_lock_irq_cmd() instead of using
      the old method with omitting cmd parameter which is ugly and makes a use
      of new macros in the MD. It also introduces the _interruptible_ variant.
      
      The use of new interface is when one have a special lock to protect data
      structures used in the condition, or one also needs to invoke "cmd"
      before putting it to sleep.
      
      All new macros are expected to be called with the lock taken. The lock
      is released before sleep and is reacquired afterwards. We will leave the
      macro with the lock held.
      
      Note to DM: IMO this should also fix theoretical race on waitqueue while
      using simultaneously wait_event_lock_irq() and wait_event() because of
      lack of locking around current state setting and wait queue removal.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      eed8c02e
  10. 13 10月, 2012 1 次提交
  11. 29 3月, 2012 1 次提交
  12. 01 3月, 2012 1 次提交
  13. 21 12月, 2011 1 次提交
  14. 06 10月, 2010 1 次提交
    • E
      wait: using uninitialized member of wait queue · 231d0aef
      Evgeny Kuznetsov 提交于
      The "flags" member of "struct wait_queue_t" is used in several places in
      the kernel code without beeing initialized by init_wait().  "flags" is
      used in bitwise operations.
      
      If "flags" not initialized then unexpected behaviour may take place.
      Incorrect flags might used later in code.
      
      Added initialization of "wait_queue_t.flags" with zero value into
      "init_wait".
      Signed-off-by: NEvgeny Kuznetsov <EXT-Eugeny.Kuznetsov@nokia.com>
      [ The bit we care about does end up being initialized by both
         prepare_to_wait() and add_to_wait_queue(), so this doesn't seem to
         cause actual bugs, but is definitely the right thing to do -Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      231d0aef
  15. 21 5月, 2010 1 次提交
    • M
      wait_event_interruptible_locked() interface · 22c43c81
      Michal Nazarewicz 提交于
      New wait_event_interruptible{,_exclusive}_locked{,_irq} macros added.
      They work just like versions without _locked* suffix but require the
      wait queue's lock to be held.  Also __wake_up_locked() is now exported
      as to pair it with the above macros.
      
      The use case of this new facility is when one uses wait queue's lock
      to  protect a data structure.  This may be advantageous if the
      structure needs to be protected by a spinlock anyway.  In particular,
      with additional spinlock the following code has to be used to wait
      for a condition:
      
      spin_lock(&data.lock);
      ...
      for (ret = 0; !ret && !(condition); ) {
      	spin_unlock(&data.lock);
      	ret = wait_event_interruptible(data.wqh, (condition));
      	spin_lock(&data.lock);
      }
      ...
      spin_unlock(&data.lock);
      
      This looks bizarre plus wait_event_interruptible() locks the wait
      queue's lock anyway so there is a unlock+lock sequence where it could
      be avoided.
      
      To avoid those problems and benefit from wait queue's lock, a code
      similar to the following should be used:
      
      /* Waiting */
      spin_lock(&data.wqh.lock);
      ...
      ret = wait_event_interruptible_locked(data.wqh, (condition));
      ...
      spin_unlock(&data.wqh.lock);
      
      /* Waiting exclusively */
      spin_lock(&data.whq.lock);
      ...
      ret = wait_event_interruptible_exclusive_locked(data.whq, (condition));
      ...
      spin_unlock(&data.whq.lock);
      
      /* Waking up */
      spin_lock(&data.wqh.lock);
      ...
      wake_up_locked(&data.wqh);
      ...
      spin_unlock(&data.wqh.lock);
      
      When spin_lock_irq() is used matching versions of macros need to be
      used (*_locked_irq()).
      Signed-off-by: NMichal Nazarewicz <m.nazarewicz@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      22c43c81
  16. 11 5月, 2010 1 次提交
    • C
      sched, wait: Use wrapper functions · a93d2f17
      Changli Gao 提交于
      epoll should not touch flags in wait_queue_t. This patch introduces a new
      function __add_wait_queue_exclusive(), for the users, who use wait queue as a
      LIFO queue.
      
      __add_wait_queue_tail_exclusive() is introduced too instead of
      add_wait_queue_exclusive_locked(). remove_wait_queue_locked() is removed, as
      it is a duplicate of __remove_wait_queue(), disliked by users, and with less
      users.
      Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Paul Menage <menage@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: <containers@lists.linux-foundation.org>
      LKML-Reference: <1273214006-2979-1-git-send-email-xiaosuo@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a93d2f17
  17. 15 9月, 2009 1 次提交
  18. 10 8月, 2009 1 次提交
  19. 28 4月, 2009 1 次提交
    • E
      net: Avoid extra wakeups of threads blocked in wait_for_packet() · bf368e4e
      Eric Dumazet 提交于
      In 2.6.25 we added UDP mem accounting.
      
      This unfortunatly added a penalty when a frame is transmitted, since
      we have at TX completion time to call sock_wfree() to perform necessary
      memory accounting. This calls sock_def_write_space() and utimately
      scheduler if any thread is waiting on the socket.
      Thread(s) waiting for an incoming frame was scheduled, then had to sleep
      again as event was meaningless.
      
      (All threads waiting on a socket are using same sk_sleep anchor)
      
      This adds lot of extra wakeups and increases latencies, as noted
      by Christoph Lameter, and slows down softirq handler.
      
      Reference : http://marc.info/?l=linux-netdev&m=124060437012283&w=2 
      
      Fortunatly, Davide Libenzi recently added concept of keyed wakeups
      into kernel, and particularly for sockets (see commit
      37e5540b 
      epoll keyed wakeups: make sockets use keyed wakeups)
      
      Davide goal was to optimize epoll, but this new wakeup infrastructure
      can help non epoll users as well, if they care to setup an appropriate
      handler.
      
      This patch introduces new DEFINE_WAIT_FUNC() helper and uses it
      in wait_for_packet(), so that only relevant event can wakeup a thread
      blocked in this function.
      
      Trace of function calls from bnx2 TX completion bnx2_poll_work() is :
      __kfree_skb()
       skb_release_head_state()
        sock_wfree()
         sock_def_write_space()
          __wake_up_sync_key()
           __wake_up_common()
            receiver_wake_function() : Stops here since thread is waiting for an INPUT
      Reported-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf368e4e
  20. 14 4月, 2009 1 次提交
    • J
      wait: don't use __wake_up_common() · 78ddb08f
      Johannes Weiner 提交于
      '777c6c5f wait: prevent exclusive waiter starvation' made
      __wake_up_common() global to be used from abort_exclusive_wait().
      
      It was needed to do a wake-up with the waitqueue lock held while
      passing down a key to the wake-up function.
      
      Since '4ede816a epoll keyed wakeups: add __wake_up_locked_key() and
      __wake_up_sync_key()' there is an appropriate wrapper for this case:
      __wake_up_locked_key().
      
      Use it here and make __wake_up_common() private to the scheduler
      again.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1239720785-19661-1-git-send-email-hannes@cmpxchg.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      78ddb08f
  21. 01 4月, 2009 2 次提交
    • D
      epoll keyed wakeups: introduce new *_poll() wakeup macros · c0da3775
      Davide Libenzi 提交于
      Introduce new wakeup macros that allow passing an event mask to the wakeup
      targets.  They exactly mimic their non-_poll() counterpart, with the added
      event mask passing capability.  I did add only the ones currently
      requested, avoiding the _nr() and _all() for the moment.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0da3775
    • D
      epoll keyed wakeups: add __wake_up_locked_key() and __wake_up_sync_key() · 4ede816a
      Davide Libenzi 提交于
      This patchset introduces wakeup hints for some of the most popular (from
      epoll POV) devices, so that epoll code can avoid spurious wakeups on its
      waiters.
      
      The problem with epoll is that the callback-based wakeups do not, ATM,
      carry any information about the events the wakeup is related to.  So the
      only choice epoll has (not being able to call f_op->poll() from inside the
      callback), is to add the file* to a ready-list and resolve the real events
      later on, at epoll_wait() (or its own f_op->poll()) time.  This can cause
      spurious wakeups, since the wake_up() itself might be for an event the
      caller is not interested into.
      
      The rate of these spurious wakeup can be pretty high in case of many
      network sockets being monitored.
      
      By allowing devices to report the events the wakeups refer to (at least
      the two major classes - POLLIN/POLLOUT), we are able to spare useless
      wakeups by proper handling inside the epoll's poll callback.
      
      Epoll will have in any case to call f_op->poll() on the file* later on,
      since the change to be done in order to have the full event set sent via
      wakeup, is too invasive for the way our f_op->poll() system works (the
      full event set is calculated inside the poll function - there are too many
      of them to even start thinking the change - also poll/select would need
      change too).
      
      Epoll is changed in a way that both devices which send event hints, and
      the ones that don't, are correctly handled.  The former will gain some
      efficiency though.
      
      As a general rule for devices, would be to add an event mask by using
      key-aware wakeup macros, when making up poll wait queues.  I tested it
      (together with the epoll's poll fix patch Andrew has in -mm) and wakeups
      for the supported devices are correctly filtered.
      
      Test program available here:
      
      http://www.xmailserver.org/epoll_test.c
      
      This patch:
      
      Nothing revolutionary here.  Just using the available "key" that our
      wakeup core already support.  The __wake_up_locked_key() was no brainer,
      since both __wake_up_locked() and __wake_up_locked_key() are thin wrappers
      around __wake_up_common().
      
      The __wake_up_sync() function had a body, so the choice was between
      borrowing the body for __wake_up_sync_key() and calling it from
      __wake_up_sync(), or make an inline and calling it from both.  I chose the
      former since in most archs it all resolves to "mov $0, REG; jmp ADDR".
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4ede816a
  22. 06 2月, 2009 1 次提交
    • J
      wait: prevent exclusive waiter starvation · 777c6c5f
      Johannes Weiner 提交于
      With exclusive waiters, every process woken up through the wait queue must
      ensure that the next waiter down the line is woken when it has finished.
      
      Interruptible waiters don't do that when aborting due to a signal.  And if
      an aborting waiter is concurrently woken up through the waitqueue, noone
      will ever wake up the next waiter.
      
      This has been observed with __wait_on_bit_lock() used by
      lock_page_killable(): the first contender on the queue was aborting when
      the actual lock holder woke it up concurrently.  The aborted contender
      didn't acquire the lock and therefor never did an unlock followed by
      waking up the next waiter.
      
      Add abort_exclusive_wait() which removes the process' wait descriptor from
      the waitqueue, iff still queued, or wakes up the next waiter otherwise.
      It does so under the waitqueue lock.  Racing with a wake up means the
      aborting process is either already woken (removed from the queue) and will
      wake up the next waiter, or it will remove itself from the queue and the
      concurrent wake up will apply to the next waiter after it.
      
      Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
      __wait_on_bit_lock() when they were interrupted by other means than a wake
      up through the queue.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Reported-by: NChris Mason <chris.mason@oracle.com>
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Mentored-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Chuck Lever <cel@citi.umich.edu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>		["after some testing"]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      777c6c5f
  23. 17 10月, 2008 1 次提交
    • T
      wait: kill is_sync_wait() · a25d644f
      Tejun Heo 提交于
      is_sync_wait() is used to distinguish between sync and async waits.
      Basically sync waits are the ones initialized with init_waitqueue_entry()
      and async ones with init_waitqueue_func_entry().  The sync/async
      distinction is used only in prepare_to_wait[_exclusive]() and its only
      function is to skip setting the current task state if the wait is async.
      This has a few problems.
      
      * No one uses it.  None of func_entry users use prepare_to_wait()
        functions, so the code path never gets executed.
      
      * The distinction is bogus.  Maybe back when func_entry is used only
        by aio but it's now also used by epoll and in future possibly by 9p
        and poll/select.
      
      * Taking @state as argument and ignoring it silenly depending on how
        @wait is initialized is just a bad error-prone API.
      
      * It prevents func_entry waits from using wait->private for no good
        reason.
      
      This patch kills is_sync_wait() and the associated code paths from
      prepare_to_wait[_exclusive]().  As there was no user of these code paths,
      this patch doesn't cause any behavior difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a25d644f