提交 · 5dd43ce2f69d42a71dcacdb13d17d8c0ac1fe8f7 · openanolis / cloud-kernel

20 6月, 2017 6 次提交

sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> · 5dd43ce2

由 Ingo Molnar 提交于 6月 20, 2017

The wait_bit*() types and APIs are mixed into wait.h, but they
are a pretty orthogonal extension of wait-queues.

Furthermore, only about 50 kernel files use these APIs, while
over 1000 use the regular wait-queue functionality.

So clean up the main wait.h by moving the wait-bit functionality
out of it, into a separate .h and .c file:

  include/linux/wait_bit.h  for types and APIs
  kernel/sched/wait_bit.c   for the implementation

Update all header dependencies.

This reduces the size of wait.h rather significantly, by about 30%.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

5dd43ce2

sched/wait: Standardize wait_bit_queue naming · 76c85ddc

由 Ingo Molnar 提交于 3月 05, 2017

So wait-bit-queue head variables are often named:

	struct wait_bit_queue *q

... which is a bit ambiguous and super confusing, because
they clearly suggest wait-queue head semantics and behavior
(they rhyme with the old wait_queue_t *q naming), while they
are extended wait-queue _entries_, not heads!

They are misnomers in two ways:

 - the 'wait_bit_queue' leaves open the question of whether
   it's an entry or a head

 - the 'q' parameter and local variable naming falsely implies
   that it's a 'queue' - while it's an entry.

This resulted in sometimes confusing cases such as:

	finish_wait(wq, &q->wait);

where the 'q' is not a wait-queue head, but a wait-bit-queue entry.

So improve this all by standardizing wait-bit-queue nomenclature
similar to wait-queue head naming:

	struct wait_bit_queue   => struct wait_bit_queue_entry
	q			=> wbq_entry

Which makes it all a much clearer:

	struct wait_bit_queue_entry *wbq_entry

... and turns the former confusing piece of code into:

	finish_wait(wq_head, &wbq_entry->wq_entry;

which IMHO makes it apparently clear what we are doing,
without having to analyze the context of the code: we are
adding a wait-queue entry to a regular wait-queue head,
which entry is embedded in a wait-bit-queue entry.

I'm not a big fan of acronyms, but repeating wait_bit_queue_entry
in field and local variable names is too long, so Hopefully it's
clear enough that 'wq_' prefixes stand for wait-queues, while
'wbq_' prefixes stand for wait-bit-queues.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

76c85ddc

sched/wait: Standardize 'struct wait_bit_queue' wait-queue entry field name · 21417136

由 Ingo Molnar 提交于 3月 05, 2017

Rename 'struct wait_bit_queue::wait' to ::wq_entry, to more clearly
name it as a wait-queue entry.

Propagate it to a couple of usage sites where the wait-bit-queue internals
are exposed.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

21417136

sched/wait: Standardize internal naming of wait-queue heads · 9d9d676f

由 Ingo Molnar 提交于 3月 05, 2017

The wait-queue head parameters and variables are named in a
couple of ways, we have the following variants currently:

	wait_queue_head_t *q
	wait_queue_head_t *wq
	wait_queue_head_t *head

In particular the 'wq' naming is ambiguous in the sense whether it's
a wait-queue head or entry name - as entries were often named 'wait'.

( Not to mention the confusion of any readers coming over from
  workqueue-land. )

Standardize all this around a single, unambiguous parameter and
variable name:

	struct wait_queue_head *wq_head

which is easy to grep for and also rhymes nicely with the wait-queue
entry naming:

	struct wait_queue_entry *wq_entry

Also rename:

	struct __wait_queue_head => struct wait_queue_head

... and use this struct type to migrate from typedefs usage to 'struct'
usage, which is more in line with existing kernel practices.

Don't touch any external users and preserve the main wait_queue_head_t
typedef.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

9d9d676f

sched/wait: Standardize internal naming of wait-queue entries · 50816c48

由 Ingo Molnar 提交于 3月 05, 2017

So the various wait-queue entry variables in include/linux/wait.h
and kernel/sched/wait.c are named in a colorfully inconsistent
way:

	wait_queue_entry_t *wait
	wait_queue_entry_t *__wait	(even in plain C code!)
	wait_queue_entry_t *q		(!)
	wait_queue_entry_t *new		(making anyone who knows C++ cringe)
	wait_queue_entry_t *old

I think part of the reason for the inconsistency is the constant
apparent confusion about what a wait queue 'head' versus 'entry' is.

( Some of the documentation talks about a 'wait descriptor', which is
  the wait-queue entry itself - further adding to the confusion. )

The most common name is 'wait', but that in itself is somewhat
ambiguous as well, as it does not really make it clear whether
it's a wait-queue entry or head.

To improve all this name the wait-queue entry structure parameters
and variables consistently and push through this naming into all
the wait.h and wait.c code:

	struct wait_queue_entry *wq_entry

The 'wq_' prefix makes it easy to grep for, and we also use the
opportunity to move away from the typedef to a plain 'struct' naming:
in the kernel we typically reserve typedefs for cases where a
C structure is really small and somewhat opaque - such as pte_t.

wait-queue entries are neither small nor opaque, so use the more
standard 'struct xxx_entry' list management code nomenclature instead.

( We don't touch external users, and we preserve the typedef as well
  for actual wait-queue users, to reduce unnecessary churn. )

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

50816c48

sched/wait: Rename wait_queue_t => wait_queue_entry_t · ac6424b9

由 Ingo Molnar 提交于 6月 20, 2017

Rename:

	wait_queue_t		=>	wait_queue_entry_t

'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
which had to carry the name.

Start sorting this out by renaming it to 'wait_queue_entry_t'.

This also allows the real structure name 'struct __wait_queue' to
lose its double underscore and become 'struct wait_queue_entry',
which is the more canonical nomenclature for such data types.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

ac6424b9

09 3月, 2017 1 次提交

sched/headers: fix up header file dependency on <linux/sched/signal.h> · bd0f9b35

由 Linus Torvalds 提交于 3月 07, 2017

The scheduler header file split and cleanups ended up exposing a few
nasty header file dependencies, and in particular it showed how we in
<linux/wait.h> ended up depending on "signal_pending()", which now comes
from <linux/sched/signal.h>.

That's a very subtle and annoying dependency, which already caused a
semantic merge conflict (see commit e58bc927 "Pull overlayfs updates
from Miklos Szeredi", which added that fixup in the merge commit).

It turns out that we can avoid this dependency _and_ improve code
generation by moving the guts of the fairly nasty helper #define
__wait_event_interruptible_locked() to out-of-line code.  The code that
includes the signal_pending() check is all in the slow-path where we
actually go to sleep waiting for the event anyway, so using a helper
function is the right thing to do.

Using a helper function is also what we already did for the non-locked
versions, see the "__wait_event*()" macros and the "prepare_to_wait*()"
set of helper functions.

We might want to try to unify all these macro games, we have a _lot_ of
subtly different wait-event loops.  But this is the minimal patch to fix
the annoying header dependency.
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bd0f9b35

02 3月, 2017 2 次提交

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> · b17b0153

由 Ingo Molnar 提交于 2月 08, 2017

We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/debug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b17b0153

sched/headers: Prepare to move signal wakeup & sigpending methods from... · 174cd4b1

由 Ingo Molnar 提交于 2月 02, 2017

sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h>

Fix up affected files that include this signal functionality via sched.h.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

174cd4b1

28 10月, 2016 1 次提交

mm: remove per-zone hashtable of bitlock waitqueues · 9dcb8b68

由 Linus Torvalds 提交于 10月 26, 2016

The per-zone waitqueues exist because of a scalability issue with the
page waitqueues on some NUMA machines, but it turns out that they hurt
normal loads, and now with the vmalloced stacks they also end up
breaking gfs2 that uses a bit_wait on a stack object:

     wait_on_bit(&gh->gh_iflags, HIF_WAIT, TASK_UNINTERRUPTIBLE)

where 'gh' can be a reference to the local variable 'mount_gh' on the
stack of fill_super().

The reason the per-zone hash table breaks for this case is that there is
no "zone" for virtual allocations, and trying to look up the physical
page to get at it will fail (with a BUG_ON()).

It turns out that I actually complained to the mm people about the
per-zone hash table for another reason just a month ago: the zone lookup
also hurts the regular use of "unlock_page()" a lot, because the zone
lookup ends up forcing several unnecessary cache misses and generates
horrible code.

As part of that earlier discussion, we had a much better solution for
the NUMA scalability issue - by just making the page lock have a
separate contention bit, the waitqueue doesn't even have to be looked at
for the normal case.

Peter Zijlstra already has a patch for that, but let's see if anybody
even notices.  In the meantime, let's fix the actual gfs2 breakage by
simplifying the bitlock waitqueues and removing the per-zone issue.
Reported-by: NAndreas Gruenbacher <agruenba@redhat.com>
Tested-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NMel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9dcb8b68

30 9月, 2016 4 次提交

sched/wait: Introduce init_wait_entry() · 0176beaf

由 Oleg Nesterov 提交于 9月 06, 2016

The partial initialization of wait_queue_t in prepare_to_wait_event() looks
ugly. This was done to shrink .text, but we can simply add the new helper
which does the full initialization and shrink the compiled code a bit more.

And. This way prepare_to_wait_event() can have more users. In particular we
are ready to remove the signal_pending_state() checks from wait_bit_action_f
helpers and change __wait_on_bit_lock() to use prepare_to_wait_event().
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160906140055.GA6167@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

0176beaf

sched/wait: Avoid abort_exclusive_wait() in __wait_on_bit_lock() · eaf9ef52

由 Oleg Nesterov 提交于 9月 06, 2016

__wait_on_bit_lock() doesn't need abort_exclusive_wait() too. Right
now it can't use prepare_to_wait_event() (see the next change), but
it can do the additional finish_wait() if action() fails.

abort_exclusive_wait() no longer has callers, remove it.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160906140053.GA6164@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

eaf9ef52

sched/wait: Avoid abort_exclusive_wait() in ___wait_event() · b1ea06a9

由 Oleg Nesterov 提交于 9月 08, 2016

___wait_event() doesn't really need abort_exclusive_wait(), we can simply
change prepare_to_wait_event() to remove the waiter from q->task_list if
it was interrupted.

This simplifies the code/logic, and this way prepare_to_wait_event() can
have more users, see the next change.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160908164815.GA18801@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
--
 include/linux/wait.h |    7 +------
 kernel/sched/wait.c  |   35 +++++++++++++++++++++++++----------
 2 files changed, 26 insertions(+), 16 deletions(-)

b1ea06a9

sched/wait: Fix abort_exclusive_wait(), it should pass TASK_NORMAL to wake_up() · 38a3e1fc

由 Oleg Nesterov 提交于 9月 06, 2016

Otherwise this logic only works if mode is "compatible" with another
exclusive waiter.

If some wq has both TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE waiters,
abort_exclusive_wait() won't wait an uninterruptible waiter.

The main user is __wait_on_bit_lock() and currently it is fine but only
because TASK_KILLABLE includes TASK_UNINTERRUPTIBLE and we do not have
lock_page_interruptible() yet.

Just use TASK_NORMAL and remove the "mode" arg from abort_exclusive_wait().
Yes, this means that (say) wake_up_interruptible() can wake up the non-
interruptible waiter(s), but I think this is fine. And in fact I think
that abort_exclusive_wait() must die, see the next change.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160906140047.GA6157@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

38a3e1fc

14 12月, 2015 1 次提交

sched/wait: Fix the signal handling fix · dfd01f02

由 Peter Zijlstra 提交于 12月 13, 2015

Jan Stancek reported that I wrecked things for him by fixing things for
Vladimir :/

His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which
should not be possible, however my previous patch made this possible by
unconditionally checking signal_pending().

We cannot use current->state as was done previously, because the
instruction after the store to that variable it can be changed.  We must
instead pass the initial state along and use that.

Fixes: 68985633 ("sched/wait: Fix signal handling in bit wait helpers")
Reported-by: NJan Stancek <jstancek@redhat.com>
Reported-by: NChris Mason <clm@fb.com>
Tested-by: NJan Stancek <jstancek@redhat.com>
Tested-by: NVladimir Murzin <vladimir.murzin@arm.com>
Tested-by: NChris Mason <clm@fb.com>
Reviewed-by: NPaul Turner <pjt@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: tglx@linutronix.de
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: hpa@zytor.com
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dfd01f02

04 12月, 2015 1 次提交

sched/wait: Fix signal handling in bit wait helpers · 68985633

由 Peter Zijlstra 提交于 12月 01, 2015

Vladimir reported getting RCU stall warnings and bisected it back to
commit:

  74316201 ("sched: Remove proliferation of wait_on_bit() action functions")

That commit inadvertently reversed the calls to schedule() and signal_pending(),
thereby not handling the case where the signal receives while we sleep.
Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
Tested-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: mark.rutland@arm.com
Cc: neilb@suse.de
Cc: oleg@redhat.com
Fixes: 74316201 ("sched: Remove proliferation of wait_on_bit() action functions")
Fixes: cbbce822 ("SCHED: add some "wait..on_bit...timeout()" interfaces.")
Link: http://lkml.kernel.org/r/20151201130404.GL3816@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

68985633

23 9月, 2015 1 次提交

userfaultfd: revert "userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key" · ac5be6b4

由 Andrea Arcangeli 提交于 9月 22, 2015

This reverts commit 51360155 and adapts
fs/userfaultfd.c to use the old version of that function.

It didn't look robust to call __wake_up_common with "nr == 1" when we
absolutely require wakeall semantics, but we've full control of what we
insert in the two waitqueue heads of the blocked userfaults.  No
exclusive waitqueue risks to be inserted into those two waitqueue heads
so we can as well stick to "nr == 1" of the old code and we can rely
purely on the fact no waitqueue inserted in one of the two waitqueue
heads we must enforce as wakeall, has wait->flags WQ_FLAG_EXCLUSIVE set.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Thierry Reding <treding@nvidia.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ac5be6b4

05 9月, 2015 1 次提交

userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key · 51360155

由 Andrea Arcangeli 提交于 9月 04, 2015

userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead
of the current hardcoded 1 (that would wake just the first waitqueue in
the head list).
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

51360155

19 5月, 2015 1 次提交

locking/arch: Rename set_mb() to smp_store_mb() · b92b8b35

由 Peter Zijlstra 提交于 5月 12, 2015

Since set_mb() is really about an smp_mb() -- not a IO/DMA barrier
like mb() rename it to match the recent smp_load_acquire() and
smp_store_release().
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b92b8b35

08 5月, 2015 1 次提交

sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() · 316c1608

由 Jason Low 提交于 4月 28, 2015

ACCESS_ONCE doesn't work reliably on non-scalar types. This patch removes
the rest of the existing usages of ACCESS_ONCE() in the scheduler, and use
the new READ_ONCE() and WRITE_ONCE() APIs as appropriate.
Signed-off-by: NJason Low <jason.low2@hp.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NRik van Riel <riel@redhat.com>
Acked-by: NWaiman Long <Waiman.Long@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1430251224-5764-2-git-send-email-jason.low2@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

316c1608

04 11月, 2014 1 次提交

sched/wait: Fix a kthread race with wait_woken() · cb6538e7

由 Peter Zijlstra 提交于 10月 31, 2014

There is a race between kthread_stop() and the new wait_woken() that
can result in a lack of progress.

CPU 0                                    | CPU 1
                                         |
rfcomm_run()                             | kthread_stop()
  ...                                    |
  if (!test_bit(KTHREAD_SHOULD_STOP))    |
                                         |   set_bit(KTHREAD_SHOULD_STOP)
                                         |   wake_up_process()
    wait_woken()                         |   wait_for_completion()
      set_current_state(INTERRUPTIBLE)   |
      if (!WQ_FLAG_WOKEN)                |
        schedule_timeout()               |
                                         |

After which both tasks will wait.. forever.

Fix this by having wait_woken() check for kthread_should_stop() but
only for kthreads (obviously).
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

cb6538e7

28 10月, 2014 1 次提交

sched/wait: Provide infrastructure to deal with nested blocking · 61ada528

由 Peter Zijlstra 提交于 9月 24, 2014

There are a few places that call blocking primitives from wait loops,
provide infrastructure to support this without the typical
task_struct::state collision.

We record the wakeup in wait_queue_t::flags which leaves
task_struct::state free to be used by others.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Cc: tglx@linutronix.de
Cc: ilya.dryomov@inktank.com
Cc: umgwanakikbuti@gmail.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140924082242.051202318@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

61ada528

25 9月, 2014 1 次提交

SCHED: add some "wait..on_bit...timeout()" interfaces. · cbbce822

由 NeilBrown 提交于 9月 25, 2014

In commit c1221321
   sched: Allow wait_on_bit_action() functions to support a timeout

I suggested that a "wait_on_bit_timeout()" interface would not meet my
need.  This isn't true - I was just over-engineering.

Including a 'private' field in wait_bit_key instead of a focused
"timeout" field was just premature generalization.  If some other
use is ever found, it can be generalized or added later.

So this patch renames "private" to "timeout" with a meaning "stop
waiting when "jiffies" reaches or passes "timeout",
and adds two of the many possible wait..bit..timeout() interfaces:

wait_on_page_bit_killable_timeout(), which is the one I want to use,
and out_of_line_wait_on_bit_timeout() which is a reasonably general
example.  Others can be added as needed.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NNeilBrown <neilb@suse.de>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

cbbce822

16 7月, 2014 2 次提交

sched: Allow wait_on_bit_action() functions to support a timeout · c1221321

由 NeilBrown 提交于 7月 07, 2014

It is currently not possible for various wait_on_bit functions
to implement a timeout.

While the "action" function that is called to do the waiting
could certainly use schedule_timeout(), there is no way to carry
forward the remaining timeout after a false wake-up.
As false-wakeups a clearly possible at least due to possible
hash collisions in bit_waitqueue(), this is a real problem.

The 'action' function is currently passed a pointer to the word
containing the bit being waited on.  No current action functions
use this pointer.  So changing it to something else will be a
little noisy but will have no immediate effect.

This patch changes the 'action' function to take a pointer to
the "struct wait_bit_key", which contains a pointer to the word
containing the bit so nothing is really lost.

It also adds a 'private' field to "struct wait_bit_key", which
is initialized to zero.

An action function can now implement a timeout with something
like

static int timed_out_waiter(struct wait_bit_key *key)
{
	unsigned long waited;
	if (key->private == 0) {
		key->private = jiffies;
		if (key->private == 0)
			key->private -= 1;
	}
	waited = jiffies - key->private;
	if (waited > 10 * HZ)
		return -EAGAIN;
	schedule_timeout(waited - 10 * HZ);
	return 0;
}

If any other need for context in a waiter were found it would be
easy to use ->private for some other purpose, or even extend
"struct wait_bit_key".

My particular need is to support timeouts in nfs_release_page()
to avoid deadlocks with loopback mounted NFS.

While wait_on_bit_timeout() would be a cleaner interface, it
will not meet my need.  I need the timeout to be sensitive to
the state of the connection with the server, which could change.
 So I need to use an 'action' interface.
Signed-off-by: NNeilBrown <neilb@suse.de>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>

c1221321

sched: Remove proliferation of wait_on_bit() action functions · 74316201

由 NeilBrown 提交于 7月 07, 2014

The current "wait_on_bit" interface requires an 'action'
function to be provided which does the actual waiting.
There are over 20 such functions, many of them identical.
Most cases can be satisfied by one of just two functions, one
which uses io_schedule() and one which just uses schedule().

So:
 Rename wait_on_bit and        wait_on_bit_lock to
        wait_on_bit_action and wait_on_bit_lock_action
 to make it explicit that they need an action function.

 Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
 which are *not* given an action function but implicitly use
 a standard one.
 The decision to error-out if a signal is pending is now made
 based on the 'mode' argument rather than being encoded in the action
 function.

 All instances of the old wait_on_bit and wait_on_bit_lock which
 can use the new version have been changed accordingly and their
 action functions have been discarded.
 wait_on_bit{_lock} does not return any specific error code in the
 event of a signal so the caller must check for non-zero and
 interpolate their own error code as appropriate.

The wait_on_bit() call in __fscache_wait_on_invalidate() was
ambiguous as it specified TASK_UNINTERRUPTIBLE but used
fscache_wait_bit_interruptible as an action function.
David Howells confirms this should be uniformly
"uninterruptible"

The main remaining user of wait_on_bit{,_lock}_action is NFS
which needs to use a freezer-aware schedule() call.

A comment in fs/gfs2/glock.c notes that having multiple 'action'
functions is useful as they display differently in the 'wchan'
field of 'ps'. (and /proc/$PID/wchan).
As the new bit_wait{,_io} functions are tagged "__sched", they
will not show up at all, but something higher in the stack.  So
the distinction will still be visible, only with different
function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
gfs2/glock.c case).

Since first version of this patch (against 3.15) two new action
functions appeared, on in NFS and one in CIFS.  CIFS also now
uses an action function that makes the same freezer aware
schedule call as NFS.
Signed-off-by: NNeilBrown <neilb@suse.de>
Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>

74316201

18 4月, 2014 1 次提交

arch: Mass conversion of smp_mb__*() · 4e857c58

由 Peter Zijlstra 提交于 3月 17, 2014

Mostly scripted conversion of the smp_mb__* barriers.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4e857c58

06 11月, 2013 2 次提交

sched: Move wait code from core.c to wait.c · b4145872

由 Peter Zijlstra 提交于 10月 04, 2013

For some reason only the wait part of the wait api lives in
kernel/sched/wait.c and the wake part still lives in kernel/sched/core.c;
ammend this.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-ftycee88naznulqk7ei5mbci@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

b4145872

sched: Move wait.c into kernel/sched/ · 7a6354e2

由 Peter Zijlstra 提交于 10月 31, 2013

Suggested-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-5q5yqvdaen0rmapwloeaotx3@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

7a6354e2

16 10月, 2013 1 次提交

sched/wait: Introduce prepare_to_wait_event() · c2d81644

由 Oleg Nesterov 提交于 10月 07, 2013

Add the new helper, prepare_to_wait_event() which should only be used
by ___wait_event().

prepare_to_wait_event() returns -ERESTARTSYS if signal_pending_state()
is true, otherwise it does prepare_to_wait/exclusive.  This allows to
uninline the signal-pending checks in wait_event*() macros.

Also, it can initialize wait->private/func. We do not care if they were
already initialized, the values are the same. This also shaves a couple
of insns from the inlined code.

This obviously makes prepare_*() path a little bit slower, but we are
likely going to sleep anyway, so I think it makes sense to shrink .text:

               text    data      bss      dec     hex  filename
            ===================================================
   before:  5126092 2959248 10117120 18202460 115bf5c   vmlinux
    after:  5124618 2955152 10117120 18196890 115a99a   vmlinux

on my build.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131007161824.GA29757@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c2d81644

20 8月, 2013 1 次提交

kernel: fix new kernel-doc warning in wait.c · 2203547f

由 Randy Dunlap 提交于 8月 18, 2013

Fix new kernel-doc warnings in kernel/wait.c:

Warning(kernel/wait.c:374): No description found for parameter 'p'
Warning(kernel/wait.c:374): Excess function parameter 'word' description in 'wake_up_atomic_t'
Warning(kernel/wait.c:374): Excess function parameter 'bit' description in 'wake_up_atomic_t'
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2203547f

24 7月, 2013 1 次提交

Fix __wait_on_atomic_t() to call the action func if the counter != 0 · 42577ca8

由 David Howells 提交于 7月 23, 2013

Fix __wait_on_atomic_t() so that it calls the action func if the counter != 0
rather than if the counter is 0 so as to be analogous to __wait_on_bit().

Thanks to Yacine who found this by visual inspection.

This will affect FS-Cache in that it will could fail to sleep correctly when
trying to clean up after a netfs cookie is withdrawn.
Reported-by: NYacine Belkadi <yacine.belkadi.1@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
cc: Milosz Tanski <milosz@adfin.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

42577ca8

15 5月, 2013 1 次提交

Add wait_on_atomic_t() and wake_up_atomic_t() · cb65537e

由 David Howells 提交于 5月 10, 2013

Add wait_on_atomic_t() and wake_up_atomic_t() to indicate became-zero events on
atomic_t types.  This uses the bit-wake waitqueue table.  The key is set to a
value outside of the number of bits in a long so that wait_on_bit() won't be
woken up accidentally.

What I'm using this for is: in a following patch I add a counter to struct
fscache_cookie to count the number of outstanding operations that need access
to netfs data.  The way this works is:

 (1) When a cookie is allocated, the counter is initialised to 1.

 (2) When an operation wants to access netfs data, it calls atomic_inc_unless()
     to increment the counter before it does so.  If it was 0, then the counter
     isn't incremented, the operation isn't permitted to access the netfs data
     (which might by this point no longer exist) and the operation aborts in
     some appropriate manner.

 (3) When an operation finishes with the netfs data, it decrements the counter
     and if it reaches 0, calls wake_up_atomic_t() on it - the assumption being
     that it was the last blocker.

 (4) When a cookie is released, the counter is decremented and the releaser
     uses wait_on_atomic_t() to wait for the counter to become 0 - which should
     indicate no one is using the netfs data any longer.  The netfs data can
     then be destroyed.

There are some alternatives that I have thought of and that have been suggested
by Tejun Heo:

 (A) Using wait_on_bit() to wait on a bit in the counter.  This doesn't work
     because if that bit happens to be 0 then the wait won't happen - even if
     the counter is non-zero.

 (B) Using wait_on_bit() to wait on a flag elsewhere which is cleared when the
     counter reaches 0.  Such a flag would be redundant and would add
     complexity.

 (C) Adding a waitqueue to fscache_cookie - this would expand that struct by
     several words for an event that happens just once in each cookie's
     lifetime.  Further, cookies are generally per-file so there are likely to
     be a lot of them.

 (D) Similar to (C), but add a pointer to a waitqueue in the cookie instead of
     a waitqueue.  This would add single word per cookie and so would be less
     of an expansion - but still an expansion.

 (E) Adding a static waitqueue to the fscache module.  Generally this would be
     fine, but under certain circumstances many cookies will all get added at
     the same time (eg. NFS umount, cache withdrawal) thereby presenting
     scaling issues.  Note that the wait may be significant as disk I/O may be
     in progress.

So, I think reusing the wait_on_bit() waitqueue set is reasonable.  I don't
make much use of the waitqueue I need on a per-cookie basis, but sometimes I
have a huge flood of the cookies to deal with.

I also don't want to add a whole new set of global waitqueue tables
specifically for the dec-to-0 event if I can reuse the bit tables.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-By: NMilosz Tanski <milosz@adfin.com>
Acked-by: NJeff Layton <jlayton@redhat.com>

cb65537e

06 12月, 2012 1 次提交

propagate name change to comments in kernel source · 6d49e352

由 Nadia Yvette Chambers 提交于 12月 06, 2012

I've legally changed my name with New York State, the US Social Security
Administration, et al. This patch propagates the name change and change
in initials and login to comments in the kernel source as well.
Signed-off-by: NNadia Yvette Chambers <nyc@holomorphy.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

6d49e352

21 12月, 2011 1 次提交

lockdep/waitqueues: Add better annotation · f07fdec5

由 Peter Zijlstra 提交于 12月 13, 2011

 -> #2 (&tty->write_wait){-.-...}:

is a lot more informative than:

 -> #2 (key#19){-.....}:
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-8zpopbny51023rdb0qq67eye@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

f07fdec5

31 10月, 2011 1 次提交

kernel: Map most files to use export.h instead of module.h · 9984de1a

由 Paul Gortmaker 提交于 5月 23, 2011

The changed files were only including linux/module.h for the
EXPORT_SYMBOL infrastructure, and nothing else.  Revector them
onto the isolated export header for faster compile times.

Nothing to see here but a whole lot of instances of:

  -#include <linux/module.h>
  +#include <linux/export.h>

This commit is only changing the kernel dir; next targets
will probably be mm, fs, the arch dirs, etc.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

9984de1a

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

27 10月, 2010 1 次提交

docbook: add more wait/wake/completion to device-drivers docbook · ee2f154a

由 Randy Dunlap 提交于 10月 26, 2010

Add more wait, wake, and completion interfaces to the device-drivers
docbook.

Fix kernel-doc notation in the added files.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ee2f154a

10 8月, 2009 1 次提交

locking, sched: Give waitqueue spinlocks their own lockdep classes · 2fc39111

由 Peter Zijlstra 提交于 8月 10, 2009

Give waitqueue spinlocks their own lockdep classes when they
are initialised from init_waitqueue_head().  This means that
struct wait_queue::func functions can operate other waitqueues.

This is used by CacheFiles to catch the page from a backing fs
being unlocked and to wake up another thread to take a copy of
it.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-by: NTakashi Iwai <tiwai@suse.de>
Cc: linux-cachefs@redhat.com
Cc: torvalds@osdl.org
Cc: akpm@linux-foundation.org
LKML-Reference: <20090810113305.17284.81508.stgit@warthog.procyon.org.uk>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2fc39111

14 4月, 2009 1 次提交

wait: don't use __wake_up_common() · 78ddb08f

由 Johannes Weiner 提交于 4月 14, 2009

'777c6c5f wait: prevent exclusive waiter starvation' made
__wake_up_common() global to be used from abort_exclusive_wait().

It was needed to do a wake-up with the waitqueue lock held while
passing down a key to the wake-up function.

Since '4ede816a epoll keyed wakeups: add __wake_up_locked_key() and
__wake_up_sync_key()' there is an appropriate wrapper for this case:
__wake_up_locked_key().

Use it here and make __wake_up_common() private to the scheduler
again.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1239720785-19661-1-git-send-email-hannes@cmpxchg.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

78ddb08f

06 2月, 2009 1 次提交

wait: prevent exclusive waiter starvation · 777c6c5f

由 Johannes Weiner 提交于 2月 04, 2009

With exclusive waiters, every process woken up through the wait queue must
ensure that the next waiter down the line is woken when it has finished.

Interruptible waiters don't do that when aborting due to a signal.  And if
an aborting waiter is concurrently woken up through the waitqueue, noone
will ever wake up the next waiter.

This has been observed with __wait_on_bit_lock() used by
lock_page_killable(): the first contender on the queue was aborting when
the actual lock holder woke it up concurrently.  The aborted contender
didn't acquire the lock and therefor never did an unlock followed by
waking up the next waiter.

Add abort_exclusive_wait() which removes the process' wait descriptor from
the waitqueue, iff still queued, or wakes up the next waiter otherwise.
It does so under the waitqueue lock.  Racing with a wake up means the
aborting process is either already woken (removed from the queue) and will
wake up the next waiter, or it will remove itself from the queue and the
concurrent wake up will apply to the next waiter after it.

Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
__wait_on_bit_lock() when they were interrupted by other means than a wake
up through the queue.

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: NChris Mason <chris.mason@oracle.com>
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Mentored-by: NOleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>		["after some testing"]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

777c6c5f

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功