1. 15 11月, 2016 1 次提交
  2. 23 8月, 2016 1 次提交
    • P
      rcu: Drive expedited grace periods from workqueue · 8b355e3b
      Paul E. McKenney 提交于
      The current implementation of expedited grace periods has the user
      task drive the grace period.  This works, but has downsides: (1) The
      user task must awaken tasks piggybacking on this grace period, which
      can result in latencies rivaling that of the grace period itself, and
      (2) User tasks can receive signals, which interfere with RCU CPU stall
      warnings.
      
      This commit therefore uses workqueues to drive the grace periods, so
      that the user task need not do the awakening.  A subsequent commit
      will remove the now-unnecessary code allowing for signals.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8b355e3b
  3. 16 6月, 2016 1 次提交
    • M
      rcu: Correctly handle sparse possible cpus · bc75e999
      Mark Rutland 提交于
      In many cases in the RCU tree code, we iterate over the set of cpus for
      a leaf node described by rcu_node::grplo and rcu_node::grphi, checking
      per-cpu data for each cpu in this range. However, if the set of possible
      cpus is sparse, some cpus described in this range are not possible, and
      thus no per-cpu region will have been allocated (or initialised) for
      them by the generic percpu code.
      
      Erroneous accesses to a per-cpu area for these !possible cpus may fault
      or may hit other data depending on the addressed generated when the
      erroneous per cpu offset is applied. In practice, both cases have been
      observed on arm64 hardware (the former being silent, but detectable with
      additional patches).
      
      To avoid issues resulting from this, we must iterate over the set of
      *possible* cpus for a given leaf node. This patch add a new helper,
      for_each_leaf_node_possible_cpu, to enable this. As iteration is often
      intertwined with rcu_node local bitmask manipulation, a new
      leaf_node_cpu_bit helper is added to make this simpler and more
      consistent. The RCU tree code is made to use both of these where
      appropriate.
      
      Without this patch, running reboot at a shell can result in an oops
      like:
      
      [ 3369.075979] Unable to handle kernel paging request at virtual address ffffff8008b21b4c
      [ 3369.083881] pgd = ffffffc3ecdda000
      [ 3369.087270] [ffffff8008b21b4c] *pgd=00000083eca48003, *pud=00000083eca48003, *pmd=0000000000000000
      [ 3369.096222] Internal error: Oops: 96000007 [#1] PREEMPT SMP
      [ 3369.101781] Modules linked in:
      [ 3369.104825] CPU: 2 PID: 1817 Comm: NetworkManager Tainted: G        W       4.6.0+ #3
      [ 3369.121239] task: ffffffc0fa13e000 ti: ffffffc3eb940000 task.ti: ffffffc3eb940000
      [ 3369.128708] PC is at sync_rcu_exp_select_cpus+0x188/0x510
      [ 3369.134094] LR is at sync_rcu_exp_select_cpus+0x104/0x510
      [ 3369.139479] pc : [<ffffff80081109a8>] lr : [<ffffff8008110924>] pstate: 200001c5
      [ 3369.146860] sp : ffffffc3eb9435a0
      [ 3369.150162] x29: ffffffc3eb9435a0 x28: ffffff8008be4f88
      [ 3369.155465] x27: ffffff8008b66c80 x26: ffffffc3eceb2600
      [ 3369.160767] x25: 0000000000000001 x24: ffffff8008be4f88
      [ 3369.166070] x23: ffffff8008b51c3c x22: ffffff8008b66c80
      [ 3369.171371] x21: 0000000000000001 x20: ffffff8008b21b40
      [ 3369.176673] x19: ffffff8008b66c80 x18: 0000000000000000
      [ 3369.181975] x17: 0000007fa951a010 x16: ffffff80086a30f0
      [ 3369.187278] x15: 0000007fa9505590 x14: 0000000000000000
      [ 3369.192580] x13: ffffff8008b51000 x12: ffffffc3eb940000
      [ 3369.197882] x11: 0000000000000006 x10: ffffff8008b51b78
      [ 3369.203184] x9 : 0000000000000001 x8 : ffffff8008be4000
      [ 3369.208486] x7 : ffffff8008b21b40 x6 : 0000000000001003
      [ 3369.213788] x5 : 0000000000000000 x4 : ffffff8008b27280
      [ 3369.219090] x3 : ffffff8008b21b4c x2 : 0000000000000001
      [ 3369.224406] x1 : 0000000000000001 x0 : 0000000000000140
      ...
      [ 3369.972257] [<ffffff80081109a8>] sync_rcu_exp_select_cpus+0x188/0x510
      [ 3369.978685] [<ffffff80081128b4>] synchronize_rcu_expedited+0x64/0xa8
      [ 3369.985026] [<ffffff80086b987c>] synchronize_net+0x24/0x30
      [ 3369.990499] [<ffffff80086ddb54>] dev_deactivate_many+0x28c/0x298
      [ 3369.996493] [<ffffff80086b6bb8>] __dev_close_many+0x60/0xd0
      [ 3370.002052] [<ffffff80086b6d48>] __dev_close+0x28/0x40
      [ 3370.007178] [<ffffff80086bf62c>] __dev_change_flags+0x8c/0x158
      [ 3370.012999] [<ffffff80086bf718>] dev_change_flags+0x20/0x60
      [ 3370.018558] [<ffffff80086cf7f0>] do_setlink+0x288/0x918
      [ 3370.023771] [<ffffff80086d0798>] rtnl_newlink+0x398/0x6a8
      [ 3370.029158] [<ffffff80086cee84>] rtnetlink_rcv_msg+0xe4/0x220
      [ 3370.034891] [<ffffff80086e274c>] netlink_rcv_skb+0xc4/0xf8
      [ 3370.040364] [<ffffff80086ced8c>] rtnetlink_rcv+0x2c/0x40
      [ 3370.045663] [<ffffff80086e1fe8>] netlink_unicast+0x160/0x238
      [ 3370.051309] [<ffffff80086e24b8>] netlink_sendmsg+0x2f0/0x358
      [ 3370.056956] [<ffffff80086a0070>] sock_sendmsg+0x18/0x30
      [ 3370.062168] [<ffffff80086a21cc>] ___sys_sendmsg+0x26c/0x280
      [ 3370.067728] [<ffffff80086a30ac>] __sys_sendmsg+0x44/0x88
      [ 3370.073027] [<ffffff80086a3100>] SyS_sendmsg+0x10/0x20
      [ 3370.078153] [<ffffff8008085e70>] el0_svc_naked+0x24/0x28
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NDennis Chen <dennis.chen@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Steve Capper <steve.capper@arm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bc75e999
  4. 01 4月, 2016 5 次提交
    • P
      rcu: Awaken grace-period kthread if too long since FQS · 8c7c4829
      Paul E. McKenney 提交于
      Recent kernels can fail to awaken the grace-period kthread for
      quiescent-state forcing.  This commit is a crude hack that does
      a wakeup if a scheduling-clock interrupt sees that it has been
      too long since force-quiescent-state (FQS) processing.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8c7c4829
    • P
      rcu: Overlap wakeups with next expedited grace period · 3b5f668e
      Paul E. McKenney 提交于
      The current expedited grace-period implementation makes subsequent grace
      periods wait on wakeups for the prior grace period.  This does not fit
      the dictionary definition of "expedited", so this commit allows these two
      phases to overlap.  Doing this requires four waitqueues rather than two
      because tasks can now be waiting on the previous, current, and next grace
      periods.  The fourth waitqueue makes the bit masking work out nicely.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3b5f668e
    • P
      rcu: Enforce expedited-GP fairness via funnel wait queue · f6a12f34
      Paul E. McKenney 提交于
      The current mutex-based funnel-locking approach used by expedited grace
      periods is subject to severe unfairness.  The problem arises when a
      few tasks, making a path from leaves to root, all wake up before other
      tasks do.  A new task can then follow this path all the way to the root,
      which needlessly delays tasks whose grace period is done, but who do
      not happen to acquire the lock quickly enough.
      
      This commit avoids this problem by maintaining per-rcu_node wait queues,
      along with a per-rcu_node counter that tracks the latest grace period
      sought by an earlier task to visit this node.  If that grace period
      would satisfy the current task, instead of proceeding up the tree,
      it waits on the current rcu_node structure using a pair of wait queues
      provided for that purpose.  This decouples awakening of old tasks from
      the arrival of new tasks.
      
      If the wakeups prove to be a bottleneck, additional kthreads can be
      brought to bear for that purpose.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f6a12f34
    • P
      rcu: Shorten expedited_workdone* to exp_workdone* · d40a4f09
      Paul E. McKenney 提交于
      Just a name change to save a few lines and a bit of typing.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      d40a4f09
    • P
      rcu: Remove expedited GP funnel-lock bypass · e2fd9d35
      Paul E. McKenney 提交于
      Commit #cdacbe1f ("rcu: Add fastpath bypassing funnel locking")
      turns out to be a pessimization at high load because it forces a tree
      full of tasks to wait for an expedited grace period that they probably
      do not need.  This commit therefore removes this optimization.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      e2fd9d35
  5. 25 2月, 2016 2 次提交
    • P
      rcu: Use simple wait queues where possible in rcutree · abedf8e2
      Paul Gortmaker 提交于
      As of commit dae6e64d ("rcu: Introduce proper blocking to no-CBs kthreads
      GP waits") the RCU subsystem started making use of wait queues.
      
      Here we convert all additions of RCU wait queues to use simple wait queues,
      since they don't need the extra overhead of the full wait queue features.
      
      Originally this was done for RT kernels[1], since we would get things like...
      
        BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
        in_atomic(): 1, irqs_disabled(): 1, pid: 8, name: rcu_preempt
        Pid: 8, comm: rcu_preempt Not tainted
        Call Trace:
         [<ffffffff8106c8d0>] __might_sleep+0xd0/0xf0
         [<ffffffff817d77b4>] rt_spin_lock+0x24/0x50
         [<ffffffff8106fcf6>] __wake_up+0x36/0x70
         [<ffffffff810c4542>] rcu_gp_kthread+0x4d2/0x680
         [<ffffffff8105f910>] ? __init_waitqueue_head+0x50/0x50
         [<ffffffff810c4070>] ? rcu_gp_fqs+0x80/0x80
         [<ffffffff8105eabb>] kthread+0xdb/0xe0
         [<ffffffff8106b912>] ? finish_task_switch+0x52/0x100
         [<ffffffff817e0754>] kernel_thread_helper+0x4/0x10
         [<ffffffff8105e9e0>] ? __init_kthread_worker+0x60/0x60
         [<ffffffff817e0750>] ? gs_change+0xb/0xb
      
      ...and hence simple wait queues were deployed on RT out of necessity
      (as simple wait uses a raw lock), but mainline might as well take
      advantage of the more streamline support as well.
      
      [1] This is a carry forward of work from v3.10-rt; the original conversion
      was by Thomas on an earlier -rt version, and Sebastian extended it to
      additional post-3.10 added RCU waiters; here I've added a commit log and
      unified the RCU changes into one, and uprev'd it to match mainline RCU.
      Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: linux-rt-users@vger.kernel.org
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1455871601-27484-6-git-send-email-wagi@monom.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      abedf8e2
    • D
      rcu: Do not call rcu_nocb_gp_cleanup() while holding rnp->lock · 065bb78c
      Daniel Wagner 提交于
      rcu_nocb_gp_cleanup() is called while holding rnp->lock. Currently,
      this is okay because the wake_up_all() in rcu_nocb_gp_cleanup() will
      not enable the IRQs. lockdep is happy.
      
      By switching over using swait this is not true anymore. swake_up_all()
      enables the IRQs while processing the waiters. __do_softirq() can now
      run and will eventually call rcu_process_callbacks() which wants to
      grap nrp->lock.
      
      Let's move the rcu_nocb_gp_cleanup() call outside the lock before we
      switch over to swait.
      
      If we would hold the rnp->lock and use swait, lockdep reports
      following:
      
       =================================
       [ INFO: inconsistent lock state ]
       4.2.0-rc5-00025-g9a73ba0 #136 Not tainted
       ---------------------------------
       inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
       rcu_preempt/8 [HC0[0]:SC0[0]:HE1:SE1] takes:
        (rcu_node_1){+.?...}, at: [<ffffffff811387c7>] rcu_gp_kthread+0xb97/0xeb0
       {IN-SOFTIRQ-W} state was registered at:
         [<ffffffff81109b9f>] __lock_acquire+0xd5f/0x21e0
         [<ffffffff8110be0f>] lock_acquire+0xdf/0x2b0
         [<ffffffff81841cc9>] _raw_spin_lock_irqsave+0x59/0xa0
         [<ffffffff81136991>] rcu_process_callbacks+0x141/0x3c0
         [<ffffffff810b1a9d>] __do_softirq+0x14d/0x670
         [<ffffffff810b2214>] irq_exit+0x104/0x110
         [<ffffffff81844e96>] smp_apic_timer_interrupt+0x46/0x60
         [<ffffffff81842e70>] apic_timer_interrupt+0x70/0x80
         [<ffffffff810dba66>] rq_attach_root+0xa6/0x100
         [<ffffffff810dbc2d>] cpu_attach_domain+0x16d/0x650
         [<ffffffff810e4b42>] build_sched_domains+0x942/0xb00
         [<ffffffff821777c2>] sched_init_smp+0x509/0x5c1
         [<ffffffff821551e3>] kernel_init_freeable+0x172/0x28f
         [<ffffffff8182cdce>] kernel_init+0xe/0xe0
         [<ffffffff8184231f>] ret_from_fork+0x3f/0x70
       irq event stamp: 76
       hardirqs last  enabled at (75): [<ffffffff81841330>] _raw_spin_unlock_irq+0x30/0x60
       hardirqs last disabled at (76): [<ffffffff8184116f>] _raw_spin_lock_irq+0x1f/0x90
       softirqs last  enabled at (0): [<ffffffff810a8df2>] copy_process.part.26+0x602/0x1cf0
       softirqs last disabled at (0): [<          (null)>]           (null)
       other info that might help us debug this:
        Possible unsafe locking scenario:
              CPU0
              ----
         lock(rcu_node_1);
         <Interrupt>
           lock(rcu_node_1);
        *** DEADLOCK ***
       1 lock held by rcu_preempt/8:
        #0:  (rcu_node_1){+.?...}, at: [<ffffffff811387c7>] rcu_gp_kthread+0xb97/0xeb0
       stack backtrace:
       CPU: 0 PID: 8 Comm: rcu_preempt Not tainted 4.2.0-rc5-00025-g9a73ba0 #136
       Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
        0000000000000000 000000006d7e67d8 ffff881fb081fbd8 ffffffff818379e0
        0000000000000000 ffff881fb0812a00 ffff881fb081fc38 ffffffff8110813b
        0000000000000000 0000000000000001 ffff881f00000001 ffffffff8102fa4f
       Call Trace:
        [<ffffffff818379e0>] dump_stack+0x4f/0x7b
        [<ffffffff8110813b>] print_usage_bug+0x1db/0x1e0
        [<ffffffff8102fa4f>] ? save_stack_trace+0x2f/0x50
        [<ffffffff811087ad>] mark_lock+0x66d/0x6e0
        [<ffffffff81107790>] ? check_usage_forwards+0x150/0x150
        [<ffffffff81108898>] mark_held_locks+0x78/0xa0
        [<ffffffff81841330>] ? _raw_spin_unlock_irq+0x30/0x60
        [<ffffffff81108a28>] trace_hardirqs_on_caller+0x168/0x220
        [<ffffffff81108aed>] trace_hardirqs_on+0xd/0x10
        [<ffffffff81841330>] _raw_spin_unlock_irq+0x30/0x60
        [<ffffffff810fd1c7>] swake_up_all+0xb7/0xe0
        [<ffffffff811386e1>] rcu_gp_kthread+0xab1/0xeb0
        [<ffffffff811089bf>] ? trace_hardirqs_on_caller+0xff/0x220
        [<ffffffff81841341>] ? _raw_spin_unlock_irq+0x41/0x60
        [<ffffffff81137c30>] ? rcu_barrier+0x20/0x20
        [<ffffffff810d2014>] kthread+0x104/0x120
        [<ffffffff81841330>] ? _raw_spin_unlock_irq+0x30/0x60
        [<ffffffff810d1f10>] ? kthread_create_on_node+0x260/0x260
        [<ffffffff8184231f>] ret_from_fork+0x3f/0x70
        [<ffffffff810d1f10>] ? kthread_create_on_node+0x260/0x260
      Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: linux-rt-users@vger.kernel.org
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1455871601-27484-5-git-send-email-wagi@monom.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      065bb78c
  6. 24 2月, 2016 1 次提交
    • B
      RCU: Privatize rcu_node::lock · 67c583a7
      Boqun Feng 提交于
      In patch:
      
      "rcu: Add transitivity to remaining rcu_node ->lock acquisitions"
      
      All locking operations on rcu_node::lock are replaced with the wrappers
      because of the need of transitivity, which indicates we should never
      write code using LOCK primitives alone(i.e. without a proper barrier
      following) on rcu_node::lock outside those wrappers. We could detect
      this kind of misuses on rcu_node::lock in the future by adding __private
      modifier on rcu_node::lock.
      
      To privatize rcu_node::lock, unlock wrappers are also needed. Replacing
      spinlock unlocks with these wrappers not only privatizes rcu_node::lock
      but also makes it easier to figure out critical sections of rcu_node.
      
      This patch adds __private modifier to rcu_node::lock and makes every
      access to it wrapped by ACCESS_PRIVATE(). Besides, unlock wrappers are
      added and raw_spin_unlock(&rnp->lock) and its friends are replaced with
      those wrappers.
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      67c583a7
  7. 06 12月, 2015 1 次提交
  8. 05 12月, 2015 2 次提交
    • P
      rcu: Reduce expedited GP memory contention via per-CPU variables · df5bd514
      Paul E. McKenney 提交于
      Currently, the piggybacked-work checks carried out by sync_exp_work_done()
      atomically increment a small set of variables (the ->expedited_workdone0,
      ->expedited_workdone1, ->expedited_workdone2, ->expedited_workdone3
      fields in the rcu_state structure), which will form a memory-contention
      bottleneck given a sufficiently large number of CPUs concurrently invoking
      either synchronize_rcu_expedited() or synchronize_sched_expedited().
      
      This commit therefore moves these for fields to the per-CPU rcu_data
      structure, eliminating the memory contention.  The show_rcuexp() function
      also changes to sum up each field in the rcu_data structures.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      df5bd514
    • P
      rcu: Clarify role of ->expmaskinitnext · 1de6e56d
      Paul E. McKenney 提交于
      Analogy with the ->qsmaskinitnext field might lead one to believe that
      ->expmaskinitnext tracks online CPUs.  This belief is incorrect: Any CPU
      that has ever been online will have its bit set in the ->expmaskinitnext
      field.  This commit therefore adds a comment to make this clear, at
      least to people who read comments.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1de6e56d
  9. 24 11月, 2015 1 次提交
  10. 08 10月, 2015 3 次提交
    • P
      rcu: Add online/offline info to expedited stall warning message · 74611ecb
      Paul E. McKenney 提交于
      This commit makes the RCU CPU stall warning message print online/offline
      indications immediately after the CPU number.  A "O" indicates global
      offline, a "." global online, and a "o" indicates RCU believes that the
      CPU is offline for the current grace period and "." otherwise, and an
      "N" indicates that RCU believes that the CPU will be offline for the
      next grace period, and "." otherwise, all right after the CPU number.
      So for CPU 10, you would normally see "10-...:" indicating that everything
      believes that the CPU is online.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      74611ecb
    • P
      rcu: Stop silencing lockdep false positive for expedited grace periods · 83c2c735
      Paul E. McKenney 提交于
      This reverts commit af859bea (rcu: Silence lockdep false positive
      for expedited grace periods).  Because synchronize_rcu_expedited()
      no longer invokes synchronize_sched_expedited(), ->exp_funnel_mutex
      acquisition is no longer nested, so the false positive no longer happens.
      This commit therefore removes the extra lockdep data structures, as they
      are no longer needed.
      83c2c735
    • P
      rcu: Switch synchronize_sched_expedited() to IPI · 6587a23b
      Paul E. McKenney 提交于
      This commit switches synchronize_sched_expedited() from stop_one_cpu_nowait()
      to smp_call_function_single(), thus moving from an IPI and a pair of
      context switches to an IPI and a single pass through the scheduler.
      Of course, if the scheduler actually does decide to switch to a different
      task, there will still be a pair of context switches, but there would
      likely have been a pair of context switches anyway, just a bit later.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      6587a23b
  11. 07 10月, 2015 4 次提交
  12. 21 9月, 2015 5 次提交
    • P
      rcu: Make ->cpu_no_qs be a union for aggregate OR · 5b74c458
      Paul E. McKenney 提交于
      This commit converts the rcu_data structure's ->cpu_no_qs field
      to a union.  The bytewise side of this union allows individual access
      to indications as to whether this CPU needs to find a quiescent state
      for a normal (.norm) and/or expedited (.exp) grace period.  The setwise
      side of the union allows testing whether or not a quiescent state is
      needed at all, for either type of grace period.
      
      For now, only .norm is used.  A later commit will introduce the expedited
      usage.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5b74c458
    • P
      rcu: Invert passed_quiesce and rename to cpu_no_qs · 0d43eb34
      Paul E. McKenney 提交于
      This commit inverts the sense of the rcu_data structure's ->passed_quiesce
      field and renames it to ->cpu_no_qs.  This will allow a later commit to
      use an "aggregate OR" operation to test expedited as well as normal grace
      periods without added overhead.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      0d43eb34
    • P
      rcu: Rename qs_pending to core_needs_qs · 97c668b8
      Paul E. McKenney 提交于
      An upcoming commit needs to invert the sense of the ->passed_quiesce
      rcu_data structure field, so this commit is taking this opportunity
      to clarify things a bit by renaming ->qs_pending to ->core_needs_qs.
      
      So if !rdp->core_needs_qs, then this CPU need not concern itself with
      quiescent states, in particular, it need not acquire its leaf rcu_node
      structure's ->lock to check.  Otherwise, it needs to report the next
      quiescent state.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      97c668b8
    • P
      rcu: Move synchronize_sched_expedited() to combining tree · bce5fa12
      Paul E. McKenney 提交于
      Currently, synchronize_sched_expedited() uses a single global counter
      to track the number of remaining context switches that the current
      expedited grace period must wait on.  This is problematic on large
      systems, where the resulting memory contention can be pathological.
      This commit therefore makes synchronize_sched_expedited() instead use
      the combining tree in the same manner as synchronize_rcu_expedited(),
      keeping memory contention down to a dull roar.
      
      This commit creates a temporary function sync_sched_exp_select_cpus()
      that is very similar to sync_rcu_exp_select_cpus().  A later commit
      will consolidate these two functions, which becomes possible when
      synchronize_sched_expedited() switches from stop_one_cpu_nowait() to
      smp_call_function_single().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bce5fa12
    • P
      rcu: Consolidate tree setup for synchronize_rcu_expedited() · b9585e94
      Paul E. McKenney 提交于
      This commit replaces sync_rcu_preempt_exp_init1(() and
      sync_rcu_preempt_exp_init2() with sync_exp_reset_tree_hotplug()
      and sync_exp_reset_tree(), which will also be used by
      synchronize_sched_expedited(), and sync_rcu_exp_select_nodes(), which
      contains code specific to synchronize_rcu_expedited().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b9585e94
  13. 04 8月, 2015 2 次提交
    • P
      rcu,locking: Privatize smp_mb__after_unlock_lock() · 12d560f4
      Paul E. McKenney 提交于
      RCU is the only thing that uses smp_mb__after_unlock_lock(), and is
      likely the only thing that ever will use it, so this commit makes this
      macro private to RCU.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
      12d560f4
    • P
      rcu: Silence lockdep false positive for expedited grace periods · af859bea
      Paul E. McKenney 提交于
      In a CONFIG_PREEMPT=y kernel, synchronize_rcu_expedited()
      acquires the ->exp_funnel_mutex in rcu_preempt_state, then invokes
      synchronize_sched_expedited, which acquires the ->exp_funnel_mutex in
      rcu_sched_state.  There can be no deadlock because rcu_preempt_state
      ->exp_funnel_mutex acquisition always precedes that of rcu_sched_state.
      But lockdep does not know that, so it gives false-positive splats.
      
      This commit therefore associates a separate lock_class_key structure
      with the rcu_sched_state structure's ->exp_funnel_mutex, allowing
      lockdep to see the lock ordering, avoiding the false positives.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      af859bea
  14. 18 7月, 2015 11 次提交