1. 16 5月, 2018 2 次提交
  2. 24 2月, 2018 1 次提交
    • P
      rcu: Create RCU-specific workqueues with rescuers · ad7c946b
      Paul E. McKenney 提交于
      RCU's expedited grace periods can participate in out-of-memory deadlocks
      due to all available system_wq kthreads being blocked and there not being
      memory available to create more.  This commit prevents such deadlocks
      by allocating an RCU-specific workqueue_struct at early boot time, and
      providing it with a rescuer to ensure forward progress.  This uses the
      shiny new init_rescuer() function provided by Tejun (but indirectly).
      
      This commit also causes SRCU to use this new RCU-specific
      workqueue_struct.  Note that SRCU's use of workqueues never blocks them
      waiting for readers, so this should be safe from a forward-progress
      viewpoint.  Note that this moves SRCU from system_power_efficient_wq
      to a normal workqueue.  In the unlikely event that this results in
      measurable degradation, a separate power-efficient workqueue will be
      creates for SRCU.
      Reported-by: NPrateek Sood <prsood@codeaurora.org>
      Reported-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      ad7c946b
  3. 21 2月, 2018 4 次提交
  4. 29 11月, 2017 1 次提交
  5. 28 11月, 2017 2 次提交
  6. 10 10月, 2017 2 次提交
  7. 17 8月, 2017 1 次提交
  8. 09 6月, 2017 14 次提交
  9. 19 4月, 2017 10 次提交
    • P
      srcu: Merge ->srcu_state into ->srcu_gp_seq · 80a7956f
      Paul E. McKenney 提交于
      Updating ->srcu_state and ->srcu_gp_seq will lead to extremely complex
      race conditions given multiple callback queues, so this commit takes
      advantage of the two-bit state now available in rcu_seq counters to
      store the state in the bottom two bits of ->srcu_gp_seq.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      80a7956f
    • P
      srcu: Allow a second bit in rcu_seq for SRCU state · f1ec57a4
      Paul E. McKenney 提交于
      This commit increases the number of reserved bits at the bottom of an
      rcu_seq grace-period counter from one to two, as will be needed to
      accommodate SRCU's three-state grace periods.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f1ec57a4
    • P
      srcu: Improve rcu_seq grace-period-counter abstraction · 031aeee0
      Paul E. McKenney 提交于
      The expedited grace-period code contains several open-coded shifts
      know the format of an rcu_seq grace-period counter, which is not
      particularly good style.  This commit therefore creates a new
      rcu_seq_ctr() function that extracts the counter portion of the
      counter, and an rcu_seq_state() function that extracts the low-order
      state bit.  This commit prepares for SRCU callback parallelization,
      which will require two state bits.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      031aeee0
    • P
      srcu: Make num_rcu_lvl[] array be external · e95d68d2
      Paul E. McKenney 提交于
      This commit makes the num_rcu_lvl[] array external so that SRCU can
      make use of it for initializing its upcoming srcu_node tree.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      e95d68d2
    • P
      srcu: Move rcu_node traversal macros to rcu.h · efbe451d
      Paul E. McKenney 提交于
      This commit moves rcu_for_each_node_breadth_first(),
      rcu_for_each_nonleaf_node_breadth_first(), and
      rcu_for_each_leaf_node() from kernel/rcu/tree.h to
      kernel/rcu/rcu.h so that SRCU can access them.
      This commit is code-movement only.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      efbe451d
    • P
      srcu: Move rcu_init_levelspread() to rcu_tree_node.h · 2b34c43c
      Paul E. McKenney 提交于
      This commit moves the rcu_init_levelspread() function from
      kernel/rcu/tree.c to kernel/rcu/rcu.h so that SRCU can access it.  This is
      another step towards enabling SRCU to create its own combining tree.
      This commit is code-movement only, give or take knock-on adjustments.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      2b34c43c
    • P
      srcu: Use rcu_segcblist to track SRCU callbacks · 8660b7d8
      Paul E. McKenney 提交于
      This commit switches SRCU from custom-built callback queues to the new
      rcu_segcblist structure.  This change associates grace-period sequence
      numbers with groups of callbacks, which will be needed for efficient
      processing of per-CPU callbacks.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8660b7d8
    • D
      rcu: Fix warning in rcu_seq_end() · f010ed82
      Dmitry Vyukov 提交于
      The rcu_seq_end() function increments seq signifying completion
      of a grace period, after that checks that the seq is even and wakes
      _synchronize_rcu_expedited().  The _synchronize_rcu_expedited() function
      uses wait_event() to wait for even seq.  The problem is that wait_event()
      can return as soon as seq becomes even without waiting for the wakeup.
      In such case the warning in rcu_seq_end() can falsely fire if the next
      expedited grace period starts before the check.
      
      Check that seq has good value before incrementing it.
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: syzkaller@googlegroups.com
      Cc: linux-kernel@vger.kernel.org
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: josh@joshtriplett.org
      Cc: jiangshanlai@gmail.com
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      
      ---
      
      syzkaller-triggered warning:
      
      WARNING: CPU: 0 PID: 4832 at kernel/rcu/tree.c:3533
      rcu_seq_end+0x110/0x140 kernel/rcu/tree.c:3533
      CPU: 0 PID: 4832 Comm: kworker/0:3 Not tainted 4.10.0+ #276
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: events wait_rcu_exp_gp
      Call Trace:
       __dump_stack lib/dump_stack.c:15 [inline]
       dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
       panic+0x1fb/0x412 kernel/panic.c:179
       __warn+0x1c4/0x1e0 kernel/panic.c:540
       warn_slowpath_null+0x2c/0x40 kernel/panic.c:583
       rcu_seq_end+0x110/0x140 kernel/rcu/tree.c:3533
       rcu_exp_gp_seq_end kernel/rcu/tree_exp.h:36 [inline]
       rcu_exp_wait_wake+0x8a9/0x1330 kernel/rcu/tree_exp.h:517
       rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:559 [inline]
       wait_rcu_exp_gp+0x83/0xc0 kernel/rcu/tree_exp.h:570
       process_one_work+0xc06/0x1c20 kernel/workqueue.c:2096
       worker_thread+0x223/0x19c0 kernel/workqueue.c:2230
       kthread+0x326/0x3f0 kernel/kthread.c:227
       ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
      ---
      f010ed82
    • P
      srcu: Move rcu_seq_start() and friends to rcu.h · 2e8c28c2
      Paul E. McKenney 提交于
      This commit moves rcu_seq_start(), rcu_seq_end(), rcu_seq_snap(),
      and rcu_seq_done() from kernel/rcu/tree.c to kernel/rcu/rcu.h.
      This will allow SRCU to use these functions, which in turn will
      allow SRCU to move from a single global callback queue to a
      per-CPU callback queue.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      2e8c28c2
    • P
      rcu: Semicolon inside RCU_TRACE() for rcu.h · dffd06a7
      Paul E. McKenney 提交于
      The current use of "RCU_TRACE(statement);" can cause odd bugs, especially
      where "statement" is a local-variable declaration, as it can leave a
      misplaced ";" in the source code.  This commit therefore converts these
      to "RCU_TRACE(statement;)", which avoids the misplaced ";".
      Reported-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      dffd06a7
  10. 15 1月, 2017 1 次提交
    • P
      rcu: Narrow early boot window of illegal synchronous grace periods · 52d7e48b
      Paul E. McKenney 提交于
      The current preemptible RCU implementation goes through three phases
      during bootup.  In the first phase, there is only one CPU that is running
      with preemption disabled, so that a no-op is a synchronous grace period.
      In the second mid-boot phase, the scheduler is running, but RCU has
      not yet gotten its kthreads spawned (and, for expedited grace periods,
      workqueues are not yet running.  During this time, any attempt to do
      a synchronous grace period will hang the system (or complain bitterly,
      depending).  In the third and final phase, RCU is fully operational and
      everything works normally.
      
      This has been OK for some time, but there has recently been some
      synchronous grace periods showing up during the second mid-boot phase.
      This code worked "by accident" for awhile, but started failing as soon
      as expedited RCU grace periods switched over to workqueues in commit
      8b355e3b ("rcu: Drive expedited grace periods from workqueue").
      Note that the code was buggy even before this commit, as it was subject
      to failure on real-time systems that forced all expedited grace periods
      to run as normal grace periods (for example, using the rcu_normal ksysfs
      parameter).  The callchain from the failure case is as follows:
      
      early_amd_iommu_init()
      |-> acpi_put_table(ivrs_base);
      |-> acpi_tb_put_table(table_desc);
      |-> acpi_tb_invalidate_table(table_desc);
      |-> acpi_tb_release_table(...)
      |-> acpi_os_unmap_memory
      |-> acpi_os_unmap_iomem
      |-> acpi_os_map_cleanup
      |-> synchronize_rcu_expedited
      
      The kernel showing this callchain was built with CONFIG_PREEMPT_RCU=y,
      which caused the code to try using workqueues before they were
      initialized, which did not go well.
      
      This commit therefore reworks RCU to permit synchronous grace periods
      to proceed during this mid-boot phase.  This commit is therefore a
      fix to a regression introduced in v4.9, and is therefore being put
      forward post-merge-window in v4.10.
      
      This commit sets a flag from the existing rcu_scheduler_starting()
      function which causes all synchronous grace periods to take the expedited
      path.  The expedited path now checks this flag, using the requesting task
      to drive the expedited grace period forward during the mid-boot phase.
      Finally, this flag is updated by a core_initcall() function named
      rcu_exp_runtime_mode(), which causes the runtime codepaths to be used.
      
      Note that this arrangement assumes that tasks are not sent POSIX signals
      (or anything similar) from the time that the first task is spawned
      through core_initcall() time.
      
      Fixes: 8b355e3b ("rcu: Drive expedited grace periods from workqueue")
      Reported-by: N"Zheng, Lv" <lv.zheng@intel.com>
      Reported-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NStan Kain <stan.kain@gmail.com>
      Tested-by: NIvan <waffolz@hotmail.com>
      Tested-by: NEmanuel Castelo <emanuel.castelo@gmail.com>
      Tested-by: NBruno Pesavento <bpesavento@infinito.it>
      Tested-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NFrederic Bezies <fredbezies@gmail.com>
      Cc: <stable@vger.kernel.org> # 4.9.0-
      52d7e48b
  11. 07 1月, 2015 1 次提交
    • L
      tiny_rcu: Directly force QS when call_rcu_[bh|sched]() on idle_task · 5f6130fa
      Lai Jiangshan 提交于
      For RCU in UP, context-switch = QS = GP, thus we can force a
      context-switch when any call_rcu_[bh|sched]() is happened on idle_task.
      After doing so, rcu_idle/irq_enter/exit() are useless, so we can simply
      make these functions empty.
      
      More important, this change does not change the functionality logically.
      Note: raise_softirq(RCU_SOFTIRQ)/rcu_sched_qs() in rcu_idle_enter() and
      outmost rcu_irq_exit() will have to wake up the ksoftirqd
      (due to in_interrupt() == 0).
      
      Before this patch		After this patch:
      call_rcu_sched() in idle;	call_rcu_sched() in idle
      				  set resched
      do other stuffs;		do other stuffs
      outmost rcu_irq_exit()		outmost rcu_irq_exit() (empty function)
        (or rcu_idle_enter())		  (or rcu_idle_enter(), also empty function)
      				start to resched. (see above)
        rcu_sched_qs()		rcu_sched_qs()
          QS,and GP and advance cb	  QS,and GP and advance cb
          wake up the ksoftirqd	    wake up the ksoftirqd
            set resched
      resched to ksoftirqd (or other)	resched to ksoftirqd (or other)
      
      These two code patches are almost the same.
      
      Size changed after patched:
      
      size kernel/rcu/tiny-old.o kernel/rcu/tiny-patched.o
         text	   data	    bss	    dec	    hex	filename
         3449	    206	      8	   3663	    e4f	kernel/rcu/tiny-old.o
         2406	    144	      8	   2558	    9fe	kernel/rcu/tiny-patched.o
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5f6130fa
  12. 04 11月, 2014 1 次提交