1. 11 6月, 2013 1 次提交
  2. 29 1月, 2013 1 次提交
  3. 24 10月, 2012 1 次提交
  4. 23 9月, 2012 1 次提交
  5. 06 7月, 2012 1 次提交
  6. 03 7月, 2012 2 次提交
  7. 03 5月, 2012 1 次提交
    • P
      rcu: Make exit_rcu() more precise and consolidate · 9dd8fb16
      Paul E. McKenney 提交于
      When running preemptible RCU, if a task exits in an RCU read-side
      critical section having blocked within that same RCU read-side critical
      section, the task must be removed from the list of tasks blocking a
      grace period (perhaps the current grace period, perhaps the next grace
      period, depending on timing).  The exit() path invokes exit_rcu() to
      do this cleanup.
      
      However, the current implementation of exit_rcu() needlessly does the
      cleanup even if the task did not block within the current RCU read-side
      critical section, which wastes time and needlessly increases the size
      of the state space.  Fix this by only doing the cleanup if the current
      task is actually on the list of tasks blocking some grace period.
      
      While we are at it, consolidate the two identical exit_rcu() functions
      into a single function.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      
      Conflicts:
      
      	kernel/rcupdate.c
      9dd8fb16
  8. 22 2月, 2012 6 次提交
    • P
      rcu: Simplify unboosting checks · 1aa03f11
      Paul E. McKenney 提交于
      This is a port of commit #82e78d80 from TREE_PREEMPT_RCU to
      TINY_PREEMPT_RCU.
      
      This commit uses the fact that current->rcu_boost_mutex is set
      any time that the RCU_READ_UNLOCK_BOOSTED flag is set in the
      current->rcu_read_unlock_special bitmask.  This allows tests of
      the bit to be changed to tests of the pointer, which in turn allows
      the RCU_READ_UNLOCK_BOOSTED flag to be eliminated.
      
      Please note that the check of current->rcu_read_unlock_special need not
      change because any time that RCU_READ_UNLOCK_BOOSTED was set, so was
      RCU_READ_UNLOCK_BLOCKED.  Therefore, __rcu_read_unlock() can continue
      testing current->rcu_read_unlock_special for non-zero, as before.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1aa03f11
    • P
      rcu: Inform RCU of irq_exit() activity · 8762705a
      Paul E. McKenney 提交于
      This is a port to TINY_RCU of Peter Zijlstra's commit #ec433f0c
      
      The rcu_read_unlock_special() function relies on in_irq() to exclude
      scheduler activity from interrupt level.  This fails because exit_irq()
      can invoke the scheduler after clearing the preempt_count() bits that
      in_irq() uses to determine that it is at interrupt level.  This situation
      can result in failures as follows:
      
           $task			IRQ		SoftIRQ
      
           rcu_read_lock()
      
           /* do stuff */
      
           <preempt> |= UNLOCK_BLOCKED
      
           rcu_read_unlock()
             --t->rcu_read_lock_nesting
      
          			irq_enter();
          			/* do stuff, don't use RCU */
          			irq_exit();
          			  sub_preempt_count(IRQ_EXIT_OFFSET);
          			  invoke_softirq()
      
          					ttwu();
          					  spin_lock_irq(&pi->lock)
          					  rcu_read_lock();
          					  /* do stuff */
          					  rcu_read_unlock();
          					    rcu_read_unlock_special()
          					      rcu_report_exp_rnp()
          					        ttwu()
          					          spin_lock_irq(&pi->lock) /* deadlock */
      
             rcu_read_unlock_special(t);
      
      This can be triggered 'easily' because invoke_softirq() immediately does
      a ttwu() of ksoftirqd/# instead of doing the in-place softirq stuff first,
      but even without that the above happens.
      
      Cure this by also excluding softirqs from the rcu_read_unlock_special()
      handler and ensuring the force_irqthreads ksoftirqd/# wakeup is done
      from full softirq context.
      
      It is also necessary to delay the ->rcu_read_lock_nesting decrement until
      after rcu_read_unlock_special().  This delay is handled by the commit
      "Protect __rcu_read_unlock() against scheduler-using irq handlers".
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8762705a
    • P
      rcu: Prevent RCU callbacks from executing before scheduler initialized · 768dfffd
      Paul E. McKenney 提交于
      This is a port of commit #b0d30417 from TREE_RCU to TREE_PREEMPT_RCU.
      
      Under some rare but real combinations of configuration parameters, RCU
      callbacks are posted during early boot that use kernel facilities that are
      not yet initialized.  Therefore, when these callbacks are invoked, hard
      hangs and crashes ensue.  This commit therefore prevents RCU callbacks
      from being invoked until after the scheduler is fully up and running,
      as in after multiple tasks have been spawned.
      
      It might well turn out that a better approach is to identify the specific
      RCU callbacks that are causing this problem, but that discussion will
      wait until such time as someone really needs an RCU callback to be invoked
      (as opposed to merely registered) during early boot.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      768dfffd
    • P
      rcu: Streamline code produced by __rcu_read_unlock() · afef2054
      Paul E. McKenney 提交于
      This is a port of commit #be0e1e21 to TINY_PREEMPT_RCU.  This uses
      noinline to prevent rcu_read_unlock_special() from being inlined into
      __rcu_read_unlock().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      afef2054
    • P
      rcu: Protect __rcu_read_unlock() against scheduler-using irq handlers · 26861faf
      Paul E. McKenney 提交于
      This commit ports commit #10f39bb1 (rcu: protect __rcu_read_unlock()
      against scheduler-using irq handlers) from TREE_PREEMPT_RCU to
      TINY_PREEMPT_RCU.  The following is a corresponding port of that
      commit message.
      
      The addition of RCU read-side critical sections within runqueue and
      priority-inheritance critical sections introduced some deadlocks,
      for example, involving interrupts from __rcu_read_unlock() where the
      interrupt handlers call wake_up().  This situation can cause the
      instance of __rcu_read_unlock() invoked from interrupt to do some
      of the processing that would otherwise have been carried out by the
      task-level instance of __rcu_read_unlock().  When the interrupt-level
      instance of __rcu_read_unlock() is called with a scheduler lock held from
      interrupt-entry/exit situations where in_irq() returns false, deadlock can
      result.  Of course, in a UP kernel, there are not really any deadlocks,
      but the upper-level critical section can still be be fatally confused
      by the lower-level critical section changing things out from under it.
      
      This commit resolves these deadlocks by using negative values of the
      per-task ->rcu_read_lock_nesting counter to indicate that an instance of
      __rcu_read_unlock() is in flight, which in turn prevents instances from
      interrupt handlers from doing any special processing.  Note that nested
      rcu_read_lock()/rcu_read_unlock() pairs are still permitted, but they will
      never see ->rcu_read_lock_nesting go to zero, and will therefore never
      invoke rcu_read_unlock_special(), thus preventing them from seeing the
      RCU_READ_UNLOCK_BLOCKED bit should it be set in ->rcu_read_unlock_special.
      This patch also adds a check for ->rcu_read_unlock_special being negative
      in rcu_check_callbacks(), thus preventing the RCU_READ_UNLOCK_NEED_QS
      bit from being set should a scheduling-clock interrupt occur while
      __rcu_read_unlock() is exiting from an outermost RCU read-side critical
      section.
      
      Of course, __rcu_read_unlock() can be preempted during the time that
      ->rcu_read_lock_nesting is negative.  This could result in the setting
      of the RCU_READ_UNLOCK_BLOCKED bit after __rcu_read_unlock() checks it,
      and would also result it this task being queued on the corresponding
      rcu_node structure's blkd_tasks list.  Therefore, some later RCU read-side
      critical section would enter rcu_read_unlock_special() to clean up --
      which could result in deadlock (OK, OK, fatal confusion) if that RCU
      read-side critical section happened to be in the scheduler where the
      runqueue or priority-inheritance locks were held.
      
      To prevent the possibility of fatal confusion that might result from
      preemption during the time that ->rcu_read_lock_nesting is negative,
      this commit also makes rcu_preempt_note_context_switch() check for
      negative ->rcu_read_lock_nesting, thus refraining from queuing the task
      (and from setting RCU_READ_UNLOCK_BLOCKED) if we are already exiting
      from the outermost RCU read-side critical section (in other words,
      we really are no longer actually in that RCU read-side critical
      section).  In addition, rcu_preempt_note_context_switch() invokes
      rcu_read_unlock_special() to carry out the cleanup in this case, which
      clears out the ->rcu_read_unlock_special bits and dequeues the task
      (if necessary), in turn avoiding needless delay of the current RCU grace
      period and needless RCU priority boosting.
      
      It is still illegal to call rcu_read_unlock() while holding a scheduler
      lock if the prior RCU read-side critical section has ever had both
      preemption and irqs enabled.  However, the common use case is legal,
      namely where then entire RCU read-side critical section executes with
      irqs disabled, for example, when the scheduler lock is held across the
      entire lifetime of the RCU read-side critical section.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      26861faf
    • P
      rcu: Add lockdep-RCU checks for simple self-deadlock · fe15d706
      Paul E. McKenney 提交于
      It is illegal to have a grace period within a same-flavor RCU read-side
      critical section, so this commit adds lockdep-RCU checks to splat when
      such abuse is encountered.  This commit does not detect more elaborate
      RCU deadlock situations.  These situations might be a job for lockdep
      enhancements.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      fe15d706
  9. 12 12月, 2011 2 次提交
  10. 31 10月, 2011 1 次提交
  11. 29 9月, 2011 3 次提交
  12. 06 5月, 2011 3 次提交
  13. 05 3月, 2011 1 次提交
  14. 30 11月, 2010 3 次提交
  15. 18 11月, 2010 1 次提交
    • P
      rcu: move TINY_RCU from softirq to kthread · b2c0710c
      Paul E. McKenney 提交于
      If RCU priority boosting is to be meaningful, callback invocation must
      be boosted in addition to preempted RCU readers.  Otherwise, in presence
      of CPU real-time threads, the grace period ends, but the callbacks don't
      get invoked.  If the callbacks don't get invoked, the associated memory
      doesn't get freed, so the system is still subject to OOM.
      
      But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
      moves the callback invocations to a kthread, which can be boosted easily.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b2c0710c
  16. 28 8月, 2010 1 次提交
  17. 21 8月, 2010 1 次提交
  18. 20 8月, 2010 1 次提交
    • P
      rcu: Add a TINY_PREEMPT_RCU · a57eb940
      Paul E. McKenney 提交于
      Implement a small-memory-footprint uniprocessor-only implementation of
      preemptible RCU.  This implementation uses but a single blocked-tasks
      list rather than the combinatorial number used per leaf rcu_node by
      TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
      processing.  This version also takes advantage of uniprocessor execution
      to accelerate grace periods in the case where there are no readers.
      
      The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.
      
      This implementation is a step towards having RCU implementation driven
      off of the SMP and PREEMPT kernel configuration variables, which can
      happen once this implementation has accumulated sufficient experience.
      
      Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
      suggested by Steve Rostedt in order to avoid the compiler-reordering
      issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).
      
      As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
      savings compared to CONFIG_TREE_PREEMPT_RCU.  Of course, for non-real-time
      workloads, CONFIG_TINY_RCU is even better.
      
      	CONFIG_TREE_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   6170	    825	     28	   7023	   kernel/rcutree.o
      				   ----
      				   7026    Total
      
      	CONFIG_TINY_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   2081	     81	      8	   2170	   kernel/rcutiny.o
      				   ----
      				   2183    Total
      
      	CONFIG_TINY_RCU (non-preemptible)
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	    719	     25	      0	    744	   kernel/rcutiny.o
      				    ---
      				    757    Total
      Requested-by: NLoïc Minier <loic.minier@canonical.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a57eb940
  19. 11 5月, 2010 1 次提交
  20. 27 8月, 2009 1 次提交
    • M
      x86: Instruction decoder API · eb13296c
      Masami Hiramatsu 提交于
      Add x86 instruction decoder to arch-specific libraries. This decoder
      can decode x86 instructions used in kernel into prefix, opcode, modrm,
      sib, displacement and immediates. This can also show the length of
      instructions.
      
      This version introduces instruction attributes for decoding
      instructions.
      The instruction attribute tables are generated from the opcode map file
      (x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).
      
      Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
      IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
      and consist of below two types of opcode tables.
      
      1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
      written as below;
      
       Table: table-name
       Referrer: escaped-name
       opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
        (or)
       opcode: escape # escaped-name
       EndTable
      
      Group opcodes, which has 8 elements, are written as below;
      
       GrpTable: GrpXXX
       reg:  mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
       EndTable
      
      These opcode maps include a few SSE and FP opcodes (for setup), because
      those opcodes are used in the kernel.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NJim Keniston <jkenisto@us.ibm.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      LKML-Reference: <20090813203413.31965.49709.stgit@localhost.localdomain>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      eb13296c
  21. 28 11月, 2008 1 次提交
    • N
      [ARM] remove a common set of __virt_to_bus definitions · b5ee9002
      Nicolas Pitre 提交于
      Let's provide an overridable default instead of having every machine
      class define __virt_to_bus and __bus_to_virt to the same thing.  What
      most platforms are using is bus_addr == phys_addr so such is the default.
      
      One exception is ebsa110 which has no DMA what so ever, so the actual
      definition is not important except only for proper compilation.  Also
      added a comment about the special footbridge bus translation.
      
      Let's also remove comments alluding to set_dma_addr which is not
      (and should not) be commonly used.
      Signed-off-by: NNicolas Pitre <nico@marvell.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      b5ee9002
  22. 07 8月, 2008 2 次提交
  23. 08 2月, 2007 1 次提交
  24. 29 6月, 2006 1 次提交
  25. 21 6月, 2006 1 次提交
  26. 10 1月, 2006 1 次提交