1. 25 9月, 2013 2 次提交
    • P
      sched: Add NEED_RESCHED to the preempt_count · f27dde8d
      Peter Zijlstra 提交于
      In order to combine the preemption and need_resched test we need to
      fold the need_resched information into the preempt_count value.
      
      Since the NEED_RESCHED flag is set across CPUs this needs to be an
      atomic operation, however we very much want to avoid making
      preempt_count atomic, therefore we keep the existing TIF_NEED_RESCHED
      infrastructure in place but at 3 sites test it and fold its value into
      preempt_count; namely:
      
       - resched_task() when setting TIF_NEED_RESCHED on the current task
       - scheduler_ipi() when resched_task() sets TIF_NEED_RESCHED on a
                         remote task it follows it up with a reschedule IPI
                         and we can modify the cpu local preempt_count from
                         there.
       - cpu_idle_loop() for when resched_task() found tsk_is_polling().
      
      We use an inverted bitmask to indicate need_resched so that a 0 means
      both need_resched and !atomic.
      
      Also remove the barrier() in preempt_enable() between
      preempt_enable_no_resched() and preempt_check_resched() to avoid
      having to reload the preemption value and allow the compiler to use
      the flags of the previuos decrement. I couldn't come up with any sane
      reason for this barrier() to be there as preempt_enable_no_resched()
      already has a barrier() before doing the decrement.
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-7a7m5qqbn5pmwnd4wko9u6da@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f27dde8d
    • P
      sched: Introduce preempt_count accessor functions · 4a2b4b22
      Peter Zijlstra 提交于
      Replace the single preempt_count() 'function' that's an lvalue with
      two proper functions:
      
       preempt_count() - returns the preempt_count value as rvalue
       preempt_count_set() - Allows setting the preempt-count value
      
      Also provide preempt_count_ptr() as a convenience wrapper to implement
      all modifying operations.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-orxrbycjozopqfhb4dxdkdvb@git.kernel.org
      [ Fixed build failure. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4a2b4b22
  2. 19 6月, 2013 1 次提交
    • S
      tracing/context-tracking: Add preempt_schedule_context() for tracing · 29bb9e5a
      Steven Rostedt 提交于
      Dave Jones hit the following bug report:
      
       ===============================
       [ INFO: suspicious RCU usage. ]
       3.10.0-rc2+ #1 Not tainted
       -------------------------------
       include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
       other info that might help us debug this:
       RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0
       RCU used illegally from extended quiescent state!
       2 locks held by cc1/63645:
        #0:  (&rq->lock){-.-.-.}, at: [<ffffffff816b39fd>] __schedule+0xed/0x9b0
        #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff8109d645>] cpuacct_charge+0x5/0x1f0
      
       CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369]
       Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
        0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28
        ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500
        0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645
       Call Trace:
        [<ffffffff816ae383>] dump_stack+0x19/0x1b
        [<ffffffff810b698d>] lockdep_rcu_suspicious+0xfd/0x130
        [<ffffffff8109d7c5>] cpuacct_charge+0x185/0x1f0
        [<ffffffff8109d645>] ? cpuacct_charge+0x5/0x1f0
        [<ffffffff8108dffc>] update_curr+0xec/0x240
        [<ffffffff8108f528>] put_prev_task_fair+0x228/0x480
        [<ffffffff816b3a71>] __schedule+0x161/0x9b0
        [<ffffffff816b4721>] preempt_schedule+0x51/0x80
        [<ffffffff816b4800>] ? __cond_resched_softirq+0x60/0x60
        [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
        [<ffffffff810ff3cc>] ftrace_ops_control_func+0x1dc/0x210
        [<ffffffff816be280>] ftrace_call+0x5/0x2f
        [<ffffffff816b681d>] ? retint_careful+0xb/0x2e
        [<ffffffff816b4805>] ? schedule_user+0x5/0x70
        [<ffffffff816b4805>] ? schedule_user+0x5/0x70
        [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
       ------------[ cut here ]------------
      
      What happened was that the function tracer traced the schedule_user() code
      that tells RCU that the system is coming back from userspace, and to
      add the CPU back to the RCU monitoring.
      
      Because the function tracer does a preempt_disable/enable_notrace() calls
      the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set,
      then preempt_schedule() is called. But this is called before the user_exit()
      function can inform the kernel that the CPU is no longer in user mode and
      needs to be accounted for by RCU.
      
      The fix is to create a new preempt_schedule_context() that checks if
      the kernel is still in user mode and if so to switch it to kernel mode
      before calling schedule. It also switches back to user mode coming back
      from schedule in need be.
      
      The only user of this currently is the preempt_enable_notrace(), which is
      only used by the tracing subsystem.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.homeSigned-off-by: NIngo Molnar <mingo@kernel.org>
      29bb9e5a
  3. 10 4月, 2013 1 次提交
    • L
      spinlocks and preemption points need to be at least compiler barriers · 386afc91
      Linus Torvalds 提交于
      In UP and non-preempt respectively, the spinlocks and preemption
      disable/enable points are stubbed out entirely, because there is no
      regular code that can ever hit the kind of concurrency they are meant to
      protect against.
      
      However, while there is no regular code that can cause scheduling, we
      _do_ end up having some exceptional (literally!) code that can do so,
      and that we need to make sure does not ever get moved into the critical
      region by the compiler.
      
      In particular, get_user() and put_user() is generally implemented as
      inline asm statements (even if the inline asm may then make a call
      instruction to call out-of-line), and can obviously cause a page fault
      and IO as a result.  If that inline asm has been scheduled into the
      middle of a preemption-safe (or spinlock-protected) code region, we
      obviously lose.
      
      Now, admittedly this is *very* unlikely to actually ever happen, and
      we've not seen examples of actual bugs related to this.  But partly
      exactly because it's so hard to trigger and the resulting bug is so
      subtle, we should be extra careful to get this right.
      
      So make sure that even when preemption is disabled, and we don't have to
      generate any actual *code* to explicitly tell the system that we are in
      a preemption-disabled region, we need to at least tell the compiler not
      to move things around the critical region.
      
      This patch grew out of the same discussion that caused commits
      79e5f05e ("ARC: Add implicit compiler barrier to raw_local_irq*
      functions") and 3e2e0d2c ("tile: comment assumption about
      __insn_mtspr for <asm/irqflags.h>") to come about.
      
      Note for stable: use discretion when/if applying this.  As mentioned,
      this bug may never have actually bitten anybody, and gcc may never have
      done the required code motion for it to possibly ever trigger in
      practice.
      
      Cc: stable@vger.kernel.org
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      386afc91
  4. 01 3月, 2012 1 次提交
  5. 10 6月, 2011 1 次提交
  6. 02 12月, 2009 1 次提交
    • T
      sched: Revert 498657a4 · 8592e648
      Tejun Heo 提交于
      498657a4 incorrectly assumed
      that preempt wasn't disabled around context_switch() and thus
      was fixing imaginary problem.  It also broke KVM because it
      depended on ->sched_in() to be called with irq enabled so that
      it can do smp calls from there.
      
      Revert the incorrect commit and add comment describing different
      contexts under with the two callbacks are invoked.
      
      Avi: spotted transposed in/out in the added comment.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NAvi Kivity <avi@redhat.com>
      Cc: peterz@infradead.org
      Cc: efault@gmx.de
      Cc: rusty@rustcorp.com.au
      LKML-Reference: <1259726212-30259-2-git-send-email-tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8592e648
  7. 24 5月, 2008 2 次提交
    • S
      ftrace: trace preempt off critical timings · 6cd8a4bb
      Steven Rostedt 提交于
      Add preempt off timings. A lot of kernel core code is taken from the RT patch
      latency trace that was written by Ingo Molnar.
      
      This adds "preemptoff" and "preemptirqsoff" to /debugfs/tracing/available_tracers
      
      Now instead of just tracing irqs off, preemption off can be selected
      to be recorded.
      
      When this is selected, it shares the same files as irqs off timings.
      One can either trace preemption off, irqs off, or one or the other off.
      
      By echoing "preemptoff" into /debugfs/tracing/current_tracer, recording
      of preempt off only is performed. "irqsoff" will only record the time
      irqs are disabled, but "preemptirqsoff" will take the total time irqs
      or preemption are disabled. Runtime switching of these options is now
      supported by simpling echoing in the appropriate trace name into
      /debugfs/tracing/current_tracer.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      6cd8a4bb
    • S
      ftrace: add preempt_enable/disable notrace macros · 50282528
      Steven Rostedt 提交于
      The tracer may need to call preempt_enable and disable functions
      for time keeping and such. The trace gets ugly when we see these
      functions show up for all traces. To make the output cleaner
      this patch adds preempt_enable_notrace and preempt_disable_notrace
      to be used by tracer (and debugging) functions.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      50282528
  8. 09 2月, 2008 1 次提交
  9. 26 7月, 2007 1 次提交
  10. 26 4月, 2006 1 次提交
  11. 23 12月, 2005 1 次提交
    • N
      [PATCH] fix race with preempt_enable() · d6f02913
      Nicolas Pitre 提交于
      Currently a simple
      
      	void foo(void) { preempt_enable(); }
      
      produces the following code on ARM:
      
      foo:
      	bic	r3, sp, #8128
      	bic	r3, r3, #63
      	ldr	r2, [r3, #4]
      	ldr	r1, [r3, #0]
      	sub	r2, r2, #1
      	tst	r1, #4
      	str	r2, [r3, #4]
      	blne	preempt_schedule
      	mov	pc, lr
      
      The problem is that the TIF_NEED_RESCHED flag is loaded _before_ the
      preemption count is stored back, hence any interrupt coming within that
      3 instruction window causing TIF_NEED_RESCHED to be set won't be
      seen and scheduling won't happen as it should.
      
      Nothing currently prevents gcc from performing that reordering.  There
      is already a barrier() before the decrement of the preemption count, but
      another one is needed between this and the TIF_NEED_RESCHED flag test
      for proper code ordering.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Acked-by: NNick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d6f02913
  12. 14 11月, 2005 1 次提交
  13. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4