1. 27 9月, 2013 1 次提交
  2. 14 8月, 2013 5 次提交
    • F
      context_tracking: User/kernel broundary cross trace events · 1b6a259a
      Frederic Weisbecker 提交于
      This can be useful to track all kernel/user round trips.
      And it's also helpful to debug the context tracking subsystem.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      1b6a259a
    • F
      context_tracking: Optimize context switch off case with static keys · 73d424f9
      Frederic Weisbecker 提交于
      No need for syscall slowpath if no CPU is full dynticks,
      rather nop this in this case.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      73d424f9
    • F
      context_tracking: Optimize guest APIs off case with static key · 48d6a816
      Frederic Weisbecker 提交于
      Optimize guest entry/exit APIs with static keys. This minimize
      the overhead for those who enable CONFIG_NO_HZ_FULL without
      always using it. Having no range passed to nohz_full= should
      result in the probes overhead to be minimized.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      48d6a816
    • F
      context_tracking: Optimize main APIs off case with static key · ad65782f
      Frederic Weisbecker 提交于
      Optimize user and exception entry/exit APIs with static
      keys. This minimize the overhead for those who enable
      CONFIG_NO_HZ_FULL without always using it. Having no range
      passed to nohz_full= should result in the probes to be nopped
      (at least we hope so...).
      
      If this proves not be enough in the long term, we'll need
      to bring an exception slow path by re-routing the exception
      handlers.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      ad65782f
    • F
      context_tracking: Ground setup for static key use · 65f382fd
      Frederic Weisbecker 提交于
      Prepare for using a static key in the context tracking subsystem.
      This will help optimizing the off case on its many users:
      
      * user_enter, user_exit, exception_enter, exception_exit, guest_enter,
        guest_exit, vtime_*()
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      65f382fd
  3. 13 8月, 2013 4 次提交
    • F
      nohz: Only enable context tracking on full dynticks CPUs · 2e709338
      Frederic Weisbecker 提交于
      The context tracking subsystem has the ability to selectively
      enable the tracking on any defined subset of CPU. This means that
      we can define a CPU range that doesn't run the context tracking
      and another range that does.
      
      Now what we want in practice is to enable the tracking on full
      dynticks CPUs only. In order to perform this, we just need to pass
      our full dynticks CPU range selection from the full dynticks
      subsystem to the context tracking.
      
      This way we can spare the overhead of RCU user extended quiescent
      state and vtime maintainance on the CPUs that are outside the
      full dynticks range. Just keep in mind the raw context tracking
      itself is still necessary everywhere.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      2e709338
    • F
      context_tracking: Fix runtime CPU off-case · d65ec121
      Frederic Weisbecker 提交于
      As long as the context tracking is enabled on any CPU, even
      a single one, all other CPUs need to keep track of their
      user <-> kernel boundaries cross as well.
      
      This is because a task can sleep while servicing an exception
      that happened in the kernel or in userspace. Then when the task
      eventually wakes up and return from the exception, the CPU needs
      to know if we resume in userspace or in the kernel. exception_exit()
      get this information from exception_enter() that saved the previous
      state.
      
      If the CPU where the exception happened didn't keep track of
      these informations, exception_exit() doesn't know which state
      tracking to restore on the CPU where the task got migrated
      and we may return to userspace with the context tracking
      subsystem thinking that we are in kernel mode.
      
      This can be fixed in the long term if we move our context tracking
      probes on very low level arch fast path user <-> kernel boundary,
      although even that is worrisome as an exception can still happen
      in the few instructions between the probe and the actual iret.
      
      Also we are not yet ready to set these probes in the fast path given
      the potential overhead problem it induces.
      
      So let's fix this by always enable context tracking even on CPUs
      that are not in the full dynticks range. OTOH we can spare the
      rcu_user_*() and vtime_user_*() calls there because the tick runs
      on these CPUs and we can handle RCU state machine and cputime
      accounting through it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      d65ec121
    • F
      context_tracing: Fix guest accounting with native vtime · 2d854e57
      Frederic Weisbecker 提交于
      1) If context tracking is enabled with native vtime accounting (which
      combo is useless except for dev testing), we call vtime_guest_enter()
      and vtime_guest_exit() on host <-> guest switches. But those are stubs
      in this configurations. As a result, cputime is not correctly flushed
      on kvm context switches.
      
      2) If context tracking runs but is disabled on some CPUs, those
      CPUs end up calling __guest_enter/__guest_exit which in turn
      call vtime_account_system(). We don't want to call this because we
      run in tick based accounting for these CPUs.
      
      Refactor the guest_enter/guest_exit code such that all combinations
      finally work.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      2d854e57
    • F
      sched: Consolidate open coded preemptible() checks · fbb00b56
      Frederic Weisbecker 提交于
      preempt_schedule() and preempt_schedule_context() open
      code their preemptability checks.
      
      Use the standard API instead for consolidation.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Alex Shi <alex.shi@intel.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      fbb00b56
  4. 19 6月, 2013 1 次提交
    • S
      tracing/context-tracking: Add preempt_schedule_context() for tracing · 29bb9e5a
      Steven Rostedt 提交于
      Dave Jones hit the following bug report:
      
       ===============================
       [ INFO: suspicious RCU usage. ]
       3.10.0-rc2+ #1 Not tainted
       -------------------------------
       include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
       other info that might help us debug this:
       RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0
       RCU used illegally from extended quiescent state!
       2 locks held by cc1/63645:
        #0:  (&rq->lock){-.-.-.}, at: [<ffffffff816b39fd>] __schedule+0xed/0x9b0
        #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff8109d645>] cpuacct_charge+0x5/0x1f0
      
       CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369]
       Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
        0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28
        ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500
        0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645
       Call Trace:
        [<ffffffff816ae383>] dump_stack+0x19/0x1b
        [<ffffffff810b698d>] lockdep_rcu_suspicious+0xfd/0x130
        [<ffffffff8109d7c5>] cpuacct_charge+0x185/0x1f0
        [<ffffffff8109d645>] ? cpuacct_charge+0x5/0x1f0
        [<ffffffff8108dffc>] update_curr+0xec/0x240
        [<ffffffff8108f528>] put_prev_task_fair+0x228/0x480
        [<ffffffff816b3a71>] __schedule+0x161/0x9b0
        [<ffffffff816b4721>] preempt_schedule+0x51/0x80
        [<ffffffff816b4800>] ? __cond_resched_softirq+0x60/0x60
        [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
        [<ffffffff810ff3cc>] ftrace_ops_control_func+0x1dc/0x210
        [<ffffffff816be280>] ftrace_call+0x5/0x2f
        [<ffffffff816b681d>] ? retint_careful+0xb/0x2e
        [<ffffffff816b4805>] ? schedule_user+0x5/0x70
        [<ffffffff816b4805>] ? schedule_user+0x5/0x70
        [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
       ------------[ cut here ]------------
      
      What happened was that the function tracer traced the schedule_user() code
      that tells RCU that the system is coming back from userspace, and to
      add the CPU back to the RCU monitoring.
      
      Because the function tracer does a preempt_disable/enable_notrace() calls
      the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set,
      then preempt_schedule() is called. But this is called before the user_exit()
      function can inform the kernel that the CPU is no longer in user mode and
      needs to be accounted for by RCU.
      
      The fix is to create a new preempt_schedule_context() that checks if
      the kernel is still in user mode and if so to switch it to kernel mode
      before calling schedule. It also switches back to user mode coming back
      from schedule in need be.
      
      The only user of this currently is the preempt_enable_notrace(), which is
      only used by the tracing subsystem.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.homeSigned-off-by: NIngo Molnar <mingo@kernel.org>
      29bb9e5a
  5. 31 5月, 2013 1 次提交
    • F
      kvm: Move guest entry/exit APIs to context_tracking · 521921ba
      Frederic Weisbecker 提交于
      The kvm_host.h header file doesn't handle well
      inclusion when archs don't support KVM.
      
      This results in build crashes for such archs when they
      want to implement context tracking because this subsystem
      includes kvm_host.h in order to implement the
      guest_enter/exit APIs but it doesn't handle KVM off case.
      
      To fix this, move the guest_enter()/guest_exit()
      declarations and generic implementation to the context
      tracking headers. These generic APIs actually belong to
      this subsystem, besides other domains boundary tracking
      like user_enter() et al.
      
      KVM now properly becomes a user of this library, not the
      other buggy way around.
      Reported-by: NKevin Hilman <khilman@linaro.org>
      Reviewed-by: NKevin Hilman <khilman@linaro.org>
      Tested-by: NKevin Hilman <khilman@linaro.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      521921ba
  6. 28 1月, 2013 2 次提交
    • F
      cputime: Safely read cputime of full dynticks CPUs · 6a61671b
      Frederic Weisbecker 提交于
      While remotely reading the cputime of a task running in a
      full dynticks CPU, the values stored in utime/stime fields
      of struct task_struct may be stale. Its values may be those
      of the last kernel <-> user transition time snapshot and
      we need to add the tickless time spent since this snapshot.
      
      To fix this, flush the cputime of the dynticks CPUs on
      kernel <-> user transition and record the time / context
      where we did this. Then on top of this snapshot and the current
      time, perform the fixup on the reader side from task_times()
      accessors.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      [fixed kvm module related build errors]
      Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com>
      6a61671b
    • F
      cputime: Generic on-demand virtual cputime accounting · abf917cd
      Frederic Weisbecker 提交于
      If we want to stop the tick further idle, we need to be
      able to account the cputime without using the tick.
      
      Virtual based cputime accounting solves that problem by
      hooking into kernel/user boundaries.
      
      However implementing CONFIG_VIRT_CPU_ACCOUNTING require
      low level hooks and involves more overhead. But we already
      have a generic context tracking subsystem that is required
      for RCU needs by archs which plan to shut down the tick
      outside idle.
      
      This patch implements a generic virtual based cputime
      accounting that relies on these generic kernel/user hooks.
      
      There are some upsides of doing this:
      
      - This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
      if context tracking is already built (already necessary for RCU in full
      tickless mode).
      
      - We can rely on the generic context tracking subsystem to dynamically
      (de)activate the hooks, so that we can switch anytime between virtual
      and tick based accounting. This way we don't have the overhead
      of the virtual accounting when the tick is running periodically.
      
      And one downside:
      
      - There is probably more overhead than a native virtual based cputime
      accounting. But this relies on hooks that are already set anyway.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      abf917cd
  7. 27 1月, 2013 2 次提交
    • F
      context_tracking: Add comments on interface and internals · 4eacdf18
      Frederic Weisbecker 提交于
      This subsystem lacks many explanations on its purpose and
      design. Add these missing comments.
      
      v4: Document function parameter to be more kernel-doc
      friendly, as per Namhyung suggestion.
      Reported-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      4eacdf18
    • F
      context_tracking: Export context state for generic vtime · 95a79fd4
      Frederic Weisbecker 提交于
      Export the context state: whether we run in user / kernel
      from the context tracking subsystem point of view.
      
      This is going to be used by the generic virtual cputime
      accounting subsystem that is needed to implement the full
      dynticks.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      95a79fd4
  8. 01 12月, 2012 1 次提交
    • F
      context_tracking: New context tracking susbsystem · 91d1aa43
      Frederic Weisbecker 提交于
      Create a new subsystem that probes on kernel boundaries
      to keep track of the transitions between level contexts
      with two basic initial contexts: user or kernel.
      
      This is an abstraction of some RCU code that use such tracking
      to implement its userspace extended quiescent state.
      
      We need to pull this up from RCU into this new level of indirection
      because this tracking is also going to be used to implement an "on
      demand" generic virtual cputime accounting. A necessary step to
      shutdown the tick while still accounting the cputime.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Gilad Ben-Yossef <gilad@benyossef.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      [ paulmck: fix whitespace error and email address. ]
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      91d1aa43