1. 08 3月, 2016 1 次提交
  2. 29 2月, 2016 1 次提交
    • R
      sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity · ff9a9b4c
      Rik van Riel 提交于
      When profiling syscall overhead on nohz-full kernels,
      after removing __acct_update_integrals() from the profile,
      native_sched_clock() remains as the top CPU user. This can be
      reduced by moving VIRT_CPU_ACCOUNTING_GEN to jiffy granularity.
      
      This will reduce timing accuracy on nohz_full CPUs to jiffy
      based sampling, just like on normal CPUs. It results in
      totally removing native_sched_clock from the profile, and
      significantly speeding up the syscall entry and exit path,
      as well as irq entry and exit, and KVM guest entry & exit.
      
      Additionally, only call the more expensive functions (and
      advance the seqlock) when jiffies actually changed.
      
      This code relies on another CPU advancing jiffies when the
      system is busy. On a nohz_full system, this is done by a
      housekeeping CPU.
      
      A microbenchmark calling an invalid syscall number 10 million
      times in a row speeds up an additional 30% over the numbers
      with just the previous patches, for a total speedup of about
      40% over 4.4 and 4.5-rc1.
      
      Run times for the microbenchmark:
      
       4.4				3.8 seconds
       4.5-rc1			3.7 seconds
       4.5-rc1 + first patch		3.3 seconds
       4.5-rc1 + first 3 patches	3.1 seconds
       4.5-rc1 + all patches		2.3 seconds
      
      A non-NOHZ_FULL cpu (not the housekeeping CPU):
      
       all kernels			1.86 seconds
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: clark@redhat.com
      Cc: eric.dumazet@gmail.com
      Cc: fweisbec@gmail.com
      Cc: luto@amacapital.net
      Link: http://lkml.kernel.org/r/1455152907-18495-5-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ff9a9b4c
  3. 21 12月, 2015 1 次提交
    • S
      missing include asm/paravirt.h in cputime.c · 1fe7c4ef
      Stefano Stabellini 提交于
      Add include asm/paravirt.h to cputime.c, as steal_account_process_tick
      calls paravirt_steal_clock, which is defined in asm/paravirt.h.
      
      The ifdef CONFIG_PARAVIRT is necessary because not all archs have an
      asm/paravirt.h to include.
      
      The reason why currently cputime.c compiles, even though include
      <asm/paravirt.h> is missing, is that on x86 asm/paravirt.h is included
      by one of the other headers included in kernel/sched/cputime.c:
      
      On arm and arm64, where I am about to introduce asm/paravirt.h and
      stolen time support, without #include <asm/paravirt.h> in cputime.c, I
      would get an error.
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      1fe7c4ef
  4. 04 12月, 2015 7 次提交
  5. 01 10月, 2015 1 次提交
  6. 03 8月, 2015 1 次提交
  7. 08 5月, 2015 1 次提交
  8. 03 10月, 2014 1 次提交
  9. 19 9月, 2014 1 次提交
  10. 08 9月, 2014 2 次提交
    • R
      sched, time: Atomically increment stime & utime · eb1b4af0
      Rik van Riel 提交于
      The functions task_cputime_adjusted and thread_group_cputime_adjusted()
      can be called locklessly, as well as concurrently on many different CPUs.
      
      This can occasionally lead to the utime and stime reported by times(), and
      other syscalls like it, going backward. The cause for this appears to be
      multiple threads racing in cputime_adjust(), both with values for utime or
      stime that is larger than the original, but each with a different value.
      
      Sometimes the larger value gets saved first, only to be immediately
      overwritten with a smaller value by another thread.
      
      Using atomic exchange prevents that problem, and ensures time
      progresses monotonically.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: umgwanakikbuti@gmail.com
      Cc: fweisbec@gmail.com
      Cc: akpm@linux-foundation.org
      Cc: srao@redhat.com
      Cc: lwoodman@redhat.com
      Cc: atheurer@redhat.com
      Cc: oleg@redhat.com
      Link: http://lkml.kernel.org/r/1408133138-22048-4-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      eb1b4af0
    • R
      time, signal: Protect resource use statistics with seqlock · e78c3496
      Rik van Riel 提交于
      Both times() and clock_gettime(CLOCK_PROCESS_CPUTIME_ID) have scalability
      issues on large systems, due to both functions being serialized with a
      lock.
      
      The lock protects against reporting a wrong value, due to a thread in the
      task group exiting, its statistics reporting up to the signal struct, and
      that exited task's statistics being counted twice (or not at all).
      
      Protecting that with a lock results in times() and clock_gettime() being
      completely serialized on large systems.
      
      This can be fixed by using a seqlock around the events that gather and
      propagate statistics. As an additional benefit, the protection code can
      be moved into thread_group_cputime(), slightly simplifying the calling
      functions.
      
      In the case of posix_cpu_clock_get_task() things can be simplified a
      lot, because the calling function already ensures that the task sticks
      around, and the rest is now taken care of in thread_group_cputime().
      
      This way the statistics reporting code can run lockless.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Daeseok Youn <daeseok.youn@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guillaume Morin <guillaume@morinfr.org>
      Cc: Ionut Alexa <ionut.m.alexa@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Michal Schmidt <mschmidt@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: umgwanakikbuti@gmail.com
      Cc: fweisbec@gmail.com
      Cc: srao@redhat.com
      Cc: lwoodman@redhat.com
      Cc: atheurer@redhat.com
      Link: http://lkml.kernel.org/r/20140816134010.26a9b572@annuminas.surriel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e78c3496
  11. 20 8月, 2014 1 次提交
  12. 07 5月, 2014 1 次提交
  13. 13 3月, 2014 1 次提交
    • F
      cputime: Fix jiffies based cputime assumption on steal accounting · dee08a72
      Frederic Weisbecker 提交于
      The steal guest time accounting code assumes that cputime_t is based on
      jiffies. So when CONFIG_NO_HZ_FULL=y, which implies that cputime_t
      is based on nsecs, steal_account_process_tick() passes the delta in
      jiffies to account_steal_time() which then accounts it as if it's a
      value in nsecs.
      
      As a result, accounting 1 second of steal time (with HZ=100 that would
      be 100 jiffies) is spuriously accounted as 100 nsecs.
      
      As such /proc/stat may report 0 values of steal time even when two
      guests have run concurrently for a few seconds on the same host and
      same CPU.
      
      In order to fix this, lets convert the nsecs based steal delta to
      cputime instead of jiffies by using the right conversion API.
      
      Given that the steal time is stored in cputime_t and this type can have
      a smaller granularity than nsecs, we only account the rounded converted
      value and leave the remaining nsecs for the next deltas.
      Reported-by: NHuiqingding <huding@redhat.com>
      Reported-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      dee08a72
  14. 09 2月, 2014 1 次提交
  15. 04 9月, 2013 1 次提交
    • S
      sched/cputime: Do not scale when utime == 0 · 5a8e01f8
      Stanislaw Gruszka 提交于
      scale_stime() silently assumes that stime < rtime, otherwise
      when stime == rtime and both values are big enough (operations
      on them do not fit in 32 bits), the resulting scaling stime can
      be bigger than rtime. In consequence utime = rtime - stime
      results in negative value.
      
      User space visible symptoms of the bug are overflowed TIME
      values on ps/top, for example:
      
       $ ps aux | grep rcu
       root         8  0.0  0.0      0     0 ?        S    12:42   0:00 [rcuc/0]
       root         9  0.0  0.0      0     0 ?        S    12:42   0:00 [rcub/0]
       root        10 62422329  0.0  0     0 ?        R    12:42 21114581:37 [rcu_preempt]
       root        11  0.1  0.0      0     0 ?        S    12:42   0:02 [rcuop/0]
       root        12 62422329  0.0  0     0 ?        S    12:42 21114581:35 [rcuop/1]
       root        10 62422329  0.0  0     0 ?        R    12:42 21114581:37 [rcu_preempt]
      
      or overflowed utime values read directly from /proc/$PID/stat
      
      Reference:
      
        https://lkml.org/lkml/2013/8/20/259Reported-and-tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20130904131602.GC2564@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5a8e01f8
  16. 16 8月, 2013 1 次提交
  17. 14 8月, 2013 6 次提交
    • F
      vtime: Always debug check snapshot source _before_ updating it · af2350bd
      Frederic Weisbecker 提交于
      The vtime delta update performed by get_vtime_delta() always check
      that the source of the snapshot is valid.
      
      Meanhile the snapshot updaters that rely on get_vtime_delta() also
      set the new snapshot origin. But some of them do this right before
      the call to get_vtime_delta(), making its debug check useless.
      
      This is easily fixable by moving the snapshot origin update after
      the call to get_vtime_delta(). The order doesn't matter there.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      af2350bd
    • F
      vtime: Always scale generic vtime accounting results · b854fafa
      Frederic Weisbecker 提交于
      The cputime accounting in full dynticks can be a subtle
      mixup of CPUs using tick based accounting and others using
      generic vtime.
      
      As long as the tick can have a share on producing these stats, we
      want to scale the result against CFS precise accounting as the tick
      can miss some task hiding between the periodic interrupt.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      b854fafa
    • F
      vtime: Optimize full dynticks accounting off case with static keys · b0493406
      Frederic Weisbecker 提交于
      If no CPU is in the full dynticks range, we can avoid the full
      dynticks cputime accounting through generic vtime along with its
      overhead and use the traditional tick based accounting instead.
      
      Let's do this and nope the off case with static keys.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      b0493406
    • F
      vtime: Fix racy cputime delta update · 54461562
      Frederic Weisbecker 提交于
      get_vtime_delta() must be called under the task vtime_seqlock
      with the code that does the cputime accounting flush.
      
      Otherwise the cputime reader can be fooled and run into
      a race where it sees the snapshot update but misses the
      cputime flush. As a result it can report a cputime that is
      way too short.
      
      Fix vtime_account_user() that wasn't complying to that rule.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      54461562
    • F
      vtime: Remove a few unneeded generic vtime state checks · 7621d1f8
      Frederic Weisbecker 提交于
      Some generic vtime APIs check if the vtime accounting
      is enabled on the local CPU before doing their work.
      
      Some of these are not needed because all their callers already
      take care of that. Let's remove the checks on these.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      7621d1f8
    • F
      context_tracking: Optimize guest APIs off case with static key · 48d6a816
      Frederic Weisbecker 提交于
      Optimize guest entry/exit APIs with static keys. This minimize
      the overhead for those who enable CONFIG_NO_HZ_FULL without
      always using it. Having no range passed to nohz_full= should
      result in the probes overhead to be minimized.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      48d6a816
  18. 13 8月, 2013 1 次提交
    • F
      vtime: Update a few comments · 5b206d48
      Frederic Weisbecker 提交于
      Update a stale comment from the old vtime era and document some
      locking that might be non obvious.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      5b206d48
  19. 31 5月, 2013 1 次提交
    • F
      vtime: Use consistent clocks among nohz accounting · 45eacc69
      Frederic Weisbecker 提交于
      While computing the cputime delta of dynticks CPUs,
      we are mixing up clocks of differents natures:
      
      * local_clock() which takes care of unstable clock
      sources and fix these if needed.
      
      * sched_clock() which is the weaker version of
      local_clock(). It doesn't compute any fixup in case
      of unstable source.
      
      If the clock source is stable, those two clocks are the
      same and we can safely compute the difference against
      two random points.
      
      Otherwise it results in random deltas as sched_clock()
      can randomly drift away, back or forward, from local_clock().
      
      As a consequence, some strange behaviour with unstable tsc
      has been observed such as non progressing constant zero cputime.
      (The 'top' command showing no load).
      
      Fix this by only using local_clock(), or its irq safe/remote
      equivalent, in vtime code.
      Reported-by: NMike Galbraith <efault@gmx.de>
      Suggested-by: NMike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      45eacc69
  20. 28 5月, 2013 1 次提交
  21. 01 5月, 2013 3 次提交
  22. 10 4月, 2013 1 次提交
  23. 08 4月, 2013 1 次提交
  24. 14 3月, 2013 1 次提交
    • F
      sched: Lower chances of cputime scaling overflow · d9a3c982
      Frederic Weisbecker 提交于
      Some users have reported that after running a process with
      hundreds of threads on intensive CPU-bound loads, the cputime
      of the group started to freeze after a few days.
      
      This is due to how we scale the tick-based cputime against
      the scheduler precise execution time value.
      
      We add the values of all threads in the group and we multiply
      that against the sum of the scheduler exec runtime of the whole
      group.
      
      This easily overflows after a few days/weeks of execution.
      
      A proposed solution to solve this was to compute that multiplication
      on stime instead of utime:
         62188451
         ("cputime: Avoid multiplication overflow on utime scaling")
      
      The rationale behind that was that it's easy for a thread to
      spend most of its time in userspace under intensive CPU-bound workload
      but it's much harder to do CPU-bound intensive long run in the kernel.
      
      This postulate got defeated when a user recently reported he was still
      seeing cputime freezes after the above patch. The workload that
      triggers this issue relates to intensive networking workloads where
      most of the cputime is consumed in the kernel.
      
      To reduce much more the opportunities for multiplication overflow,
      lets reduce the multiplication factors to the remainders of the division
      between sched exec runtime and cputime. Assuming the difference between
      these shouldn't ever be that large, it could work on many situations.
      
      This gets the same results as in the upstream scaling code except for
      a small difference: the upstream code always rounds the results to
      the nearest integer not greater to what would be the precise result.
      The new code rounds to the nearest integer either greater or not
      greater. In practice this difference probably shouldn't matter but
      it's worth mentioning.
      
      If this solution appears not to be enough in the end, we'll
      need to partly revert back to the behaviour prior to commit
           0cf55e1e
           ("sched, cputime: Introduce thread_group_times()")
      
      Back then, the scaling was done on exit() time before adding the cputime
      of an exiting thread to the signal struct. And then we'll need to
      scale one-by-one the live threads cputime in thread_group_cputime(). The
      drawback may be a slightly slower code on exit time.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      d9a3c982
  25. 08 3月, 2013 1 次提交
    • F
      cputime: Dynamically scale cputime for full dynticks accounting · 9fbc42ea
      Frederic Weisbecker 提交于
      The full dynticks cputime accounting is able to account either
      using the tick or the context tracking subsystem. This way
      the housekeeping CPU can keep the low overhead tick based
      solution.
      
      This latter mode has a low jiffies resolution granularity and
      need to be scaled against CFS precise runtime accounting to
      improve its result. We are doing this for CONFIG_TICK_CPU_ACCOUNTING,
      now we also need to expand it to full dynticks accounting dynamic
      off-case as well.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Mats Liljegren <mats.liljegren@enea.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      9fbc42ea
  26. 24 2月, 2013 1 次提交
    • F
      cputime: Use local_clock() for full dynticks cputime accounting · 7f6575f1
      Frederic Weisbecker 提交于
      Running the full dynticks cputime accounting with preemptible
      kernel debugging trigger the following warning:
      
      	[    4.488303] BUG: using smp_processor_id() in preemptible [00000000] code: init/1
      	[    4.490971] caller is native_sched_clock+0x22/0x80
      	[    4.493663] Pid: 1, comm: init Not tainted 3.8.0+ #13
      	[    4.496376] Call Trace:
      	[    4.498996]  [<ffffffff813410eb>] debug_smp_processor_id+0xdb/0xf0
      	[    4.501716]  [<ffffffff8101e642>] native_sched_clock+0x22/0x80
      	[    4.504434]  [<ffffffff8101db99>] sched_clock+0x9/0x10
      	[    4.507185]  [<ffffffff81096ccd>] fetch_task_cputime+0xad/0x120
      	[    4.509916]  [<ffffffff81096dd5>] task_cputime+0x35/0x60
      	[    4.512622]  [<ffffffff810f146e>] acct_update_integrals+0x1e/0x40
      	[    4.515372]  [<ffffffff8117d2cf>] do_execve_common+0x4ff/0x5c0
      	[    4.518117]  [<ffffffff8117cf14>] ? do_execve_common+0x144/0x5c0
      	[    4.520844]  [<ffffffff81867a10>] ? rest_init+0x160/0x160
      	[    4.523554]  [<ffffffff8117d457>] do_execve+0x37/0x40
      	[    4.526276]  [<ffffffff810021a3>] run_init_process+0x23/0x30
      	[    4.528953]  [<ffffffff81867aac>] kernel_init+0x9c/0xf0
      	[    4.531608]  [<ffffffff8188356c>] ret_from_fork+0x7c/0xb0
      
      We use sched_clock() to perform and fixup the cputime
      accounting. However we are calling it with preemption enabled
      from the read side, which trigger the bug above.
      
      To fix this up, use local_clock() instead. It takes care of
      preemption and also provide a more reliable clock source. This
      is welcome for this kind of statistic that is widely relied on
      in userspace.
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Link: http://lkml.kernel.org/r/1361636925-22288-3-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7f6575f1