You need to sign in or sign up before continuing.
  1. 07 4月, 2010 1 次提交
    • A
      timers: Introduce the concept of timer slack for legacy timers · 3bbb9ec9
      Arjan van de Ven 提交于
      While HR timers have had the concept of timer slack for quite some time
      now, the legacy timers lacked this concept, and had to make do with
      round_jiffies() and friends.
      
      Timer slack is important for power management; grouping timers reduces the
      number of wakeups which in turn reduces power consumption.
      
      This patch introduces timer slack to the legacy timers using the following
      pieces:
      * A slack field in the timer struct
      * An api (set_timer_slack) that callers can use to set explicit timer slack
      * A default slack of 0.4% of the requested delay for callers that do not set
        any explicit slack
      * Rounding code that is part of mod_timer() that tries to
        group timers around jiffies values every 'power of two'
        (so quick timers will group around every 2, but longer timers
        will group around every 4, 8, 16, 32 etc)
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: johnstul@us.ibm.com
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      3bbb9ec9
  2. 13 3月, 2010 3 次提交
  3. 21 1月, 2010 1 次提交
  4. 17 12月, 2009 1 次提交
  5. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  6. 29 8月, 2009 1 次提交
  7. 26 8月, 2009 1 次提交
  8. 23 8月, 2009 1 次提交
    • P
      rcu: Simplify rcu_pending()/rcu_check_callbacks() API · a157229c
      Paul E. McKenney 提交于
      All calls from outside RCU are of the form:
      
      	if (rcu_pending(cpu))
      		rcu_check_callbacks(cpu, user);
      
      This is silly, instead we put a call to rcu_pending() in
      rcu_check_callbacks(), and then make the outside calls be to
      rcu_check_callbacks().  This cuts down on the code a bit and
      also gives the compiler a better chance of optimizing.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <125097461311-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a157229c
  9. 05 8月, 2009 1 次提交
    • M
      timers: Cache __next_timer_interrupt result · 97fd9ed4
      Martin Schwidefsky 提交于
      Each time a cpu goes to sleep on a NOHZ=y system the timer
      wheel is searched for the next timer interrupt. It can take
      quite a few cycles to find the next pending timer.
      
      This patch adds a field to tvec_base that caches the result of
      __next_timer_interrupt.
      
      The hit ratio is around 80% on my thinkpad under normal use, on
      a server I've seen hit ratios from 5% to 95% dependent on the
      workload.
      
      -v2: jiffies wrap fixes
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <20090721202505.7d56a079@skybase>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      97fd9ed4
  10. 19 7月, 2009 1 次提交
  11. 24 6月, 2009 1 次提交
    • H
      timer stats: Optimize by adding quick check to avoid function calls · 507e1231
      Heiko Carstens 提交于
      When the kernel is configured with CONFIG_TIMER_STATS but timer
      stats are runtime disabled we still get calls to
      __timer_stats_timer_set_start_info which initializes some
      fields in the corresponding struct timer_list.
      
      So add some quick checks in the the timer stats setup functions
      to avoid function calls to __timer_stats_timer_set_start_info
      when timer stats are disabled.
      
      In an artificial workload that does nothing but playing ping
      pong with a single tcp packet via loopback this decreases cpu
      consumption by 1 - 1.5%.
      
      This is part of a modified function trace output on SLES11:
      
       perl-2497  [00] 28630647177732388 [+  125]: sk_reset_timer <-tcp_v4_rcv
       perl-2497  [00] 28630647177732513 [+  125]: mod_timer <-sk_reset_timer
       perl-2497  [00] 28630647177732638 [+  125]: __timer_stats_timer_set_start_info <-mod_timer
       perl-2497  [00] 28630647177732763 [+  125]: __mod_timer <-mod_timer
       perl-2497  [00] 28630647177732888 [+  125]: __timer_stats_timer_set_start_info <-__mod_timer
       perl-2497  [00] 28630647177733013 [+   93]: lock_timer_base <-__mod_timer
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mustafa Mesanovic <mustafa.mesanovic@de.ibm.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <20090623153811.GA4641@osiris.boeblingen.de.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      507e1231
  12. 29 5月, 2009 1 次提交
  13. 15 5月, 2009 2 次提交
    • T
      sched, timers: cleanup avenrun users · 2d02494f
      Thomas Gleixner 提交于
      avenrun is an rough estimate so we don't have to worry about
      consistency of the three avenrun values. Remove the xtime lock
      dependency and provide a function to scale the values. Cleanup the
      users.
      
      [ Impact: cleanup ]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      2d02494f
    • T
      sched, timers: move calc_load() to scheduler · dce48a84
      Thomas Gleixner 提交于
      Dimitri Sivanich noticed that xtime_lock is held write locked across
      calc_load() which iterates over all online CPUs. That can cause long
      latencies for xtime_lock readers on large SMP systems. 
      
      The load average calculation is an rough estimate anyway so there is
      no real need to protect the readers vs. the update. It's not a problem
      when the avenrun array is updated while a reader copies the values.
      
      Instead of iterating over all online CPUs let the scheduler_tick code
      update the number of active tasks shortly before the avenrun update
      happens. The avenrun update itself is handled by the CPU which calls
      do_timer().
      
      [ Impact: reduce xtime_lock write locked section ]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      dce48a84
  14. 13 5月, 2009 2 次提交
    • A
      timers: Logic to move non pinned timers · eea08f32
      Arun R Bharadwaj 提交于
      * Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-04-16 12:11:36]:
      
      This patch migrates all non pinned timers and hrtimers to the current
      idle load balancer, from all the idle CPUs. Timers firing on busy CPUs
      are not migrated.
      
      While migrating hrtimers, care should be taken to check if migrating
      a hrtimer would result in a latency or not. So we compare the expiry of the
      hrtimer with the next timer interrupt on the target cpu and migrate the
      hrtimer only if it expires *after* the next interrupt on the target cpu.
      So, added a clockevents_get_next_event() helper function to return the
      next_event on the target cpu's clock_event_device.
      
      [ tglx: cleanups and simplifications ]
      Signed-off-by: NArun R Bharadwaj <arun@linux.vnet.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      eea08f32
    • A
      timers: Framework for identifying pinned timers · 597d0275
      Arun R Bharadwaj 提交于
      * Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-04-16 12:11:36]:
      
      This patch creates a new framework for identifying cpu-pinned timers
      and hrtimers.
      
      This framework is needed because pinned timers are expected to fire on
      the same CPU on which they are queued. So it is essential to identify
      these and not migrate them, in case there are any.
      
      For regular timers, the currently existing add_timer_on() can be used
      queue pinned timers and subsequently mod_timer_pinned() can be used
      to modify the 'expires' field.
      
      For hrtimers, new modes HRTIMER_ABS_PINNED and HRTIMER_REL_PINNED are
      added to queue cpu-pinned hrtimer.
      
      [ tglx: use .._PINNED mode argument instead of creating tons of new
      functions ]
      Signed-off-by: NArun R Bharadwaj <arun@linux.vnet.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      597d0275
  15. 02 5月, 2009 1 次提交
  16. 06 4月, 2009 1 次提交
    • P
      perf_counter: unify and fix delayed counter wakeup · 925d519a
      Peter Zijlstra 提交于
      While going over the wakeup code I noticed delayed wakeups only work
      for hardware counters but basically all software counters rely on
      them.
      
      This patch unifies and generalizes the delayed wakeup to fix this
      issue.
      
      Since we're dealing with NMI context bits here, use a cmpxchg() based
      single link list implementation to track counters that have pending
      wakeups.
      
      [ This should really be generic code for delayed wakeups, but since we
        cannot use cmpxchg()/xchg() in generic code, I've let it live in the
        perf_counter code. -- Eric Dumazet could use it to aggregate the
        network wakeups. ]
      
      Furthermore, the x86 method of using TIF flags was flawed in that its
      quite possible to end up setting the bit on the idle task, loosing the
      wakeup.
      
      The powerpc method uses per-cpu storage and does appear to be
      sufficient.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Orig-LKML-Reference: <20090330171023.153932974@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      925d519a
  17. 02 4月, 2009 1 次提交
    • R
      timers: add missing kernel-doc · 633fe795
      Randy Dunlap 提交于
      Add missing kernel-doc parameter notation and change function
      name to its new name:
      
        Warning(kernel/timer.c:543): No description found for parameter 'name'
        Warning(kernel/timer.c:543): No description found for parameter 'key'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: akpm <akpm@linux-foundation.org>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      LKML-Reference: <20090401174723.f0bea0eb.randy.dunlap@oracle.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      633fe795
  18. 19 2月, 2009 1 次提交
    • I
      timers: add mod_timer_pending() · 74019224
      Ingo Molnar 提交于
      Impact: new timer API
      
      Based on an idea from Martin Josefsson with the help of
      Patrick McHardy and Stephen Hemminger:
      
      introduce the mod_timer_pending() API which is a mod_timer()
      offspring that is an invariant on already removed timers.
      
      (regular mod_timer() re-activates non-pending timers.)
      
      This is useful for the networking code in that it can
      allow unserialized mod_timer_pending() timer-forwarding
      calls, but a single del_timer*() will stop the timer
      from being reactivated again.
      
      Also while at it:
      
      - optimize the regular mod_timer() path some more, the
        timer-stat and a debug check was needlessly duplicated
        in __mod_timer().
      
      - make the exports come straight after the function, as
        most other exports in timer.c already did.
      
      - eliminate __mod_timer() as an external API, change the
        users to mod_timer().
      
      The regular mod_timer() code path is not impacted
      significantly, due to inlining optimizations and due to
      the simplifications.
      
      Based-on-patch-from: Stephen Hemminger <shemminger@vyatta.com>
      Acked-by: NStephen Hemminger <shemminger@vyatta.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: netdev@vger.kernel.org
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      74019224
  19. 15 2月, 2009 1 次提交
  20. 14 1月, 2009 4 次提交
  21. 31 12月, 2008 2 次提交
    • M
      [PATCH] idle cputime accounting · 79741dd3
      Martin Schwidefsky 提交于
      The cpu time spent by the idle process actually doing something is
      currently accounted as idle time. This is plain wrong, the architectures
      that support VIRT_CPU_ACCOUNTING=y can do better: distinguish between the
      time spent doing nothing and the time spent by idle doing work. The first
      is accounted with account_idle_time and the second with account_system_time.
      The architectures that use the account_xxx_time interface directly and not
      the account_xxx_ticks interface now need to do the check for the idle
      process in their arch code. In particular to improve the system vs true
      idle time accounting the arch code needs to measure the true idle time
      instead of just testing for the idle process.
      To improve the tick based accounting as well we would need an architecture
      primitive that can tell us if the pt_regs of the interrupted context
      points to the magic instruction that halts the cpu.
      
      In addition idle time is no more added to the stime of the idle process.
      This field now contains the system time of the idle process as it should
      be. On systems without VIRT_CPU_ACCOUNTING this will always be zero as
      every tick that occurs while idle is running will be accounted as idle
      time.
      
      This patch contains the necessary common code changes to be able to
      distinguish idle system time and true idle time. The architectures with
      support for VIRT_CPU_ACCOUNTING need some changes to exploit this.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      79741dd3
    • M
      [PATCH] fix scaled & unscaled cputime accounting · 457533a7
      Martin Schwidefsky 提交于
      The utimescaled / stimescaled fields in the task structure and the
      global cpustat should be set on all architectures. On s390 the calls
      to account_user_time_scaled and account_system_time_scaled never have
      been added. In addition system time that is accounted as guest time
      to the user time of a process is accounted to the scaled system time
      instead of the scaled user time.
      To fix the bugs and to prevent future forgetfulness this patch merges
      account_system_time_scaled into account_system_time and
      account_user_time_scaled into account_user_time.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Michael Neuling <mikey@neuling.org>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      457533a7
  22. 14 11月, 2008 1 次提交
  23. 06 11月, 2008 1 次提交
  24. 21 8月, 2008 1 次提交
  25. 11 8月, 2008 1 次提交
  26. 25 5月, 2008 1 次提交
    • C
      Remove argument from open_softirq which is always NULL · 962cf36c
      Carlos R. Mafra 提交于
      As git-grep shows, open_softirq() is always called with the last argument
      being NULL
      
      block/blk-core.c:       open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL);
      kernel/hrtimer.c:       open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL);
      kernel/rcuclassic.c:    open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
      kernel/rcupreempt.c:    open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
      kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL);
      kernel/softirq.c:       open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
      kernel/softirq.c:       open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
      kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL);
      net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL);
      net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL);
      
      This observation has already been made by Matthew Wilcox in June 2002
      (http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html)
      
      "I notice that none of the current softirq routines use the data element
      passed to them."
      
      and the situation hasn't changed since them. So it appears we can safely
      remove that extra argument to save 128 (54) bytes of kernel data (text).
      Signed-off-by: NCarlos R. Mafra <crmafra@ift.unesp.br>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      962cf36c
  27. 13 5月, 2008 1 次提交
  28. 30 4月, 2008 1 次提交
  29. 17 4月, 2008 1 次提交
  30. 26 3月, 2008 1 次提交
    • T
      NOHZ: reevaluate idle sleep length after add_timer_on() · 06d8308c
      Thomas Gleixner 提交于
      add_timer_on() can add a timer on a CPU which is currently in a long
      idle sleep, but the timer wheel is not reevaluated by the nohz code on
      that CPU. So a timer can be delayed for quite a long time. This
      triggered a false positive in the clocksource watchdog code.
      
      To avoid this we need to wake up the idle CPU and enforce the
      reevaluation of the timer wheel for the next timer event.
      
      Add a function, which checks a given CPU for idle state, marks the
      idle task with NEED_RESCHED and sends a reschedule IPI to notify the
      other CPU of the change in the timer wheel.
      
      Call this function from add_timer_on().
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: stable@kernel.org
      
      --
       include/linux/sched.h |    6 ++++++
       kernel/sched.c        |   43 +++++++++++++++++++++++++++++++++++++++++++
       kernel/timer.c        |   10 +++++++++-
       3 files changed, 58 insertions(+), 1 deletion(-)
      06d8308c
  31. 09 2月, 2008 2 次提交