提交 dfb4357d 编写于 作者: K Kees Cook 提交者: Thomas Gleixner

time: Remove CONFIG_TIMER_STATS

Currently CONFIG_TIMER_STATS exposes process information across namespaces:

kernel/time/timer_list.c print_timer():

        SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);

/proc/timer_list:

 #11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570

Given that the tracer can give the same information, this patch entirely
removes CONFIG_TIMER_STATS.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NKees Cook <keescook@chromium.org>
Acked-by: NJohn Stultz <john.stultz@linaro.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: linux-doc@vger.kernel.org
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Xing Gao <xgao01@email.wm.edu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jessica Frazelle <me@jessfraz.com>
Cc: kernel-hardening@lists.openwall.com
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: http://lkml.kernel.org/r/20170208192659.GA32582@beastSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
上级 bb42ca47
timer_stats - timer usage statistics
------------------------------------
timer_stats is a debugging facility to make the timer (ab)usage in a Linux
system visible to kernel and userspace developers. If enabled in the config
but not used it has almost zero runtime overhead, and a relatively small
data structure overhead. Even if collection is enabled runtime all the
locking is per-CPU and lookup is hashed.
timer_stats should be used by kernel and userspace developers to verify that
their code does not make unduly use of timers. This helps to avoid unnecessary
wakeups, which should be avoided to optimize power consumption.
It can be enabled by CONFIG_TIMER_STATS in the "Kernel hacking" configuration
section.
timer_stats collects information about the timer events which are fired in a
Linux system over a sample period:
- the pid of the task(process) which initialized the timer
- the name of the process which initialized the timer
- the function where the timer was initialized
- the callback function which is associated to the timer
- the number of events (callbacks)
timer_stats adds an entry to /proc: /proc/timer_stats
This entry is used to control the statistics functionality and to read out the
sampled information.
The timer_stats functionality is inactive on bootup.
To activate a sample period issue:
# echo 1 >/proc/timer_stats
To stop a sample period issue:
# echo 0 >/proc/timer_stats
The statistics can be retrieved by:
# cat /proc/timer_stats
While sampling is enabled, each readout from /proc/timer_stats will see
newly updated statistics. Once sampling is disabled, the sampled information
is kept until a new sample period is started. This allows multiple readouts.
Sample output of /proc/timer_stats:
Timerstats sample period: 3.888770 s
12, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
15, 1 swapper hcd_submit_urb (rh_timer_func)
4, 959 kedac schedule_timeout (process_timeout)
1, 0 swapper page_writeback_init (wb_timer_fn)
28, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
22, 2948 IRQ 4 tty_flip_buffer_push (delayed_work_timer_fn)
3, 3100 bash schedule_timeout (process_timeout)
1, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
1, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer)
1, 2292 ip __netdev_watchdog_up (dev_watchdog)
1, 23 events/1 do_cache_clean (delayed_work_timer_fn)
90 total events, 30.0 events/sec
The first column is the number of events, the second column the pid, the third
column is the name of the process. The forth column shows the function which
initialized the timer and in parenthesis the callback function which was
executed on expiry.
Thomas, Ingo
Added flag to indicate 'deferrable timer' in /proc/timer_stats. A deferrable
timer will appear as follows
10D, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
...@@ -88,12 +88,6 @@ enum hrtimer_restart { ...@@ -88,12 +88,6 @@ enum hrtimer_restart {
* @base: pointer to the timer base (per cpu and per clock) * @base: pointer to the timer base (per cpu and per clock)
* @state: state information (See bit values above) * @state: state information (See bit values above)
* @is_rel: Set if the timer was armed relative * @is_rel: Set if the timer was armed relative
* @start_pid: timer statistics field to store the pid of the task which
* started the timer
* @start_site: timer statistics field to store the site where the timer
* was started
* @start_comm: timer statistics field to store the name of the process which
* started the timer
* *
* The hrtimer structure must be initialized by hrtimer_init() * The hrtimer structure must be initialized by hrtimer_init()
*/ */
...@@ -104,11 +98,6 @@ struct hrtimer { ...@@ -104,11 +98,6 @@ struct hrtimer {
struct hrtimer_clock_base *base; struct hrtimer_clock_base *base;
u8 state; u8 state;
u8 is_rel; u8 is_rel;
#ifdef CONFIG_TIMER_STATS
int start_pid;
void *start_site;
char start_comm[16];
#endif
}; };
/** /**
......
...@@ -20,11 +20,6 @@ struct timer_list { ...@@ -20,11 +20,6 @@ struct timer_list {
unsigned long data; unsigned long data;
u32 flags; u32 flags;
#ifdef CONFIG_TIMER_STATS
int start_pid;
void *start_site;
char start_comm[16];
#endif
#ifdef CONFIG_LOCKDEP #ifdef CONFIG_LOCKDEP
struct lockdep_map lockdep_map; struct lockdep_map lockdep_map;
#endif #endif
...@@ -197,46 +192,6 @@ extern int mod_timer_pending(struct timer_list *timer, unsigned long expires); ...@@ -197,46 +192,6 @@ extern int mod_timer_pending(struct timer_list *timer, unsigned long expires);
*/ */
#define NEXT_TIMER_MAX_DELTA ((1UL << 30) - 1) #define NEXT_TIMER_MAX_DELTA ((1UL << 30) - 1)
/*
* Timer-statistics info:
*/
#ifdef CONFIG_TIMER_STATS
extern int timer_stats_active;
extern void init_timer_stats(void);
extern void timer_stats_update_stats(void *timer, pid_t pid, void *startf,
void *timerf, char *comm, u32 flags);
extern void __timer_stats_timer_set_start_info(struct timer_list *timer,
void *addr);
static inline void timer_stats_timer_set_start_info(struct timer_list *timer)
{
if (likely(!timer_stats_active))
return;
__timer_stats_timer_set_start_info(timer, __builtin_return_address(0));
}
static inline void timer_stats_timer_clear_start_info(struct timer_list *timer)
{
timer->start_site = NULL;
}
#else
static inline void init_timer_stats(void)
{
}
static inline void timer_stats_timer_set_start_info(struct timer_list *timer)
{
}
static inline void timer_stats_timer_clear_start_info(struct timer_list *timer)
{
}
#endif
extern void add_timer(struct timer_list *timer); extern void add_timer(struct timer_list *timer);
extern int try_to_del_timer_sync(struct timer_list *timer); extern int try_to_del_timer_sync(struct timer_list *timer);
......
...@@ -850,7 +850,6 @@ void __kthread_queue_delayed_work(struct kthread_worker *worker, ...@@ -850,7 +850,6 @@ void __kthread_queue_delayed_work(struct kthread_worker *worker,
list_add(&work->node, &worker->delayed_work_list); list_add(&work->node, &worker->delayed_work_list);
work->worker = worker; work->worker = worker;
timer_stats_timer_set_start_info(&dwork->timer);
timer->expires = jiffies + delay; timer->expires = jiffies + delay;
add_timer(timer); add_timer(timer);
} }
......
...@@ -15,6 +15,5 @@ ifeq ($(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST),y) ...@@ -15,6 +15,5 @@ ifeq ($(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST),y)
endif endif
obj-$(CONFIG_GENERIC_SCHED_CLOCK) += sched_clock.o obj-$(CONFIG_GENERIC_SCHED_CLOCK) += sched_clock.o
obj-$(CONFIG_TICK_ONESHOT) += tick-oneshot.o tick-sched.o obj-$(CONFIG_TICK_ONESHOT) += tick-oneshot.o tick-sched.o
obj-$(CONFIG_TIMER_STATS) += timer_stats.o
obj-$(CONFIG_DEBUG_FS) += timekeeping_debug.o obj-$(CONFIG_DEBUG_FS) += timekeeping_debug.o
obj-$(CONFIG_TEST_UDELAY) += test_udelay.o obj-$(CONFIG_TEST_UDELAY) += test_udelay.o
...@@ -766,34 +766,6 @@ void hrtimers_resume(void) ...@@ -766,34 +766,6 @@ void hrtimers_resume(void)
clock_was_set_delayed(); clock_was_set_delayed();
} }
static inline void timer_stats_hrtimer_set_start_info(struct hrtimer *timer)
{
#ifdef CONFIG_TIMER_STATS
if (timer->start_site)
return;
timer->start_site = __builtin_return_address(0);
memcpy(timer->start_comm, current->comm, TASK_COMM_LEN);
timer->start_pid = current->pid;
#endif
}
static inline void timer_stats_hrtimer_clear_start_info(struct hrtimer *timer)
{
#ifdef CONFIG_TIMER_STATS
timer->start_site = NULL;
#endif
}
static inline void timer_stats_account_hrtimer(struct hrtimer *timer)
{
#ifdef CONFIG_TIMER_STATS
if (likely(!timer_stats_active))
return;
timer_stats_update_stats(timer, timer->start_pid, timer->start_site,
timer->function, timer->start_comm, 0);
#endif
}
/* /*
* Counterpart to lock_hrtimer_base above: * Counterpart to lock_hrtimer_base above:
*/ */
...@@ -932,7 +904,6 @@ remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base, bool rest ...@@ -932,7 +904,6 @@ remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base, bool rest
* rare case and less expensive than a smp call. * rare case and less expensive than a smp call.
*/ */
debug_deactivate(timer); debug_deactivate(timer);
timer_stats_hrtimer_clear_start_info(timer);
reprogram = base->cpu_base == this_cpu_ptr(&hrtimer_bases); reprogram = base->cpu_base == this_cpu_ptr(&hrtimer_bases);
if (!restart) if (!restart)
...@@ -990,8 +961,6 @@ void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, ...@@ -990,8 +961,6 @@ void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
/* Switch the timer base, if necessary: */ /* Switch the timer base, if necessary: */
new_base = switch_hrtimer_base(timer, base, mode & HRTIMER_MODE_PINNED); new_base = switch_hrtimer_base(timer, base, mode & HRTIMER_MODE_PINNED);
timer_stats_hrtimer_set_start_info(timer);
leftmost = enqueue_hrtimer(timer, new_base); leftmost = enqueue_hrtimer(timer, new_base);
if (!leftmost) if (!leftmost)
goto unlock; goto unlock;
...@@ -1128,12 +1097,6 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id, ...@@ -1128,12 +1097,6 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
base = hrtimer_clockid_to_base(clock_id); base = hrtimer_clockid_to_base(clock_id);
timer->base = &cpu_base->clock_base[base]; timer->base = &cpu_base->clock_base[base];
timerqueue_init(&timer->node); timerqueue_init(&timer->node);
#ifdef CONFIG_TIMER_STATS
timer->start_site = NULL;
timer->start_pid = -1;
memset(timer->start_comm, 0, TASK_COMM_LEN);
#endif
} }
/** /**
...@@ -1217,7 +1180,6 @@ static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base, ...@@ -1217,7 +1180,6 @@ static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
raw_write_seqcount_barrier(&cpu_base->seq); raw_write_seqcount_barrier(&cpu_base->seq);
__remove_hrtimer(timer, base, HRTIMER_STATE_INACTIVE, 0); __remove_hrtimer(timer, base, HRTIMER_STATE_INACTIVE, 0);
timer_stats_account_hrtimer(timer);
fn = timer->function; fn = timer->function;
/* /*
......
...@@ -571,38 +571,6 @@ internal_add_timer(struct timer_base *base, struct timer_list *timer) ...@@ -571,38 +571,6 @@ internal_add_timer(struct timer_base *base, struct timer_list *timer)
trigger_dyntick_cpu(base, timer); trigger_dyntick_cpu(base, timer);
} }
#ifdef CONFIG_TIMER_STATS
void __timer_stats_timer_set_start_info(struct timer_list *timer, void *addr)
{
if (timer->start_site)
return;
timer->start_site = addr;
memcpy(timer->start_comm, current->comm, TASK_COMM_LEN);
timer->start_pid = current->pid;
}
static void timer_stats_account_timer(struct timer_list *timer)
{
void *site;
/*
* start_site can be concurrently reset by
* timer_stats_timer_clear_start_info()
*/
site = READ_ONCE(timer->start_site);
if (likely(!site))
return;
timer_stats_update_stats(timer, timer->start_pid, site,
timer->function, timer->start_comm,
timer->flags);
}
#else
static void timer_stats_account_timer(struct timer_list *timer) {}
#endif
#ifdef CONFIG_DEBUG_OBJECTS_TIMERS #ifdef CONFIG_DEBUG_OBJECTS_TIMERS
static struct debug_obj_descr timer_debug_descr; static struct debug_obj_descr timer_debug_descr;
...@@ -789,11 +757,6 @@ static void do_init_timer(struct timer_list *timer, unsigned int flags, ...@@ -789,11 +757,6 @@ static void do_init_timer(struct timer_list *timer, unsigned int flags,
{ {
timer->entry.pprev = NULL; timer->entry.pprev = NULL;
timer->flags = flags | raw_smp_processor_id(); timer->flags = flags | raw_smp_processor_id();
#ifdef CONFIG_TIMER_STATS
timer->start_site = NULL;
timer->start_pid = -1;
memset(timer->start_comm, 0, TASK_COMM_LEN);
#endif
lockdep_init_map(&timer->lockdep_map, name, key, 0); lockdep_init_map(&timer->lockdep_map, name, key, 0);
} }
...@@ -1001,8 +964,6 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only) ...@@ -1001,8 +964,6 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
base = lock_timer_base(timer, &flags); base = lock_timer_base(timer, &flags);
} }
timer_stats_timer_set_start_info(timer);
ret = detach_if_pending(timer, base, false); ret = detach_if_pending(timer, base, false);
if (!ret && pending_only) if (!ret && pending_only)
goto out_unlock; goto out_unlock;
...@@ -1130,7 +1091,6 @@ void add_timer_on(struct timer_list *timer, int cpu) ...@@ -1130,7 +1091,6 @@ void add_timer_on(struct timer_list *timer, int cpu)
struct timer_base *new_base, *base; struct timer_base *new_base, *base;
unsigned long flags; unsigned long flags;
timer_stats_timer_set_start_info(timer);
BUG_ON(timer_pending(timer) || !timer->function); BUG_ON(timer_pending(timer) || !timer->function);
new_base = get_timer_cpu_base(timer->flags, cpu); new_base = get_timer_cpu_base(timer->flags, cpu);
...@@ -1176,7 +1136,6 @@ int del_timer(struct timer_list *timer) ...@@ -1176,7 +1136,6 @@ int del_timer(struct timer_list *timer)
debug_assert_init(timer); debug_assert_init(timer);
timer_stats_timer_clear_start_info(timer);
if (timer_pending(timer)) { if (timer_pending(timer)) {
base = lock_timer_base(timer, &flags); base = lock_timer_base(timer, &flags);
ret = detach_if_pending(timer, base, true); ret = detach_if_pending(timer, base, true);
...@@ -1204,10 +1163,9 @@ int try_to_del_timer_sync(struct timer_list *timer) ...@@ -1204,10 +1163,9 @@ int try_to_del_timer_sync(struct timer_list *timer)
base = lock_timer_base(timer, &flags); base = lock_timer_base(timer, &flags);
if (base->running_timer != timer) { if (base->running_timer != timer)
timer_stats_timer_clear_start_info(timer);
ret = detach_if_pending(timer, base, true); ret = detach_if_pending(timer, base, true);
}
spin_unlock_irqrestore(&base->lock, flags); spin_unlock_irqrestore(&base->lock, flags);
return ret; return ret;
...@@ -1331,7 +1289,6 @@ static void expire_timers(struct timer_base *base, struct hlist_head *head) ...@@ -1331,7 +1289,6 @@ static void expire_timers(struct timer_base *base, struct hlist_head *head)
unsigned long data; unsigned long data;
timer = hlist_entry(head->first, struct timer_list, entry); timer = hlist_entry(head->first, struct timer_list, entry);
timer_stats_account_timer(timer);
base->running_timer = timer; base->running_timer = timer;
detach_timer(timer, true); detach_timer(timer, true);
...@@ -1868,7 +1825,6 @@ static void __init init_timer_cpus(void) ...@@ -1868,7 +1825,6 @@ static void __init init_timer_cpus(void)
void __init init_timers(void) void __init init_timers(void)
{ {
init_timer_cpus(); init_timer_cpus();
init_timer_stats();
open_softirq(TIMER_SOFTIRQ, run_timer_softirq); open_softirq(TIMER_SOFTIRQ, run_timer_softirq);
} }
......
...@@ -62,21 +62,11 @@ static void ...@@ -62,21 +62,11 @@ static void
print_timer(struct seq_file *m, struct hrtimer *taddr, struct hrtimer *timer, print_timer(struct seq_file *m, struct hrtimer *taddr, struct hrtimer *timer,
int idx, u64 now) int idx, u64 now)
{ {
#ifdef CONFIG_TIMER_STATS
char tmp[TASK_COMM_LEN + 1];
#endif
SEQ_printf(m, " #%d: ", idx); SEQ_printf(m, " #%d: ", idx);
print_name_offset(m, taddr); print_name_offset(m, taddr);
SEQ_printf(m, ", "); SEQ_printf(m, ", ");
print_name_offset(m, timer->function); print_name_offset(m, timer->function);
SEQ_printf(m, ", S:%02x", timer->state); SEQ_printf(m, ", S:%02x", timer->state);
#ifdef CONFIG_TIMER_STATS
SEQ_printf(m, ", ");
print_name_offset(m, timer->start_site);
memcpy(tmp, timer->start_comm, TASK_COMM_LEN);
tmp[TASK_COMM_LEN] = 0;
SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);
#endif
SEQ_printf(m, "\n"); SEQ_printf(m, "\n");
SEQ_printf(m, " # expires at %Lu-%Lu nsecs [in %Ld to %Ld nsecs]\n", SEQ_printf(m, " # expires at %Lu-%Lu nsecs [in %Ld to %Ld nsecs]\n",
(unsigned long long)ktime_to_ns(hrtimer_get_softexpires(timer)), (unsigned long long)ktime_to_ns(hrtimer_get_softexpires(timer)),
......
/*
* kernel/time/timer_stats.c
*
* Collect timer usage statistics.
*
* Copyright(C) 2006, Red Hat, Inc., Ingo Molnar
* Copyright(C) 2006 Timesys Corp., Thomas Gleixner <tglx@timesys.com>
*
* timer_stats is based on timer_top, a similar functionality which was part of
* Con Kolivas dyntick patch set. It was developed by Daniel Petrini at the
* Instituto Nokia de Tecnologia - INdT - Manaus. timer_top's design was based
* on dynamic allocation of the statistics entries and linear search based
* lookup combined with a global lock, rather than the static array, hash
* and per-CPU locking which is used by timer_stats. It was written for the
* pre hrtimer kernel code and therefore did not take hrtimers into account.
* Nevertheless it provided the base for the timer_stats implementation and
* was a helpful source of inspiration. Kudos to Daniel and the Nokia folks
* for this effort.
*
* timer_top.c is
* Copyright (C) 2005 Instituto Nokia de Tecnologia - INdT - Manaus
* Written by Daniel Petrini <d.pensator@gmail.com>
* timer_top.c was released under the GNU General Public License version 2
*
* We export the addresses and counting of timer functions being called,
* the pid and cmdline from the owner process if applicable.
*
* Start/stop data collection:
* # echo [1|0] >/proc/timer_stats
*
* Display the information collected so far:
* # cat /proc/timer_stats
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/proc_fs.h>
#include <linux/module.h>
#include <linux/spinlock.h>
#include <linux/sched.h>
#include <linux/seq_file.h>
#include <linux/kallsyms.h>
#include <linux/uaccess.h>
/*
* This is our basic unit of interest: a timer expiry event identified
* by the timer, its start/expire functions and the PID of the task that
* started the timer. We count the number of times an event happens:
*/
struct entry {
/*
* Hash list:
*/
struct entry *next;
/*
* Hash keys:
*/
void *timer;
void *start_func;
void *expire_func;
pid_t pid;
/*
* Number of timeout events:
*/
unsigned long count;
u32 flags;
/*
* We save the command-line string to preserve
* this information past task exit:
*/
char comm[TASK_COMM_LEN + 1];
} ____cacheline_aligned_in_smp;
/*
* Spinlock protecting the tables - not taken during lookup:
*/
static DEFINE_RAW_SPINLOCK(table_lock);
/*
* Per-CPU lookup locks for fast hash lookup:
*/
static DEFINE_PER_CPU(raw_spinlock_t, tstats_lookup_lock);
/*
* Mutex to serialize state changes with show-stats activities:
*/
static DEFINE_MUTEX(show_mutex);
/*
* Collection status, active/inactive:
*/
int __read_mostly timer_stats_active;
/*
* Beginning/end timestamps of measurement:
*/
static ktime_t time_start, time_stop;
/*
* tstat entry structs only get allocated while collection is
* active and never freed during that time - this simplifies
* things quite a bit.
*
* They get freed when a new collection period is started.
*/
#define MAX_ENTRIES_BITS 10
#define MAX_ENTRIES (1UL << MAX_ENTRIES_BITS)
static unsigned long nr_entries;
static struct entry entries[MAX_ENTRIES];
static atomic_t overflow_count;
/*
* The entries are in a hash-table, for fast lookup:
*/
#define TSTAT_HASH_BITS (MAX_ENTRIES_BITS - 1)
#define TSTAT_HASH_SIZE (1UL << TSTAT_HASH_BITS)
#define TSTAT_HASH_MASK (TSTAT_HASH_SIZE - 1)
#define __tstat_hashfn(entry) \
(((unsigned long)(entry)->timer ^ \
(unsigned long)(entry)->start_func ^ \
(unsigned long)(entry)->expire_func ^ \
(unsigned long)(entry)->pid ) & TSTAT_HASH_MASK)
#define tstat_hashentry(entry) (tstat_hash_table + __tstat_hashfn(entry))
static struct entry *tstat_hash_table[TSTAT_HASH_SIZE] __read_mostly;
static void reset_entries(void)
{
nr_entries = 0;
memset(entries, 0, sizeof(entries));
memset(tstat_hash_table, 0, sizeof(tstat_hash_table));
atomic_set(&overflow_count, 0);
}
static struct entry *alloc_entry(void)
{
if (nr_entries >= MAX_ENTRIES)
return NULL;
return entries + nr_entries++;
}
static int match_entries(struct entry *entry1, struct entry *entry2)
{
return entry1->timer == entry2->timer &&
entry1->start_func == entry2->start_func &&
entry1->expire_func == entry2->expire_func &&
entry1->pid == entry2->pid;
}
/*
* Look up whether an entry matching this item is present
* in the hash already. Must be called with irqs off and the
* lookup lock held:
*/
static struct entry *tstat_lookup(struct entry *entry, char *comm)
{
struct entry **head, *curr, *prev;
head = tstat_hashentry(entry);
curr = *head;
/*
* The fastpath is when the entry is already hashed,
* we do this with the lookup lock held, but with the
* table lock not held:
*/
while (curr) {
if (match_entries(curr, entry))
return curr;
curr = curr->next;
}
/*
* Slowpath: allocate, set up and link a new hash entry:
*/
prev = NULL;
curr = *head;
raw_spin_lock(&table_lock);
/*
* Make sure we have not raced with another CPU:
*/
while (curr) {
if (match_entries(curr, entry))
goto out_unlock;
prev = curr;
curr = curr->next;
}
curr = alloc_entry();
if (curr) {
*curr = *entry;
curr->count = 0;
curr->next = NULL;
memcpy(curr->comm, comm, TASK_COMM_LEN);
smp_mb(); /* Ensure that curr is initialized before insert */
if (prev)
prev->next = curr;
else
*head = curr;
}
out_unlock:
raw_spin_unlock(&table_lock);
return curr;
}
/**
* timer_stats_update_stats - Update the statistics for a timer.
* @timer: pointer to either a timer_list or a hrtimer
* @pid: the pid of the task which set up the timer
* @startf: pointer to the function which did the timer setup
* @timerf: pointer to the timer callback function of the timer
* @comm: name of the process which set up the timer
* @tflags: The flags field of the timer
*
* When the timer is already registered, then the event counter is
* incremented. Otherwise the timer is registered in a free slot.
*/
void timer_stats_update_stats(void *timer, pid_t pid, void *startf,
void *timerf, char *comm, u32 tflags)
{
/*
* It doesn't matter which lock we take:
*/
raw_spinlock_t *lock;
struct entry *entry, input;
unsigned long flags;
if (likely(!timer_stats_active))
return;
lock = &per_cpu(tstats_lookup_lock, raw_smp_processor_id());
input.timer = timer;
input.start_func = startf;
input.expire_func = timerf;
input.pid = pid;
input.flags = tflags;
raw_spin_lock_irqsave(lock, flags);
if (!timer_stats_active)
goto out_unlock;
entry = tstat_lookup(&input, comm);
if (likely(entry))
entry->count++;
else
atomic_inc(&overflow_count);
out_unlock:
raw_spin_unlock_irqrestore(lock, flags);
}
static void print_name_offset(struct seq_file *m, unsigned long addr)
{
char symname[KSYM_NAME_LEN];
if (lookup_symbol_name(addr, symname) < 0)
seq_printf(m, "<%p>", (void *)addr);
else
seq_printf(m, "%s", symname);
}
static int tstats_show(struct seq_file *m, void *v)
{
struct timespec64 period;
struct entry *entry;
unsigned long ms;
long events = 0;
ktime_t time;
int i;
mutex_lock(&show_mutex);
/*
* If still active then calculate up to now:
*/
if (timer_stats_active)
time_stop = ktime_get();
time = ktime_sub(time_stop, time_start);
period = ktime_to_timespec64(time);
ms = period.tv_nsec / 1000000;
seq_puts(m, "Timer Stats Version: v0.3\n");
seq_printf(m, "Sample period: %ld.%03ld s\n", (long)period.tv_sec, ms);
if (atomic_read(&overflow_count))
seq_printf(m, "Overflow: %d entries\n", atomic_read(&overflow_count));
seq_printf(m, "Collection: %s\n", timer_stats_active ? "active" : "inactive");
for (i = 0; i < nr_entries; i++) {
entry = entries + i;
if (entry->flags & TIMER_DEFERRABLE) {
seq_printf(m, "%4luD, %5d %-16s ",
entry->count, entry->pid, entry->comm);
} else {
seq_printf(m, " %4lu, %5d %-16s ",
entry->count, entry->pid, entry->comm);
}
print_name_offset(m, (unsigned long)entry->start_func);
seq_puts(m, " (");
print_name_offset(m, (unsigned long)entry->expire_func);
seq_puts(m, ")\n");
events += entry->count;
}
ms += period.tv_sec * 1000;
if (!ms)
ms = 1;
if (events && period.tv_sec)
seq_printf(m, "%ld total events, %ld.%03ld events/sec\n",
events, events * 1000 / ms,
(events * 1000000 / ms) % 1000);
else
seq_printf(m, "%ld total events\n", events);
mutex_unlock(&show_mutex);
return 0;
}
/*
* After a state change, make sure all concurrent lookup/update
* activities have stopped:
*/
static void sync_access(void)
{
unsigned long flags;
int cpu;
for_each_online_cpu(cpu) {
raw_spinlock_t *lock = &per_cpu(tstats_lookup_lock, cpu);
raw_spin_lock_irqsave(lock, flags);
/* nothing */
raw_spin_unlock_irqrestore(lock, flags);
}
}
static ssize_t tstats_write(struct file *file, const char __user *buf,
size_t count, loff_t *offs)
{
char ctl[2];
if (count != 2 || *offs)
return -EINVAL;
if (copy_from_user(ctl, buf, count))
return -EFAULT;
mutex_lock(&show_mutex);
switch (ctl[0]) {
case '0':
if (timer_stats_active) {
timer_stats_active = 0;
time_stop = ktime_get();
sync_access();
}
break;
case '1':
if (!timer_stats_active) {
reset_entries();
time_start = ktime_get();
smp_mb();
timer_stats_active = 1;
}
break;
default:
count = -EINVAL;
}
mutex_unlock(&show_mutex);
return count;
}
static int tstats_open(struct inode *inode, struct file *filp)
{
return single_open(filp, tstats_show, NULL);
}
static const struct file_operations tstats_fops = {
.open = tstats_open,
.read = seq_read,
.write = tstats_write,
.llseek = seq_lseek,
.release = single_release,
};
void __init init_timer_stats(void)
{
int cpu;
for_each_possible_cpu(cpu)
raw_spin_lock_init(&per_cpu(tstats_lookup_lock, cpu));
}
static int __init init_tstats_procfs(void)
{
struct proc_dir_entry *pe;
pe = proc_create("timer_stats", 0644, NULL, &tstats_fops);
if (!pe)
return -ENOMEM;
return 0;
}
__initcall(init_tstats_procfs);
...@@ -1523,8 +1523,6 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq, ...@@ -1523,8 +1523,6 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
return; return;
} }
timer_stats_timer_set_start_info(&dwork->timer);
dwork->wq = wq; dwork->wq = wq;
dwork->cpu = cpu; dwork->cpu = cpu;
timer->expires = jiffies + delay; timer->expires = jiffies + delay;
......
...@@ -980,20 +980,6 @@ config DEBUG_TIMEKEEPING ...@@ -980,20 +980,6 @@ config DEBUG_TIMEKEEPING
If unsure, say N. If unsure, say N.
config TIMER_STATS
bool "Collect kernel timers statistics"
depends on DEBUG_KERNEL && PROC_FS
help
If you say Y here, additional code will be inserted into the
timer routines to collect statistics about kernel timers being
reprogrammed. The statistics can be read from /proc/timer_stats.
The statistics collection is started by writing 1 to /proc/timer_stats,
writing 0 stops it. This feature is useful to collect information
about timer usage patterns in kernel and userspace. This feature
is lightweight if enabled in the kernel config but not activated
(it defaults to deactivated on bootup and will only be activated
if some application like powertop activates it explicitly).
config DEBUG_PREEMPT config DEBUG_PREEMPT
bool "Debug preemptible kernel" bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT && TRACE_IRQFLAGS_SUPPORT depends on DEBUG_KERNEL && PREEMPT && TRACE_IRQFLAGS_SUPPORT
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册