- 14 4月, 2009 1 次提交
-
-
由 Tom Zanussi 提交于
Add a new config option, CONFIG_EVENT_TRACING that gets selected when CONFIG_TRACING is selected and adds everything needed by the stuff in trace_export - basically all the event tracing support needed by e.g. bprint, minus the actual events, which are only included if CONFIG_EVENT_TRACER is selected. So CONFIG_EVENT_TRACER can be used to turn on or off the generated events (what I think of as the 'event tracer'), while CONFIG_EVENT_TRACING turns on or off the base event tracing support used by both the event tracer and the other things such as bprint that can't be configured out. Signed-off-by: NTom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: fweisbec@gmail.com LKML-Reference: <1239178441.10295.34.camel@tropicana> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 25 3月, 2009 1 次提交
-
-
由 Jason Baron 提交于
This patch combines Greg Bank's dprintk() work with the existing dynamic printk patchset, we are now calling it 'dynamic debug'. The new feature of this patchset is a richer /debugfs control file interface, (an example output from my system is at the bottom), which allows fined grained control over the the debug output. The output can be controlled by function, file, module, format string, and line number. for example, enabled all debug messages in module 'nf_conntrack': echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control to disable them: echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control A further explanation can be found in the documentation patch. Signed-off-by: NGreg Banks <gnb@sgi.com> Signed-off-by: NJason Baron <jbaron@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 13 3月, 2009 1 次提交
-
-
由 Frederic Weisbecker 提交于
Impact: new feature This adds the generic support for syscalls tracing. This is currently exploited through a devoted tracer but other tracing engines can use it. (They just have to play with {start,stop}_ftrace_syscalls() and use the display callbacks unless they want to override them.) The syscalls prototypes definitions are abused here to steal some metadata informations: - syscall name, param types, param names, number of params The syscall addr is not directly saved during this definition because we don't know if its prototype is available in the namespace. But we don't really need it. The arch has just to build a function able to resolve the syscall number to its metadata struct. The current tracer prints the syscall names, parameters names and values (and their types optionally). Currently the value is a raw hex but higher level values diplaying is on my TODO list. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1236955332-10133-2-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 09 3月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
Impact: fix kernel crash when using trace_printk() trace_printk_fmt section is defined into the readonly section. But we do: trace_printk_fmt = fmt; to fill in that table of format strings - which is not read-only. Under CONFIG_DEBUG_RODATA=y this crashes ... Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 3月, 2009 1 次提交
-
-
由 Lai Jiangshan 提交于
Impact: add a generic printk() for tracing, like trace_printk() trace_bprintk() uses the infrastructure to record events on ring_buffer. [ fweisbec@gmail.com: ported to latest -tip, made it work if !CONFIG_MODULES, never free the format strings from modules because we can't keep track of them and conditionnaly create the ftrace format strings section (reported by Steven Rostedt) ] Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> LKML-Reference: <1236356510-8381-4-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 25 2月, 2009 1 次提交
-
-
由 Steven Rostedt 提交于
This patch creates the event tracing infrastructure of ftrace. It will create the files: /debug/tracing/available_events /debug/tracing/set_event The available_events will list the trace points that have been registered with the event tracer. set_events will allow the user to enable or disable an event hook. example: # echo sched_wakeup > /debug/tracing/set_event Will enable the sched_wakeup event (if it is registered). # echo "!sched_wakeup" >> /debug/tracing/set_event Will disable the sched_wakeup event (and only that event). # echo > /debug/tracing/set_event Will disable all events (notice the '>') # cat /debug/tracing/available_events > /debug/tracing/set_event Will enable all registered event hooks. Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
-
- 31 1月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
Impact: fix linker screwup on x86_32 Recent x86_64 zerobased patches introduced PERCPU_VADDR() to put .data.percpu to a predefined address and re-defined PERCPU() in terms of it. The new macro defined one extra symbol, __per_cpu_load, for LMA of the section so that the init data could be accessed. This new symbol introduced the following problems to x86_32. 1. If __per_cpu_load is defined outside of .data.percpu as an absolute symbol, relocation generation for relocatable kernel fails due to absolute relocation. 2. If __per_cpu_load is put inside .data.percpu with absolute address assignment to work around #1, linker gets confused and under certain configurations ends up relocating the symbol against .data.percpu such that the load address gets added on top of already set load address. As x86_32 doesn't use predefined address for .data.percpu, there's no need for it to care about the possibility of __per_cpu_load being different from __per_cpu_start. This patch defines PERCPU() separately so that __per_cpu_load is defined inside .data.percpu so that everything is ordinary linking-wise. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 1月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
This reverts commit 5a611268. It is causing occasional boot crashes, caused by certain linker versions (GNU ld version 2.18.50.0.6-2 20080403) messing up: 82dcc000 D __per_cpu_load c16e6000 A __per_cpu_load_abs The __per_cpu_load value is out of whack. Hpa noticed the following detail: * (gdb) p/x -(0xc16e6000-0x82dcc000) * $2 = 0xc16e6000 * I.e. one is the other << 1 The two symbols should be equal. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 1月, 2009 1 次提交
-
-
由 Brian Gerst 提交于
This patch fixes this linker error: WARNING: Absolute relocations present Offset Info Type Sym.Value Sym.Name c0a4e07d 00e78001 R_386_32 c0ab0000 __per_cpu_load Now, __per_cpu_load is a section-relative symbol: c0aa4000 D __per_cpu_load c0aa4000 A __per_cpu_load_abs Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 1月, 2009 2 次提交
-
-
由 Tejun Heo 提交于
Impact: cleanup With .data.percpu.first in place, PERCPU_VADDR_PREALLOC() is no longer necessary. Kill it. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Brian Gerst 提交于
Impact: cleanup Refactor the DEFINE_PER_CPU_* macros and add .data.percpu.first section. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 17 1月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
The newly added PERCPU_*() macros define and use __per_cpu_load but VMLINUX_SYMBOL() was missing from usages causing build failures on archs where linker visible symbol is different from C symbols (e.g. blackfin). Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 16 1月, 2009 2 次提交
-
-
由 Tejun Heo 提交于
[ Based on original patch from Christoph Lameter and Mike Travis. ] Currently pdas and percpu areas are allocated separately. %gs points to local pda and percpu area can be reached using pda->data_offset. This patch folds pda into percpu area. Due to strange gcc requirement, pda needs to be at the beginning of the percpu area so that pda->stack_canary is at %gs:40. To achieve this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is added and used to reserve pda sized chunk at the start of the percpu area. After this change, for boot cpu, %gs first points to pda in the data.init area and later during setup_per_cpu_areas() gets updated to point to the actual pda. This means that setup_per_cpu_areas() need to reload %gs for CPU0 while clearing pda area for other cpus as cpu0 already has modified it when control reaches setup_per_cpu_areas(). This patch also removes now unnecessary get_local_pda() and its call sites. A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into per cpu area" patch. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
[ Based on original patch from Christoph Lameter and Mike Travis. ] This patch makes percpu symbols zerobased on x86_64 SMP by adding PERCPU_VADDR() to vmlinux.lds.h which helps setting explicit vaddr on the percpu output section and using it in vmlinux_64.lds.S. A new PHDR is added as existing ones cannot contain sections near address zero. PERCPU_VADDR() also adds a new symbol __per_cpu_load which always points to the vaddr of the loaded percpu data.init region. The following adjustments have been made to accomodate the address change. * code to locate percpu gdt_page in head_64.S is updated to add the load address to the gdt_page offset. * __per_cpu_load is used in places where access to the init data area is necessary. * pda->data_offset is initialized soon after C code is entered as zero value doesn't work anymore. This patch is mostly taken from Mike Travis' "x86_64: Base percpu variables at zero" patch. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 12月, 2008 1 次提交
-
-
由 Frederic Weisbecker 提交于
Impact: let the function-graph-tracer be aware of the irq entrypoints Add a new .irqentry.text section to store the irq entrypoints functions inside the same section. This way, the tracer will be able to signal an interrupts triggering on output by recognizing these entrypoints. Also, make this section recordable for dynamic tracing. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 11月, 2008 2 次提交
-
-
由 Steven Rostedt 提交于
Impact: feature to profile if statements This patch adds a branch profiler for all if () statements. The results will be found in: /debugfs/tracing/profile_branch For example: miss hit % Function File Line ------- --------- - -------- ---- ---- 0 1 100 x86_64_start_reservations head64.c 127 0 1 100 copy_bootdata head64.c 69 1 0 0 x86_64_start_kernel head64.c 111 32 0 0 set_intr_gate desc.h 319 1 0 0 reserve_ebda_region head.c 51 1 0 0 reserve_ebda_region head.c 47 0 1 100 reserve_ebda_region head.c 42 0 0 X maxcpus main.c 165 Miss means the branch was not taken. Hit means the branch was taken. The percent is the percentage the branch was taken. This adds a significant amount of overhead and should only be used by those analyzing their system. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Steven Rostedt 提交于
Impact: clean up to make one profiler of like and unlikely tracer The likely and unlikely profiler prints out the file and line numbers of the annotated branches that it is profiling. It shows the number of times it was correct or incorrect in its guess. Having two different files or sections for that matter to tell us if it was a likely or unlikely is pretty pointless. We really only care if it was correct or not. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 16 11月, 2008 1 次提交
-
-
由 Mathieu Desnoyers 提交于
Impact: API *CHANGE*. Must update all tracepoint users. Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint structure in a single spot for all the kernel. It helps reducing memory consumption, especially when declaring a lot of tracepoints, e.g. for kmalloc tracing. *API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for tracepoint declarations rather than DEFINE_TRACE(). This is the sane way to do it. The name previously used was misleading. Updates scheduler instrumentation to follow this API change. Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 11月, 2008 1 次提交
-
-
由 Steven Rostedt 提交于
Impact: name change of unlikely tracer and profiler Ingo Molnar suggested changing the config from UNLIKELY_PROFILE to BRANCH_PROFILING. I never did like the "unlikely" name so I went one step farther, and renamed all the unlikely configurations to a "BRANCH" variant. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 11月, 2008 1 次提交
-
-
由 Steven Rostedt 提交于
Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every(*) likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely | head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely | \ awk '{ if ($3 > 25) print $0; }' |head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. (*) Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 17 10月, 2008 1 次提交
-
-
由 Jason Baron 提交于
Base infrastructure to enable per-module debug messages. I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes control of debugging statements on a per-module basis in one /proc file, currently, <debugfs>/dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG, is not set, debugging statements can still be enabled as before, often by defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set. The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls can be dynamically enabled/disabled on a per-module basis. Future plans include extending this functionality to subsystems, that define their own debug levels and flags. Usage: Dynamic debugging is controlled by the debugfs file, <debugfs>/dynamic_printk/modules. This file contains a list of the modules that can be enabled. The format of the file is as follows: <module_name> <enabled=0/1> . . . <module_name> : Name of the module in which the debug call resides <enabled=0/1> : whether the messages are enabled or not For example: snd_hda_intel enabled=0 fixup enabled=1 driver enabled=0 Enable a module: $echo "set enabled=1 <module_name>" > dynamic_printk/modules Disable a module: $echo "set enabled=0 <module_name>" > dynamic_printk/modules Enable all modules: $echo "set enabled=1 all" > dynamic_printk/modules Disable all modules: $echo "set enabled=0 all" > dynamic_printk/modules Finally, passing "dynamic_printk" at the command line enables debugging for all modules. This mode can be turned off via the above disable command. [gkh: minor cleanups and tweaks to make the build work quietly] Signed-off-by: NJason Baron <jbaron@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 16 10月, 2008 3 次提交
-
-
由 Thomas Gleixner 提交于
Revert the dynarray changes. They need more thought and polishing. Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yinghai Lu 提交于
allow dyn-array in per_cpu area, allocated dynamically. usage: | /* in .h */ | struct kernel_stat { | struct cpu_usage_stat cpustat; | unsigned int *irqs; | }; | | /* in .c */ | DEFINE_PER_CPU(struct kernel_stat, kstat); | | DEFINE_PER_CPU_DYN_ARRAY_ADDR(per_cpu__kstat_irqs, per_cpu__kstat.irqs, sizeof(unsigned int), nr_irqs, sizeof(unsigned long), NULL); after setup_percpu()/per_cpu_alloc_dyn_array(), the dyn_array in per_cpu area is ready to use. Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
Allow crazy big arrays via bootmem at init stage. Architectures use CONFIG_HAVE_DYN_ARRAY to enable it. usage: | static struct irq_desc irq_desc_init __initdata = { | .status = IRQ_DISABLED, | .chip = &no_irq_chip, | .handle_irq = handle_bad_irq, | .depth = 1, | .lock = __SPIN_LOCK_UNLOCKED(irq_desc->lock), | #ifdef CONFIG_SMP | .affinity = CPU_MASK_ALL | #endif | }; | | static void __init init_work(void *data) | { | struct dyn_array *da = data; | struct irq_desc *desc; | int i; | | desc = *da->name; | | for (i = 0; i < *da->nr; i++) | memcpy(&desc[i], &irq_desc_init, sizeof(struct irq_desc)); | } | | struct irq_desc *irq_desc; | DEFINE_DYN_ARRAY(irq_desc, sizeof(struct irq_desc), nr_irqs, PAGE_SIZE, init_work); after pre_alloc_dyn_array() after setup_arch(), the array is ready to be used. Via this facility we can replace irq_desc[NR_IRQS] array with dyn_array irq_desc[nr_irqs]. v2: remove _nopanic in pre_alloc_dyn_array() Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 14 10月, 2008 2 次提交
-
-
由 Steven Rostedt 提交于
This patch creates a section in the kernel called "__mcount_loc". This will hold a list of pointers to the mcount relocation for each call site of mcount. For example: objdump -dr init/main.o [...] Disassembly of section .text: 0000000000000000 <do_one_initcall>: 0: 55 push %rbp [...] 000000000000017b <init_post>: 17b: 55 push %rbp 17c: 48 89 e5 mov %rsp,%rbp 17f: 53 push %rbx 180: 48 83 ec 08 sub $0x8,%rsp 184: e8 00 00 00 00 callq 189 <init_post+0xe> 185: R_X86_64_PC32 mcount+0xfffffffffffffffc [...] We will add a section to point to each function call. .section __mcount_loc,"a",@progbits [...] .quad .text + 0x185 [...] The offset to of the mcount call site in init_post is an offset from the start of the section, and not the start of the function init_post. The mcount relocation is at the call site 0x185 from the start of the .text section. .text + 0x185 == init_post + 0xa We need a way to add this __mcount_loc section in a way that we do not lose the relocations after final link. The .text section here will be attached to all other .text sections after final link and the offsets will be meaningless. We need to keep track of where these .text sections are. To do this, we use the start of the first function in the section. do_one_initcall. We can make a tmp.s file with this function as a reference to the start of the .text section. .section __mcount_loc,"a",@progbits [...] .quad do_one_initcall + 0x185 [...] Then we can compile the tmp.s into a tmp.o gcc -c tmp.s -o tmp.o And link it into back into main.o. ld -r main.o tmp.o -o tmp_main.o mv tmp_main.o main.o But we have a problem. What happens if the first function in a section is not exported, and is a static function. The linker will not let the tmp.o use it. This case exists in main.o as well. Disassembly of section .init.text: 0000000000000000 <set_reset_devices>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: e8 00 00 00 00 callq 9 <set_reset_devices+0x9> 5: R_X86_64_PC32 mcount+0xfffffffffffffffc The first function in .init.text is a static function. 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 t set_reset_devices The lowercase 't' means that set_reset_devices is local and is not exported. If we simply try to link the tmp.o with the set_reset_devices we end up with two symbols: one local and one global. .section __mcount_loc,"a",@progbits .quad set_reset_devices + 0x10 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 t set_reset_devices U set_reset_devices We still have an undefined reference to set_reset_devices, and if we try to compile the kernel, we will end up with an undefined reference to set_reset_devices, or even worst, it could be exported someplace else, and then we will have a reference to the wrong location. To handle this case, we make an intermediate step using objcopy. We convert set_reset_devices into a global exported symbol before linking it with tmp.o and set it back afterwards. 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 T set_reset_devices 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 T set_reset_devices 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 t set_reset_devices Now we have a section in main.o called __mcount_loc that we can place somewhere in the kernel using vmlinux.ld.S and access it to convert all these locations that call mcount into nops before starting SMP and thus, eliminating the need to do this with kstop_machine. Note, A well documented perl script (scripts/recordmcount.pl) is used to do all this in one location. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mathieu Desnoyers 提交于
Implementation of kernel tracepoints. Inspired from the Linux Kernel Markers. Allows complete typing verification by declaring both tracing statement inline functions and probe registration/unregistration static inline functions within the same macro "DEFINE_TRACE". No format string is required. See the tracepoint Documentation and Samples patches for usage examples. Taken from the documentation patch : "A tracepoint placed in code provides a hook to call a function (probe) that you can provide at runtime. A tracepoint can be "on" (a probe is connected to it) or "off" (no probe is attached). When a tracepoint is "off" it has no effect, except for adding a tiny time penalty (checking a condition for a branch) and space penalty (adding a few bytes for the function call at the end of the instrumented function and adds a data structure in a separate section). When a tracepoint is "on", the function you provide is called each time the tracepoint is executed, in the execution context of the caller. When the function provided ends its execution, it returns to the caller (continuing from the tracepoint site). You can put tracepoints at important locations in the code. They are lightweight hooks that can pass an arbitrary number of parameters, which prototypes are described in a tracepoint declaration placed in a header file." Addition and removal of tracepoints is synchronized by RCU using the scheduler (and preempt_disable) as guarantees to find a quiescent state (this is really RCU "classic"). The update side uses rcu_barrier_sched() with call_rcu_sched() and the read/execute side uses "preempt_disable()/preempt_enable()". We make sure the previous array containing probes, which has been scheduled for deletion by the rcu callback, is indeed freed before we proceed to the next update. It therefore limits the rate of modification of a single tracepoint to one update per RCU period. The objective here is to permit fast batch add/removal of probes on _different_ tracepoints. Changelog : - Use #name ":" #proto as string to identify the tracepoint in the tracepoint table. This will make sure not type mismatch happens due to connexion of a probe with the wrong type to a tracepoint declared with the same name in a different header. - Add tracepoint_entry_free_old. - Change __TO_TRACE to get rid of the 'i' iterator. Masami Hiramatsu <mhiramat@redhat.com> : Tested on x86-64. Performance impact of a tracepoint : same as markers, except that it adds about 70 bytes of instructions in an unlikely branch of each instrumented function (the for loop, the stack setup and the function call). It currently adds a memory read, a test and a conditional branch at the instrumentation site (in the hot path). Immediate values will eventually change this into a load immediate, test and branch, which removes the memory read which will make the i-cache impact smaller (changing the memory read for a load immediate removes 3-4 bytes per site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it also saves the d-cache hit). About the performance impact of tracepoints (which is comparable to markers), even without immediate values optimizations, tests done by Hideo Aoki on ia64 show no regression. His test case was using hackbench on a kernel where scheduler instrumentation (about 5 events in code scheduler code) was added. Quoting Hideo Aoki about Markers : I evaluated overhead of kernel marker using linux-2.6-sched-fixes git tree, which includes several markers for LTTng, using an ia64 server. While the immediate trace mark feature isn't implemented on ia64, there is no major performance regression. So, I think that we don't have any issues to propose merging marker point patches into Linus's tree from the viewpoint of performance impact. I prepared two kernels to evaluate. The first one was compiled without CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS. I downloaded the original hackbench from the following URL: http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c I ran hackbench 5 times in each condition and calculated the average and difference between the kernels. The parameter of hackbench: every 50 from 50 to 800 The number of CPUs of the server: 2, 4, and 8 Below is the results. As you can see, major performance regression wasn't found in any case. Even if number of processes increases, differences between marker-enabled kernel and marker- disabled kernel doesn't increase. Moreover, if number of CPUs increases, the differences doesn't increase either. Curiously, marker-enabled kernel is better than marker-disabled kernel in more than half cases, although I guess it comes from the difference of memory access pattern. * 2 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 4.811 | 4.872 | +0.061 | +1.27 | 100 | 9.854 | 10.309 | +0.454 | +4.61 | 150 | 15.602 | 15.040 | -0.562 | -3.6 | 200 | 20.489 | 20.380 | -0.109 | -0.53 | 250 | 25.798 | 25.652 | -0.146 | -0.56 | 300 | 31.260 | 30.797 | -0.463 | -1.48 | 350 | 36.121 | 35.770 | -0.351 | -0.97 | 400 | 42.288 | 42.102 | -0.186 | -0.44 | 450 | 47.778 | 47.253 | -0.526 | -1.1 | 500 | 51.953 | 52.278 | +0.325 | +0.63 | 550 | 58.401 | 57.700 | -0.701 | -1.2 | 600 | 63.334 | 63.222 | -0.112 | -0.18 | 650 | 68.816 | 68.511 | -0.306 | -0.44 | 700 | 74.667 | 74.088 | -0.579 | -0.78 | 750 | 78.612 | 79.582 | +0.970 | +1.23 | 800 | 85.431 | 85.263 | -0.168 | -0.2 | -------------------------------------------------------------- * 4 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.586 | 2.584 | -0.003 | -0.1 | 100 | 5.254 | 5.283 | +0.030 | +0.56 | 150 | 8.012 | 8.074 | +0.061 | +0.76 | 200 | 11.172 | 11.000 | -0.172 | -1.54 | 250 | 13.917 | 14.036 | +0.119 | +0.86 | 300 | 16.905 | 16.543 | -0.362 | -2.14 | 350 | 19.901 | 20.036 | +0.135 | +0.68 | 400 | 22.908 | 23.094 | +0.186 | +0.81 | 450 | 26.273 | 26.101 | -0.172 | -0.66 | 500 | 29.554 | 29.092 | -0.461 | -1.56 | 550 | 32.377 | 32.274 | -0.103 | -0.32 | 600 | 35.855 | 35.322 | -0.533 | -1.49 | 650 | 39.192 | 38.388 | -0.804 | -2.05 | 700 | 41.744 | 41.719 | -0.025 | -0.06 | 750 | 45.016 | 44.496 | -0.520 | -1.16 | 800 | 48.212 | 47.603 | -0.609 | -1.26 | -------------------------------------------------------------- * 8 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.094 | 2.072 | -0.022 | -1.07 | 100 | 4.162 | 4.273 | +0.111 | +2.66 | 150 | 6.485 | 6.540 | +0.055 | +0.84 | 200 | 8.556 | 8.478 | -0.078 | -0.91 | 250 | 10.458 | 10.258 | -0.200 | -1.91 | 300 | 12.425 | 12.750 | +0.325 | +2.62 | 350 | 14.807 | 14.839 | +0.032 | +0.22 | 400 | 16.801 | 16.959 | +0.158 | +0.94 | 450 | 19.478 | 19.009 | -0.470 | -2.41 | 500 | 21.296 | 21.504 | +0.208 | +0.98 | 550 | 23.842 | 23.979 | +0.137 | +0.57 | 600 | 26.309 | 26.111 | -0.198 | -0.75 | 650 | 28.705 | 28.446 | -0.259 | -0.9 | 700 | 31.233 | 31.394 | +0.161 | +0.52 | 750 | 34.064 | 33.720 | -0.344 | -1.01 | 800 | 36.320 | 36.114 | -0.206 | -0.57 | -------------------------------------------------------------- Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: NMasami Hiramatsu <mhiramat@redhat.com> Acked-by: N'Peter Zijlstra' <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 8月, 2008 1 次提交
-
-
由 Yoshinori Sato 提交于
ARCH=h8300: init/main.c:781: undefined reference to `___early_initcall_end' Same problem have __start___bug_table __stop___bug_table __tracedata_start __tracedata_end __per_cpu_start __per_cpu_end When defining a symbol in vmlinux.lds, use the VMLINUX_SYMBOL macro. VMLINUX_SYMBOL adds a prefix charactor. You can't just use straight symbol names in common header files as they dont take into consideration weird arch-specific ABI conventions. in the case of Blackfin/h8300, the ABI dictates that any C-visible symbols have an underscore prefixed to them. Thus all symbols in vmlinux.lds.h need to be wrapped in VMLINUX_SYMBOL() so that each arch can put hide this magic in their own files. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NYoshinori Sato <ysato@users.sourceforge.jp> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: "Mike Frysinger" <vapier.adi@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 7月, 2008 1 次提交
-
-
Added early initcall (pre-SMP) support, using an identical interface to that of regular initcalls. Functions called from do_pre_smp_initcalls() could be converted to use this cleaner interface. This is required by CPU hotplug, because early users have to register notifiers before going SMP. One such CPU hotplug user is the relay interface with buffer-only channels, which needs to register such a notifier, to be usable in early code. This in turn is used by kmemtrace. Signed-off-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 7月, 2008 1 次提交
-
-
由 Jan Beulich 提交于
Due to the addition of __attribute__((__cold__)) to a few symbols without adjusting the linker scripts, those symbols currently may end up outside the [_stext,_etext) range, as they get placed in .text.unlikely by (at least) gcc 4.3.0. This may confuse code not only outside of the kernel, symbol_put_addr()'s BUG() could also trigger. Hence we need to add .text.unlikely (and for future uses of __attribute__((__hot__)) also .text.hot) to the TEXT_TEXT() macro. Issue observed by Lukas Lipavsky. Signed-off-by: NJan Beulich <jbeulich@novell.com> Tested-by: NLukas Lipavsky <llipavsky@suse.cz> Cc: <stable@kernel.org> Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
-
- 10 7月, 2008 1 次提交
-
-
由 David Woodhouse 提交于
Some drivers have their own hacks to bypass the kernel's firmware loader and build their firmware into the kernel; this renders those unnecessary. Other drivers don't use the firmware loader at all, because they always want the firmware to be available. This allows them to start using the firmware loader. A third set of drivers already use the firmware loader, but can't be used without help from userspace, which sometimes requires an initrd. This allows them to work in a static kernel. Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
-
- 11 6月, 2008 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Some quirks should be called with interrupt disabled, we can't directly call them in .resume_early. Also the patch introduces pci_fixup_resume_early and pci_fixup_suspend, which matches current device core callbacks (.suspend/.resume_early). TBD: Somebody knows why we need quirk resume should double check if a quirk should be called in resume or resume_early. I changed some per my understanding, but can't make sure I fixed all. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 25 5月, 2008 3 次提交
-
-
由 Jan Beulich 提交于
.. allowing it to be write-protected just as other read-only data under CONFIG_DEBUG_RODATA. Signed-off-by: NJan Beulich <jbeulich@novell.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jan Beulich 提交于
Signed-off-by: NJan Beulich <jbeulich@novell.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Eric Dumazet 提交于
While examining holes in percpu section I found this : c05f5000 D per_cpu__current_task c05f5000 D __per_cpu_start c05f5004 D per_cpu__cpu_number c05f5008 D per_cpu__irq_regs c05f500c d per_cpu__cpu_devices c05f5040 D per_cpu__cyc2ns <Big Hole of about 4000 bytes> c05f6000 d per_cpu__cpuid4_info c05f6004 d per_cpu__cache_kobject c05f6008 d per_cpu__index_kobject <Big Hole of about 4000 bytes> c05f7000 D per_cpu__gdt_page This is because gdt_page is a percpu variable, defined with a page alignement, and linker is doing its job, two times because of .o nesting in the build process. I introduced a new macro DEFINE_PER_CPU_PAGE_ALIGNED() to avoid wasting this space. All page aligned variables (only one at this time) are put in a separate subsection .data.percpu.page_aligned, at the very begining of percpu zone. Before patch , on a x86_32 machine : .data.percpu 30232 3227471872 .data.percpu 22168 3227471872 Thats 8064 bytes saved for each CPU. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 20 2月, 2008 1 次提交
-
-
由 Sam Ravnborg 提交于
When adding __devinitconst etc. the __initconst variant were missed. Add this one and proper definitions for .head.text for use in .S files. The naming .head.text is preferred over .text.head as the latter will conflict for a function named head when introducing -ffunctions-sections. Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
-
- 30 1月, 2008 1 次提交
-
-
由 Arjan van de Ven 提交于
Latest update; I now have 4 NX tests, but 2 fail so they're #if 0'd. I also cleaned up the NX test code quite a bit, and got rid of the ugly exception table sorting stuff. From: Arjan van de Ven <arjan@linux.intel.com> This patch adds testcases for the CONFIG_DEBUG_RODATA configuration option as well as the NX CPU feature/mappings. Both testcases can move to tests/ once that patch gets merged into mainline. (I'm half considering moving the rodata test into mm/init.c but I'll wait with that until init.c is unified) As part of this I had to fix a not-quite-right alignment in the vmlinux.lds.h for the RODATA sections, which lead to 1 page less being marked read only. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 29 1月, 2008 4 次提交
-
-
由 Sam Ravnborg 提交于
Today we have the following annotations for functions/data referencing __init/__exit functions / data: __init_refok => for init functions __initdata_refok => for init data __exit_refok => for exit functions There is really no difference between the __init and __exit versions and simplify it and to introduce a shorter annotation the following new annotations are introduced: __ref => for functions (code) that references __*init / __*exit __refdata => for variables __refconst => for const variables Whit this annotation is it more obvious what the annotation is for and there is no longer the arbitary division between __init and __exit code. The mechanishm is the same as before - a special section is created which is made part of the usual sections in the linker script. We will start to see annotations like this: -static struct pci_serial_quirk pci_serial_quirks[] = { +static const struct pci_serial_quirk pci_serial_quirks[] __refconst = { ----------------- -static struct notifier_block __cpuinitdata cpuid_class_cpu_notifier = +static struct notifier_block cpuid_class_cpu_notifier __refdata = ---------------- -static int threshold_cpu_callback(struct notifier_block *nfb, +static int __ref threshold_cpu_callback(struct notifier_block *nfb, [The above is just random samples]. Note: No modifications were needed in modpost to support the new sections due to the newly introduced blacklisting. Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
-
由 Adrian Bunk 提交于
Simplify the dependencies on __mem{init,exit}* (ACPI_HOTPLUG_MEMORY requires MEMORY_HOTPLUG). Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
-
由 Sam Ravnborg 提交于
Introducing separate sections for __dev* (HOTPLUG), __cpu* (HOTPLUG_CPU) and __mem* (MEMORY_HOTPLUG) allows us to do a much more reliable Section mismatch check in modpost. We are no longer dependent on the actual configuration of for example HOTPLUG. This has the effect that all users see much more Section mismatch warnings than before because they were almost all hidden when HOTPLUG was enabled. The advantage of this is that when building a piece of code then it is much more likely that the Section mismatch errors are spotted and the warnings will be felt less random of nature. Signed-off-by: NSam Ravnborg <sam@ravnborg.org> Cc: Greg KH <greg@kroah.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Adrian Bunk <bunk@kernel.org>
-
由 Sam Ravnborg 提交于
This patch consolidate all definitions of .init.text, .init.data and .exit.text, .exit.data section definitions in the generic vmlinux.lds.h. This is a preparational patch - alone it does not buy us much good. Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
-