提交 ab486bc9 编写于 作者: L Linus Torvalds

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk

Pull printk updates from Petr Mladek:

 - Add a console_msg_format command line option:

     The value "default" keeps the old "[time stamp] text\n" format. The
     value "syslog" allows to see the syslog-like "<log
     level>[timestamp] text" format.

     This feature was requested by people doing regression tests, for
     example, 0day robot. They want to have both filtered and full logs
     at hands.

 - Reduce the risk of softlockup:

     Pass the console owner in a busy loop.

     This is a new approach to the old problem. It was first proposed by
     Steven Rostedt on Kernel Summit 2017. It marks a context in which
     the console_lock owner calls console drivers and could not sleep.
     On the other side, printk() callers could detect this state and use
     a busy wait instead of a simple console_trylock(). Finally, the
     console_lock owner checks if there is a busy waiter at the end of
     the special context and eventually passes the console_lock to the
     waiter.

     The hand-off works surprisingly well and helps in many situations.
     Well, there is still a possibility of the softlockup, for example,
     when the flood of messages stops and the last owner still has too
     much to flush.

     There is increasing number of people having problems with
     printk-related softlockups. We might eventually need to get better
     solution. Anyway, this looks like a good start and promising
     direction.

 - Do not allow to schedule in console_unlock() called from printk():

     This reverts an older controversial commit. The reschedule helped
     to avoid softlockups. But it also slowed down the console output.
     This patch is obsoleted by the new console waiter logic described
     above. In fact, the reschedule made the hand-off less effective.

 - Deprecate "%pf" and "%pF" format specifier:

     It was needed on ia64, ppc64 and parisc64 to dereference function
     descriptors and show the real function address. It is done
     transparently by "%ps" and "pS" format specifier now.

     Sergey Senozhatsky found that all the function descriptors were in
     a special elf section and could be easily detected.

 - Remove printk_symbol() API:

     It has been obsoleted by "%pS" format specifier, and this change
     helped to remove few continuous lines and a less intuitive old API.

 - Remove redundant memsets:

     Sergey removed unnecessary memset when processing printk.devkmsg
     command line option.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (27 commits)
  printk: drop redundant devkmsg_log_str memsets
  printk: Never set console_may_schedule in console_trylock()
  printk: Hide console waiter logic into helpers
  printk: Add console owner and waiter logic to load balance console writes
  kallsyms: remove print_symbol() function
  checkpatch: add pF/pf deprecation warning
  symbol lookup: introduce dereference_symbol_descriptor()
  parisc64: Add .opd based function descriptor dereference
  powerpc64: Add .opd based function descriptor dereference
  ia64: Add .opd based function descriptor dereference
  sections: split dereference_function_descriptor()
  openrisc: Fix conflicting types for _exext and _stext
  lib: do not use print_symbol()
  irq debug: do not use print_symbol()
  sysfs: do not use print_symbol()
  drivers: do not use print_symbol()
  x86: do not use print_symbol()
  unicore32: do not use print_symbol()
  sh: do not use print_symbol()
  mn10300: do not use print_symbol()
  ...
......@@ -646,6 +646,20 @@
console=brl,ttyS0
For now, only VisioBraille is supported.
console_msg_format=
[KNL] Change console messages format
default
By default we print messages on consoles in
"[time stamp] text\n" format (time stamp may not be
printed, depending on CONFIG_PRINTK_TIME or
`printk_time' param).
syslog
Switch to syslog format: "<%u>[time stamp] text\n"
IOW, each message will have a facility and loglevel
prefix. The format is similar to one used by syslog()
syscall, or to executing "dmesg -S --raw" or to reading
from /proc/kmsg.
consoleblank= [KNL] The console blank (screen saver) timeout in
seconds. A value of 0 disables the blank timer.
Defaults to 0.
......
......@@ -68,42 +68,32 @@ Symbols/Function Pointers
::
%pS versatile_init+0x0/0x110
%ps versatile_init
%pF versatile_init+0x0/0x110
%pf versatile_init
%pS versatile_init+0x0/0x110
%pSR versatile_init+0x9/0x110
(with __builtin_extract_return_addr() translation)
%ps versatile_init
%pB prev_fn_of_versatile_init+0x88/0x88
The ``F`` and ``f`` specifiers are for printing function pointers,
for example, f->func, &gettimeofday. They have the same result as
``S`` and ``s`` specifiers. But they do an extra conversion on
ia64, ppc64 and parisc64 architectures where the function pointers
are actually function descriptors.
The ``S`` and ``s`` specifiers are used for printing a pointer in symbolic
format. They result in the symbol name with (S) or without (s)
offsets. If KALLSYMS are disabled then the symbol address is printed instead.
The ``S`` and ``s`` specifiers can be used for printing symbols
from direct addresses, for example, __builtin_return_address(0),
(void *)regs->ip. They result in the symbol name with (S) or
without (s) offsets. If KALLSYMS are disabled then the symbol
address is printed instead.
Note, that the ``F`` and ``f`` specifiers are identical to ``S`` (``s``)
and thus deprecated. We have ``F`` and ``f`` because on ia64, ppc64 and
parisc64 function pointers are indirect and, in fact, are function
descriptors, which require additional dereferencing before we can lookup
the symbol. As of now, ``S`` and ``s`` perform dereferencing on those
platforms (when needed), so ``F`` and ``f`` exist for compatibility
reasons only.
The ``B`` specifier results in the symbol name with offsets and should be
used when printing stack backtraces. The specifier takes into
consideration the effect of compiler optimisations which may occur
when tail-calls are used and marked with the noreturn GCC attribute.
Examples::
printk("Going to call: %pF\n", gettimeofday);
printk("Going to call: %pF\n", p->func);
printk("%s: called from %pS\n", __func__, (void *)_RET_IP_);
printk("%s: called from %pS\n", __func__,
(void *)__builtin_return_address(0));
printk("Faulted at %pS\n", (void *)regs->ip);
printk(" %s%pB\n", (reliable ? "" : "? "), (void *)*stack);
Kernel Pointers
---------------
......
......@@ -154,8 +154,8 @@ static ssize_t dev_attr_show(struct kobject *kobj, struct attribute *attr,
if (dev_attr->show)
ret = dev_attr->show(dev, dev_attr, buf);
if (ret >= (ssize_t)PAGE_SIZE) {
print_symbol("dev_attr_show: %s returned bad count\n",
(unsigned long)dev_attr->show);
printk("dev_attr_show: %pS returned bad count\n",
dev_attr->show);
}
return ret;
}
......
......@@ -167,8 +167,8 @@ static ssize_t dev_attr_show(struct kobject *kobj, struct attribute *attr,
if (dev_attr->show)
ret = dev_attr->show(dev, dev_attr, buf);
if (ret >= (ssize_t)PAGE_SIZE) {
print_symbol("dev_attr_show: %s returned bad count\n",
(unsigned long)dev_attr->show);
printk("dev_attr_show: %pS returned bad count\n",
dev_attr->show);
}
return ret;
}
......
......@@ -21,7 +21,6 @@
#include <linux/unistd.h>
#include <linux/user.h>
#include <linux/interrupt.h>
#include <linux/kallsyms.h>
#include <linux/init.h>
#include <linux/elfcore.h>
#include <linux/pm.h>
......@@ -121,8 +120,8 @@ void __show_regs(struct pt_regs *regs)
show_regs_print_info(KERN_DEFAULT);
print_symbol("PC is at %s\n", instruction_pointer(regs));
print_symbol("LR is at %s\n", regs->ARM_lr);
printk("PC is at %pS\n", (void *)instruction_pointer(regs));
printk("LR is at %pS\n", (void *)regs->ARM_lr);
printk("pc : [<%08lx>] lr : [<%08lx>] psr: %08lx\n",
regs->ARM_pc, regs->ARM_lr, regs->ARM_cpsr);
printk("sp : %08lx ip : %08lx fp : %08lx\n",
......
......@@ -35,7 +35,6 @@
#include <linux/delay.h>
#include <linux/reboot.h>
#include <linux/interrupt.h>
#include <linux/kallsyms.h>
#include <linux/init.h>
#include <linux/cpu.h>
#include <linux/elfcore.h>
......@@ -221,8 +220,8 @@ void __show_regs(struct pt_regs *regs)
show_regs_print_info(KERN_DEFAULT);
print_pstate(regs);
print_symbol("pc : %s\n", regs->pc);
print_symbol("lr : %s\n", lr);
printk("pc : %pS\n", (void *)regs->pc);
printk("lr : %pS\n", (void *)lr);
printk("sp : %016llx\n", sp);
i = top_reg;
......
......@@ -11,7 +11,6 @@
#include <linux/module.h>
#include <linux/ptrace.h>
#include <linux/sched/debug.h>
#include <linux/kallsyms.h>
#include <linux/bug.h>
#include <asm/soc.h>
......@@ -375,8 +374,7 @@ static void show_trace(unsigned long *stack, unsigned long *endstack)
if (i % 5 == 0)
pr_debug("\n ");
#endif
pr_debug(" [<%08lx>]", addr);
print_symbol(" %s\n", addr);
pr_debug(" [<%08lx>] %pS\n", addr, (void *)addr);
i++;
}
}
......
......@@ -27,6 +27,8 @@ extern char __start_gate_brl_fsys_bubble_down_patchlist[], __end_gate_brl_fsys_b
extern char __start_unwind[], __end_unwind[];
extern char __start_ivt_text[], __end_ivt_text[];
#define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1
#undef dereference_function_descriptor
static inline void *dereference_function_descriptor(void *ptr)
{
......@@ -38,6 +40,12 @@ static inline void *dereference_function_descriptor(void *ptr)
return ptr;
}
#undef dereference_kernel_function_descriptor
static inline void *dereference_kernel_function_descriptor(void *ptr)
{
if (ptr < (void *)__start_opd || ptr >= (void *)__end_opd)
return ptr;
return dereference_function_descriptor(ptr);
}
#endif /* _ASM_IA64_SECTIONS_H */
......@@ -36,6 +36,7 @@
#include <asm/patch.h>
#include <asm/unaligned.h>
#include <asm/sections.h>
#define ARCH_MODULE_DEBUG 0
......@@ -918,3 +919,14 @@ module_arch_cleanup (struct module *mod)
if (mod->arch.core_unw_table)
unw_remove_unwind_table(mod->arch.core_unw_table);
}
void *dereference_module_function_descriptor(struct module *mod, void *ptr)
{
Elf64_Shdr *opd = mod->arch.opd;
if (ptr < (void *)opd->sh_addr ||
ptr >= (void *)(opd->sh_addr + opd->sh_size))
return ptr;
return dereference_function_descriptor(ptr);
}
......@@ -13,7 +13,6 @@
#include <linux/pm.h>
#include <linux/elf.h>
#include <linux/errno.h>
#include <linux/kallsyms.h>
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/slab.h>
......@@ -69,7 +68,6 @@ void
ia64_do_show_stack (struct unw_frame_info *info, void *arg)
{
unsigned long ip, sp, bsp;
char buf[128]; /* don't make it so big that it overflows the stack! */
printk("\nCall Trace:\n");
do {
......@@ -79,11 +77,9 @@ ia64_do_show_stack (struct unw_frame_info *info, void *arg)
unw_get_sp(info, &sp);
unw_get_bsp(info, &bsp);
snprintf(buf, sizeof(buf),
" [<%016lx>] %%s\n"
printk(" [<%016lx>] %pS\n"
" sp=%016lx bsp=%016lx\n",
ip, sp, bsp);
print_symbol(buf, ip);
ip, (void *)ip, sp, bsp);
} while (unw_unwind(info) >= 0);
}
......@@ -111,7 +107,7 @@ show_regs (struct pt_regs *regs)
printk("psr : %016lx ifs : %016lx ip : [<%016lx>] %s (%s)\n",
regs->cr_ipsr, regs->cr_ifs, ip, print_tainted(),
init_utsname()->release);
print_symbol("ip is at %s\n", ip);
printk("ip is at %pS\n", (void *)ip);
printk("unat: %016lx pfs : %016lx rsc : %016lx\n",
regs->ar_unat, regs->ar_pfs, regs->ar_rsc);
printk("rnat: %016lx bsps: %016lx pr : %016lx\n",
......
......@@ -109,7 +109,9 @@ SECTIONS {
RODATA
.opd : AT(ADDR(.opd) - LOAD_OFFSET) {
__start_opd = .;
*(.opd)
__end_opd = .;
}
/*
......
......@@ -22,7 +22,6 @@
#include <linux/delay.h>
#include <linux/spinlock.h>
#include <linux/interrupt.h>
#include <linux/kallsyms.h>
#include <linux/pci.h>
#include <linux/kdebug.h>
#include <linux/bug.h>
......@@ -262,8 +261,7 @@ void show_trace(unsigned long *sp)
raslot = ULONG_MAX;
else
printk(" ?");
print_symbol(" %s", addr);
printk("\n");
printk(" %pS\n", (void *)addr);
}
}
......
......@@ -39,8 +39,7 @@
#include <asm/io.h>
#include <asm/pgtable.h>
#include <asm/unwinder.h>
extern char _etext, _stext;
#include <asm/sections.h>
int kstack_depth_to_print = 0x180;
int lwa_flag;
......
......@@ -29,7 +29,9 @@ SECTIONS
. = ALIGN(16);
/* Linkage tables */
.opd : {
__start_opd = .;
*(.opd)
__end_opd = .;
} PROVIDE (__gp = .);
.plt : {
*(.plt)
......
......@@ -6,8 +6,14 @@
#include <asm-generic/sections.h>
#ifdef CONFIG_64BIT
#define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1
#undef dereference_function_descriptor
void *dereference_function_descriptor(void *);
#undef dereference_kernel_function_descriptor
void *dereference_kernel_function_descriptor(void *);
#endif
#endif
......@@ -66,6 +66,7 @@
#include <asm/pgtable.h>
#include <asm/unwind.h>
#include <asm/sections.h>
#if 0
#define DEBUGP printk
......@@ -954,3 +955,18 @@ void module_arch_cleanup(struct module *mod)
{
deregister_unwind_table(mod);
}
#ifdef CONFIG_64BIT
void *dereference_module_function_descriptor(struct module *mod, void *ptr)
{
unsigned long start_opd = (Elf64_Addr)mod->core_layout.base +
mod->arch.fdesc_offset;
unsigned long end_opd = start_opd +
mod->arch.fdesc_count * sizeof(Elf64_Fdesc);
if (ptr < (void *)start_opd || ptr >= (void *)end_opd)
return ptr;
return dereference_function_descriptor(ptr);
}
#endif
......@@ -315,6 +315,15 @@ void *dereference_function_descriptor(void *ptr)
ptr = p;
return ptr;
}
void *dereference_kernel_function_descriptor(void *ptr)
{
if (ptr < (void *)__start_opd ||
ptr >= (void *)__end_opd)
return ptr;
return dereference_function_descriptor(ptr);
}
#endif
static inline unsigned long brk_rnd(void)
......
......@@ -100,7 +100,9 @@ SECTIONS
. = ALIGN(16);
/* Linkage tables */
.opd : {
__start_opd = .;
*(.opd)
__end_opd = .;
} PROVIDE (__gp = .);
.plt : {
*(.plt)
......
......@@ -45,6 +45,9 @@ struct mod_arch_specific {
unsigned long tramp;
#endif
/* For module function descriptor dereference */
unsigned long start_opd;
unsigned long end_opd;
#else /* powerpc64 */
/* Indices of PLT sections within module. */
unsigned int core_plt_section;
......
......@@ -66,6 +66,9 @@ static inline int overlaps_kvm_tmp(unsigned long start, unsigned long end)
}
#ifdef PPC64_ELF_ABI_v1
#define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1
#undef dereference_function_descriptor
static inline void *dereference_function_descriptor(void *ptr)
{
......@@ -76,6 +79,15 @@ static inline void *dereference_function_descriptor(void *ptr)
ptr = p;
return ptr;
}
#undef dereference_kernel_function_descriptor
static inline void *dereference_kernel_function_descriptor(void *ptr)
{
if (ptr < (void *)__start_opd || ptr >= (void *)__end_opd)
return ptr;
return dereference_function_descriptor(ptr);
}
#endif /* PPC64_ELF_ABI_v1 */
#endif
......
......@@ -93,6 +93,15 @@ static unsigned int local_entry_offset(const Elf64_Sym *sym)
{
return 0;
}
void *dereference_module_function_descriptor(struct module *mod, void *ptr)
{
if (ptr < (void *)mod->arch.start_opd ||
ptr >= (void *)mod->arch.end_opd)
return ptr;
return dereference_function_descriptor(ptr);
}
#endif
#define STUB_MAGIC 0x73747562 /* stub */
......@@ -344,6 +353,11 @@ int module_frob_arch_sections(Elf64_Ehdr *hdr,
else if (strcmp(secstrings+sechdrs[i].sh_name,"__versions")==0)
dedotify_versions((void *)hdr + sechdrs[i].sh_offset,
sechdrs[i].sh_size);
else if (!strcmp(secstrings + sechdrs[i].sh_name, ".opd")) {
me->arch.start_opd = sechdrs[i].sh_addr;
me->arch.end_opd = sechdrs[i].sh_addr +
sechdrs[i].sh_size;
}
/* We don't handle .init for the moment: rename to _init */
while ((p = strstr(secstrings + sechdrs[i].sh_name, ".init")))
......
......@@ -287,7 +287,9 @@ SECTIONS
}
.opd : AT(ADDR(.opd) - LOAD_OFFSET) {
__start_opd = .;
*(.opd)
__end_opd = .;
}
. = ALIGN(256);
......
......@@ -20,7 +20,6 @@
#include <linux/sched/task_stack.h>
#include <linux/slab.h>
#include <linux/elfcore.h>
#include <linux/kallsyms.h>
#include <linux/fs.h>
#include <linux/ftrace.h>
#include <linux/hw_breakpoint.h>
......@@ -37,8 +36,8 @@ void show_regs(struct pt_regs * regs)
printk("\n");
show_regs_print_info(KERN_DEFAULT);
print_symbol("PC is at %s\n", instruction_pointer(regs));
print_symbol("PR is at %s\n", regs->pr);
printk("PC is at %pS\n", (void *)instruction_pointer(regs));
printk("PR is at %pS\n", (void *)regs->pr);
printk("PC : %08lx SP : %08lx SR : %08lx ",
regs->pc, regs->regs[15], regs->sr);
......
......@@ -23,7 +23,6 @@
#include <linux/delay.h>
#include <linux/reboot.h>
#include <linux/interrupt.h>
#include <linux/kallsyms.h>
#include <linux/init.h>
#include <linux/cpu.h>
#include <linux/elfcore.h>
......@@ -139,8 +138,8 @@ void __show_regs(struct pt_regs *regs)
char buf[64];
show_regs_print_info(KERN_DEFAULT);
print_symbol("PC is at %s\n", instruction_pointer(regs));
print_symbol("LR is at %s\n", regs->UCreg_lr);
printk("PC is at %pS\n", (void *)instruction_pointer(regs));
printk("LR is at %pS\n", (void *)regs->UCreg_lr);
printk(KERN_DEFAULT "pc : [<%08lx>] lr : [<%08lx>] psr: %08lx\n"
"sp : %08lx ip : %08lx fp : %08lx\n",
regs->UCreg_pc, regs->UCreg_lr, regs->UCreg_asr,
......
......@@ -14,7 +14,6 @@
#include <linux/capability.h>
#include <linux/miscdevice.h>
#include <linux/ratelimit.h>
#include <linux/kallsyms.h>
#include <linux/rcupdate.h>
#include <linux/kobject.h>
#include <linux/uaccess.h>
......@@ -235,7 +234,7 @@ static void __print_mce(struct mce *m)
m->cs, m->ip);
if (m->cs == __KERNEL_CS)
print_symbol("{%s}", m->ip);
pr_cont("{%pS}", (void *)m->ip);
pr_cont("\n");
}
......
......@@ -29,7 +29,6 @@
#include <linux/slab.h>
#include <linux/uaccess.h>
#include <linux/io.h>
#include <linux/kallsyms.h>
#include <asm/pgtable.h>
#include <linux/mmiotrace.h>
#include <asm/e820/api.h> /* for ISA_START_ADDRESS */
......@@ -123,8 +122,8 @@ static void die_kmmio_nesting_error(struct pt_regs *regs, unsigned long addr)
pr_emerg("unexpected fault for address: 0x%08lx, last fault for address: 0x%08lx\n",
addr, my_reason->addr);
print_pte(addr);
print_symbol(KERN_EMERG "faulting IP is at %s\n", regs->ip);
print_symbol(KERN_EMERG "last faulting IP was at %s\n", my_reason->ip);
pr_emerg("faulting IP is at %pS\n", (void *)regs->ip);
pr_emerg("last faulting IP was at %pS\n", (void *)my_reason->ip);
#ifdef __i386__
pr_emerg("eax: %08lx ebx: %08lx ecx: %08lx edx: %08lx\n",
regs->ax, regs->bx, regs->cx, regs->dx);
......
......@@ -20,7 +20,6 @@
#include <linux/of.h>
#include <linux/of_device.h>
#include <linux/genhd.h>
#include <linux/kallsyms.h>
#include <linux/mutex.h>
#include <linux/pm_runtime.h>
#include <linux/netdevice.h>
......@@ -685,8 +684,8 @@ static ssize_t dev_attr_show(struct kobject *kobj, struct attribute *attr,
if (dev_attr->show)
ret = dev_attr->show(dev, dev_attr, buf);
if (ret >= (ssize_t)PAGE_SIZE) {
print_symbol("dev_attr_show: %s returned bad count\n",
(unsigned long)dev_attr->show);
printk("dev_attr_show: %pS returned bad count\n",
dev_attr->show);
}
return ret;
}
......
......@@ -11,7 +11,6 @@
#include <linux/module.h>
#include <linux/kobject.h>
#include <linux/kallsyms.h>
#include <linux/slab.h>
#include <linux/list.h>
#include <linux/mutex.h>
......@@ -69,8 +68,8 @@ static int sysfs_kf_seq_show(struct seq_file *sf, void *v)
* indicate truncated result or overflow in normal use cases.
*/
if (count >= (ssize_t)PAGE_SIZE) {
print_symbol("fill_read_buffer: %s returned bad count\n",
(unsigned long)ops->show);
printk("fill_read_buffer: %pS returned bad count\n",
ops->show);
/* Try to struggle along */
count = PAGE_SIZE - 1;
}
......
......@@ -30,6 +30,7 @@
* __ctors_start, __ctors_end
* __irqentry_text_start, __irqentry_text_end
* __softirqentry_text_start, __softirqentry_text_end
* __start_opd, __end_opd
*/
extern char _text[], _stext[], _etext[];
extern char _data[], _sdata[], _edata[];
......@@ -49,12 +50,15 @@ extern char __start_once[], __end_once[];
/* Start and end of .ctors section - used for constructor calls. */
extern char __ctors_start[], __ctors_end[];
/* Start and end of .opd section - used for function descriptors. */
extern char __start_opd[], __end_opd[];
extern __visible const void __nosave_begin, __nosave_end;
/* function descriptor handling (if any). Override
* in asm/sections.h */
/* Function descriptor handling (if any). Override in asm/sections.h */
#ifndef dereference_function_descriptor
#define dereference_function_descriptor(p) (p)
#define dereference_kernel_function_descriptor(p) (p)
#endif
/* random extra sections (if any). Override
......
......@@ -9,6 +9,10 @@
#include <linux/errno.h>
#include <linux/kernel.h>
#include <linux/stddef.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <asm/sections.h>
#define KSYM_NAME_LEN 128
#define KSYM_SYMBOL_LEN (sizeof("%s+%#lx/%#lx [%s]") + (KSYM_NAME_LEN - 1) + \
......@@ -16,6 +20,56 @@
struct module;
static inline int is_kernel_inittext(unsigned long addr)
{
if (addr >= (unsigned long)_sinittext
&& addr <= (unsigned long)_einittext)
return 1;
return 0;
}
static inline int is_kernel_text(unsigned long addr)
{
if ((addr >= (unsigned long)_stext && addr <= (unsigned long)_etext) ||
arch_is_kernel_text(addr))
return 1;
return in_gate_area_no_mm(addr);
}
static inline int is_kernel(unsigned long addr)
{
if (addr >= (unsigned long)_stext && addr <= (unsigned long)_end)
return 1;
return in_gate_area_no_mm(addr);
}
static inline int is_ksym_addr(unsigned long addr)
{
if (IS_ENABLED(CONFIG_KALLSYMS_ALL))
return is_kernel(addr);
return is_kernel_text(addr) || is_kernel_inittext(addr);
}
static inline void *dereference_symbol_descriptor(void *ptr)
{
#ifdef HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR
struct module *mod;
ptr = dereference_kernel_function_descriptor(ptr);
if (is_ksym_addr((unsigned long)ptr))
return ptr;
preempt_disable();
mod = __module_address((unsigned long)ptr);
preempt_enable();
if (mod)
ptr = dereference_module_function_descriptor(mod, ptr);
#endif
return ptr;
}
#ifdef CONFIG_KALLSYMS
/* Lookup the address for a symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name);
......@@ -40,9 +94,6 @@ extern int sprint_symbol(char *buffer, unsigned long address);
extern int sprint_symbol_no_offset(char *buffer, unsigned long address);
extern int sprint_backtrace(char *buffer, unsigned long address);
/* Look up a kernel symbol and print it to the kernel messages. */
extern void __print_symbol(const char *fmt, unsigned long address);
int lookup_symbol_name(unsigned long addr, char *symname);
int lookup_symbol_attrs(unsigned long addr, unsigned long *size, unsigned long *offset, char *modname, char *name);
......@@ -112,23 +163,8 @@ static inline int kallsyms_show_value(void)
return false;
}
/* Stupid that this does nothing, but I didn't create this mess. */
#define __print_symbol(fmt, addr)
#endif /*CONFIG_KALLSYMS*/
/* This macro allows us to keep printk typechecking */
static __printf(1, 2)
void __check_printsym_format(const char *fmt, ...)
{
}
static inline void print_symbol(const char *fmt, unsigned long addr)
{
__check_printsym_format(fmt, "");
__print_symbol(fmt, (unsigned long)
__builtin_extract_return_addr((void *)addr));
}
static inline void print_ip_sym(unsigned long ip)
{
printk("[<%p>] %pS\n", (void *) ip, (void *) ip);
......
......@@ -612,6 +612,9 @@ int ref_module(struct module *a, struct module *b);
__mod ? __mod->name : "kernel"; \
})
/* Dereference module function descriptor */
void *dereference_module_function_descriptor(struct module *mod, void *ptr);
/* For kallsyms to ask for address resolution. namebuf should be at
* least KSYM_NAME_LEN long: a pointer to namebuf is returned if
* found, otherwise NULL. */
......@@ -766,6 +769,13 @@ static inline bool is_module_sig_enforced(void)
return false;
}
/* Dereference module function descriptor */
static inline
void *dereference_module_function_descriptor(struct module *mod, void *ptr)
{
return ptr;
}
#endif /* CONFIG_MODULES */
#ifdef CONFIG_SYSFS
......
......@@ -3,8 +3,6 @@
* Debugging printout:
*/
#include <linux/kallsyms.h>
#define ___P(f) if (desc->status_use_accessors & f) printk("%14s set\n", #f)
#define ___PS(f) if (desc->istate & f) printk("%14s set\n", #f)
/* FIXME */
......@@ -19,14 +17,14 @@ static inline void print_irq_desc(unsigned int irq, struct irq_desc *desc)
printk("irq %d, desc: %p, depth: %d, count: %d, unhandled: %d\n",
irq, desc, desc->depth, desc->irq_count, desc->irqs_unhandled);
printk("->handle_irq(): %p, ", desc->handle_irq);
print_symbol("%s\n", (unsigned long)desc->handle_irq);
printk("->irq_data.chip(): %p, ", desc->irq_data.chip);
print_symbol("%s\n", (unsigned long)desc->irq_data.chip);
printk("->handle_irq(): %p, %pS\n",
desc->handle_irq, desc->handle_irq);
printk("->irq_data.chip(): %p, %pS\n",
desc->irq_data.chip, desc->irq_data.chip);
printk("->action(): %p\n", desc->action);
if (desc->action) {
printk("->action->handler(): %p, ", desc->action->handler);
print_symbol("%s\n", (unsigned long)desc->action->handler);
printk("->action->handler(): %p, %pS\n",
desc->action->handler, desc->action->handler);
}
___P(IRQ_LEVEL);
......
......@@ -12,7 +12,6 @@
* compression (see scripts/kallsyms.c for a more complete description)
*/
#include <linux/kallsyms.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/seq_file.h>
#include <linux/fs.h>
......@@ -20,15 +19,12 @@
#include <linux/err.h>
#include <linux/proc_fs.h>
#include <linux/sched.h> /* for cond_resched */
#include <linux/mm.h>
#include <linux/ctype.h>
#include <linux/slab.h>
#include <linux/filter.h>
#include <linux/ftrace.h>
#include <linux/compiler.h>
#include <asm/sections.h>
/*
* These will be re-linked against their real values
* during the second link stage.
......@@ -52,37 +48,6 @@ extern const u16 kallsyms_token_index[] __weak;
extern const unsigned long kallsyms_markers[] __weak;
static inline int is_kernel_inittext(unsigned long addr)
{
if (addr >= (unsigned long)_sinittext
&& addr <= (unsigned long)_einittext)
return 1;
return 0;
}
static inline int is_kernel_text(unsigned long addr)
{
if ((addr >= (unsigned long)_stext && addr <= (unsigned long)_etext) ||
arch_is_kernel_text(addr))
return 1;
return in_gate_area_no_mm(addr);
}
static inline int is_kernel(unsigned long addr)
{
if (addr >= (unsigned long)_stext && addr <= (unsigned long)_end)
return 1;
return in_gate_area_no_mm(addr);
}
static int is_ksym_addr(unsigned long addr)
{
if (IS_ENABLED(CONFIG_KALLSYMS_ALL))
return is_kernel(addr);
return is_kernel_text(addr) || is_kernel_inittext(addr);
}
/*
* Expand a compressed symbol data into the resulting uncompressed string,
* if uncompressed string is too long (>= maxlen), it will be truncated,
......@@ -464,17 +429,6 @@ int sprint_backtrace(char *buffer, unsigned long address)
return __sprint_symbol(buffer, address, -1, 1);
}
/* Look up a kernel symbol and print it to the kernel messages. */
void __print_symbol(const char *fmt, unsigned long address)
{
char buffer[KSYM_SYMBOL_LEN];
sprint_symbol(buffer, address);
printk(fmt, buffer);
}
EXPORT_SYMBOL(__print_symbol);
/* To avoid using get_symbol_offset for every symbol, we carry prefix along. */
struct kallsym_iter {
loff_t pos;
......
......@@ -3953,6 +3953,12 @@ static const char *get_ksymbol(struct module *mod,
return symname(kallsyms, best);
}
void * __weak dereference_module_function_descriptor(struct module *mod,
void *ptr)
{
return ptr;
}
/* For kallsyms to ask for address resolution. NULL means not found. Careful
* not to lock to avoid deadlock on oopses, simply disable preemption. */
const char *module_address_lookup(unsigned long addr,
......
......@@ -131,13 +131,10 @@ static int __init control_devkmsg(char *str)
/*
* Set sysctl string accordingly:
*/
if (devkmsg_log == DEVKMSG_LOG_MASK_ON) {
memset(devkmsg_log_str, 0, DEVKMSG_STR_MAX_SIZE);
strncpy(devkmsg_log_str, "on", 2);
} else if (devkmsg_log == DEVKMSG_LOG_MASK_OFF) {
memset(devkmsg_log_str, 0, DEVKMSG_STR_MAX_SIZE);
strncpy(devkmsg_log_str, "off", 3);
}
if (devkmsg_log == DEVKMSG_LOG_MASK_ON)
strcpy(devkmsg_log_str, "on");
else if (devkmsg_log == DEVKMSG_LOG_MASK_OFF)
strcpy(devkmsg_log_str, "off");
/* else "ratelimit" which is set by default. */
/*
......@@ -277,6 +274,13 @@ EXPORT_SYMBOL(console_set_on_cmdline);
/* Flag: console code may call schedule() */
static int console_may_schedule;
enum con_msg_format_flags {
MSG_FORMAT_DEFAULT = 0,
MSG_FORMAT_SYSLOG = (1 << 0),
};
static int console_msg_format = MSG_FORMAT_DEFAULT;
/*
* The printk log buffer consists of a chain of concatenated variable
* length records. Every record starts with a record header, containing
......@@ -1543,6 +1547,146 @@ SYSCALL_DEFINE3(syslog, int, type, char __user *, buf, int, len)
return do_syslog(type, buf, len, SYSLOG_FROM_READER);
}
/*
* Special console_lock variants that help to reduce the risk of soft-lockups.
* They allow to pass console_lock to another printk() call using a busy wait.
*/
#ifdef CONFIG_LOCKDEP
static struct lockdep_map console_owner_dep_map = {
.name = "console_owner"
};
#endif
static DEFINE_RAW_SPINLOCK(console_owner_lock);
static struct task_struct *console_owner;
static bool console_waiter;
/**
* console_lock_spinning_enable - mark beginning of code where another
* thread might safely busy wait
*
* This basically converts console_lock into a spinlock. This marks
* the section where the console_lock owner can not sleep, because
* there may be a waiter spinning (like a spinlock). Also it must be
* ready to hand over the lock at the end of the section.
*/
static void console_lock_spinning_enable(void)
{
raw_spin_lock(&console_owner_lock);
console_owner = current;
raw_spin_unlock(&console_owner_lock);
/* The waiter may spin on us after setting console_owner */
spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
}
/**
* console_lock_spinning_disable_and_check - mark end of code where another
* thread was able to busy wait and check if there is a waiter
*
* This is called at the end of the section where spinning is allowed.
* It has two functions. First, it is a signal that it is no longer
* safe to start busy waiting for the lock. Second, it checks if
* there is a busy waiter and passes the lock rights to her.
*
* Important: Callers lose the lock if there was a busy waiter.
* They must not touch items synchronized by console_lock
* in this case.
*
* Return: 1 if the lock rights were passed, 0 otherwise.
*/
static int console_lock_spinning_disable_and_check(void)
{
int waiter;
raw_spin_lock(&console_owner_lock);
waiter = READ_ONCE(console_waiter);
console_owner = NULL;
raw_spin_unlock(&console_owner_lock);
if (!waiter) {
spin_release(&console_owner_dep_map, 1, _THIS_IP_);
return 0;
}
/* The waiter is now free to continue */
WRITE_ONCE(console_waiter, false);
spin_release(&console_owner_dep_map, 1, _THIS_IP_);
/*
* Hand off console_lock to waiter. The waiter will perform
* the up(). After this, the waiter is the console_lock owner.
*/
mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
return 1;
}
/**
* console_trylock_spinning - try to get console_lock by busy waiting
*
* This allows to busy wait for the console_lock when the current
* owner is running in specially marked sections. It means that
* the current owner is running and cannot reschedule until it
* is ready to lose the lock.
*
* Return: 1 if we got the lock, 0 othrewise
*/
static int console_trylock_spinning(void)
{
struct task_struct *owner = NULL;
bool waiter;
bool spin = false;
unsigned long flags;
if (console_trylock())
return 1;
printk_safe_enter_irqsave(flags);
raw_spin_lock(&console_owner_lock);
owner = READ_ONCE(console_owner);
waiter = READ_ONCE(console_waiter);
if (!waiter && owner && owner != current) {
WRITE_ONCE(console_waiter, true);
spin = true;
}
raw_spin_unlock(&console_owner_lock);
/*
* If there is an active printk() writing to the
* consoles, instead of having it write our data too,
* see if we can offload that load from the active
* printer, and do some printing ourselves.
* Go into a spin only if there isn't already a waiter
* spinning, and there is an active printer, and
* that active printer isn't us (recursive printk?).
*/
if (!spin) {
printk_safe_exit_irqrestore(flags);
return 0;
}
/* We spin waiting for the owner to release us */
spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
/* Owner will clear console_waiter on hand off */
while (READ_ONCE(console_waiter))
cpu_relax();
spin_release(&console_owner_dep_map, 1, _THIS_IP_);
printk_safe_exit_irqrestore(flags);
/*
* The owner passed the console lock to us.
* Since we did not spin on console lock, annotate
* this as a trylock. Otherwise lockdep will
* complain.
*/
mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
return 1;
}
/*
* Call the console drivers, asking them to write out
* log_buf[start] to log_buf[end - 1].
......@@ -1748,13 +1892,20 @@ asmlinkage int vprintk_emit(int facility, int level,
/* If called from the scheduler, we can not call up(). */
if (!in_sched) {
/*
* Disable preemption to avoid being preempted while holding
* console_sem which would prevent anyone from printing to
* console
*/
preempt_disable();
/*
* Try to acquire and then immediately release the console
* semaphore. The release will print out buffers and wake up
* /dev/kmsg and syslog() users.
*/
if (console_trylock())
if (console_trylock_spinning())
console_unlock();
preempt_enable();
}
return printed_len;
......@@ -1855,6 +2006,8 @@ static ssize_t msg_print_ext_header(char *buf, size_t size,
static ssize_t msg_print_ext_body(char *buf, size_t size,
char *dict, size_t dict_len,
char *text, size_t text_len) { return 0; }
static void console_lock_spinning_enable(void) { }
static int console_lock_spinning_disable_and_check(void) { return 0; }
static void call_console_drivers(const char *ext_text, size_t ext_len,
const char *text, size_t len) {}
static size_t msg_print_text(const struct printk_log *msg,
......@@ -1913,6 +2066,17 @@ static int __add_preferred_console(char *name, int idx, char *options,
c->index = idx;
return 0;
}
static int __init console_msg_format_setup(char *str)
{
if (!strcmp(str, "syslog"))
console_msg_format = MSG_FORMAT_SYSLOG;
if (!strcmp(str, "default"))
console_msg_format = MSG_FORMAT_DEFAULT;
return 1;
}
__setup("console_msg_format=", console_msg_format_setup);
/*
* Set up a console. Called via do_early_param() in init/main.c
* for each "console=" parameter in the boot command line.
......@@ -2069,20 +2233,7 @@ int console_trylock(void)
return 0;
}
console_locked = 1;
/*
* When PREEMPT_COUNT disabled we can't reliably detect if it's
* safe to schedule (e.g. calling printk while holding a spin_lock),
* because preempt_disable()/preempt_enable() are just barriers there
* and preempt_count() is always 0.
*
* RCU read sections have a separate preemption counter when
* PREEMPT_RCU enabled thus we must take extra care and check
* rcu_preempt_depth(), otherwise RCU read sections modify
* preempt_count().
*/
console_may_schedule = !oops_in_progress &&
preemptible() &&
!rcu_preempt_depth();
console_may_schedule = 0;
return 1;
}
EXPORT_SYMBOL(console_trylock);
......@@ -2215,7 +2366,10 @@ void console_unlock(void)
goto skip;
}
len += msg_print_text(msg, false, text + len, sizeof(text) - len);
len += msg_print_text(msg,
console_msg_format & MSG_FORMAT_SYSLOG,
text + len,
sizeof(text) - len);
if (nr_ext_console_drivers) {
ext_len = msg_print_ext_header(ext_text,
sizeof(ext_text),
......@@ -2229,14 +2383,29 @@ void console_unlock(void)
console_seq++;
raw_spin_unlock(&logbuf_lock);
/*
* While actively printing out messages, if another printk()
* were to occur on another CPU, it may wait for this one to
* finish. This task can not be preempted if there is a
* waiter waiting to take over.
*/
console_lock_spinning_enable();
stop_critical_timings(); /* don't trace print latency */
call_console_drivers(ext_text, ext_len, text, len);
start_critical_timings();
if (console_lock_spinning_disable_and_check()) {
printk_safe_exit_irqrestore(flags);
return;
}
printk_safe_exit_irqrestore(flags);
if (do_cond_resched)
cond_resched();
}
console_locked = 0;
/* Release the exclusive_console once it is used */
......
// SPDX-License-Identifier: GPL-2.0
#include "sched.h"
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/kallsyms.h>
#include <linux/utsname.h>
#include <linux/security.h>
#include <linux/export.h>
#include "sched.h"
unsigned int __read_mostly sysctl_sched_autogroup_enabled = 1;
static struct autogroup autogroup_default;
static atomic_t autogroup_seq_nr;
......
......@@ -5,7 +5,6 @@
* DEBUG_PREEMPT variant of smp_processor_id().
*/
#include <linux/export.h>
#include <linux/kallsyms.h>
#include <linux/sched.h>
notrace static unsigned int check_preemption_disabled(const char *what1,
......@@ -43,7 +42,7 @@ notrace static unsigned int check_preemption_disabled(const char *what1,
printk(KERN_ERR "BUG: using %s%s() in preemptible [%08x] code: %s/%d\n",
what1, what2, preempt_count() - 1, current->comm, current->pid);
print_symbol("caller is %s\n", (long)__builtin_return_address(0));
printk("caller is %pS\n", __builtin_return_address(0));
dump_stack();
out_enable:
......
......@@ -42,7 +42,6 @@
#include "../mm/internal.h" /* For the trace_print_flags arrays */
#include <asm/page.h> /* for PAGE_SIZE */
#include <asm/sections.h> /* for dereference_function_descriptor() */
#include <asm/byteorder.h> /* cpu_to_le16 */
#include <linux/string_helpers.h>
......@@ -1863,10 +1862,10 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
switch (*fmt) {
case 'F':
case 'f':
ptr = dereference_function_descriptor(ptr);
/* Fallthrough */
case 'S':
case 's':
ptr = dereference_symbol_descriptor(ptr);
/* Fallthrough */
case 'B':
return symbol_string(buf, end, ptr, spec, fmt);
case 'R':
......
......@@ -5759,18 +5759,25 @@ sub process {
for (my $count = $linenr; $count <= $lc; $count++) {
my $fmt = get_quoted_string($lines[$count - 1], raw_line($count, 0));
$fmt =~ s/%%//g;
if ($fmt =~ /(\%[\*\d\.]*p(?![\WFfSsBKRraEhMmIiUDdgVCbGNOx]).)/) {
if ($fmt =~ /(\%[\*\d\.]*p(?![\WSsBKRraEhMmIiUDdgVCbGNOx]).)/) {
$bad_extension = $1;
last;
}
}
if ($bad_extension ne "") {
my $stat_real = raw_line($linenr, 0);
my $ext_type = "Invalid";
my $use = "";
for (my $count = $linenr + 1; $count <= $lc; $count++) {
$stat_real = $stat_real . "\n" . raw_line($count, 0);
}
if ($bad_extension =~ /p[Ff]/) {
$ext_type = "Deprecated";
$use = " - use %pS instead";
$use =~ s/pS/ps/ if ($bad_extension =~ /pf/);
}
WARN("VSPRINTF_POINTER_EXTENSION",
"Invalid vsprintf pointer extension '$bad_extension'\n" . "$here\n$stat_real\n");
"$ext_type vsprintf pointer extension '$bad_extension'$use\n" . "$here\n$stat_real\n");
}
}
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册