提交 · 7dec935a3aa04412cba2cebe1524ae0d34a30c24 · OpenHarmony / kernel_linux

04 3月, 2014 1 次提交

tracepoint: Do not waste memory on mods with no tracepoints · 7dec935a

由 Steven Rostedt (Red Hat) 提交于 2月 26, 2014

No reason to allocate tp_module structures for modules that have no
tracepoints. This just wastes memory.

Fixes: b75ef8b4 "Tracepoint: Dissociate from module mutex"
Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

7dec935a

21 2月, 2014 17 次提交

ftrace: Have static function trace clear ENABLED flag on unregister · 1fcc1553

由 Steven Rostedt (Red Hat) 提交于 2月 19, 2014

The ENABLED flag needs to be cleared when a ftrace_ops is unregistered
otherwise it wont be able to be registered again.

This is only for static tracing and does not affect DYNAMIC_FTRACE at
all.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

1fcc1553

tracing: Add trace_clock=<clock> kernel parameter · e1e232ca

由 Steven Rostedt 提交于 2月 10, 2014

Being able to change the trace clock at boot can be advantageous if
you need a better source of when things happen across CPUs. The default
trace clock is the fastest, but it uses local clocks which may not be
synced across CPUs and it does not let you know when events took place
with respect to events on other CPUs.

The global trace clock can help in this case, and if you do not care
about timings, the counter "clock" is the best, as that is just a  simple
atomic counter that is incremented for every event.

Usage is to add "trace_clock=counter" on the kernel command line. You
can replace counter with "global" or any of the clocks listed in
/sys/kernel/debug/tracing/trace_clock
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Appreciated-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e1e232ca

tracing/uprobes: Support mix of ftrace and perf · 43fe9891

由 Namhyung Kim 提交于 1月 17, 2014

It seems there's no reason to prevent mixed used of ftrace and perf
for a single uprobe event.  At least the kprobes already support it.

Link: http://lkml.kernel.org/r/1389946120-19610-6-git-send-email-namhyung@kernel.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

43fe9891

tracing/uprobes: Support event triggering · ca3b1620

由 Namhyung Kim 提交于 1月 17, 2014

Add support for event triggering to uprobes.  This is same as kprobes
support added by Tom (plus cleanup by Steven).

Link: http://lkml.kernel.org/r/1389946120-19610-5-git-send-email-namhyung@kernel.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

ca3b1620

tracing/uprobes: Support ftrace_event_file base multibuffer · 70ed91c6

由 zhangwei(Jovi) 提交于 1月 17, 2014

Support multi-buffer on uprobe-based dynamic events by
using ftrace_event_file.

This patch is based kprobe-based dynamic events multibuffer
support work initially, commited by Masami(commit 41a7dd42),
but revised as below:

Oleg changed the kprobe-based multibuffer design from
array-pointers of ftrace_event_file into simple list,
so this patch also change to the list design.

rcu_read_lock/unlock added into uprobe_trace_func/uretprobe_trace_func,
to synchronize with ftrace_event_file list add and delete.

Even though we allow multi-uprobes instances now,
but TP_FLAG_PROFILE/TP_FLAG_TRACE are still mutually exclusive
in probe_event_enable currently, this means we cannot allow
one user is using uprobe-tracer, and another user is using
perf-probe on same uprobe concurrently.
(Perhaps this will be fix in future, kprobe don't have this
limitation now)

Link: http://lkml.kernel.org/r/1389946120-19610-4-git-send-email-namhyung@kernel.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Nzhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

70ed91c6

tracing/uprobes: Move argument fetching to uprobe_dispatcher() · dd9fa555

由 Namhyung Kim 提交于 1月 17, 2014

A single uprobe event might serve different users like ftrace and
perf.  And this is especially important for upcoming multi buffer
support.  But in this case it'll fetch (same) data from userspace
multiple times.  So move it to the beginning of the dispatcher
function and reuse it for each users.

Link: http://lkml.kernel.org/r/1389946120-19610-3-git-send-email-namhyung@kernel.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

dd9fa555

tracing/uprobes: Rename uprobe_{trace,perf}_print() functions · a43b9704

由 Namhyung Kim 提交于 1月 17, 2014

The uprobe_{trace,perf}_print functions are misnomers since what they
do is not printing.  There's also a real print function named
print_uprobe_event() so they'll only increase confusion IMHO.

Rename them with double underscores to follow convention of kprobe.

Link: http://lkml.kernel.org/r/1389946120-19610-2-git-send-email-namhyung@kernel.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

a43b9704

ftrace: Allow for function tracing instance to filter functions · 591dffda

由 Steven Rostedt (Red Hat) 提交于 1月 10, 2014

Create a "set_ftrace_filter" and "set_ftrace_notrace" files in the instance
directories to let users filter of functions to trace for the given instance.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

591dffda

ftrace: Pass in global_ops for use with filtering files · e3b3e2e8

由 Steven Rostedt (Red Hat) 提交于 11月 11, 2013

In preparation for having the function tracing instances be able to
filter on functions, the generic filter functions must first be
converted to take in the global_ops as a parameter.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e3b3e2e8

ftrace: Allow instances to use function tracing · f20a5806

由 Steven Rostedt (Red Hat) 提交于 11月 07, 2013

Allow instances (sub-buffers) to enable function tracing.
Each instance will have its own function tracing capability.
For now, instances will not have function stack tracing, or will
they be able to pick and choose what functions they can trace.

Picking and choosing their own functions will come later.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f20a5806

tracing: Convert tracer->enabled to counter · 50512ab5

由 Steven Rostedt (Red Hat) 提交于 1月 14, 2014

As tracers will soon be used by instances, the tracer enabled field
needs to be converted to a counter instead of a boolean.
This counter is protected by the trace_types_lock mutex.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

50512ab5

tracing: Disable tracers before deletion of instance · 6b450d25

由 Steven Rostedt (Red Hat) 提交于 1月 14, 2014

When an instance is about to be deleted, make sure the tracer
is set to nop. If it isn't reset the tracer and set it to the nop
tracer, otherwise memory leaks and bad pointers may result.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

6b450d25

ftrace: Copy ops private to global_ops private · e6435e96

由 Steven Rostedt (Red Hat) 提交于 1月 10, 2014

If global_ops function is being called directly, instead of the global_ops
list function, set the global_ops private to be the same as the ops private
that's being called directly.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e6435e96

tracing: Only let top level have option files · f1b21c9a

由 Steven Rostedt (Red Hat) 提交于 1月 14, 2014

Currently, only the top level instance can have tracing options.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f1b21c9a

tracing: Set up infrastructure to allow tracers for instances · 607e2ea1

由 Steven Rostedt (Red Hat) 提交于 11月 06, 2013

Currently the tracers (function, function_graph, irqsoff, etc) can only
be used by the top level tracing directory (not for instances).

This sets up the infrastructure to allow instances to be able to
run a separate tracer apart from the what the top level tracing is
doing.

As tracers need to adapt for being used by instances, the tracers
must flag if they can be used by instances or not. Currently only the
'nop' tracer can be used by all instances.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

607e2ea1

tracing: Pass trace_array to flag_changed callback · bf6065b5

由 Steven Rostedt (Red Hat) 提交于 1月 10, 2014

As options (flags) may affect instances instead of being global
the flag_changed() callbacks need to receive the trace_array descriptor
of the instance they will be modifying.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

bf6065b5

tracing: Pass trace_array to set_flag callback · 8c1a49ae

由 Steven Rostedt (Red Hat) 提交于 1月 10, 2014

As options (flags) may affect instances instead of being global
the set_flag() callbacks need to receive the trace_array descriptor
of the instance they will be modifying.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

8c1a49ae

14 2月, 2014 1 次提交

tick: Clear broadcast pending bit when switching to oneshot · dd5fd9b9

由 Thomas Gleixner 提交于 2月 11, 2014

AMD systems which use the C1E workaround in the amd_e400_idle routine
trigger the WARN_ON_ONCE in the broadcast code when onlining a CPU.

The reason is that the idle routine of those AMD systems switches the
cpu into forced broadcast mode early on before the newly brought up
CPU can switch over to high resolution / NOHZ mode. The timer related
CPU1 bringup looks like this:

  clockevent_register_device(local_apic);
  tick_setup(local_apic);
  ...
  idle()
	tick_broadcast_on_off(FORCE);
	tick_broadcast_oneshot_control(ENTER)
	  cpumask_set(cpu, broadcast_oneshot_mask);
	halt();

Now the broadcast interrupt on CPU0 sets CPU1 in the
broadcast_pending_mask and wakes CPU1. So CPU1 continues:

	local_apic_timer_interrupt()
	   tick_handle_periodic();
	   softirq()
	     tick_init_highres();
	       cpumask_clr(cpu, broadcast_oneshot_mask);
	
	tick_broadcast_oneshot_control(ENTER)
	   WARN_ON(cpumask_test(cpu, broadcast_pending_mask);

So while we remove CPU1 from the broadcast_oneshot_mask when we switch
over to highres mode, we do not clear the pending bit, which then
triggers the warning when we go back to idle.

The reason why this is only visible on C1E affected AMD systems is
that the other machines enter the deep sleep states via
acpi_idle/intel_idle and exit the broadcast mode before executing the
remote triggered local_apic_timer_interrupt. So the pending bit is
already cleared when the switch over to highres mode is clearing the
oneshot mask.

The solution is simple: Clear the pending bit together with the mask
bit when we switch over to highres mode.

Stanislaw came up independently with the same patch by enforcing the
C1E workaround and debugging the fallout. I picked mine, because mine
has a changelog :)
Reported-by: Npoma <pomidorabelisima@gmail.com>
Debugged-by: NStanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Olaf Hering <olaf@aepfle.de>
Cc: Dave Jones <davej@redhat.com>
Cc: Justin M. Forbes <jforbes@redhat.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402111434180.21991@ionos.tec.linutronix.de
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

dd5fd9b9

12 2月, 2014 1 次提交

ring-buffer: Fix first commit on sub-buffer having non-zero delta · d651aa1d

由 Steven Rostedt (Red Hat) 提交于 2月 11, 2014

Each sub-buffer (buffer page) has a full 64 bit timestamp. The events on
that page use a 27 bit delta against that timestamp in order to save on
bits written to the ring buffer. If the time between events is larger than
what the 27 bits can hold, a "time extend" event is added to hold the
entire 64 bit timestamp again and the events after that hold a delta from
that timestamp.

As a "time extend" is always paired with an event, it is logical to just
allocate the event with the time extend, to make things a bit more efficient.

Unfortunately, when the pairing code was written, it removed the "delta = 0"
from the first commit on a page, causing the events on the page to be
slightly skewed.

Fixes: 69d1b839 "ring-buffer: Bind time extend and data events together"
Cc: stable@vger.kernel.org # 2.6.37+
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d651aa1d

11 2月, 2014 1 次提交

genirq: Add missing irq_to_desc export for CONFIG_SPARSE_IRQ=n · 2c45aada

由 Paul Gortmaker 提交于 2月 10, 2014

In allmodconfig builds for sparc and any other arch which does
not set CONFIG_SPARSE_IRQ, the following will be seen at modpost:

  CC [M]  lib/cpu-notifier-error-inject.o
  CC [M]  lib/pm-notifier-error-inject.o
ERROR: "irq_to_desc" [drivers/gpio/gpio-mcp23s08.ko] undefined!
make[2]: *** [__modpost] Error 1

This happens because commit 3911ff30 ("genirq: export
handle_edge_irq() and irq_to_desc()") added one export for it, but
there were actually two instances of it, in an if/else clause for
CONFIG_SPARSE_IRQ.  Add the second one.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: stable@vger.kernel.org	# 3.4+
Link: http://lkml.kernel.org/r/1392057610-11514-1-git-send-email-paul.gortmaker@windriver.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

2c45aada

09 2月, 2014 1 次提交

genirq: Add devm_request_any_context_irq() · 0668d306

由 Stephen Boyd 提交于 1月 02, 2014

Some drivers use request_any_context_irq() but there isn't a
devm_* function for it. Add one so that these drivers don't need
to explicitly free the irq on driver detach.
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Link: http://lkml.kernel.org/r/1388709460-19222-3-git-send-email-sboyd@codeaurora.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

0668d306

06 2月, 2014 2 次提交

time: Fix overflow when HZ is smaller than 60 · 80d767d7

由 Mikulas Patocka 提交于 1月 24, 2014

When compiling for the IA-64 ski emulator, HZ is set to 32 because the
emulation is slow and we don't want to waste too many cycles processing
timers. Alpha also has an option to set HZ to 32.

This causes integer underflow in
kernel/time/jiffies.c:
kernel/time/jiffies.c:66:2: warning: large integer implicitly truncated to unsigned type [-Woverflow]
  .mult  = NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
  ^

This patch reduces the JIFFIES_SHIFT value to avoid the overflow.
Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1401241639100.23871@file01.intranet.prod.int.rdu2.redhat.com
Cc: stable@vger.kernel.org
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

80d767d7

execve: use 'struct filename *' for executable name passing · c4ad8f98

由 Linus Torvalds 提交于 2月 05, 2014

This changes 'do_execve()' to get the executable name as a 'struct
filename', and to free it when it is done.  This is what the normal
users want, and it simplifies and streamlines their error handling.

The controlled lifetime of the executable name also fixes a
use-after-free problem with the trace_sched_process_exec tracepoint: the
lifetime of the passed-in string for kernel users was not at all
obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize
the pathname allocation lifetime with the execve() having finished,
which in turn meant that the trace point that happened after
mm_release() of the old process VM ended up using already free'd memory.

To solve the kernel string lifetime issue, this simply introduces
"getname_kernel()" that works like the normal user-space getname()
function, except with the source coming from kernel memory.

As Oleg points out, this also means that we could drop the tcomm[] array
from 'struct linux_binprm', since the pathname lifetime now covers
setup_new_exec().  That would be a separate cleanup.
Reported-by: NIgor Zhbanov <i.zhbanov@samsung.com>
Tested-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c4ad8f98

05 2月, 2014 1 次提交

genirq: Generic irq chip requires IRQ_DOMAIN · 923fa4ea

由 Nitin A Kamble 提交于 1月 30, 2014

The generic_chip.c uses interfaces from irq_domain.c which is
controlled by the IRQ_DOMAIN config option, but there is no Kconfig
dependency so the build can fail:

linux/kernel/irq/generic-chip.c:400:11: error:
'irq_domain_xlate_onetwocell' undeclared here (not in a function)

Select IRQ_DOMAIN when GENERIC_IRQ_CHIP is selected.
Signed-off-by: NNitin A Kamble <nitin.a.kamble@intel.com>
Link: http://lkml.kernel.org/r/1391129410-54548-2-git-send-email-nitin.a.kamble@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # 3.11+

923fa4ea

31 1月, 2014 2 次提交

kernel/smp.c: remove cpumask_ipi · 73f94550

由 Roman Gushchin 提交于 1月 30, 2014

After commit 9a46ad6d ("smp: make smp_call_function_many() use logic
similar to smp_call_function_single()"), cfd->cpumask is accessed only
in smp_call_function_many().  So there is no more need to copy it into
cfd->cpumask_ipi before putting csd into the list.  The cpumask_ipi
field is obsolete and can be removed.
Signed-off-by: NRoman Gushchin <klamm@yandex-team.ru>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Wang YanQing <udknight@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

73f94550

kernel: use lockless list for smp_call_function_single · 6897fc22

由 Christoph Hellwig 提交于 1月 30, 2014

Make smp_call_function_single and friends more efficient by using a
lockless list.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6897fc22

28 1月, 2014 6 次提交

sched: Make sched_class::get_rr_interval() optional · a57beec5

由 Peter Zijlstra 提交于 1月 27, 2014

Not all classes implement (or can implement) a useful get_rr_interval()
function, default to a 0 time-slice for them.

This fixes a crash reported by Tommi Rantala.
Reported-by: NTommi Rantala <tt.rantala@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140127105413.GC11314@laptop.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

a57beec5

sched/deadline: Add sched_dl documentation · 712e5e34

由 Dario Faggioli 提交于 1月 27, 2014

Add in Documentation/scheduler/ some hints about the design
choices, the usage and the future possible developments of the
sched_dl scheduling class and of the SCHED_DEADLINE policy.
Reviewed-by: NHenrik Austad <henrik@austad.us>
Signed-off-by: NDario Faggioli <raistlin@linux.it>
Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
[ Re-wrote sections 2 and 3. ]
Signed-off-by: NLuca Abeni <luca.abeni@unitn.it>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1390821615-23247-1-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

712e5e34

softirq: use const char * const for softirq_to_name, whitespace neatening · ce85b4f2

由 Joe Perches 提交于 1月 27, 2014

Reduce data size a little.
Reduce checkpatch noise.

$ size kernel/softirq.o*
   text	   data	    bss	    dec	    hex	filename
  11554	   6013	   4008	  21575	   5447	kernel/softirq.o.new
  11474	   6093	   4008	  21575	   5447	kernel/softirq.o.old
Signed-off-by: NJoe Perches <joe@perches.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ce85b4f2

softirq: convert printks to pr_<level> · 40322764

由 Joe Perches 提交于 1月 27, 2014

Use a more current logging style.
Signed-off-by: NJoe Perches <joe@perches.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

40322764

softirq: use ffs() in __do_softirq() · 2e702b9f

由 Joe Perches 提交于 1月 27, 2014

Possible speed improvement of __do_softirq() by using ffs() instead of
using a while loop with an & 1 test then single bit shift.
Signed-off-by: NJoe Perches <joe@perches.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2e702b9f

kernel/kexec.c: use vscnprintf() instead of vsnprintf() in vmcoreinfo_append_str() · a19428e5

由 Chen Gang 提交于 1月 27, 2014

vsnprintf() may let 'r' larger than sizeof(buf), in this case, if 'r' is
also less than "vmcoreinfo_max_size - vmcoreinfo_size" (left size of
destination buffer), next memcpy() will read the unexpected addresses.
Signed-off-by: NChen Gang <gang.chen@asianux.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a19428e5

25 1月, 2014 4 次提交

hung_task: Display every hung task warning · 270750db

由 Aaron Tomlin 提交于 1月 20, 2014

When khungtaskd detects hung tasks, it prints out
backtraces from a number of those tasks.

Limiting the number of backtraces being printed
out can result in the user not seeing the information
necessary to debug the issue. The hung_task_warnings
sysctl controls this feature.

This patch makes it possible for hung_task_warnings
to accept a special value to print an unlimited
number of backtraces when khungtaskd detects hung
tasks.

The special value is -1. To use this value it is
necessary to change types from ulong to int.
Signed-off-by: NAaron Tomlin <atomlin@redhat.com>
Reviewed-by: NRik van Riel <riel@redhat.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: oleg@redhat.com
Link: http://lkml.kernel.org/r/1390239253-24030-3-git-send-email-atomlin@redhat.com
[ Build warning fix. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

270750db

introduce __fcheck_files() to fix rcu_dereference_check_fdtable(), kill rcu_my_thread_group_empty() · a8d4b834

由 Oleg Nesterov 提交于 1月 11, 2014

rcu_dereference_check_fdtable() looks very wrong,

1. rcu_my_thread_group_empty() was added by 844b9a87 "vfs: fix
   RCU-lockdep false positive due to /proc" but it doesn't really
   fix the problem. A CLONE_THREAD (without CLONE_FILES) task can
   hit the same race with get_files_struct().

   And otoh rcu_my_thread_group_empty() can suppress the correct
   warning if the caller is the CLONE_FILES (without CLONE_THREAD)
   task.

2. files->count == 1 check is not really right too. Even if this
   files_struct is not shared it is not safe to access it lockless
   unless the caller is the owner.

   Otoh, this check is sub-optimal. files->count == 0 always means
   it is safe to use it lockless even if files != current->files,
   but put_files_struct() has to take rcu_read_lock(). See the next
   patch.

This patch removes the buggy checks and turns fcheck_files() into
__fcheck_files() which uses rcu_dereference_raw(), the "unshared"
callers, fget_light() and fget_raw_light(), can use it to avoid
the warning from RCU-lockdep.

fcheck_files() is trivially reimplemented as rcu_lockdep_assert()
plus __fcheck_files().
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a8d4b834

sysctl: Add neg_one as a standard constraint · 2397efb1

由 Aaron Tomlin 提交于 1月 20, 2014

Add neg_one to the list of standard constraints - will be used by the next patch.
Signed-off-by: NAaron Tomlin <atomlin@redhat.com>
Acked-by: NRik van Riel <riel@redhat.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: oleg@redhat.com
Link: http://lkml.kernel.org/r/1390239253-24030-2-git-send-email-atomlin@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

2397efb1

kgdb/kdb: Fix no KDB config problem · fc8b1374

由 Mike Travis 提交于 1月 14, 2014

Some code added to the debug_core module had KDB dependencies
that it shouldn't have.  Move the KDB dependent REASON back to
the caller to remove the dependency in the debug core code.

Update the call from the UV NMI handler to conform to the new
interface.
Signed-off-by: NMike Travis <travis@sgi.com>
Reviewed-by: NHedi Berriche <hedi@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20140114162551.318251993@asylum.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fc8b1374

24 1月, 2014 3 次提交

kdump: fix exported size of vmcoreinfo note · 77019967

由 Vivek Goyal 提交于 1月 23, 2014

Right now we seem to be exporting the max data size contained inside
vmcoreinfo note.  But this does not include the size of meta data around
vmcore info data.  Like name of the note and starting and ending elf_note.

I think user space expects total size and that size is put in PT_NOTE elf
header.  Things seem to be fine so far because we are not using vmcoreinfo
note to the maximum capacity.  But as it starts filling up, to capacity,
at some point of time, problem will be visible.

I don't think user space will be broken with this change.  So there is no
need to introduce vmcoreinfo2.  This change is safe and backward
compatible.  More explanation on why this change is safe is below.

vmcoreinfo contains information about kernel which user space needs to
know to do things like filtering.  For example, various kernel config
options or information about size or offset of some data structures etc.
All this information is commmunicated to user space with an ELF note
present in ELF /proc/vmcore file.

Currently vmcoreinfo data size is 4096.  With some elf note meta data
around it, actual size is 4132 bytes.  But we are using barely 25% of that
size.  Rest is empty.  So even if we tell user space that size of ELf note
is 4096 and not 4132, nothing will be broken becase after around 1000
bytes, everything is zero anyway.

But once we start filling up the note to the capacity, and not report the
full size of note, bad things will start happening.  Either some data will
be lost or tools will be confused that they did not fine the zero note at
the end.

So I think this change is safe and should not break existing tools.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Cc: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Cc: Dan Aloni <da-x@monatomic.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

77019967

kexec: add sysctl to disable kexec_load · 7984754b

由 Kees Cook 提交于 1月 23, 2014

For general-purpose (i.e.  distro) kernel builds it makes sense to build
with CONFIG_KEXEC to allow end users to choose what kind of things they
want to do with kexec.  However, in the face of trying to lock down a
system with such a kernel, there needs to be a way to disable kexec_load
(much like module loading can be disabled).  Without this, it is too easy
for the root user to modify kernel memory even when CONFIG_STRICT_DEVMEM
and modules_disabled are set.  With this change, it is still possible to
load an image for use later, then disable kexec_load so the image (or lack
of image) can't be altered.

The intention is for using this in environments where "perfect"
enforcement is hard.  Without a verified boot, along with verified
modules, and along with verified kexec, this is trying to give a system a
better chance to defend itself (or at least grow the window of
discoverability) against attack in the face of a privilege escalation.

In my mind, I consider several boot scenarios:

1) Verified boot of read-only verified root fs loading fd-based
   verification of kexec images.
2) Secure boot of writable root fs loading signed kexec images.
3) Regular boot loading kexec (e.g. kcrash) image early and locking it.
4) Regular boot with no control of kexec image at all.

1 and 2 don't exist yet, but will soon once the verified kexec series has
landed.  4 is the state of things now.  The gap between 2 and 4 is too
large, so this change creates scenario 3, a middle-ground above 4 when 2
and 1 are not possible for a system.
Signed-off-by: NKees Cook <keescook@chromium.org>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7984754b

kernel/signal.c: change do_signal_stop/do_sigaction to use while_each_thread() · 8d38f203

由 Oleg Nesterov 提交于 1月 23, 2014

Change do_signal_stop() and do_sigaction() to avoid next_thread() and use
while_each_thread() instead.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: NSameer Nanda <snanda@chromium.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8d38f203

OpenHarmony / kernel_linux 上一次同步 4 年多

OpenHarmony / kernel_linux
上一次同步 4 年多