提交 · c1e7abbc7afc97367cd77c8f2895c2169a8f9c87 · openeuler / raspberrypi-kernel

12 11月, 2008 1 次提交

ring-buffer: buffer record on/off switch · a3583244

由 Steven Rostedt 提交于 11月 11, 2008

Impact: enable/disable ring buffer recording API added

Several kernel developers have requested that there be a way to stop
recording into the ring buffers with a simple switch that can also
be enabled from userspace. This patch addes a new kernel API to the
ring buffers called:

 tracing_on()
 tracing_off()

When tracing_off() is called, all ring buffers will not be able to record
into their buffers.

tracing_on() will enable the ring buffers again.

These two act like an on/off switch. That is, there is no counting of the
number of times tracing_off or tracing_on has been called.

A new file is added to the debugfs/tracing directory called

  tracing_on

This allows for userspace applications to also flip the switch.

  echo 0 > debugfs/tracing/tracing_on

disables the tracing.

  echo 1 > /debugfs/tracing/tracing_on

enables it.

Note, this does not disable or enable any tracers. It only sets or clears
a flag that needs to be set in order for the ring buffers to write to
their buffers. It is a global flag, and affects all ring buffers.

The buffers start out with tracing_on enabled.

There are now three flags that control recording into the buffers:

 tracing_on: which affects all ring buffer tracers.

 buffer->record_disabled: which affects an allocated buffer, which may be set
     if an anomaly is detected, and tracing is disabled.

 cpu_buffer->record_disabled: which is set by tracing_stop() or if an
     anomaly is detected. tracing_start can not reenable this if
     an anomaly occurred.

The userspace debugfs/tracing/tracing_enabled is implemented with
tracing_stop() but the user space code can not enable it if the kernel
called tracing_stop().

Userspace can enable the tracing_on even if the kernel disabled it.
It is just a switch used to stop tracing if a condition was hit.
tracing_on is not for protecting critical areas in the kernel nor is
it for stopping tracing if an anomaly occurred. This is because userspace
can reenable it at any time.

Side effect: With this patch, I discovered a dead variable in ftrace.c
  called tracing_on. This patch removes it.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

a3583244

11 11月, 2008 7 次提交

sched: release buddies on yield · 2002c695

由 Peter Zijlstra 提交于 11月 11, 2008

Clear buddies on yield, so that the buddy rules don't schedule them
despite them being placed right-most.

This fixed a performance regression with yield-happy binary JVMs.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Tested-by: NLin Ming <ming.m.lin@intel.com>

2002c695

timers: handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq context · 5d5254f0

由 Gautham R Shenoy 提交于 10月 25, 2008

Impact: fix incorrect locking triggered during hotplug-intense stress-tests

While migrating the the CB_IRQSAFE_UNLOCKED timers during a cpu-offline,
we queue them on the cb_pending list, so that they won't go
stale.

Thus, when the callbacks of the timers run from the softirq context,
they could run into potential deadlocks, since these callbacks
assume that they're running with irq's disabled, thereby annoying
lockdep!

Fix this by emulating hardirq context while running these callbacks from
the hrtimer softirq.

=================================
[ INFO: inconsistent lock state ]
2.6.27 #2
--------------------------------
inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
ksoftirqd/0/4 [HC0[0]:SC1[1]:HE1:SE0] takes:
 (&rq->lock){++..}, at: [<c011db84>] sched_rt_period_timer+0x9e/0x1fc
{in-hardirq-W} state was registered at:
  [<c014103c>] __lock_acquire+0x549/0x121e
  [<c0107890>] native_sched_clock+0x88/0x99
  [<c013aa12>] clocksource_get_next+0x39/0x3f
  [<c0139abc>] update_wall_time+0x616/0x7df
  [<c0141d6b>] lock_acquire+0x5a/0x74
  [<c0121724>] scheduler_tick+0x3a/0x18d
  [<c047ed45>] _spin_lock+0x1c/0x45
  [<c0121724>] scheduler_tick+0x3a/0x18d
  [<c0121724>] scheduler_tick+0x3a/0x18d
  [<c012c436>] update_process_times+0x3a/0x44
  [<c013c044>] tick_periodic+0x63/0x6d
  [<c013c062>] tick_handle_periodic+0x14/0x5e
  [<c010568c>] timer_interrupt+0x44/0x4a
  [<c0150c9f>] handle_IRQ_event+0x13/0x3d
  [<c0151c14>] handle_level_irq+0x79/0xbd
  [<c0105634>] do_IRQ+0x69/0x7d
  [<c01041e4>] common_interrupt+0x28/0x30
  [<c047007b>] aac_probe_one+0x1a3/0x3f3
  [<c047ec2d>] _spin_unlock_irqrestore+0x36/0x39
  [<c01512b4>] setup_irq+0x1be/0x1f9
  [<c065d70b>] start_kernel+0x259/0x2c5
  [<ffffffff>] 0xffffffff
irq event stamp: 50102
hardirqs last  enabled at (50102): [<c047ebf4>] _spin_unlock_irq+0x20/0x23
hardirqs last disabled at (50101): [<c047edc2>] _spin_lock_irq+0xa/0x4b
softirqs last  enabled at (50088): [<c0128ba6>] do_softirq+0x37/0x4d
softirqs last disabled at (50099): [<c0128ba6>] do_softirq+0x37/0x4d

other info that might help us debug this:
no locks held by ksoftirqd/0/4.

stack backtrace:
Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.27 #2
 [<c013f6cb>] print_usage_bug+0x13e/0x147
 [<c013fef5>] mark_lock+0x493/0x797
 [<c01410b1>] __lock_acquire+0x5be/0x121e
 [<c0141d6b>] lock_acquire+0x5a/0x74
 [<c011db84>] sched_rt_period_timer+0x9e/0x1fc
 [<c047ed45>] _spin_lock+0x1c/0x45
 [<c011db84>] sched_rt_period_timer+0x9e/0x1fc
 [<c011db84>] sched_rt_period_timer+0x9e/0x1fc
 [<c01210fd>] finish_task_switch+0x41/0xbd
 [<c0107890>] native_sched_clock+0x88/0x99
 [<c011dae6>] sched_rt_period_timer+0x0/0x1fc
 [<c0136dda>] run_hrtimer_pending+0x54/0xe5
 [<c011dae6>] sched_rt_period_timer+0x0/0x1fc
 [<c0128afb>] __do_softirq+0x7b/0xef
 [<c0128ba6>] do_softirq+0x37/0x4d
 [<c0128c12>] ksoftirqd+0x56/0xc5
 [<c0128bbc>] ksoftirqd+0x0/0xc5
 [<c0134649>] kthread+0x38/0x5d
 [<c0134611>] kthread+0x0/0x5d
 [<c0104477>] kernel_thread_helper+0x7/0x10
 =======================
Signed-off-by: NGautham R Shenoy <ego@in.ibm.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5d5254f0

fix for account_group_exec_runtime(), make sure ->signal can't be freed under rq->lock · ad474cac

由 Oleg Nesterov 提交于 11月 10, 2008

Impact: fix hang/crash on ia64 under high load

This is ugly, but the simplest patch by far.

Unlike other similar routines, account_group_exec_runtime() could be
called "implicitly" from within scheduler after exit_notify(). This
means we can race with the parent doing release_task(), we can't just
check ->signal != NULL.

Change __exit_signal() to do spin_unlock_wait(&task_rq(tsk)->lock)
before __cleanup_signal() to make sure ->signal can't be freed under
task_rq(tsk)->lock. Note that task_rq_unlock_wait() doesn't care
about the case when tsk changes cpu/rq under us, this should be OK.

Thanks to Ingo who nacked my previous buggy patch.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Reported-by: NDoug Chapman <doug.chapman@hp.com>

ad474cac

ring-buffer: prevent infinite looping on time stamping · 4143c5cb

由 Steven Rostedt 提交于 11月 10, 2008

Impact: removal of unnecessary looping

The lockless part of the ring buffer allows for reentry into the code
from interrupts. A timestamp is taken, a test is preformed and if it
detects that an interrupt occurred that did tracing, it tries again.

The problem arises if the timestamp code itself causes a trace.
The detection will detect this and loop again. The difference between
this and an interrupt doing tracing, is that this will fail every time,
and cause an infinite loop.

Currently, we test if the loop happens 1000 times, and if so, it will
produce a warning and disable the ring buffer.

The problem with this approach is that it makes it difficult to perform
some types of tracing (tracing the timestamp code itself).

Each trace entry has a delta timestamp from the previous entry.
If a trace entry is reserved but and interrupt occurs and traces before
the previous entry is commited, the delta timestamp for that entry will
be zero. This actually makes sense in terms of tracing, because the
interrupt entry happened before the preempted entry was commited, so
one may consider the two happening at the same time. The order is
still preserved in the buffer.

With this idea, instead of trying to get a new timestamp if an interrupt
made it in between the timestamp and the test, the entry could simply
make the delta zero and continue. This will prevent interrupts or
tracers in the timer code from causing the above loop.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

4143c5cb

ftrace: disable tracing on resize · bf5e6519

由 Steven Rostedt 提交于 11月 10, 2008

Impact: fix for bug on resize

This patch addresses the bug found here:

 http://bugzilla.kernel.org/show_bug.cgi?id=11996

When ftrace converted to the new unified trace buffer, the resizing of
the buffer was not protected as much as it was originally. If tracing
is performed while the resize occurs, then the buffer can be corrupted.

This patch disables all ftrace buffer modifications before a resize
takes place.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

bf5e6519

nohz: disable tick_nohz_kick_tick() for now · ae99286b

由 Thomas Gleixner 提交于 11月 10, 2008

Impact: nohz powersavings and wakeup regression

commit fb02fbc1 (NOHZ: restart tick
device from irq_enter()) causes a serious wakeup regression.

While the patch is correct it does not take into account that spurious
wakeups happen on x86. A fix for this issue is available, but we just
revert to the .27 behaviour and let long running softirqs screw
themself.

Disable it for now.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

ae99286b

irq: call __irq_enter() before calling the tick_idle_check · ee5f80a9

由 Thomas Gleixner 提交于 11月 07, 2008

Impact: avoid spurious ksoftirqd wakeups

The tick idle check which is called from irq_enter() was run before
the call to __irq_enter() which did not set the in_interrupt() bits in
preempt_count. That way the raise of a softirq woke up softirqd for
nothing as the softirq was handled on return from interrupt.

Call __irq_enter() before calling into the tick idle check code.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ee5f80a9

10 11月, 2008 1 次提交

sched: clean up debug info · 5ac5c4d6

由 Peter Zijlstra 提交于 11月 10, 2008

Impact: clean up and fix debug info printout

While looking over the sched_debug code I noticed that we printed the rq
schedstats for every cfs_rq, ammend this.

Also change nr_spead_over into an int, and fix a little buglet in
min_vruntime printing.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5ac5c4d6

07 11月, 2008 3 次提交

sched: fix memory leak in a failure path · ca3273f9

由 Li Zefan 提交于 11月 07, 2008

Impact: fix rare memory leak in the sched-domains manual reconfiguration code

In the failure path, rd is not attached to a sched domain,
so it causes a leak.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ca3273f9

sched: fix a bug in sched domain degenerate · f29c9b1c

由 Li Zefan 提交于 11月 06, 2008

Impact: re-add incorrectly eliminated sched domain layers

(1) on i386 with SCHED_SMT and SCHED_MC enabled
	# mount -t cgroup -o cpuset xxx /mnt
	# echo 0 > /mnt/cpuset.sched_load_balance
	# mkdir /mnt/0
	# echo 0 > /mnt/0/cpuset.cpus
	# dmesg
	CPU0 attaching sched-domain:
	 domain 0: span 0 level CPU
	  groups: 0

(2) on i386 with SCHED_MC enabled but SCHED_SMT disabled
	# same with (1)
	# dmesg
	CPU0 attaching NULL sched-domain.

The bug is that some sched domains may be skipped unintentionally when
degenerating (optimizing) sched domains.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f29c9b1c

cgroups: fix invalid cgrp->dentry before cgroup has been completely removed · 24eb0899

由 Li Zefan 提交于 11月 06, 2008

This fixes an oops when reading /proc/sched_debug.

A cgroup won't be removed completely until finishing cgroup_diput(), so we
shouldn't invalidate cgrp->dentry in cgroup_rmdir().  Otherwise, when a
group is being removed while cgroup_path() gets called, we may trigger
NULL dereference BUG.

The bug can be reproduced:

 # cat test.sh
 #!/bin/sh
 mount -t cgroup -o cpu xxx /mnt
 for (( ; ; ))
 {
	mkdir /mnt/sub
	rmdir /mnt/sub
 }
 # ./test.sh &
 # cat /proc/sched_debug

BUG: unable to handle kernel NULL pointer dereference at 00000038
IP: [<c045a47f>] cgroup_path+0x39/0x90
...
Call Trace:
 [<c0420344>] ? print_cfs_rq+0x6e/0x75d
 [<c0421160>] ? sched_debug_show+0x72d/0xc1e
...
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPaul Menage <menage@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>		[2.6.26.x, 2.6.27.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

24eb0899

06 11月, 2008 3 次提交

cpumask: introduce new API, without changing anything · 2d3854a3

由 Rusty Russell 提交于 11月 05, 2008

Impact: introduce new APIs

We want to deprecate cpumasks on the stack, as we are headed for
gynormous numbers of CPUs.  Eventually, we want to head towards an
undefined 'struct cpumask' so they can never be declared on stack.

1) New cpumask functions which take pointers instead of copies.
   (cpus_* -> cpumask_*)

2) Several new helpers to reduce requirements for temporary cpumasks
   (cpumask_first_and, cpumask_next_and, cpumask_any_and)

3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
   (cpumask_var_t, alloc_cpumask_var and free_cpumask_var)

4) 'struct cpumask' for explicitness and to mark new-style code.

5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
   not NR_CPUS for time efficiency and for smaller dynamic allocations
   in future.

6) cpumask_copy() so we can allocate less than a full cpumask eventually
   (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
   definition eventually.

7) work_on_cpu() helper for doing task on a CPU, rather than saving old
   cpumask for current thread and manipulating it.

8) smp_call_function_many() which is smp_call_function_mask() except
   taking a cpumask pointer.

Note that this patch simply introduces the new functions and leaves
the obsolescent ones in place.  This is to simplify the transition
patches.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2d3854a3

Add round_jiffies_up and related routines · 9c133c46

由 Alan Stern 提交于 11月 06, 2008

This patch (as1158b) adds round_jiffies_up() and friends.  These
routines work like the analogous round_jiffies() functions, except
that they will never round down.

The new routines will be useful for timeouts where we don't care
exactly when the timer expires, provided it doesn't expire too soon.
Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9c133c46

generic-ipi: fix the smp_mb() placement · 561920a0

由 Suresh Siddha 提交于 10月 30, 2008

smp_mb() is needed (to make the memory operations visible globally) before
sending the ipi on the sender and the receiver (on Alpha atleast) needs
smp_read_barrier_depends() in the handler before reading the call_single_queue
list in a lock-free fashion.

On x86, x2apic mode register accesses for sending IPI's don't have serializing
semantics. So the need for smp_mb() before sending the IPI becomes more
critical in x2apic mode.

Remove the unnecessary smp_mb() in csd_flag_wait(), as the presence of that
smp_mb() doesn't mean anything on the sender, when the ipi receiver is not
doing any thing special (like memory fence) after clearing the CSD_FLAG_WAIT.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

561920a0

05 11月, 2008 5 次提交

sched: fix buddies for group scheduling · 02479099

由 Peter Zijlstra 提交于 11月 04, 2008

Impact: scheduling order fix for group scheduling

For each level in the hierarchy, set the buddy to point to the right entity.
Therefore, when we do the hierarchical schedule, we have a fair chance of
ending up where we meant to.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

02479099

sched: backward looking buddy · 4793241b

由 Peter Zijlstra 提交于 11月 04, 2008

Impact: improve/change/fix wakeup-buddy scheduling

Currently we only have a forward looking buddy, that is, we prefer to
schedule to the task we last woke up, under the presumption that its
going to consume the data we just produced, and therefore will have
cache hot benefits.

This allows co-waking producer/consumer task pairs to run ahead of the
pack for a little while, keeping their cache warm. Without this, we
would interleave all pairs, utterly trashing the cache.

This patch introduces a backward looking buddy, that is, suppose that
in the above scenario, the consumer preempts the producer before it
can go to sleep, we will therefore miss the wakeup from consumer to
producer (its already running, after all), breaking the cycle and
reverting to the cache-trashing interleaved schedule pattern.

The backward buddy will try to schedule back to the task that woke us
up in case the forward buddy is not available, under the assumption
that the last task will be the one with the most cache hot task around
barring current.

This will basically allow a task to continue after it got preempted.

In order to avoid starvation, we allow either buddy to get wakeup_gran
ahead of the pack.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4793241b

sched: fix fair preempt check · d95f98d0

由 Peter Zijlstra 提交于 11月 04, 2008

Impact: fix cross-class preemption

Inter-class wakeup preemptions should go on class order.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d95f98d0

sched: cleanup fair task selection · f4b6755f

由 Peter Zijlstra 提交于 11月 04, 2008

Impact: cleanup

Clean up task selection
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f4b6755f

ftrace: fix breakage in bin_fmt results · 072ba498

由 Eric Anholt 提交于 10月 26, 2008

In 777e208d we changed from outputting
field->cpu (a char) to iter->cpu (unsigned int), increasing the resulting
structure size by 3 bytes.
Signed-off-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

072ba498

03 11月, 2008 3 次提交

tracing, ring-buffer: add paranoid checks for loops · 818e3dd3

由 Steven Rostedt 提交于 10月 31, 2008

While writing a new tracer, I had a bug where I caused the ring-buffer
to recurse in a bad way. The bug was with the tracer I was writing
and not the ring-buffer itself. But it took a long time to find the
problem.

This patch adds paranoid checks into the ring-buffer infrastructure
that will catch bugs of this nature.

Note: I put the bug back in the tracer and this patch showed the error
      nicely and prevented the lockup.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

818e3dd3

ftrace: use kretprobe trampoline name to test in output · b3aa5577

由 Steven Rostedt 提交于 10月 31, 2008

Impact: ia64+tracing build fix

When a function is kprobed, the return address is set to the
kprobe_trampoline, or something similar. This caused the output
of the trace to look confusing when the parent seemed to be this
"kprobe_trampoline" function.

To fix this, Abhishek Sagar added a test of the instruction pointer
of the parent to see if it matched the kprobe_trampoline. If it
did, the output would print a "[unknown/kretprobe'd]" instead.

Unfortunately, not all archs do this the same way, and the trampoline
function may not be exported, which causes failures in builds.

This patch will compare the name instead of the pointer to see
if it matches. This prevents us from depending on a function from
being exported, and should work on all archs. The worst that can
happen is that an arch might use a different name and then we
go back to the confusing output. At least the arch will still build.
Reported-by: NAbhishek Sagar <sagar.abhishek@gmail.com>
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Tested-by: NAbhishek Sagar <sagar.abhishek@gmail.com>
Acked-by: NAbhishek Sagar <sagar.abhishek@gmail.com>

b3aa5577

tracing, alpha: undefined reference to `save_stack_trace' · c2c80529

由 Al Viro 提交于 10月 31, 2008

Impact: build fix on !stacktrace architectures

only select STACKTRACE on architectures that have STACKTRACE_SUPPORT

... since we also need to ifdef out the guts of ftrace_trace_stack().
We also want to disallow setting TRACE_ITER_STACKTRACE in trace_flags
on such configs, but that can wait.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Acked-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c2c80529

02 11月, 2008 2 次提交

PM_TEST_SUSPEND should depend on RTC_CLASS, not RTC_LIB · 28959742

由 Al Viro 提交于 11月 01, 2008

Insufficient dependency - we really want CONFIG_RTC_CLASS=y there.
That will give us CONFIG_RTC_LIB=y, so the old dependency can be
simply replaced.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

28959742

reserve_region_with_split: Fix GFP_KERNEL usage under spinlock · 42c02023

由 Linus Torvalds 提交于 11月 01, 2008

This one apparently doesn't generate any warnings, because the function
is only used during system bootup, when the warnings are disabled.  But
it's still very wrong.

The __reserve_region_with_split() function is called with the
resource_lock held for writing, so it must only ever do GFP_ATOMIC
allocations.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

42c02023

31 10月, 2008 7 次提交

ftrace: handle archs that do not support irqs_disabled_flags · 9244489a

由 Steven Rostedt 提交于 10月 24, 2008

Impact: build fix on non-lockdep architectures

Some architectures do not support a way to read the irq flags that
is set from "local_irq_save(flags)" to determine if interrupts were
disabled or enabled. Ftrace uses this information to display to the user
if the trace occurred with interrupts enabled or disabled.

Besides the fact that those archs that do not support this will fail to
compile, unless they fix it, we do not want to have the trace simply
say interrupts were not disabled or they were enabled, without knowing
the real answer.

This patch adds a 'X' in the output to let the user know that the
architecture they are running on does not support a way for the tracer
to determine if interrupts were enabled or disabled. It also lets those
same archs compile with tracing enabled.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9244489a

'kill sig -1' must only apply to caller's namespace · d25141a8

由 Sukadev Bhattiprolu 提交于 10月 29, 2008

Currently "kill <sig> -1" kills processes in all namespaces and breaks the
isolation of namespaces.  Earlier attempt to fix this was discussed at:

	http://lkml.org/lkml/2008/7/23/148

As suggested by Oleg Nesterov in that thread, use "task_pid_vnr() > 1"
check since task_pid_vnr() returns 0 if process is outside the caller's
namespace.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NEric W. Biederman <ebiederm@xmission.com>
Tested-by: NDaniel Hokka Zakrisson <daniel@hozac.com>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d25141a8

kernel/profile: fix profile_init() section mismatch · ce05fcc3

由 Paul Mundt 提交于 10月 29, 2008

profile_init() calls in to alloc_bootmem() on early initialization.  While
alloc_bootmem() is __init, the reference itself is safe in that it is
tucked below a !slab_is_available() check.  So, flag profile_init() as
__ref.
Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ce05fcc3

freezer_cg: simplify freezer_change_state() · 51308ee5

由 Li Zefan 提交于 10月 29, 2008

Just call unfreeze_cgroup() if goal_state == THAWED, and call
try_to_freeze_cgroup() if goal_state == FROZEN.

No behavior has been changed.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NCedric Le Goater <clg@fr.ibm.com>
Acked-by: NMatt Helsley <matthltc@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

51308ee5

freezer_cg: use thaw_process() in unfreeze_cgroup() · 00c2e63c

由 Li Zefan 提交于 10月 29, 2008

Don't duplicate the implementation of thaw_process().

[akpm@linux-foundation.org: make __thaw_process() static]
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: NMatt Helsley <matthltc@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

00c2e63c

freezer_cg: remove redundant check in freezer_can_attach() · 80a6a2cf

由 Li Zefan 提交于 10月 29, 2008

It is sufficient to check if @task is frozen, and no need to check if the
original freezer is frozen.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NCedric Le Goater <clg@fr.ibm.com>
Acked-by: NMatt Helsley <matthltc@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

80a6a2cf

freezer_cg: fix improper BUG_ON() causing oops · 7ccb9743

由 Li Zefan 提交于 10月 29, 2008

The BUG_ON() should be protected by freezer->lock, otherwise it can be
triggered easily when a task has been unfreezed but the corresponding
cgroup hasn't been changed to FROZEN state.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NCedric Le Goater <clg@fr.ibm.com>
Acked-by: NMatt Helsley <matthltc@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7ccb9743

30 10月, 2008 2 次提交

sched: change sched_debug's mode to 0444 · a9cf4ddb

由 Li Zefan 提交于 10月 30, 2008

Impact: change /proc/sched/debug from rw-r--r-- to r--r--r--

/proc/sched_debug is read-only.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a9cf4ddb

ftrace: fix trace_nop config select · f3384b28

由 Steven Rostedt 提交于 10月 29, 2008

Impact: build fix on non-function-tracing architectures

The trace_nop is the tracer that is defined when no tracer is set in
the ftrace infrastructure.

The trace_nop was mistakenly selected by HAVE_FTRACE due to the confusion
between ftrace infrastructure and the ftrace function tracer (which has
been solved by renaming the function tracer).

This patch changes the select to the approriate TRACING.

This patch should fix compile errors on architectures that do not define
the FUNCTION_TRACER.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f3384b28

29 10月, 2008 2 次提交

resources: fix x86info results ioremap.c:226 __ioremap_caller+0xf2/0x2d6() WARNINGs · d68612b2

由 Suresh Siddha 提交于 10月 28, 2008

Impact: avoid false-positive WARN_ON()

Andi Kleen reported:
> When running x86info on a 2.6.27-git8 system I get
>
> resource map sanity check conflict: 0x9e000 0x9efff 0x10000 0x9e7ff System RAM
> ------------[ cut here ]------------
> WARNING: at /home/lsrc/linux/arch/x86/mm/ioremap.c:226 __ioremap_caller+0xf2/0x2d6()
> ...

Some of the pages below the 1MB ISA addresses will be shared typically by both
BIOS and system usable RAM. For example:
	BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
	BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)

x86info reads the low physical address using /dev/mem, which internally
uses ioremap() for accessing non RAM pages. ioremap() of such low
pages conflicts with multiple resource entities leading to the
above warning.

Change the iomem_map_sanity_check() to allow mapping a page spanning multiple
resource entities (minimum granularity that one can map is a page anyhow).
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d68612b2

ftrace: perform an initialization for ftrace to enable it · 0b6e4d56

由 Frederic Weisbecker 提交于 10月 28, 2008

Impact: corrects a bug which made the non-dyn function tracer not functional

With latest git, the non-dynamic function tracer didn't get any trace.

The problem was the fact that ftrace_enabled wasn't initialized to 1
because ftrace hasn't any init function when DYNAMIC_FTRACE is disabled.

So when a tracer tries to register an ftrace_ops struct,
__register_ftrace_function failed to set the hook.

This patch corrects it by setting an init function to initialize
ftrace during the boot.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0b6e4d56

28 10月, 2008 4 次提交

ftrace: fix current_tracer error return · 60063a66

由 Steven Rostedt 提交于 10月 28, 2008

The commit (in linux-tip) c2931e05
 ( ftrace: return an error when setting a nonexistent tracer )
added useful code that would error when a bad tracer was written into
the current_tracer file.

But this had a bug if the amount written was more than the amount read by
that code. The first iteration would set the tracer correctly, but since
it did not consume the rest of what was written (usually whitespace), the
userspace utility would continue to write what was not consumed. This
second iteration would fail to find a tracer and return -EINVAL. Funny
thing is that the tracer would have already been set.

This patch just consumes all the data that is written to the file.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

60063a66

lockdep: fix irqs on/off ip tracing · 6afe40b4

由 Heiko Carstens 提交于 10月 28, 2008

Impact: fix lockdep lock-api-caller output when irqsoff tracing is enabled

81d68a96 "ftrace: trace irq disabled critical timings" added wrappers around
trace_hardirqs_on/off_caller. However these functions use
__builtin_return_address(0) to figure out which function actually disabled
or enabled irqs. The result is that we save the ips of trace_hardirqs_on/off
instead of the real caller. Not very helpful.

However since the patch from Steven the ip already gets passed. So use that
and get rid of __builtin_return_address(0) in these two functions.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6afe40b4

lockdep: minor fix for debug_show_all_locks() · 46fec7ac

由 qinghuang feng 提交于 10月 28, 2008

When we failed to get tasklist_lock eventually (count equals 0),
we should only print " ignoring it.\n", and not print
" locked it.\n" needlessly.
Signed-off-by: NQinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

46fec7ac

tracing: fix a build error on alpha · 21798a84

由 Frederic Weisbecker 提交于 10月 28, 2008

Impact: build fix on Alpha

When tracing is enabled, some arch have included <linux/irqflags.h>
on their <asm/system.h> but others like alpha or m68k don't.

Build error on alpha:

kernel/trace/trace.c: In function 'tracing_cpumask_write':
kernel/trace/trace.c:2145: error: implicit declaration of function 'raw_local_irq_disable'
kernel/trace/trace.c:2162: error: implicit declaration of function 'raw_local_irq_enable'

Tested on Alpha through a cross-compiler (should correct a similar issue on m68k).
Reported-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

21798a84