提交 · 0c489c47d45ecac21059567c163f92138b2fbaa2 · openeuler / raspberrypi-kernel

26 2月, 2009 1 次提交

perfcounters: fix a few minor cleanliness issues · f3dfd265

由 Paul Mackerras 提交于 2月 26, 2009

This fixes three issues noticed by Arnd Bergmann:

- Add #ifdef __KERNEL__ and move some things around in perf_counter.h
  to make sure only the bits that userspace needs are exported to
  userspace.

- Use __u64, __s64, __u32 types in the structs exported to userspace
  rather than u64, s64, u32.

- Make the sys_perf_counter_open syscall available to the SPUs on
  Cell platforms.

And one issue that I noticed in looking at the code again:

- Wrap the perf_counter_open syscall with SYSCALL_DEFINE4 so we get
  the proper handling of int arguments on ppc64 (and some other 64-bit
  architectures).
Reported-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

f3dfd265

23 2月, 2009 1 次提交

PM: Split up sysdev_[suspend|resume] from device_power_[down|up] · 770824bd

由 Rafael J. Wysocki 提交于 2月 22, 2009

Move the sysdev_suspend/resume from the callee to the callers, with
no real change in semantics, so that we can rework the disabling of
interrupts during suspend/hibernation.

This is based on an earlier patch from Linus.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

770824bd

22 2月, 2009 5 次提交

PM: Fix suspend_console and resume_console to use only one semaphore · 403f3075

由 Arve Hjønnevåg 提交于 2月 14, 2009

This fixes a race where a thread acquires the console while the
console is suspended, and the console is resumed before this
thread releases it. In this case, the secondary console
semaphore would be left locked, and the primary semaphore would
be released twice. This in turn would cause the console switch
on suspend or resume to hang forever.

Note that suspend_console does not actually lock the console
for clients that use acquire_console_sem, it only locks it for
clients that use try_acquire_console_sem. If we change
suspend_console to fully lock the console, then the kernel
may deadlock on suspend. One client of try_acquire_console_sem
is acquire_console_semaphore_for_printk, which uses it to
prevent printk from using the console while it is suspended.
Signed-off-by: NArve Hjønnevåg <arve@android.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

403f3075

PM: Wait for console in resume · b090f9fa

由 Arve Hjønnevåg 提交于 2月 14, 2009

Avoids later waking up to a blinking cursor if the device woke up and
returned to sleep before the console switch happened.
Signed-off-by: NBrian Swetland <swetland@google.com>
Signed-off-by: NArve Hjønnevåg <arve@android.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b090f9fa

PM: Fix pm_notifiers during user mode hibernation · ebae2604

由 Andrey Borzenkov 提交于 2月 14, 2009

Snapshot device is opened with O_RDONLY during suspend and O_WRONLY durig
resume.  Make sure we also call notifiers with correct parameter telling
them what we are really doing.
Signed-off-by: NAndrey Borzenkov <arvidjaar@mail.ru>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Acked-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ebae2604

PM: fix build for CONFIG_PM unset · 09664fda

由 Rafael J. Wysocki 提交于 2月 14, 2009

Compilation of kprobes.c with CONFIG_PM unset is broken due to some broken
config dependncies.  Fix that.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <gregkh@suse.de>
Reported-by: NIngo Molnar <mingo@elte.hu>
Tested-by: NMasami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09664fda

PM/resume: wait for device probing to finish · eed3ee08

由 Arjan van de Ven 提交于 2月 14, 2009

the resume code does not currently wait for device probing to finish.
Even without async function calls this is dicey and not correct,
but with async function calls during the boot sequence this is going
to get hit more...

This patch adds the synchronization using the newly introduced helper.
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Acked-by: NGreg KH <gregkh@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eed3ee08

19 2月, 2009 5 次提交

tracing: limit the number of loops the ring buffer self test can make · 4b3e3d22

由 Steven Rostedt 提交于 2月 18, 2009

Impact: prevent deadlock if ring buffer gets corrupted

This patch adds a paranoid check to make sure the ring buffer consumer
does not go into an infinite loop. Since the ring buffer has been set
to read only, the consumer should not loop for more than the ring buffer
size. A check is added to make sure the consumer does not loop more than
the ring buffer size.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

4b3e3d22

tracing: have function trace select kallsyms · 4d7a077c

由 Steven Rostedt 提交于 2月 18, 2009

Impact: fix output of function tracer to be useful

The function tracer is pretty useless if KALLSYMS is not configured.
Unless you are good at reading hex values, the function tracer should
select the KALLSYMS configuration.

Also, the dynamic function tracer will fail its self test if KALLSYMS
is not selected.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

4d7a077c

tracing: disable tracing while testing ring buffer · 0c5119c1

由 Steven Rostedt 提交于 2月 18, 2009

Impact: fix to prevent hard lockup on self tests

If one of the tracers are broken and is constantly filling the ring
buffer while the test of the ring buffer is running, it will hang
the box. The reason is that the test is a consumer that will not
stop till the ring buffer is empty. But if the tracer is broken and
is constantly producing input to the buffer, this test will never
end. The result is a lockup of the box.

This happened when KALLSYMS was not defined and the dynamic ftrace
test constantly filled the ring buffer, because the filter failed
and all functions were being traced. Something was being called
that constantly filled the buffer.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>

0c5119c1

pm: fix build for CONFIG_PM unset · 42f5e039

由 Rafael J. Wysocki 提交于 2月 18, 2009

Compilation of kprobes.c with CONFIG_PM unset is broken due to some broken
config dependncies.  Fix that.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Reported-by: NIngo Molnar <mingo@elte.hu>
Tested-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Acked-by: NPavel Machek <pavel@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

42f5e039

cgroups: fix possible use after free · 67e055d1

由 Li Zefan 提交于 2月 18, 2009

In cgroup_kill_sb(), root is freed before sb is detached from the list, so
another sget() may find this sb and call cgroup_test_super(), which will
access the root that has been freed.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPaul Menage <menage@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

67e055d1

18 2月, 2009 2 次提交

block: fix bad definition of BIO_RW_SYNC · 93dbb393

由 Jens Axboe 提交于 2月 16, 2009

We can't OR shift values, so get rid of BIO_RW_SYNC and use BIO_RW_SYNCIO
and BIO_RW_UNPLUG explicitly. This brings back the behaviour from before
213d9417.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

93dbb393

tracing/function-graph-tracer: trace the idle tasks · 5b058bcd

由 Frederic Weisbecker 提交于 2月 17, 2009

When the function graph tracer is activated, it iterates over the task_list
to allocate a stack to store the return addresses.

But the per cpu idle tasks are not iterated by using
do_each_thread / while_each_thread.

So we have to iterate on them manually.

This fixes somes weirdness in the traces and many losses of traces.
Examples on two cpus:

 0)   Xorg-4287    |   2.906 us    |              }
 0)   Xorg-4287    |   3.965 us    |            }
 0)   Xorg-4287    |   5.302 us    |          }
 ------------------------------------------
 0)   Xorg-4287    =>    <idle>-0
 ------------------------------------------

 0)    <idle>-0    |   2.861 us    |                        }
 0)    <idle>-0    |   0.526 us    |                        set_normalized_timespec();
 0)    <idle>-0    |   7.201 us    |                      }
 0)    <idle>-0    |   8.214 us    |                    }
 0)    <idle>-0    |               |                    clockevents_program_event() {
 0)    <idle>-0    |               |                      lapic_next_event() {
 0)    <idle>-0    |   0.510 us    |                        native_apic_mem_write();
 0)    <idle>-0    |   1.546 us    |                      }
 0)    <idle>-0    |   2.583 us    |                    }
 0)    <idle>-0    | + 12.435 us   |                  }
 0)    <idle>-0    | + 13.470 us   |                }
 0)    <idle>-0    |   0.608 us    |                _spin_unlock_irqrestore();
 0)    <idle>-0    | + 23.270 us   |              }
 0)    <idle>-0    | + 24.336 us   |            }
 0)    <idle>-0    | + 25.417 us   |          }
 0)    <idle>-0    |   0.593 us    |          _spin_unlock();
 0)    <idle>-0    | + 41.869 us   |        }
 0)    <idle>-0    | + 42.906 us   |      }
 0)    <idle>-0    | + 95.035 us   |    }
 0)    <idle>-0    |   0.540 us    |    menu_reflect();
 0)    <idle>-0    | ! 100.404 us  |  }
 0)    <idle>-0    |   0.564 us    |  mce_idle_callback();
 0)    <idle>-0    |               |  enter_idle() {
 0)    <idle>-0    |   0.526 us    |    mce_idle_callback();
 0)    <idle>-0    |   1.757 us    |  }
 0)    <idle>-0    |               |  cpuidle_idle_call() {
 0)    <idle>-0    |               |    menu_select() {
 0)    <idle>-0    |   0.525 us    |      pm_qos_requirement();
 0)    <idle>-0    |   0.518 us    |      tick_nohz_get_sleep_length();
 0)    <idle>-0    |   2.621 us    |    }
[...]
 1)    <idle>-0    |   0.518 us    |              touch_softlockup_watchdog();
 1)    <idle>-0    | + 14.355 us   |            }
 1)    <idle>-0    | + 22.840 us   |          }
 1)    <idle>-0    | + 25.949 us   |        }
 1)    <idle>-0    |               |        handle_irq() {
 1)    <idle>-0    |   0.511 us    |          irq_to_desc();
 1)    <idle>-0    |               |          handle_edge_irq() {
 1)    <idle>-0    |   0.638 us    |            _spin_lock();
 1)    <idle>-0    |               |            ack_apic_edge() {
 1)    <idle>-0    |   0.510 us    |              irq_to_desc();
 1)    <idle>-0    |               |              move_native_irq() {
 1)    <idle>-0    |   0.510 us    |                irq_to_desc();
 1)    <idle>-0    |   1.532 us    |              }
 1)    <idle>-0    |   0.511 us    |              native_apic_mem_write();
 ------------------------------------------
 1)    <idle>-0    =>    cat-5073
 ------------------------------------------

 1)    cat-5073    |   3.731 us    |                    }
 1)    cat-5073    |               |                    run_local_timers() {
 1)    cat-5073    |   0.533 us    |                      hrtimer_run_queues();
 1)    cat-5073    |               |                      raise_softirq() {
 1)    cat-5073    |               |                        __raise_softirq_irqoff() {
 1)    cat-5073    |               |                          /* nr: 1 */
 1)    cat-5073    |   2.718 us    |                        }
 1)    cat-5073    |   3.814 us    |                      }
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5b058bcd

16 2月, 2009 2 次提交

trace: mmiotrace to the tracer menu in Kconfig · 6bc5c366

由 Pekka Paalanen 提交于 1月 03, 2009

Impact: cosmetic change in Kconfig menu layout

This patch was originally suggested by Peter Zijlstra, but seems it
was forgotten.

CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable
directly under the Kernel hacking / debugging menu in the kernel
configuration system. They were present only for x86 and x86_64.

Other tracers that use the ftrace tracing framework are in their own
sub-menu. This patch moves the mmiotrace configuration options there.
Since the Kconfig file, where the tracer menu is, is not architecture
specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by
x86/x86_64. CONFIG_MMIOTRACE now depends on it.
Signed-off-by: NPekka Paalanen <pq@iki.fi>
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6bc5c366

mmiotrace: count events lost due to not recording · 391b170f

由 Pekka Paalanen 提交于 1月 06, 2009

Impact: enhances lost events counting in mmiotrace

The tracing framework, or the ring buffer facility it uses, has a switch
to stop recording data. When recording is off, the trace events will be
lost. The framework does not count these, so mmiotrace has to count them
itself.
Signed-off-by: NPekka Paalanen <pq@iki.fi>
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

391b170f

14 2月, 2009 1 次提交

User namespaces: Only put the userns when we unhash the uid · fb5ae64f

由 Serge E. Hallyn 提交于 2月 13, 2009

uids in namespaces other than init don't get a sysfs entry.

For those in the init namespace, while we're waiting to remove
the sysfs entry for the uid the uid is still hashed, and
alloc_uid() may re-grab that uid without getting a new
reference to the user_ns, which we've already put in free_user
before scheduling remove_user_sysfs_dir().
Reported-and-tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Tested-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fb5ae64f

13 2月, 2009 2 次提交

timers: more consistently use clock vs timer · 3997ad31

由 Peter Zijlstra 提交于 2月 12, 2009

While reviewing the manpages, I noticed I'd missed some clock vs timer sites.

Make sure that all timer functions call cpu_timer_sample_group() and not
cpu_clock_sample_group(). This ensures that we enable the process wide timer
in time, and therefore pay the O(n) thread group cost from the syscall.

Not doing it here, will result in the first jiffy tick after setting the timer
doing this, resulting in a very expensive tick (but only once) and a delay in
actually starting the timer.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3997ad31

perfcounters: make context switch and migration software counters work again · c07c99b6

由 Paul Mackerras 提交于 2月 13, 2009

Jaswinder Singh Rajput reported that commit 23a185ca caused the
context switch and migration software counters to report zero always.
With that commit, the software counters only count events that occur
between sched-in and sched-out for a task. This is necessary for the
counter enable/disable prctls and ioctls to work. However, the
context switch and migration counts are incremented after sched-out
for one task and before sched-in for the next. Since the increment
doesn't occur while a task is scheduled in (as far as the software
counters are concerned) it doesn't count towards any counter.

Thus the context switch and migration counters need to count events
that occur at any time, provided the counter is enabled, not just
those that occur while the task is scheduled in (from the perf_counter
subsystem's point of view). The problem though is that the software
counter code can't tell the difference between being enabled and being
scheduled in, and between being disabled and being scheduled out,
since we use the one pair of enable/disable entry points for both.
That is, the high-level disable operation simply arranges for the
counter to not be scheduled in any more, and the high-level enable
operation arranges for it to be scheduled in again.

One way to solve this would be to have sched_in/out operations in the
hw_perf_counter_ops struct as well as enable/disable. However, this
takes a simpler approach: it adds a 'prev_state' field to the
perf_counter struct that allows a counter's enable method to know
whether the counter was previously disabled or just inactive
(scheduled out), and therefore whether the enable method is being
called as a result of a high-level enable or a schedule-in operation.

This then allows the context switch, migration and page fault counters
to reset their hw.prev_count value in their enable functions only if
they are called as a result of a high-level enable operation.
Although page faults would normally only occur while the counter is
scheduled in, this changes the page fault counter code too in case
there are ever circumstances where page faults get counted against a
task while its counters are not scheduled in.
Reported-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c07c99b6

12 2月, 2009 4 次提交

sched: cpu hotplug fix · a0490fa3

由 Ingo Molnar 提交于 2月 12, 2009

rq_attach_root() does a kfree() with the runqueue lock held.

That's not a very wise move, fix it.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a0490fa3

cgroups: fix lockdep subclasses overflow · cfebe563

由 Li Zefan 提交于 2月 11, 2009

I enabled all cgroup subsystems when compiling kernel, and then:
 # mount -t cgroup -o net_cls xxx /mnt
 # mkdir /mnt/0

This showed up immediately:
 BUG: MAX_LOCKDEP_SUBCLASSES too low!
 turning off the locking correctness validator.

It's caused by the cgroup hierarchy lock:
	for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
		struct cgroup_subsys *ss = subsys[i];
		if (ss->root == root)
			mutex_lock_nested(&ss->hierarchy_mutex, i);
	}

Now we have 9 cgroup subsystems, and the above 'i' for net_cls is 8, but
MAX_LOCKDEP_SUBCLASSES is 8.

This patch uses different lockdep keys for different subsystems.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPaul Menage <menage@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cfebe563

mm: fix dirty_bytes/dirty_background_bytes sysctls on 64bit arches · fc3501d4

由 Sven Wegener 提交于 2月 11, 2009

We need to pass an unsigned long as the minimum, because it gets casted
to an unsigned long in the sysctl handler. If we pass an int, we'll
access four more bytes on 64bit arches, resulting in a random minimum
value.

[rientjes@google.com: fix type of `old_bytes']
Signed-off-by: NSven Wegener <sven.wegener@stealer.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fc3501d4

futex: fix reference leak · 2fff78c7

由 Peter Zijlstra 提交于 2月 11, 2009

Catalin noticed that (38d47c1b: futex: rely on get_user_pages() for
shared futexes) caused an mm_struct leak.

Some tracing with the function graph tracer quickly pointed out that
futex_wait() has exit paths with unbalanced reference counts.

This regression was discovered by kmemleak.
Reported-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: N"Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Tested-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2fff78c7

11 2月, 2009 7 次提交

sched: revert recent sync wakeup changes · fc631c82

由 Peter Zijlstra 提交于 2月 11, 2009

Intel reported a 10% regression (mysql+sysbench) on a 16-way machine
with these patches:

  1596e297: sched: symmetric sync vs avg_overlap
  d942fb6c: sched: fix sync wakeups

Revert them.
Reported-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Bisected-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fc631c82

perfcounters: fix refcounting bug, take 2 · 4bcf349a

由 Paul Mackerras 提交于 2月 11, 2009

Only free child_counter if it has a parent; if it doesn't, then it
has a file pointing to it and we'll free it in perf_release.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4bcf349a

timers: fix TIMER_ABSTIME for process wide cpu timers · 4da94d49

由 Peter Zijlstra 提交于 2月 11, 2009

The POSIX timer interface allows for absolute time expiry values through the
TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock
every time we start it.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4da94d49

timers: split process wide cpu clocks/timers, fix · 3fccfd67

由 Peter Zijlstra 提交于 2月 10, 2009

To decrease the chance of a missed enable, always enable the timer when we
sample it, we'll always disable it when we find that there are no active timers
in the jiffy tick.

This fixes a flood of warnings reported by Mike Galbraith.
Reported-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3fccfd67

perfcounters: fix use after free in perf_release() · 5af75917

由 Mike Galbraith 提交于 2月 11, 2009

running...

  while true; do
    foo -d 1 -f 1 -c 100000 & sleep 1
    kerneltop -d 1 -f 1 -e 1 -c 25000 -p `pidof foo`
  done

  while true; do
    killall foo; killall kerneltop; sleep 2
  done

...in two shells with SLUB_DEBUG enabled produces flood of:
BUG task_struct: Poison overwritten.

Fix the use-after-free bug in perf_release().
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5af75917

ptrace, x86: fix the usage of ptrace_fork() · 06eb23b1

由 Oleg Nesterov 提交于 2月 09, 2009

I noticed by pure accident we have ptrace_fork() and friends. This was
added by "x86, bts: add fork and exit handling", commit
bf53de90.

I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly
believe this needs the fix. I think something like this program

	int main(void)
	{
		int pid = fork();

		if (!pid) {
			ptrace(PTRACE_TRACEME, 0, NULL, NULL);
			kill(getpid(), SIGSTOP);
			fork();
		} else {
			struct ptrace_bts_config bts = {
				.flags = PTRACE_BTS_O_ALLOC,
				.size  = 4 * 4096,
			};

			wait(NULL);

			ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK);
			ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts));
			ptrace(PTRACE_CONT, pid, NULL, NULL);

			sleep(1);
		}

		return 0;
	}

should crash the kernel.

If the task is traced by its natural parent ptrace_reparented() returns 0
but we should clear ->btsxxx anyway.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NMarkus Metzger <markus.t.metzger@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

06eb23b1

perf_counters: allow users to count user, kernel and/or hypervisor events · 0475f9ea

由 Paul Mackerras 提交于 2月 11, 2009

Impact: new perf_counter feature

This extends the perf_counter_hw_event struct with bits that specify
that events in user, kernel and/or hypervisor mode should not be
counted (i.e. should be excluded), and adds code to program the PMU
mode selection bits accordingly on x86 and powerpc.

For software counters, we don't currently have the infrastructure to
distinguish which mode an event occurs in, so we currently fail the
counter initialization if the setting of the hw_event.exclude_* bits
would require us to distinguish. Context switches and CPU migrations
are currently considered to occur in kernel mode.

On x86, this changes the previous policy that only root can count
kernel events. Now non-root users can count kernel events or exclude
them. Non-root users still can't use NMI events, though. On x86 we
don't appear to have any way to control whether hypervisor events are
counted or not, so hw_event.exclude_hv is ignored.

On powerpc, the selection of whether to count events in user, kernel
and/or hypervisor mode is PMU-wide, not per-counter, so this adds a
check that the hw_event.exclude_* settings are the same as other events
on the PMU. Counters being added to a group have to have the same
settings as the other hardware counters in the group. Counters and
groups can only be enabled in hw_perf_group_sched_in or power_perf_enable
if they have the same settings as any other counters already on the
PMU. If we are not running on a hypervisor, the exclude_hv setting
is ignored (by forcing it to 0) since we can't ever get any
hypervisor events.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

0475f9ea

10 2月, 2009 3 次提交

profiling: fix broken profiling regression · acd89579

由 Hugh Dickins 提交于 2月 09, 2009

Impact: fix broken /proc/profile on UP machines

Commit c309b917 "cpumask: convert
kernel/profile.c" broke profiling.  prof_cpu_mask was previously
initialized to CPU_MASK_ALL, but left uninitialized in that commit.
We need to copy cpu_possible_mask (cpu_online_mask is not enough).
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

acd89579

stackprotector: update make rules · 5d707e9c

由 Tejun Heo 提交于 2月 09, 2009

Impact: no default -fno-stack-protector if stackp is enabled, cleanup

Stackprotector make rules had the following problems.

* cc support test and warning are scattered across makefile and
  kernel/panic.c.

* -fno-stack-protector was always added regardless of configuration.

Update such that cc support test and warning are contained in makefile
and -fno-stack-protector is added iff stackp is turned off.  While at
it, prepare for 32bit support.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5d707e9c

elf: add ELF_CORE_COPY_KERNEL_REGS() · 6cd61c0b

由 Tejun Heo 提交于 2月 09, 2009

ELF core dump is used for both user land core dump and kernel crash
dump.  Depending on architecture, register might need to be accessed
differently for userland and kernel.  Allow architectures to define
ELF_CORE_COPY_KERNEL_REGS() and use different operation for kernel
register dump.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6cd61c0b

09 2月, 2009 6 次提交

perf_counters: make software counters work as per-cpu counters · 23a185ca

由 Paul Mackerras 提交于 2月 09, 2009

Impact: kernel crash fix

Yanmin Zhang reported that using a PERF_COUNT_TASK_CLOCK software
counter as a per-cpu counter would reliably crash the system, because
it calls __task_delta_exec with a null pointer.  The page fault,
context switch and cpu migration counters also won't function
correctly as per-cpu counters since they reference the current task.

This fixes the problem by redirecting the task_clock counter to the
cpu_clock counter when used as a per-cpu counter, and by implementing
per-cpu page fault, context switch and cpu migration counters.

Along the way, this:

- Initializes counter->ctx earlier, in perf_counter_alloc, so that
  sw_perf_counter_init can use it
- Adds code to kernel/sched.c to count task migrations into each
  cpu, in rq->nr_migrations_in
- Exports the per-cpu context switch and task migration counts
  via new functions added to kernel/sched.c
- Makes sure that if sw_perf_counter_init fails, we don't try to
  initialize the counter as a hardware counter.  Since the user has
  passed a negative, non-raw event type, they clearly don't intend
  for it to be interpreted as a hardware event.
Reported-by: N"Zhang Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

23a185ca

async: use list_move_tail · f7de7621

由 Stefan Richter 提交于 2月 02, 2009

list.h provides a dedicated primitive for
"list_del followed by list_add_tail"... list_move_tail.
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>

f7de7621

async: Rename _special -> _domain for clarity. · 766ccb9e

由 Cornelia Huck 提交于 1月 20, 2009

Rename the async_*_special() functions to async_*_domain(), which
describes the purpose of these functions much better.
[Broke up long lines to silence checkpatch]
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>

766ccb9e

async: Add some documentation. · f30d5b30

由 Cornelia Huck 提交于 1月 19, 2009

Add some kerneldoc to the async interface.
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>

f30d5b30

async: Handle kthread_run() return codes. · 86532d8b

由 Cornelia Huck 提交于 1月 19, 2009

If we fail to create the manager thread, fall back to non-fastboot.
If we fail to create an async thread, try again after waiting for
a bit.
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>

86532d8b

async: Fix running list handling. · 7a89bbc7

由 Cornelia Huck 提交于 1月 19, 2009

async_schedule() should pass in async_running as the running
list, and run_one_entry() should put the entry to be run on
the provided running list instead of always on the generic one.
Reported-by: NJonathan Corbet <corbet@lwn.net>
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>

7a89bbc7

07 2月, 2009 1 次提交

fork.c: fix NULL pointer dereference when nr_threads == threads-max · 04ec93fe

由 Li Zefan 提交于 2月 06, 2009

I happened to forked lots of processes, and hit NULL pointer dereference.
It is because in copy_process() after checking max_threads, 0 is returned
but not -EAGAIN.

The bug is introduced by "CRED: Detach the credentials from task_struct"
(commit f1752eec).
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

04ec93fe