提交 · 18240904960a39e582ced8ba8ececb10b8c22dd3 · OpenHarmony / kernel_linux

15 9月, 2009 8 次提交

由 Anirban Sinha 提交于 9月 14, 2009

console_print() is an old legacy interface mostly unused in the entire
kernel tree. It's best to clean up its existing use and let developers
use their own implementation of it as they feel fit.
Signed-off-by: NAnirban Sinha <asinha@zeugmasystems.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

353f6dd2

CRED: Allow put_cred() to cope with a NULL groups list · 4a5d6ba1

由 David Howells 提交于 9月 14, 2009

put_cred() will oops if given a NULL groups list, but that is now possible with
the existence of cred_alloc_blank(), as used in keyctl_session_to_parent().

Added in commit:

	commit ee18d64c
	Author: David Howells <dhowells@redhat.com>
	Date:   Wed Sep 2 09:14:21 2009 +0100
	KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
Reported-by: NMarc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

4a5d6ba1

PM: Trivial fixes · 8de03073

由 Wu Fengguang 提交于 7月 22, 2009

Fix the definition of BM_BITS_PER_BLOCK and kerneldoc
description of create_bm_block_list().

[rjw: Added changelog.]
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

8de03073

PM / Hibernate / Memory hotplug: Always use for_each_populated_zone() · 98e73dc5

由 Gerald Schaefer 提交于 7月 22, 2009

Use for_each_populated_zone() instead of for_each_zone() in hibernation
code. This fixes a bug on s390, where we allow both config options
HIBERNATION and MEMORY_HOTPLUG, so that we also have a ZONE_MOVABLE
here. We only allow hibernation if no memory hotplug operation was
performed, so in fact both features can only be used exclusively, but
this way we don't need 2 differently configured (distribution) kernels.

If we have an unpopulated ZONE_MOVABLE, we allow hibernation but run
into a BUG_ON() in memory_bm_test/set/clear_bit() because hibernation
code iterates through all zones, not only the populated zones, in
several places. For example, swsusp_free() does for_each_zone() and
then checks for pfn_valid(), which is true even if the zone is not
populated, resulting in a BUG_ON() later because the pfn cannot be
found in the memory bitmap.

Replacing all occurences of for_each_zone() in hibernation code with
for_each_populated_zone() would fix this issue.

[rjw: Rebased on top of linux-next hibernation patches.]
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

98e73dc5

PM/Hibernate: Do not try to allocate too much memory too hard (rev. 2) · ef4aede3

由 Rafael J. Wysocki 提交于 7月 08, 2009

We want to avoid attempting to free too much memory too hard during
hibernation, so estimate the minimum size of the image to use as the
lower limit for preallocating memory.

The approach here is based on the (experimental) observation that we
can't free more page frames than the sum of:

* global_page_state(NR_SLAB_RECLAIMABLE)
* global_page_state(NR_ACTIVE_ANON)
* global_page_state(NR_INACTIVE_ANON)
* global_page_state(NR_ACTIVE_FILE)
* global_page_state(NR_INACTIVE_FILE)

minus

* global_page_state(NR_FILE_MAPPED)

Namely, if this number is subtracted from the number of saveable
pages in the system, we get a good estimate of the minimum reasonable
size of a hibernation image.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Acked-by: NWu Fengguang <fengguang.wu@intel.com>

ef4aede3

PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2) · 64a473cb

由 Rafael J. Wysocki 提交于 7月 08, 2009

Since the hibernation code is now going to use allocations of memory
to make enough room for the image, it can also use the page frames
allocated at this stage as image page frames.  The low-level
hibernation code needs to be rearranged for this purpose, but it
allows us to avoid freeing a great number of pages and allocating
these same pages once again later, so it generally is worth doing.

[rev. 2: Take highmem into account correctly.]
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

64a473cb

PM/Hibernate: Rework shrinking of memory · 4bb33435

由 Rafael J. Wysocki 提交于 7月 08, 2009

Rework swsusp_shrink_memory() so that it calls shrink_all_memory()
just once to make some room for the image and then allocates memory
to apply more pressure to the memory management subsystem, if
necessary.

Unfortunately, we don't seem to be able to drop shrink_all_memory()
entirely just yet, because that would lead to huge performance
regressions in some test cases.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Acked-by: NPavel Machek <pavel@ucw.cz>

4bb33435

PM: Fix typo in label name s/Platofrm_finish/Platform_finish/ · e681c9dd

由 Thadeu Lima de Souza Cascardo 提交于 7月 08, 2009

Although the same label name is used somewhere else in the file, this
particular label was consistently typoed in all of its uses.
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

e681c9dd

11 9月, 2009 4 次提交

block: add blk-iopoll, a NAPI like approach for block devices · 5e605b64

由 Jens Axboe 提交于 8月 05, 2009

This borrows some code from NAPI and implements a polled completion
mode for block devices. The idea is the same as NAPI - instead of
doing the command completion when the irq occurs, schedule a dedicated
softirq in the hopes that we will complete more IO when the iopoll
handler is invoked. Devices have a budget of commands assigned, and will
stay in polled mode as long as they continue to consume their budget
from the iopoll softirq handler. If they do not, the device is set back
to interrupt completion mode.

This patch holds the core bits for blk-iopoll, device driver support
sold separately.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5e605b64

writeback: add name to backing_dev_info · d993831f

由 Jens Axboe 提交于 6月 12, 2009

This enables us to track who does what and print info. Its main use
is catching dirty inodes on the default_backing_dev_info, so we can
fix that up.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d993831f

sched: Fix sched::sched_stat_wait tracepoint field · e1f84508

由 Ingo Molnar 提交于 9月 10, 2009

This weird perf trace output:

  cc1-9943  [001]  2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns]

Is caused by setting one component field of the delta to zero
a bit too early. Move it to later.

( Note, this does not affect the NEW_FAIR_SLEEPERS interactivity bug,
  it's just a reporting bug in essence. )
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nikos Chantziaras <realnc@arcor.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <4AA93D34.8040500@arcor.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e1f84508

sched: Disable NEW_FAIR_SLEEPERS for now · 3f2aa307

由 Ingo Molnar 提交于 9月 10, 2009

Nikos Chantziaras and Jens Axboe reported that turning off
NEW_FAIR_SLEEPERS improves desktop interactivity visibly.

Nikos described his experiences the following way:

  " With this setting, I can do "nice -n 19 make -j20" and
    still have a very smooth desktop and watch a movie at
    the same time.  Various other annoyances (like the
    "logout/shutdown/restart" dialog of KDE not appearing
    at all until the background fade-out effect has finished)
    are also gone.  So this seems to be the single most
    important setting that vastly improves desktop behavior,
    at least here. "

Jens described it the following way, referring to a 10-seconds
xmodmap scheduling delay he was trying to debug:

  " Then I tried switching NO_NEW_FAIR_SLEEPERS on, and then
    I get:

    Performance counter stats for 'xmodmap .xmodmap-carl':

         9.009137  task-clock-msecs         #      0.447 CPUs
               18  context-switches         #      0.002 M/sec
                1  CPU-migrations           #      0.000 M/sec
              315  page-faults              #      0.035 M/sec

    0.020167093  seconds time elapsed

    Woot! "

So disable it for now. In perf trace output i can see weird
delta timestamps:

  cc1-9943  [001]  2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns]

That nsec field is not supposed to be that large. More digging
is needed - but lets turn it off while the real bug is found.
Reported-by: NNikos Chantziaras <realnc@arcor.de>
Tested-by: NNikos Chantziaras <realnc@arcor.de>
Reported-by: NJens Axboe <jens.axboe@oracle.com>
Tested-by: NJens Axboe <jens.axboe@oracle.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <4AA93D34.8040500@arcor.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3f2aa307

09 9月, 2009 3 次提交

sched: Keep kthreads at default priority · 61cbe54d

由 Mike Galbraith 提交于 9月 09, 2009

Removes kthread/workqueue priority boost, they increase worst-case
desktop latencies.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1252486344.28645.18.camel@marge.simson.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

61cbe54d

sched: Re-tune the scheduler latency defaults to decrease worst-case latencies · 172e082a

由 Mike Galbraith 提交于 9月 09, 2009

Reduce the latency target from 20 msecs to 5 msecs.

Why? Larger latencies increase spread, which is good for scaling,
but bad for worst case latency.

We still have the ilog(nr_cpus) rule to scale up on bigger
server boxes.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1252486344.28645.18.camel@marge.simson.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

172e082a

sched: Turn off child_runs_first · 2bba22c5

由 Mike Galbraith 提交于 9月 09, 2009

Set child_runs_first default to off.

It hurts 'optimal' make -j<NR_CPUS> workloads as make jobs
get preempted by child tasks, reducing parallelism.

Note, this patch might make existing races in user
applications more prominent than before - so breakages
might be bisected to this commit.

Child-runs-first is broken on SMP to begin with, and we
already had it off briefly in v2.6.23 so most of the
offenders ought to be fixed. Would be nice not to revert
this commit but fix those apps finally ...
Signed-off-by: NMike Galbraith <efault@gmx.de>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1252486344.28645.18.camel@marge.simson.net>
[ made the sysctl independent of CONFIG_SCHED_DEBUG, in case
  people want to work around broken apps. ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2bba22c5

08 9月, 2009 3 次提交

sched: Ensure that a child can't gain time over it's parent after fork() · b5d9d734

由 Mike Galbraith 提交于 9月 08, 2009

A fork/exec load is usually "pass the baton", so the child
should never be placed behind the parent.  With START_DEBIT we
make room for the new task, but with child_runs_first, that
room comes out of the _parent's_ hide. There's nothing to say
that the parent wasn't ahead of min_vruntime at fork() time,
which means that the "baton carrier", who is essentially the
parent in drag, can gain time and increase scheduling latencies
for waiters.

With NEW_FAIR_SLEEPERS + START_DEBIT + child_runs_first
enabled, we essentially pass the sleeper fairness off to the
child, which is fine, but if we don't base placement on the
parent's updated vruntime, we can end up compounding latency
woes if the child itself then does fork/exec.  The debit
incurred at fork doesn't hurt the parent who is then going to
sleep and maybe exit, but the child who acquires the error
harms all comers.

This improves latencies of make -j<n> kernel build workloads.
Reported-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NMike Galbraith <efault@gmx.de>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b5d9d734

sched: Deal with low-load in wake_affine() · 71a29aa7

由 Peter Zijlstra 提交于 9月 07, 2009

wake_affine() would always fail under low-load situations where
both prev and this were idle, because adding a single task will
always be a significant imbalance, even if there's nothing
around that could balance it.

Deal with this by allowing imbalance when there's nothing you
can do about it.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

71a29aa7

sched: Remove short cut from select_task_rq_fair() · cdd2ab3d

由 Peter Zijlstra 提交于 9月 07, 2009

select_task_rq_fair() incorrectly skips the wake_affine()
logic, remove this.

When prev_cpu == this_cpu, the code jumps straight to the
wake_idle() logic, this doesn't give the wake_affine() logic
the chance to pin the task to this cpu.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cdd2ab3d

07 9月, 2009 1 次提交

Security/SELinux: includecheck fix kernel/sysctl.c · b6d9c256

由 Jaswinder Singh Rajput 提交于 9月 04, 2009

fix the following 'make includecheck' warning:

  kernel/sysctl.c: linux/security.h is included more than once.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

b6d9c256

05 9月, 2009 10 次提交

ring-buffer: only enable ring_buffer_swap_cpu when needed · 85bac32c

由 Steven Rostedt 提交于 9月 04, 2009

Since the ability to swap the cpu buffers adds a small overhead to
the recording of a trace, we only want to add it when needed.

Only the irqsoff and preemptoff tracers use this feature, and both are
not recommended for production kernels. This patch disables its use
when neither irqsoff nor preemptoff is configured.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

85bac32c

ring-buffer: check for swapped buffers in start of committing · 62f0b3eb

由 Steven Rostedt 提交于 9月 04, 2009

Because the irqsoff tracer can swap an internal CPU buffer, it is possible
that a swap happens between the start of the write and before the committing
bit is set (the committing bit will disable swapping).

This patch adds a check for this and will fail the write if it detects it.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

62f0b3eb

tracing: report error in trace if we fail to swap latency buffer · e8165dbb

由 Steven Rostedt 提交于 9月 03, 2009

The irqsoff tracer will fail to swap the cpu buffer with the max
buffer if it preempts a commit. Instead of ignoring this, this patch
makes the tracer report it if the last max latency failed due to preempting
a current commit.

The output of the latency tracer will look like this:

 # tracer: irqsoff
 #
 # irqsoff latency trace v1.1.5 on 2.6.31-rc5
 # --------------------------------------------------------------------
 # latency: 112 us, #1/1, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
 #    -----------------
 #    | task: -4281 (uid:0 nice:0 policy:0 rt_prio:0)
 #    -----------------
 #  => started at: save_args
 #  => ended at:   __do_softirq
 #
 #
 #                  _------=> CPU#
 #                 / _-----=> irqs-off
 #                | / _----=> need-resched
 #                || / _---=> hardirq/softirq
 #                ||| / _--=> preempt-depth
 #                |||| /
 #                |||||     delay
 #  cmd     pid   ||||| time  |   caller
 #     \   /      |||||   \   |   /
    bash-4281    1d.s6  265us : update_max_tr_single: Failed to swap buffers due to commit in progress

Note the latency time and the functions that disabled the irqs or preemption
will still be listed.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e8165dbb

tracing: add trace_array_printk for internal tracers to use · 659372d3

由 Steven Rostedt 提交于 9月 03, 2009

This patch adds a trace_array_printk to allow a tracer to use the
trace_printk on its own trace array.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

659372d3

tracing: pass around ring buffer instead of tracer · e77405ad

由 Steven Rostedt 提交于 9月 02, 2009

The latency tracers (irqsoff and wakeup) can swap trace buffers
on the fly. If an event is happening and has reserved data on one of
the buffers, and the latency tracer swaps the global buffer with the
max buffer, the result is that the event may commit the data to the
wrong buffer.

This patch changes the API to the trace recording to be recieve the
buffer that was used to reserve a commit. Then this buffer can be passed
in to the commit.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

e77405ad

tracing: make tracing_reset safe for external use · f633903a

由 Steven Rostedt 提交于 9月 04, 2009

Reseting the trace buffer without first disabling the buffer and
waiting for any writers to complete, can corrupt the ring buffer.

This patch makes the external version of tracing_reset safe from
corruption by disabling the ring buffer and calling synchronize_sched.

This version can no longer be called from interrupt context. But all those
callers have been removed.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f633903a

tracing: use timestamp to determine start of latency traces · 2f26ebd5

由 Steven Rostedt 提交于 9月 01, 2009

Currently the latency tracers reset the ring buffer. Unfortunately
if a commit is in process (due to a trace event), this can corrupt
the ring buffer. When this happens, the ring buffer will detect
the corruption and then permanently disable the ring buffer.

The bug does not crash the system, but it does prevent further tracing
after the bug is hit.

Instead of reseting the trace buffers, the timestamp of the start of
the trace is used instead. The buffers will still contain the previous
data, but the output will not count any data that is before the
timestamp of the trace.

Note, this only affects the static trace output (trace) and not the
runtime trace output (trace_pipe). The runtime trace output does not
make sense for the latency tracers anyway.
Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

2f26ebd5

tracing/filters: Defer pred allocation, fix memory leak · c58b4321

由 Li Zefan 提交于 9月 01, 2009

The predicates of an event and their filter structure are allocated
when we create an event filter for the first time.

These objects must be created once but each time we come with a new
filter, we overwrite such pre-existing allocation, if any.

Thus, this patch checks if the filter has already been allocated
before going ahead.
Spotted-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <4A9CB1BA.3060402@cn.fujitsu.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

c58b4321

tracing: remove users of tracing_reset · 76f0d073

由 Steven Rostedt 提交于 9月 04, 2009

The function tracing_reset is deprecated for outside use of trace.c.

The new function to reset the the buffers is tracing_reset_online_cpus.

The reason for this is that resetting the buffers while the event
trace points are active can corrupt the buffers, because they may
be writing at the time of reset. The tracing_reset_online_cpus disables
writes and waits for current writers to finish.

This patch replaces all users of tracing_reset except for the latency
tracers. Those changes require more work and will be removed in the
following patches.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

76f0d073

tracing: disable buffers and synchronize_sched before resetting · 621968cd

由 Steven Rostedt 提交于 9月 04, 2009

Resetting the ring buffers while traces are happening can corrupt
the ring buffer and disable it (no kernel crash to worry about).

The safest thing to do is disable the ring buffers, call synchronize_sched()
to wait for all current writers to finish and then reset the buffer.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

621968cd

04 9月, 2009 11 次提交

tracing: disable update max tracer while reading trace · b8de7bd1

由 Steven Rostedt 提交于 8月 31, 2009

When reading the tracer from the trace file, updating the max latency
may corrupt the output. This patch disables the tracing of the max
latency while reading the trace file.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

b8de7bd1

tracing: print out start and stop in latency traces · 8248ac05

由 Steven Rostedt 提交于 9月 02, 2009

During development of the tracer, we would copy information from
the live tracer to the max tracer with one memcpy. Since then we
added a generic ring buffer and we handle the copies differently now.
Unfortunately, we never copied the critical section information, and
we lost the output:

 #  => started at: kmem_cache_alloc
 #  => ended at:   kmem_cache_alloc

This patch adds back the critical start and end copying as well as
removes the unused "trace_idx" and "overrun" fields of the
trace_array_cpu structure.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

8248ac05

ring-buffer: disable all cpu buffers when one finds a problem · 077c5407

由 Steven Rostedt 提交于 9月 03, 2009

Currently the way RB_WARN_ON works, is to disable either the current
CPU buffer or all CPU buffers, depending on whether a ring_buffer or
ring_buffer_per_cpu struct was passed into the macro.

Most users of the RB_WARN_ON pass in the CPU buffer, so only the one
CPU buffer gets disabled but the rest are still active. This may
confuse users even though a warning is sent to the console.

This patch changes the macro to disable the entire buffer even if
the CPU buffer is passed in.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

077c5407

ring-buffer: do not count discarded events · a1863c21

由 Steven Rostedt 提交于 9月 03, 2009

The latency tracers report the number of items in the trace buffer.
This uses the ring buffer data to calculate this. Because discarded
events are also counted, the numbers do not match the number of items
that are printed. The ring buffer also adds a "padding" item to the
end of each buffer page which also gets counted as a discarded item.

This patch decrements the counter to the page entries on a discard.
This allows us to ignore discarded entries while reading the buffer.

Decrementing the counter is still safe since it can only happen while
the committing flag is still set.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

a1863c21

ring-buffer: remove ring_buffer_event_discard · dc892f73

由 Steven Rostedt 提交于 9月 03, 2009

The function ring_buffer_event_discard can be used on any item in the
ring buffer, even after the item was committed. This function provides
no safety nets and is very race prone.

An item may be safely removed from the ring buffer before it is committed
with the ring_buffer_discard_commit.

Since there are currently no users of this function, and because this
function is racey and error prone, this patch removes it altogether.

Note, removing this function also allows the counters to ignore
all discarded events (patches will follow).
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

dc892f73

ring-buffer: fix ring_buffer_read crossing pages · 7e9391cf

由 Steven Rostedt 提交于 9月 03, 2009

When the ring buffer uses an iterator (static read mode, not on the
fly reading), when it crosses a page boundery, it will skip the first
entry on the next page. The reason is that the last entry of a page
is usually padding if the page is not full. The padding will not be
returned to the user.

The problem arises on ring_buffer_read because it also increments the
iterator. Because both the read and peek use the same rb_iter_peek,
the rb_iter_peak will return the padding but also increment to the next
item. This is because the ring_buffer_peek will not incerment it
itself.

The ring_buffer_read will increment it again and then call rb_iter_peek
again to get the next item. But that will be the second item, not the
first one on the page.

The reason this never showed up before, is because the ftrace utility
always calls ring_buffer_peek first and only uses ring_buffer_read
to increment to the next item. The ring_buffer_peek will always keep
the pointer to a valid item and not padding. This just hid the bug.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

7e9391cf

ring-buffer: remove unnecessary cpu_relax · 1b959e18

由 Steven Rostedt 提交于 9月 03, 2009

The loops in the ring buffer that use cpu_relax are not dependent on
other CPUs. They simply came across some padding in the ring buffer and
are skipping over them. It is a normal loop and does not require a
cpu_relax.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

1b959e18

ring-buffer: do not swap buffers during a commit · 98277991

由 Steven Rostedt 提交于 9月 02, 2009

If a commit is taking place on a CPU ring buffer, do not allow it to
be swapped. Return -EBUSY when this is detected instead.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

98277991

ring-buffer: do not reset while in a commit · 41b6a95d

由 Steven Rostedt 提交于 9月 02, 2009

The callers of reset must ensure that no commit can be taking place
at the time of the reset. If it does then we may corrupt the ring buffer.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

41b6a95d

sched: Fix dynamic power-balancing crash · d7ea17a7

由 Ingo Molnar 提交于 9月 04, 2009

This crash:

[ 1774.088275] divide error: 0000 [#1] SMP
[ 1774.100355] CPU 13
[ 1774.102498] Modules linked in:
[ 1774.105631] Pid: 30881, comm: hackbench Not tainted 2.6.31-rc8-tip-01308-g484d664-dirty #1629 X8DTN
[ 1774.114807] RIP: 0010:[<ffffffff81041c38>]  [<ffffffff81041c38>]
sched_balance_self+0x19b/0x2d4

Triggers because update_group_power() modifies the sd tree and does
temporary calculations there - not considering that other CPUs
could observe intermediate values, such as the zero initial value.

Calculate it in a temporary variable instead. (we need no memory
barrier as these are all statistical values anyway)
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090904092742.GA11014@elte.hu>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d7ea17a7

sched: Remove reciprocal for cpu_power · 18a3885f

由 Peter Zijlstra 提交于 9月 01, 2009

Its a source of fail, also, now that cpu_power is dynamical,
its a waste of time.

before:
<idle>-0   [000]   132.877936: find_busiest_group: avg_load: 0 group_load: 8241 power: 1

after:
bash-1689  [001]   137.862151: find_busiest_group: avg_load: 10636288 group_load: 10387 power: 1

[ v2: build fix from From: Andreas Herrmann ]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: NGautham R Shenoy <ego@in.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
LKML-Reference: <20090901083826.425896304@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

18a3885f

OpenHarmony / kernel_linux 上一次同步 4 年多

OpenHarmony / kernel_linux
上一次同步 4 年多