提交 · bc2a27cd27271c5257989a57f511be86b26f5e54 · OpenHarmony / kernel_linux

13 9月, 2012 5 次提交

sched: cpu_power: enable ARCH_POWER · bc2a27cd

由 Vincent Guittot 提交于 7月 09, 2012

Heteregeneous ARM platform uses arch_scale_freq_power function
to reflect the relative capacity of each core
Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1341826026-6504-6-git-send-email-vincent.guittot@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

bc2a27cd

sched/nohz: Clean up select_nohz_load_balancer() · c1cc017c

由 Alex Shi 提交于 9月 10, 2012

There is no load_balancer to be selected now. It just sets the
state of the nohz tick to stop.

So rename the function, pass the 'cpu' as a parameter and then
remove the useless call from tick_nohz_restart_sched_tick().

[ s/set_nohz_tick_stopped/nohz_balance_enter_idle/g
  s/clear_nohz_tick_stopped/nohz_balance_exit_idle/g ]
Signed-off-by: NAlex Shi <alex.shi@intel.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1347261059-24747-1-git-send-email-alex.shi@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c1cc017c

sched: Fix load avg vs. cpu-hotplug · 08bedae1

由 Peter Zijlstra 提交于 9月 06, 2012

Commit f319da0c ("sched: Fix load avg vs cpu-hotplug") was an
incomplete fix:

In particular, the problem is that at the point it calls
calc_load_migrate() nr_running := 1 (the stopper thread), so move the
call to CPU_DEAD where we're sure that nr_running := 0.

Also note that we can call calc_load_migrate() without serialization, we
know the state of rq is stable since its cpu is dead, and we modify the
global state using appropriate atomic ops.
Suggested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1346882630.2600.59.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>

08bedae1

sched: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW · f3e94786

由 Peter Zijlstra 提交于 9月 12, 2012

Now that the last architecture to use this has stopped doing so (ARM,
thanks Catalin!) we can remove this complexity from the scheduler
core.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: http://lkml.kernel.org/n/tip-g9p2a1w81xxbrze25v9zpzbf@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

f3e94786

sched: Fix nohz_idle_balance() · 5ed4f1d9

由 Vincent Guittot 提交于 9月 13, 2012

On tickless systems, one CPU runs load balance for all idle CPUs.

The cpu_load of this CPU is updated before starting the load balance
of each other idle CPUs. We should instead update the cpu_load of
the balance_cpu.
Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1347509486-8688-1-git-send-email-vincent.guittot@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

5ed4f1d9

04 9月, 2012 8 次提交

sched: Remove useless code in yield_to() · 38b8dd6f

由 Michael Wang 提交于 7月 03, 2012

It's impossible to enter the else branch if we have set
skip_clock_update in task_yield_fair(), as yield_to_task_fair()
will directly return true after invoke task_yield_fair().
Signed-off-by: NMichael Wang <wangyun@linux.vnet.ibm.com>
Acked-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4FF2925A.9060005@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

38b8dd6f

sched: Add time unit suffix to sched sysctl knobs · d00535db

由 Namhyung Kim 提交于 8月 16, 2012

Unlike others, sched_migration_cost, sched_time_avg and
sched_shares_window doesn't have time unit as suffix. Add them.
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1345083330-19486-1-git-send-email-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

d00535db

sched/debug: Limit sd->*_idx range on sysctl · 201c373e

由 Namhyung Kim 提交于 8月 16, 2012

Various sd->*_idx's are used for refering the rq's load average table
when selecting a cpu to run. However they can be set to any number
with sysctl knobs so that it can crash the kernel if something bad is
given. Fix it by limiting them into the actual range.
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1345104204-8317-1-git-send-email-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

201c373e

sched: Remove AFFINE_WAKEUPS feature flag · c751134e

由 Namhyung Kim 提交于 8月 16, 2012

Commit beac4c7e ("sched: Remove AFFINE_WAKEUPS feature") removed
use of the flag but left the definition. Get rid of it.
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Link: http://lkml.kernel.org/r/1345090865-20851-1-git-send-email-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

c751134e

sched: Fix kernel-doc warnings in kernel/sched/fair.c · 9450d57e

由 Randy Dunlap 提交于 8月 18, 2012

Fix two kernel-doc warnings in kernel/sched/fair.c:

Warning(kernel/sched/fair.c:3660): Excess function parameter 'cpus' description in 'update_sg_lb_stats'
Warning(kernel/sched/fair.c:3806): Excess function parameter 'cpus' description in 'update_sd_lb_stats'
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/50303714.3090204@xenotime.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

9450d57e

sched: Unthrottle rt runqueues in __disable_runtime() · a4c96ae3

由 Peter Boonstoppel 提交于 8月 09, 2012

migrate_tasks() uses _pick_next_task_rt() to get tasks from the
real-time runqueues to be migrated. When rt_rq is throttled
_pick_next_task_rt() won't return anything, in which case
migrate_tasks() can't move all threads over and gets stuck in an
infinite loop.

Instead unthrottle rt runqueues before migrating tasks.

Additionally: move unthrottle_offline_cfs_rqs() to rq_offline_fair()
Signed-off-by: NPeter Boonstoppel <pboonstoppel@nvidia.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/5FBF8E85CA34454794F0F7ECBA79798F379D3648B7@HQMAIL04.nvidia.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

a4c96ae3

sched: Add missing call to calc_load_exit_idle() · 749c8814

由 Charles Wang 提交于 8月 20, 2012

Azat Khuzhin reported high loadavg in Linux v3.6

After checking the upstream scheduler code, I found Peter's commit:

5167e8d5 sched/nohz: Rewrite and fix load-avg computation -- again

not fully applied, missing the call to calc_load_exit_idle().

After that idle exit in sampling window will always be calculated
to non-idle, and the load will be higher than normal.

This patch adds the missing call to calc_load_exit_idle().
Signed-off-by: NCharles Wang <muming.wq@taobao.com>
Cc: stable@kernel.org
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1345449754-27130-1-git-send-email-muming.wq@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

749c8814

sched: Fix load avg vs cpu-hotplug · f319da0c

由 Peter Zijlstra 提交于 8月 20, 2012

Rabik and Paul reported two different issues related to the same few
lines of code.

Rabik's issue is that the nr_uninterruptible migration code is wrong in
that he sees artifacts due to this (Rabik please do expand in more
detail).

Paul's issue is that this code as it stands relies on us using
stop_machine() for unplug, we all would like to remove this assumption
so that eventually we can remove this stop_machine() usage altogether.

The only reason we'd have to migrate nr_uninterruptible is so that we
could use for_each_online_cpu() loops in favour of
for_each_possible_cpu() loops, however since nr_uninterruptible() is the
only such loop and its using possible lets not bother at all.

The problem Rabik sees is (probably) caused by the fact that by
migrating nr_uninterruptible we screw rq->calc_load_active for both rqs
involved.

So don't bother with fancy migration schemes (meaning we now have to
keep using for_each_possible_cpu()) and instead fold any nr_active delta
after we migrate all tasks away to make sure we don't have any skewed
nr_active accounting.
Reported-by: NRakib Mullick <rakib.mullick@gmail.com>
Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1345454817.23018.27.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>

f319da0c

02 9月, 2012 1 次提交

time: Move ktime_t overflow checking into timespec_valid_strict · cee58483

由 John Stultz 提交于 8月 31, 2012

Andreas Bombe reported that the added ktime_t overflow checking added to
timespec_valid in commit 4e8b1452 ("time: Improve sanity checking of
timekeeping inputs") was causing problems with X.org because it caused
timeouts larger then KTIME_T to be invalid.

Previously, these large timeouts would be clamped to KTIME_MAX and would
never expire, which is valid.

This patch splits the ktime_t overflow checking into a new
timespec_valid_strict function, and converts the timekeeping codes
internal checking to use this more strict function.
Reported-and-tested-by: NAndreas Bombe <aeb@debian.org>
Cc: Zhouping Liu <zliu@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cee58483

22 8月, 2012 6 次提交

task_work: add a scheduling point in task_work_run() · 88ec2789

由 Eric Dumazet 提交于 8月 21, 2012

It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced
the problem addressed in commit 944be0b2 (close_files(): add scheduling
point)

If a server process with a lot of files (say 2 million tcp sockets)
is killed, we can spend a lot of time in task_work_run() and trigger
a soft lockup.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

88ec2789

time: Avoid making adjustments if we haven't accumulated anything · bf2ac312

由 John Stultz 提交于 8月 21, 2012

If update_wall_time() is called and the current offset isn't large
enough to accumulate, avoid re-calling timekeeping_adjust which may
change the clock freq and can cause 1ns inconsistencies with
CLOCK_REALTIME_COARSE/CLOCK_MONOTONIC_COARSE.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1345595449-34965-5-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

bf2ac312

time: Avoid potential shift overflow with large shift values · 6ea565a9

由 John Stultz 提交于 8月 21, 2012

Andreas Schwab noticed that the 1 << tk->shift could overflow if the
shift value was greater than 30, since 1 would be a 32bit long on
32bit architectures. This issue was introduced by 1e75fa8b (time:
Condense timekeeper.xtime into xtime_sec)

Use 1ULL instead to ensure we don't overflow on the shift.
Reported-by: NAndreas Schwab <schwab@linux-m68k.org>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1345595449-34965-4-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

6ea565a9

time: Fix casting issue in timekeeping_forward_now · 85dc8f05

由 Andreas Schwab 提交于 8月 21, 2012

arch_gettimeoffset returns a u32 value which when shifted by tk->shift
can overflow. This issue was introduced with 1e75fa8b (time: Condense
timekeeper.xtime into xtime_sec)

Cast it to u64 first.
Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1345595449-34965-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

85dc8f05

time: Ensure we normalize the timekeeper in tk_xtime_add · 784ffcbb

由 John Stultz 提交于 8月 21, 2012

Andreas noticed problems with resume on specific hardware after commit
1e75fa8b (time: Condense timekeeper.xtime into xtime_sec) combined
with commit b44d50dc (time: Fix casting issue in tk_set_xtime and
tk_xtime_add)

After some digging I realized we aren't normalizing the timekeeper
after the add. Add the missing normalize call.
Reported-by: NAndreas Schwab <schwab@linux-m68k.org>
Tested-by: NAndreas Schwab <schwab@linux-m68k.org>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1345595449-34965-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

784ffcbb

task_work: add a scheduling point in task_work_run() · f341861f

由 Eric Dumazet 提交于 8月 21, 2012

It seems commit 4a9d4b02 ("switch fput to task_work_add") re-
introduced the problem addressed in 944be0b2 ("close_files(): add
scheduling point")

If a server process with a lot of files (say 2 million tcp sockets) is
killed, we can spend a lot of time in task_work_run() and trigger a soft
lockup.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f341861f

21 8月, 2012 1 次提交

uprobes: Fix mmap_region()'s mm->mm_rb corruption if uprobe_mmap() fails · c7a3a88c

由 Oleg Nesterov 提交于 8月 19, 2012

This patch fixes:

  https://bugzilla.redhat.com/show_bug.cgi?id=843640

If mmap_region()->uprobe_mmap() fails, unmap_and_free_vma path
does unmap_region() but does not remove the soon-to-be-freed vma
from rb tree. Actually there are more problems but this is how
William noticed this bug.

Perhaps we could do do_munmap() + return in this case, but in
fact it is simply wrong to abort if uprobe_mmap() fails. Until
at least we move the !UPROBE_COPY_INSN code from
install_breakpoint() to uprobe_register().

For example, uprobe_mmap()->install_breakpoint() can fail if the
probed insn is not supported (remember, uprobe_register()
succeeds if nobody mmaps inode/offset), mmap() should not fail
in this case.

dup_mmap()->uprobe_mmap() is wrong too by the same reason,
fork() can race with uprobe_register() and fail for no reason if
it wins the race and does install_breakpoint() first.

And, if nothing else, both mmap_region() and dup_mmap() return
success if uprobe_mmap() fails. Change them to ignore the error
code from uprobe_mmap().
Reported-and-tested-by: NWilliam Cohen <wcohen@redhat.com>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v3.5
Cc: Anton Arapov <anton@redhat.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20120819171042.GB26957@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c7a3a88c

20 8月, 2012 2 次提交

cputime: Consolidate vtime handling on context switch · baa36046

由 Frederic Weisbecker 提交于 6月 18, 2012

The archs that implement virtual cputime accounting all
flush the cputime of a task when it gets descheduled
and sometimes set up some ground initialization for the
next task to account its cputime.

These archs all put their own hooks in their context
switch callbacks and handle the off-case themselves.

Consolidate this by creating a new account_switch_vtime()
callback called in generic code right after a context switch
and that these archs must implement to flush the prev task
cputime and initialize the next task cputime related state.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>

baa36046

sched: Move cputime code to its own file · 73fbec60

由 Frederic Weisbecker 提交于 6月 16, 2012

Extract cputime code from the giant sched/core.c and
put it in its own file. This make it easier to deal with
this particular area and de-bloat a bit more core.c
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>

73fbec60

19 8月, 2012 1 次提交

alpha: take a bunch of syscalls into osf_sys.c · be53db6e

由 Al Viro 提交于 8月 19, 2012

New helper: current_thread_info().  Allows to do a bunch of odd syscalls
in C. While we are at it, there had never been a reason to do
osf_getpriority() in assembler.  We also get "namespace"-aware (read:
consistent with getuid(2), etc.) behaviour from getx?id() syscalls now.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NMichael Cree <mcree@orcon.net.nz>
Acked-by: NMatt Turner <mattst88@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

be53db6e

18 8月, 2012 1 次提交

tracing/syscalls: Fix perf syscall tracing when syscall_nr == -1 · 60916a93

由 Will Deacon 提交于 8月 16, 2012

syscall_get_nr can return -1 in the case that the task is not executing
a system call.

This patch fixes perf_syscall_{enter,exit} to check that the syscall
number is valid before using it as an index into a bitmap.

Link: http://lkml.kernel.org/r/1345137254-7377-1-git-send-email-will.deacon@arm.com

Cc: Jason Baron <jbaron@redhat.com>
Cc: Wade Farnsworth <wade_farnsworth@mentor.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

60916a93

15 8月, 2012 4 次提交

time: Improve sanity checking of timekeeping inputs · 4e8b1452

由 John Stultz 提交于 8月 08, 2012

Unexpected behavior could occur if the time is set to a value large
enough to overflow a 64bit ktime_t (which is something larger then the
year 2262).

Also unexpected behavior could occur if large negative offsets are
injected via adjtimex.

So this patch improves the sanity check timekeeping inputs by
improving the timespec_valid() check, and then makes better use of
timespec_valid() to make sure we don't set the time to an invalid
negative value or one that overflows ktime_t.

Note: This does not protect from setting the time close to overflowing
ktime_t and then letting natural accumulation cause the overflow.
Reported-by: NCAI Qian <caiqian@redhat.com>
Reported-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Zhouping Liu <zliu@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1344454580-17031-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4e8b1452

audit: clean up refcounting in audit-tree · b3e8692b

由 Miklos Szeredi 提交于 8月 15, 2012

Drop the initial reference by fsnotify_init_mark early instead of
audit_tree_freeing_mark() at destroy time.

In the cases we destroy the mark before we drop the initial reference we need to
get rid of the get_mark that balances the put_mark in audit_tree_freeing_mark().
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

b3e8692b

audit: fix refcounting in audit-tree · a2140fc0

由 Miklos Szeredi 提交于 8月 15, 2012

Refcounting of fsnotify_mark in audit tree is broken.  E.g:

                              refcount
create_chunk
  alloc_chunk                 1
  fsnotify_add_mark           2

untag_chunk
  fsnotify_get_mark           3
  fsnotify_destroy_mark
    audit_tree_freeing_mark   2
  fsnotify_put_mark           1
  fsnotify_put_mark           0
  via destroy_list
    fsnotify_mark_destroy    -1

This was reported by various people as triggering Oops when stopping auditd.

We could just remove the put_mark from audit_tree_freeing_mark() but that would
break freeing via inode destruction.  So this patch simply omits a put_mark
after calling destroy_mark or adds a get_mark before.

The additional get_mark is necessary where there's no other put_mark after
fsnotify_destroy_mark() since it assumes that the caller is holding a reference
(or the inode is keeping the mark pinned, not the case here AFAICS).
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reported-by: NValentin Avram <aval13@gmail.com>
Reported-by: NPeter Moody <pmoody@google.com>
Acked-by: NEric Paris <eparis@redhat.com>
CC: stable@vger.kernel.org

a2140fc0

audit: don't free_chunk() after fsnotify_add_mark() · 0fe33aae

由 Miklos Szeredi 提交于 8月 15, 2012

Don't do free_chunk() after fsnotify_add_mark().  That one does a delayed unref
via the destroy list and this results in use-after-free.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Acked-by: NEric Paris <eparis@redhat.com>
CC: stable@vger.kernel.org

0fe33aae

14 8月, 2012 9 次提交

sched: recover SD_WAKE_AFFINE in select_task_rq_fair and code clean up · f03542a7

由 Alex Shi 提交于 7月 26, 2012

Since power saving code was removed from sched now, the implement
code is out of service in this function, and even pollute other logical.
like, 'want_sd' never has chance to be set '0', that remove the effect
of SD_WAKE_AFFINE here.

So, clean up the obsolete code, includes SD_PREFER_LOCAL.
Signed-off-by: NAlex Shi <alex.shi@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5028F431.6000306@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

f03542a7

sched: using dst_rq instead of this_rq during load balance · 78feefc5

由 Michael Wang 提交于 8月 06, 2012

As we already have dst_rq in lb_env, using or changing "this_rq" do not
make sense.

This patch will replace "this_rq" with dst_rq in load_balance, and we
don't need to change "this_rq" while process LBF_SOME_PINNED any more.
Signed-off-by: NMichael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/501F8357.3070102@linux.vnet.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

78feefc5

sched: Document schedule() entry points · edde96ea

由 Pekka Enberg 提交于 8月 04, 2012

This patch adds a comment on top of the schedule() function to explain
to scheduler newbies how the main scheduler function is entered.
Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
Explained-by: NIngo Molnar <mingo@kernel.org>
Explained-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NPekka Enberg <penberg@kernel.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1344070187-2420-1-git-send-email-penberg@kernel.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

edde96ea

sched: Fix __sched_period comment · 532b1858

由 Borislav Petkov 提交于 8月 08, 2012

It should be sched_nr_latency so fix it before it annoys me more.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1344435364-18632-1-git-send-email-bp@amd64.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

532b1858

sched: Fix migration thread runtime bogosity · 8f618968

由 Mike Galbraith 提交于 8月 04, 2012

Make stop scheduler class do the same accounting as other classes,

Migration threads can be caught in the act while doing exec balancing,
leading to the below due to use of unmaintained ->se.exec_start.  The
load that triggered this particular instance was an apparently out of
control heavily threaded application that does system monitoring in
what equated to an exec bomb, with one of the VERY frequently migrated
tasks being ps.

%CPU   PID USER     CMD
99.3    45 root     [migration/10]
97.7    53 root     [migration/12]
97.0    57 root     [migration/13]
90.1    49 root     [migration/11]
89.6    65 root     [migration/15]
88.7    17 root     [migration/3]
80.4    37 root     [migration/8]
78.1    41 root     [migration/9]
44.2    13 root     [migration/2]
Signed-off-by: NMike Galbraith <mgalbraith@suse.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1344051854.6739.19.camel@marge.simpson.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8f618968

sched,rt: fix isolated CPUs leaving root_task_group indefinitely throttled · e221d028

由 Mike Galbraith 提交于 8月 07, 2012

Root task group bandwidth replenishment must service all CPUs, regardless of
where the timer was last started, and regardless of the isolation mechanism,
lest 'Quoth the Raven, "Nevermore"' become rt scheduling policy.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1344326558.6968.25.camel@marge.simpson.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

e221d028

sched,cgroup: Fix up task_groups list · 35cf4e50

由 Mike Galbraith 提交于 8月 07, 2012

With multiple instances of task_groups, for_each_rt_rq() is a noop,
no task groups having been added to the rt.c list instance.  This
renders __enable/disable_runtime() and print_rt_stats() noop, the
user (non) visible effect being that rt task groups are missing in
/proc/sched_debug.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Cc: stable@kernel.org # v3.3+
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1344308413.6846.7.camel@marge.simpson.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

35cf4e50

sched: fix divide by zero at {thread_group,task}_times · bea6832c

由 Stanislaw Gruszka 提交于 8月 08, 2012

On architectures where cputime_t is 64 bit type, is possible to trigger
divide by zero on do_div(temp, (__force u32) total) line, if total is a
non zero number but has lower 32 bit's zeroed. Removing casting is not
a good solution since some do_div() implementations do cast to u32
internally.

This problem can be triggered in practice on very long lived processes:

  PID: 2331   TASK: ffff880472814b00  CPU: 2   COMMAND: "oraagent.bin"
   #0 [ffff880472a51b70] machine_kexec at ffffffff8103214b
   #1 [ffff880472a51bd0] crash_kexec at ffffffff810b91c2
   #2 [ffff880472a51ca0] oops_end at ffffffff814f0b00
   #3 [ffff880472a51cd0] die at ffffffff8100f26b
   #4 [ffff880472a51d00] do_trap at ffffffff814f03f4
   #5 [ffff880472a51d60] do_divide_error at ffffffff8100cfff
   #6 [ffff880472a51e00] divide_error at ffffffff8100be7b
      [exception RIP: thread_group_times+0x56]
      RIP: ffffffff81056a16  RSP: ffff880472a51eb8  RFLAGS: 00010046
      RAX: bc3572c9fe12d194  RBX: ffff880874150800  RCX: 0000000110266fad
      RDX: 0000000000000000  RSI: ffff880472a51eb8  RDI: 001038ae7d9633dc
      RBP: ffff880472a51ef8   R8: 00000000b10a3a64   R9: ffff880874150800
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: ffff880472a51f08
      R13: ffff880472a51f10  R14: 0000000000000000  R15: 0000000000000007
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #7 [ffff880472a51f00] do_sys_times at ffffffff8108845d
   #8 [ffff880472a51f40] sys_times at ffffffff81088524
   #9 [ffff880472a51f80] system_call_fastpath at ffffffff8100b0f2
      RIP: 0000003808caac3a  RSP: 00007fcba27ab6d8  RFLAGS: 00000202
      RAX: 0000000000000064  RBX: ffffffff8100b0f2  RCX: 0000000000000000
      RDX: 00007fcba27ab6e0  RSI: 000000000076d58e  RDI: 00007fcba27ab6e0
      RBP: 00007fcba27ab700   R8: 0000000000000020   R9: 000000000000091b
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: 00007fff9ca41940
      R13: 0000000000000000  R14: 00007fcba27ac9c0  R15: 00007fff9ca41940
      ORIG_RAX: 0000000000000064  CS: 0033  SS: 002b

Cc: stable@vger.kernel.org
Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120808092714.GA3580@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

bea6832c

sched, cgroup: Reduce rq->lock hold times for large cgroup hierarchies · a35b6466

由 Peter Zijlstra 提交于 8月 08, 2012

Peter Portante reported that for large cgroup hierarchies (and or on
large CPU counts) we get immense lock contention on rq->lock and stuff
stops working properly.

His workload was a ton of processes, each in their own cgroup,
everybody idling except for a sporadic wakeup once every so often.

It was found that:

  schedule()
    idle_balance()
      load_balance()
        local_irq_save()
        double_rq_lock()
        update_h_load()
          walk_tg_tree(tg_load_down)
            tg_load_down()

Results in an entire cgroup hierarchy walk under rq->lock for every
new-idle balance and since new-idle balance isn't throttled this
results in a lot of work while holding the rq->lock.

This patch does two things, it removes the work from under rq->lock
based on the good principle of race and pray which is widely employed
in the load-balancer as a whole. And secondly it throttles the
update_h_load() calculation to max once per jiffy.

I considered excluding update_h_load() for new-idle balance
all-together, but purely relying on regular balance passes to update
this data might not work out under some rare circumstances where the
new-idle busiest isn't the regular busiest for a while (unlikely, but
a nightmare to debug if someone hits it and suffers).

Cc: pjt@google.com
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Reported-by: NPeter Portante <pportant@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-aaarrzfpnaam7pqrekofu8a6@git.kernel.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

a35b6466

13 8月, 2012 1 次提交

printk: Fix calculation of length used to discard records · e3756477

由 Jeff Mahoney 提交于 8月 10, 2012

While tracking down a weird buffer overflow issue in a program that
looked to be sane, I started double checking the length returned by
syslog(SYSLOG_ACTION_READ_ALL, ...) to make sure it wasn't overflowing
the buffer.

Sure enough, it was.  I saw this in strace:

  11339 syslog(SYSLOG_ACTION_READ_ALL, "<5>[244017.708129] REISERFS (dev"..., 8192) = 8279

It turns out that the loops that calculate how much space the entries
will take when they're copied don't include the newlines and prefixes
that will be included in the final output since prev flags is passed as
zero.

This patch properly accounts for it and fixes the overflow.

CC: stable@kernel.org
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e3756477

09 8月, 2012 1 次提交

Revert "NMI watchdog: fix for lockup detector breakage on resume" · 300d3739

由 Rafael J. Wysocki 提交于 8月 07, 2012

Revert commit 45226e94 (NMI watchdog: fix for lockup detector breakage
on resume) which breaks resume from system suspend on my SH7372
Mackerel board (by causing a NULL pointer dereference to happen) and
is generally wrong, because it abuses the CPU hotplug functionality
in a shamelessly blatant way.

The original issue should be addressed through appropriate syscore
resume callback instead.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

300d3739

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多