提交 · 40dc11ffb35e8c4e8fa71092048e0f8de9db758c · OpenHarmony / kernel_linux

You need to sign in or sign up before continuing.

09 12月, 2010 2 次提交

printk: Use this_cpu_{read|write} api on printk_pending · 40dc11ff

由 Eric Dumazet 提交于 11月 26, 2010

__get_cpu_var() is a bit inefficient, lets use __this_cpu_read() and
__this_cpu_write() to manipulate printk_pending.

printk_needs_cpu(cpu) is called only for the current cpu :
Use faster __this_cpu_read().

Remove the redundant unlikely on (cpu_is_offline(cpu)) test:

 # size kernel/printk.o*
   text	   data	    bss	    dec	    hex	filename
   9942	    756	 263488	 274186	  42f0a	kernel/printk.o.new
   9990	    756	 263488	 274234	  42f3a	kernel/printk.o.old
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290788536.2855.237.camel@edumazet-laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

40dc11ff

sched: Make pushable_tasks CONFIG_SMP dependant · 806c09a7

由 Dario Faggioli 提交于 11月 30, 2010

As noted by Peter Zijlstra at https://lkml.org/lkml/2010/11/10/391
(while reviewing other stuff, though), tracking pushable tasks
only makes sense on SMP systems.
Signed-off-by: NDario Faggioli <raistlin@linux.it>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Acked-by: NGregory Haskins <ghaskins@novell.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1291143093.2697.298.camel@Palantir>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

806c09a7

07 12月, 2010 2 次提交

PM / Hibernate: Fix memory corruption related to swap · c9e664f1

由 Rafael J. Wysocki 提交于 12月 03, 2010

There is a problem that swap pages allocated before the creation of
a hibernation image can be released and used for storing the contents
of different memory pages while the image is being saved.  Since the
kernel stored in the image doesn't know of that, it causes memory
corruption to occur after resume from hibernation, especially on
systems with relatively small RAM that need to swap often.

This issue can be addressed by keeping the GFP_IOFS bits clear
in gfp_allowed_mask during the entire hibernation, including the
saving of the image, until the system is finally turned off or
the hibernation is aborted.  Unfortunately, for this purpose
it's necessary to rework the way in which the hibernate and
suspend code manipulates gfp_allowed_mask.

This change is based on an earlier patch from Hugh Dickins.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Reported-by: NOndrej Zary <linux@rainbow-software.org>
Acked-by: NHugh Dickins <hughd@google.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: stable@kernel.org

c9e664f1

PM / Hibernate: Use async I/O when reading compressed hibernation image · 9f339caf

由 Bojan Smojver 提交于 11月 25, 2010

This is a fix for reading LZO compressed image using async I/O.
Essentially, instead of having just one page into which we keep
reading blocks from swap, we allocate enough of them to cover the
largest compressed size and then let block I/O pick them all up. Once
we have them all (and here we wait), we decompress them, as usual.
Obviously, the very first block we still pick up synchronously,
because we need to know the size of the lot before we pick up the
rest.

Also fixed the copyright line, which I've forgotten before.
Signed-off-by: NBojan Smojver <bojan@rexursive.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

9f339caf

03 12月, 2010 1 次提交

do_exit(): make sure that we run with get_fs() == USER_DS · 33dd94ae

由 Nelson Elhage 提交于 12月 02, 2010

If a user manages to trigger an oops with fs set to KERNEL_DS, fs is not
otherwise reset before do_exit().  do_exit may later (via mm_release in
fork.c) do a put_user to a user-controlled address, potentially allowing
a user to leverage an oops into a controlled write into kernel memory.

This is only triggerable in the presence of another bug, but this
potentially turns a lot of DoS bugs into privilege escalations, so it's
worth fixing.  I have proof-of-concept code which uses this bug along
with CVE-2010-3849 to write a zero to an arbitrary kernel address, so
I've tested that this is not theoretical.

A more logical place to put this fix might be when we know an oops has
occurred, before we call do_exit(), but that would involve changing
every architecture, in multiple places.

Let's just stick it in do_exit instead.

[akpm@linux-foundation.org: update code comment]
Signed-off-by: NNelson Elhage <nelhage@ksplice.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33dd94ae

01 12月, 2010 1 次提交

genirq: Fix incorrect proc spurious output · 25c9170e

由 Kenji Kaneshige 提交于 11月 30, 2010

Since commit a1afb637(switch /proc/irq/*/spurious to seq_file) all
/proc/irq/XX/spurious files show the information of irq 0.

Current irq_spurious_proc_open() passes on NULL as the 3rd argument,
which is used as an IRQ number in irq_spurious_proc_show(), to the
single_open(). Because of this, all the /proc/irq/XX/spurious file
shows IRQ 0 information regardless of the IRQ number.

To fix the problem, irq_spurious_proc_open() must pass on the
appropreate data (IRQ number) to single_open().
Signed-off-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: NYong Zhang <yong.zhang0@gmail.com>
LKML-Reference: <4CF4B778.90604@jp.fujitsu.com>
Cc: stable@kernel.org [2.6.33+]
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

25c9170e

30 11月, 2010 2 次提交

sched: Add 'autogroup' scheduling feature: automated per session task groups · 5091faa4

由 Mike Galbraith 提交于 11月 30, 2010

A recurring complaint from CFS users is that parallel kbuild has
a negative impact on desktop interactivity.  This patch
implements an idea from Linus, to automatically create task
groups.  Currently, only per session autogroups are implemented,
but the patch leaves the way open for enhancement.

Implementation: each task's signal struct contains an inherited
pointer to a refcounted autogroup struct containing a task group
pointer, the default for all tasks pointing to the
init_task_group.  When a task calls setsid(), a new task group
is created, the process is moved into the new task group, and a
reference to the preveious task group is dropped.  Child
processes inherit this task group thereafter, and increase it's
refcount.  When the last thread of a process exits, the
process's reference is dropped, such that when the last process
referencing an autogroup exits, the autogroup is destroyed.

At runqueue selection time, IFF a task has no cgroup assignment,
its current autogroup is used.

Autogroup bandwidth is controllable via setting it's nice level
through the proc filesystem:

  cat /proc/<pid>/autogroup

Displays the task's group and the group's nice level.

  echo <nice level> > /proc/<pid>/autogroup

Sets the task group's shares to the weight of nice <level> task.
Setting nice level is rate limited for !admin users due to the
abuse risk of task group locking.

The feature is enabled from boot by default if
CONFIG_SCHED_AUTOGROUP=y is selected, but can be disabled via
the boot option noautogroup, and can also be turned on/off on
the fly via:

  echo [01] > /proc/sys/kernel/sched_autogroup_enabled

... which will automatically move tasks to/from the root task group.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Paul Turner <pjt@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
[ Removed the task_group_path() debug code, and fixed !EVENTFD build failure. ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>
LKML-Reference: <1290281700.28711.9.camel@maggy.simson.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5091faa4

sched: Fix unregister_fair_sched_group() · 822bc180

由 Paul Turner 提交于 11月 29, 2010

In the flipping and flopping between calling
unregister_fair_sched_group() on a per-cpu versus per-group basis
we ended up in a bad state.

Remove from the list for the passed cpu as opposed to some
arbitrary index.

( This fixes explosions w/ autogroup as well as a group
  creation/destruction stress test. )
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NPaul Turner <pjt@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20101130005740.080828123@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

822bc180

26 11月, 2010 6 次提交

sched: Remove unused argument dest_cpu to migrate_task() · b7a2b39d

由 Nikanth Karthikesan 提交于 11月 26, 2010

Remove unused argument, 'dest_cpu' of migrate_task(), and pass runqueue,
as it is always known at the call site.
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <201011261237.09187.knikanth@suse.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b7a2b39d

mutexes, sched: Introduce arch_mutex_cpu_relax() · 335d7afb

由 Gerald Schaefer 提交于 11月 22, 2010

The spinning mutex implementation uses cpu_relax() in busy loops as a
compiler barrier. Depending on the architecture, cpu_relax() may do more
than needed in this specific mutex spin loops. On System z we also give
up the time slice of the virtual cpu in cpu_relax(), which prevents
effective spinning on the mutex.

This patch replaces cpu_relax() in the spinning mutex code with
arch_mutex_cpu_relax(), which can be defined by each architecture that
selects HAVE_ARCH_MUTEX_CPU_RELAX. The default is still cpu_relax(), so
this patch should not affect other architectures than System z for now.
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290437256.7455.4.camel@thinkpad>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

335d7afb

nohz: Fix printk_needs_cpu() return value on offline cpus · 61ab2544

由 Heiko Carstens 提交于 11月 26, 2010

This patch fixes a hang observed with 2.6.32 kernels where timers got enqueued
on offline cpus.

printk_needs_cpu() may return 1 if called on offline cpus. When a cpu gets
offlined it schedules the idle process which, before killing its own cpu, will
call tick_nohz_stop_sched_tick(). That function in turn will call
printk_needs_cpu() in order to check if the local tick can be disabled. On
offline cpus this function should naturally return 0 since regardless if the
tick gets disabled or not the cpu will be dead short after. That is besides the
fact that __cpu_disable() should already have made sure that no interrupts on
the offlined cpu will be delivered anyway.

In this case it prevents tick_nohz_stop_sched_tick() to call
select_nohz_load_balancer(). No idea if that really is a problem. However what
made me debug this is that on 2.6.32 the function get_nohz_load_balancer() is
used within __mod_timer() to select a cpu on which a timer gets enqueued. If
printk_needs_cpu() returns 1 then the nohz_load_balancer cpu doesn't get
updated when a cpu gets offlined. It may contain the cpu number of an offline
cpu. In turn timers get enqueued on an offline cpu and not very surprisingly
they never expire and cause system hangs.

This has been observed 2.6.32 kernels. On current kernels __mod_timer() uses
get_nohz_timer_target() which doesn't have that problem. However there might be
other problems because of the too early exit tick_nohz_stop_sched_tick() in
case a cpu goes offline.

Easiest way to fix this is just to test if the current cpu is offline and call
printk_tick() directly which clears the condition.

Alternatively I tried a cpu hotplug notifier which would clear the condition,
however between calling the notifier function and printk_needs_cpu() something
could have called printk() again and the problem is back again. This seems to
be the safest fix.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
LKML-Reference: <20101126120235.406766476@de.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

61ab2544

printk: Fix wake_up_klogd() vs cpu hotplug · 49f41383

由 Heiko Carstens 提交于 11月 26, 2010

wake_up_klogd() may get called from preemptible context but uses
__raw_get_cpu_var() to write to a per cpu variable. If it gets preempted
between getting the address and writing to it, the cpu in question could be
offline if the process gets scheduled back and hence writes to the per cpu data
of an offline cpu.

This buggy behaviour was introduced with fa33507a "printk: robustify
printk, fix #2" which was supposed to fix a "using smp_processor_id() in
preemptible" warning.

Let's use this_cpu_write() instead which disables preemption and makes sure
that the outlined scenario cannot happen.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101126124247.GC7023@osiris.boeblingen.de.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

49f41383

perf: Fix the software context switch counter · ee6dcfa4

由 Peter Zijlstra 提交于 11月 26, 2010

Stephane noticed that because the perf_sw_event() call is inside the
perf_event_task_sched_out() call it won't get called unless we
have a per-task counter.
Reported-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ee6dcfa4

perf: Fix inherit vs. context rotation bug · dddd3379

由 Thomas Gleixner 提交于 11月 24, 2010

It was found that sometimes children of tasks with inherited events had
one extra event. Eventually it turned out to be due to the list rotation
no being exclusive with the list iteration in the inheritance code.

Cure this by temporarily disabling the rotation while we inherit the events.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dddd3379

23 11月, 2010 5 次提交

sched: Add some clock info to sched_debug · 5bb6b1ea

由 Peter Zijlstra 提交于 11月 19, 2010

Add more clock information to /proc/sched_debug, Thomas wanted to see
the sched_clock_stable state.
Requested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5bb6b1ea

cpu: Remove incorrect BUG_ON · 51a96c77

由 Peter Zijlstra 提交于 11月 19, 2010

Oleg mentioned that there is no actual guarantee the dying cpu's
migration thread is actually finished running when we get there, so
replace the BUG_ON() with a spinloop waiting for it.
Reported-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

51a96c77

cpu: Remove unused variable · 2e01f474

由 Dhaval Giani 提交于 11月 18, 2010

GCC warns us about:

 kernel/cpu.c: In function ‘take_cpu_down’:
 kernel/cpu.c:200:15: warning: unused variable ‘cpu’

This variable is unused since param->hcpu is directly
used later on in cpu_notify.
Signed-off-by: NDhaval Giani <dhaval_giani@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290091494.1145.5.camel@gondor.retis>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2e01f474

sched: Fix UP build breakage · 70caf8a6

由 Peter Zijlstra 提交于 11月 20, 2010

The recent cgroup-scheduling rework caused a UP build problem.

Cc: Paul Turner <pjt@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

70caf8a6

sched: Make task dump print all 15 chars of proc comm · 28d0686c

由 Erik Gilling 提交于 11月 19, 2010

Signed-off-by: NErik Gilling <konkers@android.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290218934-8544-3-git-send-email-john.stultz@linaro.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

28d0686c

20 11月, 2010 1 次提交

Revert "kernel: make /proc/kallsyms mode 400 to reduce ease of attacking" · 33e0d57f

由 Linus Torvalds 提交于 11月 19, 2010

This reverts commit 59365d13.

It turns out that this can break certain existing user land setups.
Quoth Sarah Sharp:

 "On Wednesday, I updated my branch to commit 460781b5 from linus' tree,
  and my box would not boot.  klogd segfaulted, which stalled the whole
  system.

  At first I thought it actually hung the box, but it continued booting
  after 5 minutes, and I was able to log in.  It dropped back to the
  text console instead of the graphical bootup display for that period
  of time.  dmesg surprisingly still works.  I've bisected the problem
  down to this commit (commit 59365d13)

  The box is running klogd 1.5.5ubuntu3 (from Jaunty).  Yes, I know
  that's old.  I read the bit in the commit about changing the
  permissions of kallsyms after boot, but if I can't boot that doesn't
  help."

So let's just keep the old default, and encourage distributions to do
the "chmod -r /proc/kallsyms" in their bootup scripts.  This is not
worth a kernel option to change default behavior, since it's so easily
done in user space.
Reported-and-bisected-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: Marcus Meissner <meissner@suse.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Eugene Teo <eugeneteo@kernel.org>
Cc: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33e0d57f

18 11月, 2010 19 次提交

sched: Update tg->shares after cpu.shares write · 9437178f