提交 · 0e1f34833bd9170ccc93ab759e48e695917fa48f · OpenHarmony / kernel_linux

15 3月, 2008 1 次提交

sched: fix race in schedule() · 0e1f3483

由 Hiroshi Shimamoto 提交于 3月 10, 2008

Fix a hard to trigger crash seen in the -rt kernel that also affects
the vanilla scheduler.

There is a race condition between schedule() and some dequeue/enqueue
functions; rt_mutex_setprio(), __setscheduler() and sched_move_task().

When scheduling to idle, idle_balance() is called to pull tasks from
other busy processor. It might drop the rq lock. It means that those 3
functions encounter on_rq=0 and running=1. The current task should be
put when running.

Here is a possible scenario:

   CPU0                               CPU1
    |                              schedule()
    |                              ->deactivate_task()
    |                              ->idle_balance()
    |                              -->load_balance_newidle()
rt_mutex_setprio()                     |
    |                              --->double_lock_balance()
    *get lock                          *rel lock
    * on_rq=0, ruuning=1               |
    * sched_class is changed           |
    *rel lock                          *get lock
    :                                  |
                                       :
                                   ->put_prev_task_rt()
                                   ->pick_next_task_fair()
                                       => panic

The current process of CPU1(P1) is scheduling. Deactivated P1, and the
scheduler looks for another process on other CPU's runqueue because CPU1
will be idle. idle_balance(), load_balance_newidle() and
double_lock_balance() are called and double_lock_balance() could drop
the rq lock. On the other hand, CPU0 is trying to boost the priority of
P1. The result of boosting only P1's prio and sched_class are changed to
RT. The sched entities of P1 and P1's group are never put. It makes
cfs_rq invalid, because the cfs_rq has curr and no leaf, but
pick_next_task_fair() is called, then the kernel panics.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0e1f3483

13 3月, 2008 1 次提交

documentation: Move power-related files to Documentation/power/ · 53471121

由 Randy Dunlap 提交于 3月 12, 2008

Move 00-INDEX entries to power/00-INDEX (and add entry for
pm_qos_interface.txt).

Update references to moved filenames.

Fix some trailing whitespace.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

53471121

12 3月, 2008 1 次提交

Hibernation: Fix mark_nosave_pages() · a82f7119

由 Rafael J. Wysocki 提交于 3月 12, 2008

There is a problem in the hibernation code that triggers on some NUMA
systems on which pfn_valid() returns 'true' for some PFNs that don't
belong to any zone.  Namely, there is a BUG_ON() in
memory_bm_find_bit() that triggers for PFNs not belonging to any
zone and passing the pfn_valid() test.  On the affected systems it
triggers when we mark PFNs reported by the platform as not saveable,
because the PFNs in question belong to a region mapped directly using
iorepam() (i.e. the ACPI data area) and they pass the pfn_valid()
test.

Modify memory_bm_find_bit() so that it returns an error if given PFN
doesn't belong to any zone instead of crashing the kernel and ignore
the result returned by it in mark_nosave_pages(), while marking the
"nosave" memory regions.

This doesn't affect the hibernation functionality, as we won't touch
the PFNs in question anyway.

http://bugzilla.kernel.org/show_bug.cgi?id=9966 .
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NLen Brown <len.brown@intel.com>

a82f7119

11 3月, 2008 5 次提交

keep rd->online and cpu_online_map in sync · 08f503b0

由 Gregory Haskins 提交于 3月 10, 2008

It is possible to allow the root-domain cache of online cpus to
become out of sync with the global cpu_online_map.  This is because we
currently trigger removal of cpus too early in the notifier chain.
Other DOWN_PREPARE handlers may in fact run and reconfigure the
root-domain topology, thereby stomping on our own offline handling.

The end result is that rd->online may become out of sync with
cpu_online_map, which results in potential task misrouting.

So change the offline handling to be more tightly coupled with the
global offline process by triggering on CPU_DYING intead of
CPU_DOWN_PREPARE.
Signed-off-by: NGregory Haskins <ghaskins@novell.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

08f503b0

Revert "cpu hotplug: adjust root-domain->online span in response to hotplug event" · 1f94ef59

由 Gregory Haskins 提交于 3月 10, 2008

This reverts commit 393d94d9.

Lets fix this right.
Signed-off-by: NGregory Haskins <ghaskins@novell.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1f94ef59

rcu: move PREEMPT_RCU config option back under PREEMPT · 21bbb39c

由 Paul E. McKenney 提交于 3月 10, 2008

The original preemptible-RCU patch put the choice between classic and
preemptible RCU into kernel/Kconfig.preempt, which resulted in build failures
on machines not supporting CONFIG_PREEMPT.  This choice was therefore moved to
init/Kconfig, which worked, but placed the choice between classic and
preemptible RCU at the top level, a very obtuse choice indeed.

This patch changes from the Kconfig "choice" mechanism to a pair of booleans,
only one of which (CONFIG_PREEMPT_RCU) is user-visible, and is located in
kernel/Kconfig.preempt, where one would expect it to be.  The other
(CONFIG_CLASSIC_RCU) is in init/Kconfig so that it is available to all
architectures, hopefully avoiding build breakage.  Thanks to Roman Zippel for
suggesting this approach.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Josh Triplett <josh@freedesktop.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

21bbb39c

modules: warn about suspicious return values from module's ->init() hook · e24e2e64

由 Alexey Dobriyan 提交于 3月 10, 2008

Return value convention of module's init functions is 0/-E.  Sometimes,
e.g.  during forward-porting mistakes happen and buggy module created,
where result of comparison "workqueue != NULL" is propagated all the way up
to sys_init_module.  What happens is that some other module created
workqueue in question, our module created it again and module was
successfully loaded.

Or it could be some other bug.

Let's make such mistakes much more visible.  In retrospective, such
messages would noticeably shorten some of my head-scratching sessions.

Note, that dump_stack() is just a way to get attention from user.  Sample
message:

sys_init_module: 'foo'->init suspiciously returned 1, it should follow 0/-E convention
sys_init_module: loading module anyway...
Pid: 4223, comm: modprobe Not tainted 2.6.24-25f66630 #5

Call Trace:
 [<ffffffff80254b05>] sys_init_module+0xe5/0x1d0
 [<ffffffff8020b39b>] system_call_after_swapgs+0x7b/0x80
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e24e2e64

modules: fix module waiting for dependent modules' init · 6c5db22d

由 Rusty Russell 提交于 3月 10, 2008

Commit c9a3ba55 (module: wait for dependent modules doing init.) didn't quite
work because the waiter holds the module lock, meaning that the state of the
module it's waiting for cannot change.

Fortunately, it's fairly simple to update the state outside the lock and do
the wakeup.

Thanks to Jan Glauber for tracking this down and testing (qdio and qeth).
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6c5db22d

10 3月, 2008 1 次提交

cpu hotplug: adjust root-domain->online span in response to hotplug event · 393d94d9

由 Gregory Haskins 提交于 3月 08, 2008

We currently set the root-domain online span automatically when the
domain is added to the cpu if the cpu is already a member of
cpu_online_map.

This was done as a hack/bug-fix for s2ram, but it also causes a problem
with hotplug CPU_DOWN transitioning.  The right way to fix the original
problem is to actually respond to CPU_UP events, instead of CPU_ONLINE,
which is already too late.

This solves the hung reboot regression reported by Andrew Morton and
others.
Signed-off-by: NGregory Haskins <ghaskins@novell.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

393d94d9

09 3月, 2008 4 次提交

time: remove obsolete CLOCK_TICK_ADJUST · 10a398d0

由 Roman Zippel 提交于 3月 04, 2008

The first version of the ntp_interval/tick_length inconsistent usage patch was
recently merged as bbe4d18a

http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=bbe4d18ac2e058c56adb0cd71f49d9ed3216a405

While the fix did greatly improve the situation, it was correctly pointed out
by Roman that it does have a small bug: If the users change clocksources after
the system has been running and NTP has made corrections, the correctoins made
against the old clocksource will be applied against the new clocksource,
causing error.

The second attempt, which corrects the issue in the NTP_INTERVAL_LENGTH
definition has also made it up-stream as commit
e13a2e61

http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e13a2e61dd5152f5499d2003470acf9c838eab84

Roman has correctly pointed out that CLOCK_TICK_ADJUST is calculated
based on the PIT's frequency, and isn't really relevant to non-PIT
driven clocksources (that is, clocksources other then jiffies and pit).

This patch reverts both of those changes, and simply removes
CLOCK_TICK_ADJUST.

This does remove the granularity error correction for users of PIT and Jiffies
clocksource users, but the granularity error but for the majority of users, it
should be within the 500ppm range NTP can accommodate for.

For systems that have granularity errors greater then 500ppm, the
"ntp_tick_adj=" boot option can be used to compensate.

[johnstul@us.ibm.com: provided changelog]
[mattilinnanvuori@yahoo.com: maek ntp_tick_adj static]
Signed-off-by: NRoman Zippel <zippel@linux-m68k.org>
Acked-by: Njohn stultz <johnstul@us.ibm.com>
Signed-off-by: NMatti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: mingo@elte.hu
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

10a398d0

time: don't touch an offlined CPU's ts->tick_stopped in tick_cancel_sched_timer() · a7901766

由 Karsten Wiese 提交于 3月 04, 2008

Silences WARN_ONs in rcu_enter_nohz() and rcu_exit_nohz(), which appeared
before caused by (repeated) calls to:
        $ echo 0 > /sys/devices/system/cpu/cpu1/online
        $ echo 1 > /sys/devices/system/cpu/cpu1/online
Signed-off-by: NKarsten Wiese <fzu@wemgehoertderstaat.de>
Cc: johnstul@us.ibm.com
Cc: Rafael Wysocki <rjw@sisk.pl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

a7901766

ntp: use unsigned input for do_div() · e48af19f

由 David Howells 提交于 2月 25, 2008

The kernel NTP code shouldn't hand 64-bit *signed* values to do_div().  Make it
instead hand 64-bit unsigned values.  This gets rid of a couple of warnings.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

e48af19f

Fix waitid si_code regression · 6efcae46

由 Roland McGrath 提交于 3月 08, 2008

In commit ee7c82da ("wait_task_stopped:
simplify and fix races with SIGCONT/SIGKILL/untrace"), the magic (short)
cast when storing si_code was lost in wait_task_stopped.  This leaks the
in-kernel CLD_* values that do not match what userland expects.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6efcae46

07 3月, 2008 6 次提交

sched: don't allow rt_runtime_us to be zero for groups having rt tasks · 521f1a24

由 Dhaval Giani 提交于 2月 28, 2008

This patch checks if we can set the rt_runtime_us to 0. If there is a
realtime task in the group, we don't want to set the rt_runtime_us as 0
or bad things will happen. (that task wont get any CPU time despite
being TASK_RUNNNG)
Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

521f1a24

sched: rt-group: fixup schedulability constraints calculation · 2692a240

由 Peter Zijlstra 提交于 2月 27, 2008

it was only possible to configure the rt-group scheduling parameters
beyond the default value in a very small range.

that's because div64_64() has a different calling convention than
do_div() :/

fix a few untidies while we are here; sysctl_sched_rt_period may overflow
due to that multiplication, so cast to u64 first. Also that RUNTIME_INF
juggling makes little sense although its an effective NOP.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2692a240

sched: fix the wrong time slice value for SCHED_FIFO tasks · 1868f958

由 Miao Xie 提交于 3月 07, 2008

Function sys_sched_rr_get_interval returns wrong time slice value for
SCHED_FIFO tasks. The time slice for SCHED_FIFO tasks should be 0.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1868f958

sched: export task_nice · 150d8bed

由 Pavel Roskin 提交于 3月 05, 2008

The API is trivial, and so is the implementation.
Signed-off-by: NPavel Roskin <proski@gnu.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

150d8bed

sched: balance RT task resched only on runqueue · 6fa46fa5

由 Steven Rostedt 提交于 3月 05, 2008

Sripathi Kodi reported a crash in the -rt kernel:

  https://bugzilla.redhat.com/show_bug.cgi?id=435674

this is due to a place that can reschedule a task without holding
the tasks runqueue lock.  This was caused by the RT balancing code
that pulls RT tasks to the current run queue and will reschedule the
current task.

There's a slight chance that the pulling of the RT tasks will release
the current runqueue's lock and retake it (in the double_lock_balance).
During this time that the runqueue is released, the current task can
migrate to another runqueue.

In the prio_changed_rt code, after the pull, if the current task is of
lesser priority than one of the RT tasks pulled, resched_task is called
on the current task. If the current task had migrated in that small
window, resched_task will be called without holding the runqueue lock
for the runqueue that the task is on.

This race condition also exists in the mainline kernel and this patch
adds a check to make sure the task hasn't migrated before calling
resched_task.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Tested-by: NSripathi Kodi <sripathik@in.ibm.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6fa46fa5

sched: retain vruntime · 810b3817

由 Peter Zijlstra 提交于 2月 29, 2008

Kei Tokunaga reported an interactivity problem when moving tasks
between control groups.

Tasks would retain their old vruntime when moved between groups, this
can cause funny lags. Re-set the vruntime on group move to fit within
the new tree.
Reported-by: NKei Tokunaga <tokunaga.keiich@jp.fujitsu.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

810b3817

06 3月, 2008 1 次提交

cpusets: fix obsolete comment · 41f7f60d

由 David Rientjes 提交于 3月 04, 2008

mm migration is no longer done in cpuset_update_task_memory_state() so it
can no longer take current->mm->mmap_sem, so fix the obsolete comment.

[ This changed in commit 04c19fa6
  ("cpuset: migrate all tasks in cpuset at once") when the mm migration
  was moved from cpuset_update_task_memory_state() to update_nodemask() ]
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

41f7f60d

05 3月, 2008 7 次提交

module: allow ndiswrapper to use GPL-only symbols · 9b37ccfc

由 Pavel Roskin 提交于 2月 28, 2008

A change after 2.6.24 broke ndiswrapper by accidentally removing its
access to GPL-only symbols.  Revert that change and add comments about
the reasons why ndiswrapper and driverloader are treated in a special
way.
Signed-off-by: NPavel Roskin <proski@gnu.org>
Acked-by: NGreg KH <gregkh@suse.de>
Acked-by: NIngo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jon Masters <jonathan@jonmasters.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9b37ccfc

kprobes: fix a null pointer bug in register_kretprobe() · b2a5cd69

由 Masami Hiramatsu 提交于 3月 04, 2008

Fix a bug in regiseter_kretprobe() which does not check rp->kp.symbol_name ==
NULL before calling kprobe_lookup_name.

For maintainability, this introduces kprobe_addr helper function which
resolves addr field.  It is used by register_kprobe and register_kretprobe.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b2a5cd69

markers: don't risk NULL deref in marker · 544adb41

由 Jesper Juhl 提交于 3月 04, 2008

get_marker() may return NULL, so test for it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com>
Acked-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

544adb41

Kprobes: indicate kretprobe support in Kconfig · 9edddaa2

由 Ananth N Mavinakayanahalli 提交于 3月 04, 2008

Add CONFIG_HAVE_KRETPROBES to the arch/<arch>/Kconfig file for relevant
architectures with kprobes support.  This facilitates easy handling of
in-kernel modules (like samples/kprobes/kretprobe_example.c) that depend on
kretprobes being present in the kernel.

Thanks to Sam Ravnborg for helping make the patch more lean.

Per Mathieu's suggestion, added CONFIG_KRETPROBES and fixed up dependencies.
Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9edddaa2

Memory Resource Controller use strstrip while parsing arguments · fb78922c

由 Balbir Singh 提交于 3月 04, 2008

The memory controller has a requirement that while writing values, we need
to use echo -n. This patch fixes the problem and makes the UI more consistent.
Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fb78922c

cgroup: fix default notify_on_release setting · b6abdb0e

由 Li Zefan 提交于 3月 04, 2008

The documentation says the default value of notify_on_release of a child
cgroup is inherited from its parent, which is reasonable, but the
implementation just sets the flag disabled.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPaul Menage <menage@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6abdb0e

sched: revert load_balance_monitor() changes · 62fb1851

由 Peter Zijlstra 提交于 2月 25, 2008

The following commits cause a number of regressions:

  commit 58e2d4ca
  Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
  Date:   Fri Jan 25 21:08:00 2008 +0100
  sched: group scheduling, change how cpu load is calculated

  commit 6b2d7700
  Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
  Date:   Fri Jan 25 21:08:00 2008 +0100
  sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups

Namely:
 - very frequent wakeups on SMP, reported by PowerTop users.
 - cacheline trashing on (large) SMP
 - some latencies larger than 500ms

While there is a mergeable patch to fix the latter, the former issues
are not fixable in a manner suitable for .25 (we're at -rc3 now).

Hence we revert them and try again in v2.6.26.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Tested-by: NAlexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

62fb1851

04 3月, 2008 4 次提交

freezer vs stopped or traced · 13b1c3d4

由 Roland McGrath 提交于 3月 03, 2008

This changes the "freezer" code used by suspend/hibernate in its treatment
of tasks in TASK_STOPPED (job control stop) and TASK_TRACED (ptrace) states.

As I understand it, the intent of the "freezer" is to hold all tasks
from doing anything significant.  For this purpose, TASK_STOPPED and
TASK_TRACED are "frozen enough".  It's possible the tasks might resume
from ptrace calls (if the tracer were unfrozen) or from signals
(including ones that could come via timer interrupts, etc).  But this
doesn't matter as long as they quickly block again while "freezing" is
in effect.  Some minor adjustments to the signal.c code make sure that
try_to_freeze() very shortly follows all wakeups from both kinds of
stop.  This lets the freezer code safely leave stopped tasks unmolested.

Changing this fixes the longstanding bug of seeing after resuming from
suspend/hibernate your shell report "[1] Stopped" and the like for all
your jobs stopped by ^Z et al, as if you had freshly fg'd and ^Z'd them.
It also removes from the freezer the arcane special case treatment for
ptrace'd tasks, which relied on intimate knowledge of ptrace internals.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

13b1c3d4

exit_notify: fix kill_orphaned_pgrp() usage with mt exit · 821c7de7

由 Oleg Nesterov 提交于 3月 02, 2008

1. exit_notify() always calls kill_orphaned_pgrp(). This is wrong, we
   should do this only when the whole process exits.

2. exit_notify() uses "current" as "ignored_task", obviously wrong.
   Use ->group_leader instead.

Test case:

	void hup(int sig)
	{
		printf("HUP received\n");
	}

	void *tfunc(void *arg)
	{
		sleep(2);
		printf("sub-thread exited\n");
		return NULL;
	}

	int main(int argc, char *argv[])
	{
		if (!fork()) {
			signal(SIGHUP, hup);
			kill(getpid(), SIGSTOP);
			exit(0);
		}

		pthread_t thr;
		pthread_create(&thr, NULL, tfunc, NULL);

		sleep(1);
		printf("main thread exited\n");
		syscall(__NR_exit, 0);

		return 0;
	}

output:

	main thread exited
	HUP received
	Hangup

With this patch the output is:

	main thread exited
	sub-thread exited
	HUP received
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

821c7de7

will_become_orphaned_pgrp: partially fix insufficient ->exit_state check · 05e83df6

由 Oleg Nesterov 提交于 3月 02, 2008

p->exit_state != 0 doesn't mean this process is dead, it may have
sub-threads.  Change the code to use "p->exit_state && thread_group_empty(p)"
instead.

Without this patch, ^Z doesn't deliver SIGTSTP to the foreground process
if the main thread has exited.

However, the new check is not perfect either.  There is a window when
exit_notify() drops tasklist and before release_task().  Suppose that
the last (non-leader) thread exits.  This means that entire group exits,
but thread_group_empty() is not true yet.

As Eric pointed out, is_global_init() is wrong as well, but I did not
dare to do other changes.

Just for the record, has_stopped_jobs() is absolutely wrong too.  But we
can't fix it now, we should first fix SIGNAL_STOP_STOPPED issues.

Even with this patch ^Z doesn't play well with the dead main thread.
The task is stopped correctly but do_wait(WSTOPPED) won't see it.  This
is another unrelated issue, will be (hopefully) fixed separately.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

05e83df6

introduce kill_orphaned_pgrp() helper · f49ee505

由 Oleg Nesterov 提交于 3月 02, 2008

Factor out the common code in reparent_thread() and exit_notify().

No functional changes.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f49ee505

01 3月, 2008 7 次提交

[PATCH] drop EOE records from printk · 8d07a67c

由 Steve Grubb 提交于 2月 21, 2008

Hi,

While we are looking at the printk issue, I see that its printk'ing the EOE
(end of event) records which is really not something that we need in syslog.
Its really intended for the realtime audit event stream handled by the audit
daemon. So, lets avoid printk'ing that record type.
Signed-off-by: NSteve Grubb <sgrubb@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8d07a67c

[RFC] AUDIT: do not panic when printk loses messages · b29ee87e

由 Eric Paris 提交于 2月 21, 2008

On the latest kernels if one was to load about 15 rules, set the failure
state to panic, and then run service auditd stop the kernel will panic.
This is because auditd stops, then the script deletes all of the rules.
These deletions are sent as audit messages out of the printk kernel
interface which is already known to be lossy. These will overun the
default kernel rate limiting (10 really fast messages) and will call
audit_panic(). The same effect can happen if a slew of avc's come
through while auditd is stopped.

This can be fixed a number of ways but this patch fixes the problem by
just not panicing if auditd is not running. We know printk is lossy and
if the user chooses to set the failure mode to panic and tries to use
printk we can't make any promises no matter how hard we try, so why try?
At least in this way we continue to get lost message accounting and will
eventually know that things went bad.

The other change is to add a new call to audit_log_lost() if auditd
disappears. We already pulled the skb off the queue and couldn't send
it so that message is lost. At least this way we will account for the
last message and panic if the machine is configured to panic. This code
path should only be run if auditd dies for unforeseen reasons. If
auditd closes correctly audit_pid will get set to 0 and we won't walk
this code path.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b29ee87e

[PATCH] Audit: Fix the format type for size_t variables · 422b03cf

由 Paul Moore 提交于 2月 27, 2008

Fix the following compiler warning by using "%zu" as defined in C99.

  CC      kernel/auditsc.o
  kernel/auditsc.c: In function 'audit_log_single_execve_arg':
  kernel/auditsc.c:1074: warning: format '%ld' expects type 'long int', but
  argument 4 has type 'size_t'
Signed-off-by: NPaul Moore <paul.moore@hp.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

422b03cf

rcupreempt: remove never-migrates assumption from rcu_process_callbacks() · c9e71002

由 Paul E. McKenney 提交于 2月 28, 2008

This patch fixes a potentially invalid access to a per-CPU variable in
rcu_process_callbacks().

This per-CPU access needs to be done in such a way as to guarantee that
the code using it cannot move to some other CPU before all uses of the
value accessed have completed.  Even though this code is currently only
invoked from softirq context, which currrently cannot migrate to some
other CPU, life would be better if this code did not silently make such
an assumption.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c9e71002

rcupreempt: fix hibernate/resume in presence of PREEMPT_RCU and hotplug · ae778869

由 Paul E. McKenney 提交于 2月 27, 2008

This fixes a oops encountered when doing hibernate/resume in presence of
PREEMPT_RCU.

The problem was that the code failed to disable preemption when
accessing a per-CPU variable.  This is OK when called from code that
already has preemption disabled, but such is not the case from the
suspend/resume code path.
Reported-by: NDave Young <hidave.darkstar@gmail.com>
Tested-by: NDave Young <hidave.darkstar@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ae778869

softlockup: fix task state setting · 7be2a03e

由 Dmitry Adamushko 提交于 2月 08, 2008

kthread_stop() can be called when a 'watchdog' thread is executing after
kthread_should_stop() but before set_task_state(TASK_INTERRUPTIBLE).
Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7be2a03e

rcu: add support for dynamic ticks and preempt rcu · 2232c2d8

由 Steven Rostedt 提交于 2月 29, 2008

The PREEMPT-RCU can get stuck if a CPU goes idle and NO_HZ is set. The
idle CPU will not progress the RCU through its grace period and a
synchronize_rcu my get stuck. Without this patch I have a box that will
not boot when PREEMPT_RCU and NO_HZ are set. That same box boots fine
with this patch.

This patch comes from the -rt kernel where it has been tested for
several months.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2232c2d8

26 2月, 2008 2 次提交

printk: fix possible printk overrun · cf3680b9

由 Tejun Heo 提交于 2月 14, 2008

printk recursion detection prepends message to printk_buf and offsets
printk_buf when actual message is printed but it forgets to trim buffer
length accordingly. This can result in overrun in extreme cases. Fix it.

[ mingo@elte.hu:

  bug was introduced by me via:

   commit 32a76006
   Author: Ingo Molnar <mingo@elte.hu>
   Date:   Fri Jan 25 21:07:58 2008 +0100

       printk: make printk more robust by not allowing recursion
]
Signed-off-by: NTejun Heo <htejun@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cf3680b9

Subject: lockdep: include all lock classes in all_lock_classes · 1481197b

由 Dale Farnsworth 提交于 2月 25, 2008

Add each lock class to the all_lock_classes list when it is
first registered.

Previously, lock classes were added to all_lock_classes when
the lock class was first used.  Since one of the uses of the
list is to find unused locks, this didn't work well.
Signed-off-by: NDale Farnsworth <dale@farnsworth.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1481197b

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年