提交 · 00a5825332769706eb2e276c101dd20269a53992 · openanolis / cloud-kernel

07 12月, 2007 1 次提交

Fix oprofile configuration breakage · 00a58253

由 Ralf Baechle 提交于 12月 06, 2007

The cleanup 09cadedb broke the oprofile
configuration for MIPS by allowing oprofile support to be built for
kernel models where oprofile doesn't have a chance in hell to work.

Just a dependecy list on a number of architectures is - surprise - broken
and should as per past discussions probably in most considered to be
broken in most cases.  So I introduce a dependency for the oprofile
configuration on ARCH_SUPPORTS_OPROFILE.
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

00a58253

06 12月, 2007 2 次提交

Avoid potential NULL dereference in unregister_sysctl_table · f1dad166

由 Pavel Emelyanov 提交于 12月 04, 2007

register_sysctl_table() can return NULL sometimes, e.g.  when kmalloc()
returns NULL or when sysctl check fails.

I've also noticed, that many (most?) code in the kernel doesn't check for
the return value from register_sysctl_table() and later simply calls the
unregister_sysctl_table() with potentially NULL argument.

This is unlikely on a common kernel configuration, but in case we're
dealing with modules and/or fault-injection support, there's a slight
possibility of an OOPS.

Changing all the users to check for return code from the registering does
not look like a good solution - there are too many code doing this and
failure in sysctl tables registration is not a good reason to abort module
loading (in most of the cases).

So I think, that we can just have this check in unregister_sysctl_table
just to avoid accidental OOPS-es (actually, the unregister_sysctl_table()
did exactly this, before the start_unregistering() appeared).
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f1dad166

fix clone(CLONE_NEWPID) · 5cd17569

由 Eric W. Biederman 提交于 12月 04, 2007

Currently we are complicating the code in copy_process, the clone ABI, and
if we fix the bugs sys_setsid itself, with an unnecessary open coded
version of sys_setsid.

So just simplify everything and don't special case the session and pgrp of
the initial process in a pid namespace.

Having this special case actually presents to user space the classic linux
startup conditions with session == pgrp == 0 for /sbin/init.

We already handle sending signals to processes in a child pid namespace.

We need to handle sending signals to processes in a parent pid namespace
for cases like SIGCHILD and SIGIO.

This makes nothing extra visible inside a pid namespace.  So this extra
special case appears to have no redeeming merits.

Further removing this special case increases the flexibility of how we can
use pid namespaces, by not requiring the initial process in a pid namespace
to be a daemon.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5cd17569

05 12月, 2007 8 次提交

futex: correctly return -EFAULT not -EINVAL · cde898fa

由 Thomas Gleixner 提交于 12月 05, 2007

return -EFAULT not -EINVAL. Found by review.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cde898fa

lockdep: in_range() fix · 54561783

由 Oleg Nesterov 提交于 12月 05, 2007

Torsten Kaiser wrote:

| static inline int in_range(const void *start, const void *addr, const void *end)
| {
|         return addr >= start && addr <= end;
| }
| This  will return true, if addr is in the range of start (including)
| to end (including).
|
| But debug_check_no_locks_freed() seems does:
| const void *mem_to = mem_from + mem_len
| -> mem_to is the last byte of the freed range, that fits in_range
| lock_from = (void *)hlock->instance;
| -> first byte of the lock
| lock_to = (void *)(hlock->instance + 1);
| -> first byte of the next lock, not last byte of the lock that is being checked!
|
| The test is:
| if (!in_range(mem_from, lock_from, mem_to) &&
|                                         !in_range(mem_from, lock_to, mem_to))
|                         continue;
| So it tests, if the first byte of the lock is in the range that is freed ->OK
| And if the first byte of the *next* lock is in the range that is freed
| -> Not OK.

We can also simplify in_range checks, we need only 2 comparisons, not 4.
If the lock is not in memory range, it should be either at the left of range
or at the right.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

54561783

lockdep: fix debug_show_all_locks() · 85684873

由 Ingo Molnar 提交于 12月 05, 2007

fix the oops that can be seen in:

   http://bugzilla.kernel.org/attachment.cgi?id=13828&action=view

it is not safe to print the locks of running tasks.

(even with this fix we have a small race - but this is a debug
 function after all.)
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

85684873

sched: style cleanups · 41a2d6cf

由 Ingo Molnar 提交于 12月 05, 2007

style cleanup of various changes that were done recently.

no code changed:

      text    data     bss     dec     hex filename
     23680    2542      28   26250    668a sched.o.before
     23680    2542      28   26250    668a sched.o.after
Signed-off-by: NIngo Molnar <mingo@elte.hu>

41a2d6cf

futex: fix for futex_wait signal stack corruption · ce6bd420

由 Steven Rostedt 提交于 12月 05, 2007

David Holmes found a bug in the -rt tree with respect to
pthread_cond_timedwait. After trying his test program on the latest git
from mainline, I found the bug was there too.  The bug he was seeing
that his test program showed, was that if one were to do a "Ctrl-Z" on a
process that was in the pthread_cond_timedwait, and then did a "bg" on
that process, it would return with a "-ETIMEDOUT" but early. That is,
the timer would go off early.

Looking into this, I found the source of the problem. And it is a rather
nasty bug at that.

Here's the relevant code from kernel/futex.c: (not in order in the file)

[...]
smlinkage long sys_futex(u32 __user *uaddr, int op, u32 val,
                          struct timespec __user *utime, u32 __user *uaddr2,
                          u32 val3)
{
        struct timespec ts;
        ktime_t t, *tp = NULL;
        u32 val2 = 0;
        int cmd = op & FUTEX_CMD_MASK;

        if (utime && (cmd == FUTEX_WAIT || cmd == FUTEX_LOCK_PI)) {
                if (copy_from_user(&ts, utime, sizeof(ts)) != 0)
                        return -EFAULT;
                if (!timespec_valid(&ts))
                        return -EINVAL;

                t = timespec_to_ktime(ts);
                if (cmd == FUTEX_WAIT)
                        t = ktime_add(ktime_get(), t);
                tp = &t;
        }
[...]
        return do_futex(uaddr, op, val, tp, uaddr2, val2, val3);
}

[...]

long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
                u32 __user *uaddr2, u32 val2, u32 val3)
{
        int ret;
        int cmd = op & FUTEX_CMD_MASK;
        struct rw_semaphore *fshared = NULL;

        if (!(op & FUTEX_PRIVATE_FLAG))
                fshared = &current->mm->mmap_sem;

        switch (cmd) {
        case FUTEX_WAIT:
                ret = futex_wait(uaddr, fshared, val, timeout);

[...]

static int futex_wait(u32 __user *uaddr, struct rw_semaphore *fshared,
                      u32 val, ktime_t *abs_time)
{
[...]
               struct restart_block *restart;
                restart = &current_thread_info()->restart_block;
                restart->fn = futex_wait_restart;
                restart->arg0 = (unsigned long)uaddr;
                restart->arg1 = (unsigned long)val;
                restart->arg2 = (unsigned long)abs_time;
                restart->arg3 = 0;
                if (fshared)
                        restart->arg3 |= ARG3_SHARED;
                return -ERESTART_RESTARTBLOCK;
[...]

static long futex_wait_restart(struct restart_block *restart)
{
        u32 __user *uaddr = (u32 __user *)restart->arg0;
        u32 val = (u32)restart->arg1;
        ktime_t *abs_time = (ktime_t *)restart->arg2;
        struct rw_semaphore *fshared = NULL;

        restart->fn = do_no_restart_syscall;
        if (restart->arg3 & ARG3_SHARED)
                fshared = &current->mm->mmap_sem;
        return (long)futex_wait(uaddr, fshared, val, abs_time);
}

So when the futex_wait is interrupt by a signal we break out of the
hrtimer code and set up or return from signal. This code does not return
back to userspace, so we set up a RESTARTBLOCK.  The bug here is that we
save the "abs_time" which is a pointer to the stack variable "ktime_t t"
from sys_futex.

This returns and unwinds the stack before we get to call our signal. On
return from the signal we go to futex_wait_restart, where we update all
the parameters for futex_wait and call it. But here we have a problem
where abs_time is no longer valid.

I verified this with print statements, and sure enough, what abs_time
was set to ends up being garbage when we get to futex_wait_restart.

The solution I did to solve this (with input from Linus Torvalds)
was to add unions to the restart_block to allow system calls to
use the restart with specific parameters.  This way the futex code now
saves the time in a 64bit value in the restart block instead of storing
it on the stack.

Note: I'm a bit nervious to add "linux/types.h" and use u32 and u64
in thread_info.h, when there's a #ifdef __KERNEL__ just below that.
Not sure what that is there for.  If this turns out to be a problem, I've
tested this with using "unsigned int" for u32 and "unsigned long long" for
u64 and it worked just the same. I'm using u32 and u64 just to be
consistent with what the futex code uses.
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>

ce6bd420

D
[SYSCTL_CHECK]: Fix typo in KERN_SPARC_SCONS_PWROFF entry string. · 874a5f87
由 David S. Miller 提交于 11月 19, 2007
```
Based upon a report by Mikael Pettersson.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
874a5f87

sched: default to more agressive yield for SCHED_BATCH tasks · db292ca3

由 Ingo Molnar 提交于 12月 04, 2007

do more agressive yield for SCHED_BATCH tuned tasks: they are all
about throughput anyway. This allows a gentler migration path for
any apps that relied on stronger yield.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

db292ca3

sched: fix crash in sys_sched_rr_get_interval() · 77034937

由 Ingo Molnar 提交于 12月 04, 2007

Luiz Fernando N. Capitulino reported that sched_rr_get_interval()
crashes for SCHED_OTHER tasks that are on an idle runqueue.

The fix is to return a 0 timeslice for tasks that are on an idle
runqueue. (and which are not running, obviously)

this also shrinks the code a bit:

   text    data     bss     dec     hex filename
  47903    3934     336   52173    cbcd sched.o.before
  47885    3934     336   52155    cbbb sched.o.after
Reported-by: NLuiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

77034937

04 12月, 2007 1 次提交

uml: add !UML dependencies · b00296fb

由 Al Viro 提交于 12月 01, 2007

The previous commit ("uml: keep UML Kconfig in sync with x86") is not
enough, unfortunately.  If we go that way, we need to add dependencies
on !UML for several options.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NJeff Dike <jdike@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b00296fb

03 12月, 2007 1 次提交

sched: cpu accounting controller (V2) · d842de87

由 Srivatsa Vaddagiri 提交于 12月 02, 2007

Commit cfb52856 removed a useful feature for
us, which provided a cpu accounting resource controller.  This feature would be
useful if someone wants to group tasks only for accounting purpose and doesnt
really want to exercise any control over their cpu consumption.

The patch below reintroduces the feature. It is based on Paul Menage's
original patch (Commit 62d0df64), with
these differences:

        - Removed load average information. I felt it needs more thought (esp
	  to deal with SMP and virtualized platforms) and can be added for
	  2.6.25 after more discussions.
        - Convert group cpu usage to be nanosecond accurate (as rest of the cfs
	  stats are) and invoke cpuacct_charge() from the respective scheduler
	  classes
	- Make accounting scalable on SMP systems by splitting the usage
	  counter to be per-cpu
	- Move the code from kernel/cpu_acct.c to kernel/sched.c (since the
	  code is not big enough to warrant a new file and also this rightly
	  needs to live inside the scheduler. Also things like accessing
	  rq->lock while reading cpu usage becomes easier if the code lived in
	  kernel/sched.c)

The patch also modifies the cpu controller not to provide the same accounting
information.
Tested-by: NBalbir Singh <balbir@linux.vnet.ibm.com>

 Tested the patches on top of 2.6.24-rc3. The patches work fine. Ran
 some simple tests like cpuspin (spin on the cpu), ran several tasks in
 the same group and timed them. Compared their time stamps with
 cpuacct.usage.
Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d842de87

30 11月, 2007 4 次提交

wait_task_stopped(): pass correct exit_code to wait_noreap_copyout() · e6ceb32a

由 Scott James Remnant 提交于 11月 28, 2007

In wait_task_stopped() exit_code already contains the right value for the
si_status member of siginfo, and this is simply set in the non WNOWAIT
case.

If you call waitid() with a stopped or traced process, you'll get the signal
in siginfo.si_status as expected -- however if you call waitid(WNOWAIT) at the
same time, you'll get the signal << 8 | 0x7f

Pass it unchanged to wait_noreap_copyout(); we would only need to shift it
and add 0x7f if we were returning it in the user status field and that
isn't used for any function that permits WNOWAIT.
Signed-off-by: NScott James Remnant <scott@ubuntu.com>
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e6ceb32a

FRV: fix the extern declaration of kallsyms_num_syms · 9e6c1e63

由 David Howells 提交于 11月 28, 2007

Fix the extern declaration of kallsyms_num_syms to indicate that the symbol
does not reside in the small-data storage space, and so may not be accessed
relative to the small data base register.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9e6c1e63

Isolate the UTS namespace's domainname and hostname back · 32df81cb

由 Pavel Emelyanov 提交于 11月 28, 2007

Commit 7d69a1f4 ("remove CONFIG_UTS_NS
and CONFIG_IPC_NS") by Cedric Le Goater accidentally removed the code
that prevented the uts->hostname and uts->domainname values from being
overwritten from another namespace.

In other words, setting hostname/domainname via sysfs (echo xxx >
/proc/sys/kernel/(host|domain)name) cased the new value to be set in
init UTS namespace only.

Return the isolation back.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Acked-by: NCedric Le Goater <clg@fr.ibm.com>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

32df81cb

wait_task_stopped(): don't use task_pid_nr_ns() lockless · c8950783

由 Oleg Nesterov 提交于 11月 28, 2007

wait_task_stopped(WNOWAIT) does task_pid_nr_ns() without tasklist/rcu lock,
we can read an already freed memory.  Use the cached pid_t value.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Looks-good-to: Roland McGrath <roland@redhat.com>
Acked-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c8950783

28 11月, 2007 5 次提交

I
sched: clean up kernel/sched_stat.h · f95e0d1c
由 Ingo Molnar 提交于 11月 28, 2007
```
clean up kernel/sched_stat.h.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
f95e0d1c
I
sched: clean up overlong line in kernel/sched_debug.c · c1a89740
由 Ingo Molnar 提交于 11月 28, 2007
```
clean up overlong line in kernel/sched_debug.c.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
c1a89740

sched: clean up, move __sched_text_start/end to sched.h · deaf2227

由 Ingo Molnar 提交于 11月 28, 2007

move __sched_text_start/end to sched.h. No code changed:

   text    data     bss     dec     hex filename
  26582    2310      28   28920    70f8 sched.o.before
  26582    2310      28   28920    70f8 sched.o.after
Signed-off-by: NIngo Molnar <mingo@elte.hu>

deaf2227

I
sched: clean up sd_alloc_ctl_cpu_table() definition · 9a4e7159
由 Ingo Molnar 提交于 11月 28, 2007
```
clean up sd_alloc_ctl_cpu_table() definition.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
9a4e7159

softlockup: fix false positives on CONFIG_NOHZ · d3938204

由 Thomas Gleixner 提交于 11月 28, 2007

David Miller reported soft lockup false-positives that trigger
on NOHZ due to CPUs idling for more than 10 seconds.

The solution is touch the softlockup watchdog when we return from
idle. (by definition we are not 'locked up' when we were idle)

http://bugzilla.kernel.org/show_bug.cgi?id=9409Reported-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d3938204

27 11月, 2007 5 次提交

sched: bump version of kernel/sched_debug.c · f7b9329e

由 Ingo Molnar 提交于 11月 26, 2007

bump version of kernel/sched_debug.c and remove CFS version
information from it.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f7b9329e

sched: fix minimum granularity tunings · 722aab0c

由 Zou Nan hai 提交于 11月 26, 2007

increase the default minimum granularity some more - this gives us
more performance in aim7 benchmarks.

also correct some comments: we scale with ilog(ncpus) + 1.
Signed-off-by: NZou Nan hai <nanhai.zou@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

722aab0c

sched: fix kernel/acct.c comment · bcbe4a07

由 Ingo Molnar 提交于 11月 26, 2007

fix kernel/acct.c comment.

noticed by Lin Tan. Comment suggested by Olaf Kirch.

also see:

  http://bugzilla.kernel.org/show_bug.cgi?id=8220

Reported-by: tammy000@gmail.com
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bcbe4a07

sched: don't forget to unlock uids_mutex on error paths · 5e8869bb

由 Pavel Emelyanov 提交于 11月 26, 2007

The commit

 commit 5cb350ba
 Author: Dhaval Giani <dhaval@linux.vnet.ibm.com>
 Date:   Mon Oct 15 17:00:14 2007 +0200

    sched: group scheduling, sysfs tunables

introduced the uids_mutex and the helpers to lock/unlock it.
Unfortunately, the error paths of alloc_uid() were not patched
to unlock it.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Acked-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5e8869bb

time: add ADJ_OFFSET_SS_READ · 52bfb360

由 John Stultz 提交于 11月 26, 2007

Michael Kerrisk reported that a long standing bug in the adjtimex()
system call causes glibc's adjtime(3) function to deliver the wrong
results if 'delta' is NULL.

add the ADJ_OFFSET_SS_READ API detail, which will be used by glibc
to fix this API compatibility bug.

Also see: http://bugzilla.kernel.org/show_bug.cgi?id=6761

[ mingo@elte.hu: added patch description and made it backwards compatible ]

NOTE: the new flag is defined 0xa001 so that it returns -EINVAL on
older kernels - this way glibc can use it safely. Suggested by Ulrich
Drepper.
Acked-by: NUlrich Drepper <drepper@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

52bfb360

20 11月, 2007 5 次提交

[S390] appldata: remove unused binary sysctls. · 37e3a6ac

由 Heiko Carstens 提交于 11月 20, 2007

Remove binary sysctls that never worked due to missing strategy functions.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <geraldsc@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

37e3a6ac

[S390] cmm: remove unused binary sysctls. · 43ebbf11

由 Heiko Carstens 提交于 11月 20, 2007

Remove binary sysctls that never worked due to missing strategy functions.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

43ebbf11

[IPVS]: Move remaining sysctl handlers over to CTL_UNNUMBERED · 9055fa1f

由 Simon Horman 提交于 11月 19, 2007

Switch the remaining IPVS sysctl entries over to to use CTL_UNNUMBERED,
I stronly doubt that anyone is using the sys_sysctl interface to
these variables.
Signed-off-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9055fa1f

[IPVS]: Fix sysctl warnings about missing strategy in schedulers · 9e103fa6

由 Simon Horman 提交于 11月 19, 2007

sysctl table check failed: /net/ipv4/vs/lblc_expiration .3.5.21.19 Missing strategy
[...]
sysctl table check failed: /net/ipv4/vs/lblcr_expiration .3.5.21.20 Missing strategy

Switch these entried over to use CTL_UNNUMBERED as clearly
the sys_syscal portion wasn't working.

This is along the same lines as Christian Borntraeger's patch that fixes
up entries with no stratergy in net/ipv4/ipvs/ip_vs_ctl.c
Signed-off-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e103fa6

[IPVS]: Fix sysctl warnings about missing strategy · 611cd55b

由 Christian Borntraeger 提交于 11月 19, 2007

Running the latest git code I get the following messages during boot:
sysctl table check failed: /net/ipv4/vs/drop_entry .3.5.21.4 Missing strategy
[...]		  
sysctl table check failed: /net/ipv4/vs/drop_packet .3.5.21.5 Missing strategy
[...]
sysctl table check failed: /net/ipv4/vs/secure_tcp .3.5.21.6 Missing strategy
[...]
sysctl table check failed: /net/ipv4/vs/sync_threshold .3.5.21.24 Missing strategy

I removed the binary sysctl handler for those messages and also removed
the definitions in ip_vs.h. The alternative would be to implement a 
proper strategy handler, but syscall sysctl is deprecated.

There are other sysctl definitions that are commented out or work with 
the default sysctl_data strategy. I did not touch these. 
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NSimon Horman <horms@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

611cd55b

19 11月, 2007 1 次提交

module: fix and elaborate comments · 9a4b9708

由 Matti Linnanvuori 提交于 11月 08, 2007

Fix and elaborate comments.
Signed-off-by: NMatti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

9a4b9708

17 11月, 2007 2 次提交

ntp: fix typo that makes sync_cmos_clock erratic · fa6a1a55

由 David P. Reed 提交于 11月 14, 2007

Fix a typo in ntp.c that has caused updating of the persistent (RTC)
clock when synced to NTP to behave erratically.

When debugging a freeze that arises on my AMD64 machines when I
run the ntpd service, I added a number of printk's to monitor the
sync_cmos_clock procedure.  I discovered that it was not syncing to
cmos RTC every 11 minutes as documented, but instead would keep trying
every second for hours at a time.  The reason turned out to be a typo
in sync_cmos_clock, where it attempts to ensure that
update_persistent_clock is called very close to 500 msec. after a 1
second boundary (required by the PC RTC's spec). That typo referred to
"xtime" in one spot, rather than "now", which is derived from "xtime"
but not equal to it.  This makes the test erratic, creating a
"coin-flip" that decides when update_persistent_clock is called - when
it is called, which is rarely, it may be at any time during the one
second period, rather than close to 500 msec, so the value written is
needlessly incorrect, too.

Signed-off-by: David P. Reed
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

fa6a1a55

x86: ignore the sys_getcpu() tcache parameter · 4307d1e5

由 Ingo Molnar 提交于 11月 07, 2007

dont use the vgetcpu tcache - it's causing problems with tasks
migrating, they'll see the old cache up to a jiffy after the
migration, further increasing the costs of the migration.

In the worst case they see a complete bogus information from
the tcache, when a sys_getcpu() call "invalidated" the cache
info by incrementing the jiffies _and_ the cpuid info in the
cache and the following vdso_getcpu() call happens after
vdso_jiffies have been incremented.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NUlrich Drepper <drepper@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

4307d1e5

16 11月, 2007 5 次提交

sched: reorder SCHED_FEAT_ bits · 9612633a

由 Ingo Molnar 提交于 11月 15, 2007

reorder SCHED_FEAT_ bits so that the used ones come first. Makes
tuning instructions easier.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9612633a

sched: make sched_nr_latency static · 518b22e9

由 Adrian Bunk 提交于 11月 15, 2007

sched_nr_latency can now become static.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

518b22e9

sched: remove activate_idle_task() · 94bc9a7b

由 Dmitry Adamushko 提交于 11月 15, 2007

cpu_down() code is ok wrt sched_idle_next() placing the 'idle' task not
at the beginning of the queue.

So get rid of activate_idle_task() and make use of activate_task() instead.
It is the same as activate_task(), except for the update_rq_clock(rq) call
that is redundant.

Code size goes down:

   text    data     bss     dec     hex filename
  47853    3934     336   52123    cb9b sched.o.before
  47828    3934     336   52098    cb82 sched.o.after
Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

94bc9a7b

sched: fix __set_task_cpu() SMP race · ce96b5ac

由 Dmitry Adamushko 提交于 11月 15, 2007

Grant Wilson has reported rare SCHED_FAIR_USER crashes on his quad-core
system, which crashes can only be explained via runqueue corruption.

there is a narrow SMP race in __set_task_cpu(): after ->cpu is set up to
a new value, task_rq_lock(p, ...) can be successfuly executed on another
CPU. We must ensure that updates of per-task data have been completed by
this moment.

this bug has been hiding in the Linux scheduler for an eternity (we never
had any explicit barrier for task->cpu in set_task_cpu() - so the bug was
introduced in 2.5.1), but only became visible via set_task_cfs_rq() being
accidentally put after the task->cpu update. It also probably needs a
sufficiently out-of-order CPU to trigger.
Reported-by: NGrant Wilson <grant.wilson@zen.co.uk>
Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ce96b5ac

sched: fix SCHED_FIFO tasks & FAIR_GROUP_SCHED · dae51f56

由 Oleg Nesterov 提交于 11月 15, 2007

Suppose that the SCHED_FIFO task does

	switch_uid(new_user);

Now, p->se.cfs_rq and p->se.parent both point into the old
user_struct->tg because sched_move_task() doesn't call set_task_cfs_rq()
for !fair_sched_class case.

Suppose that old user_struct/task_group is freed/reused, and the task
does

	sched_setscheduler(SCHED_NORMAL);

__setscheduler() sets fair_sched_class, but doesn't update
->se.cfs_rq/parent which point to the freed memory.

This means that check_preempt_wakeup() doing

		while (!is_same_group(se, pse)) {
			se = parent_entity(se);
			pse = parent_entity(pse);
		}

may OOPS in a similar way if rq->curr or p did something like above.

Perhaps we need something like the patch below, note that
__setscheduler() can't do set_task_cfs_rq().
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dae51f56

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功