提交 · 91e37a793b5a9436a2d12b2f0a8f52db3a133e1d · xiphi1978 / linux

09 5月, 2008 31 次提交

module: don't ignore vermagic string if module doesn't have modversions · 91e37a79

由 Rusty Russell 提交于 5月 09, 2008

Linus found a logic bug: we ignore the version number in a module's
vermagic string if we have CONFIG_MODVERSIONS set, but modversions
also lets through a module with no __versions section for modprobe
--force (with tainting, but still).

We should only ignore the start of the vermagic string if the module
actually *has* crcs to check.  Rather than (say) having an
entertaining hissy fit and creating a config option to work around the
buggy code.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

91e37a79

module: be more picky about allowing missing module versions · a5dd6970

由 Rusty Russell 提交于 5月 09, 2008

We allow missing __versions sections, because modprobe --force strips
it. It makes less sense to allow sections where there's no version
for a specific symbol the module uses, so disallow that.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a5dd6970

module: put modversions in vermagic · 6c2545ee

由 Rusty Russell 提交于 5月 09, 2008

Don't allow a module built without versions altogether to be inserted
into a kernel which expects modversions.

modprobe --force will strip vermagic as well as modversions, so it
won't be effected, but this will make sure that a
non-CONFIG_MODVERSIONS module won't be accidentally inserted into a
CONFIG_MODVERSIONS kernel.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6c2545ee

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · 28a4acb4

由 Linus Torvalds 提交于 5月 08, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
  net: Added ASSERT_RTNL() to dev_open() and dev_close().
  can: Fix can_send() handling on dev_queue_xmit() failures
  netns: Fix arbitrary net_device-s corruptions on net_ns stop.
  netfilter: Kconfig: default DCCP/SCTP conntrack support to the protocol config values
  netfilter: nf_conntrack_sip: restrict RTP expect flushing on error to last request
  macvlan: Fix memleak on device removal/crash on module removal
  net/ipv4: correct RFC 1122 section reference in comment
  tcp FRTO: SACK variant is errorneously used with NewReno
  e1000e: don't return half-read eeprom on error
  ucc_geth: Don't use RX clock as TX clock.
  cxgb3: Use CAP_SYS_RAWIO for firmware
  pcnet32: delete non NAPI code from driver.
  fs_enet: Fix a memory leak in fs_enet_mdio_probe
  [netdrvr] eexpress: IPv6 fails - multicast problems
  3c59x: use netstats in net_device structure
  3c980-TX needs EXTRA_PREAMBLE
  fix warning in drivers/net/appletalk/cops.c
  e1000e: Add support for BM PHYs on ICH9
  uli526x: fix endianness issues in the setup frame
  uli526x: initialize the hardware prior to requesting interrupts
  ...

28a4acb4

L
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 · 89f92d64
由 Linus Torvalds 提交于 5月 08, 2008
```
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: Fix SA_ONSTACK signal handling.
```
89f92d64

Revert "PCI: remove default PCI expansion ROM memory allocation" · 8d539108

由 Linus Torvalds 提交于 5月 08, 2008

This reverts commit 9f8dacca, which was
reported to break X startup (xf86-video-ati-6.8.0). See

	http://bugs.freedesktop.org/show_bug.cgi?id=15523

for details.
Reported-by: NLaurence Withers <l@lwithers.me.uk>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8d539108

Merge branch 'for-linus' of... · c4f51b46

由 Linus Torvalds 提交于 5月 08, 2008

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes:
  sched: fix weight calculations
  semaphore: fix

c4f51b46

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 · f5892745

由 Linus Torvalds 提交于 5月 08, 2008

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  [ALSA] soc at91 minor bug fixes
  [ALSA] soc - at91-pcm - Fix line wrapping
  pcspkr: fix dependancies

f5892745

Remove duplicated include in net/sunrpc/svc.c · 625fc3a3

由 Huang Weiyi 提交于 5月 08, 2008

<linux/sched.h> we included twice.
Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

625fc3a3

fs/proc/task_mmu.c: remove duplicated include files · 19566ca6

由 Huang Weiyi 提交于 5月 08, 2008

Removed duplicated include files <linux/ptrace.h> and <linux/seq_file.h> in
fs/proc/task_mmu.c.
Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

19566ca6

Fix drivers/media build for modular builds · f7c83a0a

由 Ingo Molnar 提交于 4月 30, 2008

Fix allmodconfig build bug introduced in latest -git by commit
7c91f062 ("V4L/DVB(7767): Move tuners to common/tuners"):

  LD      kernel/built-in.o
  LD      drivers/built-in.o
  ld: drivers/media/built-in.o: No such file: No such file or directory

which happens if all media drivers are modular:

  http://redhat.com/~mingo/misc/config-Wed_Apr_30_09_24_48_CEST_2008.bad

In that case there's no obj-y rule connecting all the built-in.o files and
the link tree breaks.

The fix is to add a guaranteed obj-y rule for the core vmlinux to build.
(which results in an empty object file if all media drivers are modular)
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f7c83a0a

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband · 8e1bf9ff

由 Linus Torvalds 提交于 5月 08, 2008

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/ehca: Wait for async events to finish before destroying QP
  IB/ipath: Fix SDMA error recovery in absence of link status change
  IB/ipath: Need to always request and handle PIO avail interrupts
  IB/ipath: Fix count of packets received by kernel
  IB/ipath: Return the correct opcode for RDMA WRITE with immediate
  IB/ipath: Fix bug that can leave sends disabled after freeze recovery
  IB/ipath: Only increment SSN if WQE is put on send queue
  IB/ipath: Only warn about prototype chip during init
  RDMA/cxgb3: Fix severe limit on userspace memory registration size
  RDMA/cxgb3: Don't add PBL memory to gen_pool in chunks

8e1bf9ff

MN10300: Make cpu_relax() invoke barrier() · 148c69b4

由 David Howells 提交于 5月 07, 2008

Make cpu_relax() invoke barrier() to be the same as other arches.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

148c69b4

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · 7a34912d

由 Linus Torvalds 提交于 5月 08, 2008

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Revert "relay: fix splice problem"
  docbook: fix bio missing parameter
  block: use unitialized_var() in bio_alloc_bioset()
  block: avoid duplicate calls to get_part() in disk stat code
  cfq-iosched: make io priorities inherit CPU scheduling class as well as nice
  block: optimize generic_unplug_device()
  block: get rid of likely/unlikely predictions in merge logic
  vfs: splice remove_suid() cleanup
  cfq-iosched: fix RCU race in the cfq io_context destructor handling
  block: adjust tagging function queue bit locking
  block: sysfs store function needs to grab queue_lock and use queue_flag_*()

7a34912d

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 · 0f1bce41

由 Linus Torvalds 提交于 5月 08, 2008

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
  udf: Fix memory corruption when fs mounted with noadinicb option
  udf: Make udf exportable
  udf: fs/udf/partition.c:udf_get_pblock() mustn't be inline

0f1bce41

Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 · da1ba891

由 Linus Torvalds 提交于 5月 08, 2008

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] guest page hinting light
  [S390] tty3270: fix put_char fail/success conversion.
  [S390] compat ptrace cleanup
  [S390] s390mach compile warning
  [S390] cio: Fix parsing mechanism for blacklisted devices.
  [S390] cio: Remove cio_msg kernel parameter.
  [S390] s390-kvm: leave sie context on work. Removes preemption requirement
  [S390] s390: Optimize user and work TIF check

da1ba891

drivers/scsi/dpt_i2o.c: fix build on alpha · 8b2cc917

由 Andrew Morton 提交于 5月 06, 2008

alpha:

drivers/scsi/dpt_i2o.c:1997: error: implicit declaration of function 'adpt_alpha_info'
drivers/scsi/dpt_i2o.c: At top level:
drivers/scsi/dpt_i2o.c:2032: warning: conflicting types for 'adpt_alpha_info'
drivers/scsi/dpt_i2o.c:2032: error: static declaration of 'adpt_alpha_info' follows non-static declaration
drivers/scsi/dpt_i2o.c:1997: error: previous implicit declaration of 'adpt_alpha_info' was here

Due to a copy-n-paste error in drivers/scsi/dpti.h.

Fix that up and remove some of the many daft static-declarations-in-a-header
which this driver enjoys.

Cc: Miquel van Smoorenburg <miquels@cistron.nl>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8b2cc917

Fix cpuset sched_relax_domain_level control file · 5be7a479

由 Paul Menage 提交于 5月 06, 2008

Due to a merge conflict, the sched_relax_domain_level control file was marked
as being handled by cpuset_read/write_u64, but the code to handle it was
actually in cpuset_common_file_read/write.

Since the value being written/read is in fact a signed integer, it should be
treated as such; this patch adds cpuset_read/write_s64 functions, and uses
them to handle the sched_relax_domain_level file.

With this patch, the sched_relax_domain_level can be read and written, and the
correct contents seen/updated.
Signed-off-by: NPaul Menage <menage@google.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5be7a479

slub: fix atomic usage in any_slab_objects() · 4ea33e2d

由 Benjamin Herrenschmidt 提交于 5月 06, 2008

any_slab_objects() does an atomic_read on an atomic_long_t, this
fixes it to use atomic_long_read instead.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4ea33e2d

sys_pipe(): fix file descriptor leaks · ba719bae

由 Ulrich Drepper 提交于 5月 06, 2008

Remember to close the files if copy_to_user() failed.

Spotted by dm.n9107@gmail.com.
Signed-off-by: NUlrich Drepper <drepper@redhat.com>
Cc: DM <dm.n9107@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ba719bae

vt: fix canonical input in UTF-8 mode · c1236d31

由 Samuel Thibault 提交于 5月 06, 2008

For e.g.  proper TTY canonical support, IUTF8 termios flag has to be set as
appropriate.  Linux used to not care about setting that flag for VT TTYs.

This patch fixes that by activating it according to the current mode of the
VT, and sets the default value according to the vt.default_utf8 parameter.
Signed-off-by: NSamuel Thibault <samuel.thibault@ens-lyon.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c1236d31

usb/asix: add Buffalo LUA-U2-GT 10/100/1000 · ac7b77f1

由 Mattia Dongili 提交于 5月 06, 2008

The USB net adapter Buffalo LUA-U2-GT (0411:006e) carries a AX88178 chip.
Tested on the above HW.
Signed-off-by: NMattia Dongili <malattia@linux.it>
Acked-off-by: NDavid Hollis <dhollis@davehollis.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: NJeff Garzik <jeff@garzik.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ac7b77f1

sx.c: fix printk warnings on sparc32 · 32fb3ca8

由 Andrew Morton 提交于 5月 06, 2008

drivers/char/sx.c: In function 'sx_set_real_termios':
drivers/char/sx.c:973: warning: format '%u' expects type 'unsigned int', but argument 2 has type 'long unsigned int'
drivers/char/sx.c:999: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'tcflag_t'
drivers/char/sx.c:1012: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'tcflag_t'

sparc32 seems to use weird types for its tty things.

[ Fine by me but this is ancient debug and most of the debug in sx just
  wants deleting eventually.  - Alan ]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NAlan Cox <alan@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

32fb3ca8

uml: fix inconsistence due to tty_operation change · 3168cb98

由 WANG Cong 提交于 5月 06, 2008

'put_char' of 'struct tty_operations' has changed from 'void' into 'int'.
This can also shut up compiler warnings.

Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: NWANG Cong <wangcong@zeuux.org>
Acked-by: NAlan Cox <alan@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3168cb98

misc: fix integer as NULL pointer warnings · cb6969e8

由 Harvey Harrison 提交于 5月 06, 2008

drivers/md/raid10.c:889:17: warning: Using plain integer as NULL pointer
drivers/media/video/cx18/cx18-driver.c:616:12: warning: Using plain integer as NULL pointer
sound/oss/kahlua.c:70:12: warning: Using plain integer as NULL pointer
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cb6969e8

fix irq flags for iuu_phoenix.c · 8594303a

由 Steven Rostedt 提交于 5月 06, 2008

The file drivers/usb/serial/iuu_phoenix.c uses "int" for flags.  This can
cause hard to find bugs on some architectures.  This patch converts the flags
to use "long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8594303a

fix irq flags in rtc-ds1511 · 9a0f4aea

由 Steven Rostedt 提交于 5月 06, 2008

The file in drivers/rtc/rtc-ds1551.c uses "int" for flags.  This can cause
hard to find bugs on some architectures.  This patch converts the flags to use
"long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9a0f4aea

fix irq flags in saa7134 · a8b1ecf3

由 Steven Rostedt 提交于 5月 06, 2008

Some files in the drivers/media/video/saa7134 directory uses "int" for flags.
This can cause hard to find bugs on some architectures. This patch converts
the flags to use "long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Acked-by: NMauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a8b1ecf3

fix irq flags in mac80211 code · a1f2aa1b

由 Steven Rostedt 提交于 5月 06, 2008

A file in the net/mac80211 directory uses "int" for flags.  This can cause
hard to find bugs on some architectures.  This patch converts the flags to use
"long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).
Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a1f2aa1b

serial: access after NULL check in uart_flush_buffer() · 55d7b689

由 Tetsuo Handa 提交于 5月 06, 2008

I noticed that

  static void uart_flush_buffer(struct tty_struct *tty)
  {
  	struct uart_state *state = tty->driver_data;
  	struct uart_port *port = state->port;
  	unsigned long flags;

  	/*
  	 * This means you called this function _after_ the port was
  	 * closed.  No cookie for you.
  	 */
  	if (!state || !state->info) {
  		WARN_ON(1);
  		return;
  	}

is too late for checking state != NULL.
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

55d7b689

Kconfig: improved help for CONFIG_ACCESSIBILITY · 3f9827bc

由 Samuel Thibault 提交于 5月 06, 2008

Add a small explanation of what accessibility is.
Signed-off-by: NSamuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3f9827bc

08 5月, 2008 9 次提交

sched: fix weight calculations · 46151122

由 Mike Galbraith 提交于 5月 08, 2008

The conversion between virtual and real time is as follows:

  dvt = rw/w * dt <=> dt = w/rw * dvt

Since we want the fair sleeper granularity to be in real time, we actually
need to do:

  dvt = - rw/w * l

This bug could be related to the regression reported by Yanmin Zhang:

| Comparing with kernel 2.6.25, sysbench+mysql(oltp, readonly) has lots
| of regressions with 2.6.26-rc1:
|
| 1) 8-core stoakley: 28%;
| 2) 16-core tigerton: 20%;
| 3) Itanium Montvale: 50%.
Reported-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

46151122

semaphore: fix · bf726eab

由 Ingo Molnar 提交于 5月 08, 2008

Yanmin Zhang reported:

| Comparing with kernel 2.6.25, AIM7 (use tmpfs) has more th
| regression under 2.6.26-rc1 on my 8-core stoakley, 16-core tigerton,
| and Itanium Montecito. Bisect located the patch below:
|
| 64ac24e7 is first bad commit
| commit 64ac24e7
| Author: Matthew Wilcox <matthew@wil.cx>
| Date:   Fri Mar 7 21:55:58 2008 -0500
|
|     Generic semaphore implementation
|
| After I manually reverted the patch against 2.6.26-rc1 while fixing
| lots of conflicts/errors, aim7 regression became less than 2%.

i reproduced the AIM7 workload and can confirm Yanmin's findings that
-.26-rc1 regresses over .25 - by over 67% here.

Looking at the workload i found and fixed what i believe to be the real
bug causing the AIM7 regression: it was inefficient wakeup / scheduling
/ locking behavior of the new generic semaphore code, causing suboptimal
performance.

The problem comes from the following code. The new semaphore code does
this on down():

        spin_lock_irqsave(&sem->lock, flags);
        if (likely(sem->count > 0))
                sem->count--;
        else
                __down(sem);
        spin_unlock_irqrestore(&sem->lock, flags);

and this on up():

        spin_lock_irqsave(&sem->lock, flags);
        if (likely(list_empty(&sem->wait_list)))
                sem->count++;
        else
                __up(sem);
        spin_unlock_irqrestore(&sem->lock, flags);

where __up() does:

        list_del(&waiter->list);
        waiter->up = 1;
        wake_up_process(waiter->task);

and where __down() does this in essence:

        list_add_tail(&waiter.list, &sem->wait_list);
        waiter.task = task;
        waiter.up = 0;
        for (;;) {
                [...]
                spin_unlock_irq(&sem->lock);
                timeout = schedule_timeout(timeout);
                spin_lock_irq(&sem->lock);
                if (waiter.up)
                        return 0;
        }

the fastpath looks good and obvious, but note the following property of
the contended path: if there's a task on the ->wait_list, the up() of
the current owner will "pass over" ownership to that waiting task, in a
wake-one manner, via the waiter->up flag and by removing the waiter from
the wait list.

That is all and fine in principle, but as implemented in
kernel/semaphore.c it also creates a nasty, hidden source of contention!

The contention comes from the following property of the new semaphore
code: the new owner owns the semaphore exclusively, even if it is not
running yet.

So if the old owner, even if just a few instructions later, does a
down() [lock_kernel()] again, it will be blocked and will have to wait
on the new owner to eventually be scheduled (possibly on another CPU)!
Or if another task gets to lock_kernel() sooner than the "new owner"
scheduled, it will be blocked unnecessarily and for a very long time
when there are 2000 tasks running.

I.e. the implementation of the new semaphores code does wake-one and
lock ownership in a very restrictive way - it does not allow
opportunistic re-locking of the lock at all and keeps the scheduler from
picking task order intelligently.

This kind of scheduling, with 2000 AIM7 processes running, creates awful
cross-scheduling between those 2000 tasks, causes reduced parallelism, a
throttled runqueue length and a lot of idle time. With increasing number
of CPUs it causes an exponentially worse behavior in AIM7, as the chance
for a newly woken new-owner task to actually run anytime soon is less
and less likely.

Note that it takes just a tiny bit of contention for the 'new-semaphore
catastrophy' to happen: the wakeup latencies get added to whatever small
contention there is, and quickly snowball out of control!

I believe Yanmin's findings and numbers support this analysis too.

The best fix for this problem is to use the same scheduling logic that
the kernel/mutex.c code uses: keep the wake-one behavior (that is OK and
wanted because we do not want to over-schedule), but also allow
opportunistic locking of the lock even if a wakee is already "in
flight".

The patch below implements this new logic. With this patch applied the
AIM7 regression is largely fixed on my quad testbox:

  # v2.6.25 vanilla:
  ..................
  Tasks   Jobs/Min        JTI     Real    CPU     Jobs/sec/task
  2000    56096.4         91      207.5   789.7   0.4675
  2000    55894.4         94      208.2   792.7   0.4658

  # v2.6.26-rc1-166-gc0a18111 vanilla:
  ...................................
  Tasks   Jobs/Min        JTI     Real    CPU     Jobs/sec/task
  2000    33230.6         83      350.3   784.5   0.2769
  2000    31778.1         86      366.3   783.6   0.2648

  # v2.6.26-rc1-166-gc0a18111 + semaphore-speedup:
  ...............................................
  Tasks   Jobs/Min        JTI     Real    CPU     Jobs/sec/task
  2000    55707.1         92      209.0   795.6   0.4642
  2000    55704.4         96      209.0   796.0   0.4642

i.e. a 67% speedup. We are now back to within 1% of the v2.6.25
performance levels and have zero idle time during the test, as expected.

Btw., interactivity also improved dramatically with the fix - for
example console-switching became almost instantaneous during this
workload (which after all is running 2000 tasks at once!), without the
patch it was stuck for a minute at times.

There's another nice side-effect of this speedup patch, the new generic
semaphore code got even smaller:

   text    data     bss     dec     hex filename
   1241       0       0    1241     4d9 semaphore.o.before
   1207       0       0    1207     4b7 semaphore.o.after

(because the waiter.up complication got removed.)

Longer-term we should look into using the mutex code for the generic
semaphore code as well - but i's not easy due to legacies and it's
outside of the scope of v2.6.26 and outside the scope of this patch as
well.
Bisected-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bf726eab

J
Revert "relay: fix splice problem" · 75065ff6
由 Jens Axboe 提交于 5月 08, 2008
```
This reverts commit c3270e57.
```
75065ff6

[ALSA] soc at91 minor bug fixes · e3a2efa6

由 Patrik Sevallius 提交于 5月 08, 2008

Found these two bugs while browsing through the code.  The first one is
a cut-n-paste bug, instead of disabling the clock when request_irq()
fails, it enabled it once more.  The second one fixes a debug printout,
AT91_SSC_IER is write only, AT91_SSC_IMR is readable (the printed string
actually says imr).

Frank Mandarino was busy so he asked me to send these to this list.

/Patrik
Signed-off-by: NPatrik Sevallius <patrik.sevallius@enea.com>
Acked-by: NFrank Mandarino <fmandarino@endrelia.com>
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>

e3a2efa6

[ALSA] soc - at91-pcm - Fix line wrapping · 30a717f7

由 Mark Brown 提交于 5月 08, 2008

There's more checkpatch stuff to fix in the driver, this just fixes the
minimum required for the following patch to be clean.
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>

30a717f7

net: Added ASSERT_RTNL() to dev_open() and dev_close(). · e46b66bc

由 Ben Hutchings 提交于 5月 08, 2008

dev_open() and dev_close() must be called holding the RTNL, since they
call device functions and netdevice notifiers that are promised the RTNL.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e46b66bc

can: Fix can_send() handling on dev_queue_xmit() failures · c2ab7ac2

由 Oliver Hartkopp 提交于 5月 08, 2008

The tx packet counting and the local loopback of CAN frames should
only happen in the case that the CAN frame has been enqueued to the
netdevice tx queue successfully.

Thanks to Andre Naujoks <nautsch@gmail.com> for reporting this issue.
Signed-off-by: NOliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: NUrs Thuermann <urs@isnogud.escape.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2ab7ac2

D

Merge branch 'upstream-davem' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 · 33f9936b
由 David S. Miller 提交于 5月 08, 2008

33f9936b

netns: Fix arbitrary net_device-s corruptions on net_ns stop. · aca51397

由 Pavel Emelyanov 提交于 5月 08, 2008

When a net namespace is destroyed, some devices (those, not killed
on ns stop explicitly) are moved back to init_net.

The problem, is that this net_ns change has one point of failure -
the __dev_alloc_name() may be called if a name collision occurs (and
this is easy to trigger). This allocator performs a likely-to-fail
GFP_ATOMIC allocation to find a suitable number. Other possible 
conditions that may cause error (for device being ns local or not
registered) are always false in this case.

So, when this call fails, the device is unregistered. But this is
*not* the right thing to do, since after this the device may be
released (and kfree-ed) improperly. E. g. bridges require more
actions (sysfs update, timer disarming, etc.), some other devices 
want to remove their private areas from lists, etc.

I. e. arbitrary use-after-free cases may occur.

The proposed fix is the following: since the only reason for the
dev_change_net_namespace to fail is the name generation, we may
give it a unique fall-back name w/o %d-s in it - the dev<ifindex>
one, since ifindexes are still unique.

So make this change, raise the failure-case printk loglevel to 
EMERG and replace the unregister_netdevice call with BUG().

[ Use snprintf() -DaveM ]
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aca51397