提交 · 4265f161b6bb7b31163671329b1142b9023bf4e3 · Linux-御风守护者 / linux

17 3月, 2008 7 次提交

virtio: fix race in enable_cb · 4265f161

由 Christian Borntraeger 提交于 3月 14, 2008

There is a race in virtio_net, dealing with disabling/enabling the callback.
I saw the following oops:

kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:218!
illegal operation: 0001 [#1] SMP
Modules linked in: sunrpc dm_mod
CPU: 2 Not tainted 2.6.25-rc1zlive-host-10623-gd358142-dirty #99
Process swapper (pid: 0, task: 000000000f85a610, ksp: 000000000f873c60)
Krnl PSW : 0404300180000000 00000000002b81a6 (vring_disable_cb+0x16/0x20)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3
Krnl GPRS: 0000000000000001 0000000000000001 0000000010005800 0000000000000001
           000000000f3a0900 000000000f85a610 0000000000000000 0000000000000000
           0000000000000000 000000000f870000 0000000000000000 0000000000001237
           000000000f3a0920 000000000010ff74 00000000002846f6 000000000fa0bcd8
Krnl Code: 00000000002b819a: a7110001           tmll    %r1,1
           00000000002b819e: a7840004           brc     8,2b81a6
           00000000002b81a2: a7f40001           brc     15,2b81a4
          >00000000002b81a6: a51b0001           oill    %r1,1
           00000000002b81aa: 40102000           sth     %r1,0(%r2)
           00000000002b81ae: 07fe               bcr     15,%r14
           00000000002b81b0: eb7ff0380024       stmg    %r7,%r15,56(%r15)
           00000000002b81b6: a7f13e00           tmll    %r15,15872
Call Trace:
([<000000000fa0bcd0>] 0xfa0bcd0)
 [<00000000002b8350>] vring_interrupt+0x5c/0x6c
 [<000000000010ab08>] do_extint+0xb8/0xf0
 [<0000000000110716>] ext_no_vtime+0x16/0x1a
 [<0000000000107e72>] cpu_idle+0x1c2/0x1e0

The problem can be triggered with a high amount of host->guest traffic.
I think its the following race:

poll says netif_rx_complete
poll calls enable_cb
enable_cb opens the interrupt mask
a new packet comes, an interrupt is triggered----\
enable_cb sees that there is more work           |
enable_cb disables the interrupt                 |
       .                                         V
       .                            interrupt is delivered
       .                            skb_recv_done does atomic napi test, ok
 some waiting                       disable_cb is called->check fails->bang!
       .
poll would do napi check
poll would do disable_cb

The fix is to let enable_cb not disable the interrupt again, but expect the
caller to do the cleanup if it returns false. In that case, the interrupt is
only disabled, if the napi test_set_bit was successful.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cleaned up doco)

4265f161

virtio: Enable netpoll interface for netconsole logging · da74e89d

由 Amit Shah 提交于 2月 29, 2008

Add a new poll_controller handler that the netpoll interface needs.

This enables netconsole logging from a kvm guest over the virtio
net interface.
Signed-off-by: NAmit Shah <amitshah@gmx.net>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

da74e89d

virtio: handle > 2 billion page balloon targets · bdc1681c

由 Rusty Russell 提交于 3月 17, 2008

If the host asks for a huge target towards_target() can overflow, and
we up oops as we try to release more pages than we have.  The simple
fix is to use a 64-bit value.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

bdc1681c

virtio: Fix sysfs bits to have proper block symlink · c4839346

由 Jeremy Katz 提交于 3月 02, 2008

Fix up so that the virtio_blk devices in sysfs link correctly to their
block device.  This then allows them to be detected by hal, etc
Signed-off-by: NJeremy Katz <katzj@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

c4839346

virtio: Use spin_lock_irqsave/restore for virtio-pci · 27ebe308

由 Anthony Liguori 提交于 3月 02, 2008

virtio-pci acquires its spin lock in an interrupt context so it's necessary
to use spin_lock_irqsave/restore variants.  This patch fixes guest SMP when
using virtio devices in KVM.
Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

27ebe308

L

Linux 2.6.25-rc6 · a978b30a
由 Linus Torvalds 提交于 3月 16, 2008

a978b30a

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6 · 69d1d523

由 Linus Torvalds 提交于 3月 16, 2008

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
  [PARISC] make ptr_to_pide() static
  [PARISC] head.S: section mismatch fixes
  [PARISC] add back Crestone Peak cpu
  [PARISC] futex: special case cmpxchg NULL in kernel space
  [PARISC] clean up show_stack
  [PARISC] add pa8900 CPUs to hardware inventory
  [PARISC] clean up include/asm-parisc/elf.h
  [PARISC] move defconfig to arch/parisc/configs/
  [PARISC] add back AD1889 MAINTAINERS entry
  [PARISC] pdc_console: fix bizarre panic on boot
  [PARISC] dump_stack in show_regs
  [PARISC] pdc_stable: fix compile errors
  [PARISC] remove unused pdc_iodc_printf function
  [PARISC] bump __NR_syscalls
  [PARISC] unbreak pgalloc.h
  [PARISC] move VMALLOC_* definitions to fixmap.h
  [PARISC] wire up timerfd syscalls
  [PARISC] remove old timerfd syscall

69d1d523

16 3月, 2008 21 次提交

[PARISC] make ptr_to_pide() static · 56ee0cfd

由 FUJITA Tomonori 提交于 3月 10, 2008

Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>

56ee0cfd

[PARISC] head.S: section mismatch fixes · 0c634cc6

由 Helge Deller 提交于 12月 26, 2007

- move boot_args[] into the init section
- move $global$ into the read_mostly section
- fix the following two section mismatches:
WARNING: vmlinux.o(.text+0x9c): Section mismatch: reference to .init.text:start_kernel (between '$pgt_fill_loop' and '$is_pa20')
WARNING: vmlinux.o(.text+0xa0): Section mismatch: reference to .init.text:start_kernel (between '$pgt_fill_loop' and '$is_pa20')
Signed-off-by: NHelge Deller <deller@gmx.de>
SIgned-off-by: NKyle McMartin <kyle@mcmartin.ca>

0c634cc6

[PARISC] add back Crestone Peak cpu · ab86adb4

由 Kyle McMartin 提交于 3月 01, 2008

Crestone Peak Slow is the 800MHz PA-8800 cpu in the C8000.
0x88B is probably the Crestone Peak Fast.
Signed-off-by: NKyle McMartin <kyle@mcmartin.ca>

ab86adb4

[PARISC] futex: special case cmpxchg NULL in kernel space · c20a84c9

由 Kyle McMartin 提交于 3月 01, 2008

Commit a0c1e907 added code to futex.c
to detect whether futex_atomic_cmpxchg_inatomic was implemented at run
time:

+       curval = cmpxchg_futex_value_locked(NULL, 0, 0);
+       if (curval == -EFAULT)
+               futex_cmpxchg_enabled = 1;

This is bogus on parisc, since page zero in kernel virtual space is the
gateway page for syscall entry, and should not be read from the kernel.
(That, and we really don't like the kernel faulting on its own address
 space...)
Signed-off-by: NKyle McMartin <kyle@mcmartin.ca>

c20a84c9

[PARISC] clean up show_stack · dc39455e

由 Kyle McMartin 提交于 3月 01, 2008

When we show_regs, we obviously have a struct pt_regs of the calling
frame. Use these in show_stack so we don't have the entire bogus call trace
up to the show_stack call.
Signed-off-by: NKyle McMartin <kyle@mcmartin.ca>

dc39455e

[PARISC] add pa8900 CPUs to hardware inventory · b23f5baa

由 James Bottomley 提交于 2月 20, 2008

This patch adds the known pa8900 CPUs to the inventory list and removes
the Crestone Peak one which apparently never escaped into the wild.
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NKyle McMartin <kyle@mcmartin.ca>

b23f5baa

[PARISC] clean up include/asm-parisc/elf.h · fd5d3f6a

由 Randolph Chung 提交于 2月 24, 2008

Cleanup some cruft. No functionality changes.
Signed-off-by: NRandolph Chung <tausq@parisc-linux.org>
Signed-off-by: NKyle McMartin <kyle@mcmartin.ca>

fd5d3f6a

[PARISC] move defconfig to arch/parisc/configs/ · c04f7ae2

由 Adrian Bunk 提交于 2月 26, 2008

This patch moves the default parisc defconfig to
arch/parisc/configs/generic_defconfig where it belongs and selects it as
the default defconfig through KBUILD_DEFCONFIG.
Signed-off-by: NAdrian Bunk <adrian.bunk@movial.fi>
Signed-off-by: NKyle McMartin <kyle@mcmartin.ca>

c04f7ae2

[PARISC] add back AD1889 MAINTAINERS entry · 2f39d519

由 Thibaut VARENE 提交于 2月 20, 2008

Signed-off-by: NThibaut VARENE <T-Bone@parisc-linux.org>
Signed-off-by: NKyle McMartin <kyle@mcmartin.ca>

2f39d519

[PARISC] pdc_console: fix bizarre panic on boot · ef1afd4d

由 Kyle McMartin 提交于 2月 18, 2008

Commit 721fdf34 introduced a subtle bug
by accidently removing the "static" from iodc_dbuf. This resulted in, what
appeared to be, a trap without *current set to a task. Probably the result of
a trap in real mode while calling firmware.

Also do other misc clean ups. Since the only input from firmware is non
blocking, share iodc_dbuf between input and output, and spinlock the
only callers.
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>

ef1afd4d

[PARISC] dump_stack in show_regs · d0347b49

由 Kyle McMartin 提交于 2月 18, 2008

Originally, show_stack was used in BUG() output. However, a recent commit
changed it to print register state (no idea what that's supposed to help,
really...) and parisc was missing a backtrace because of it.
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>

d0347b49

[PARISC] pdc_stable: fix compile errors · ff451d70

由 Joel Soete 提交于 2月 18, 2008

Signed-off-by: NJoel Soete <rubisher@scarlet.be>
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>

ff451d70

K
[PARISC] remove unused pdc_iodc_printf function · 179183bf
由 Kyle McMartin 提交于 2月 18, 2008
```
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>
```
179183bf

[PARISC] bump __NR_syscalls · e2be75ae

由 Kyle McMartin 提交于 2月 18, 2008

oops, forgot this in the previous commit.
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>

e2be75ae

[PARISC] unbreak pgalloc.h · 9aa150b8

由 Kyle McMartin 提交于 2月 18, 2008

Commit 2f569afd broke the compile
rather spectacularly. Fix code errors.
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>

9aa150b8

K
[PARISC] move VMALLOC_* definitions to fixmap.h · d912e1dc
由 Kyle McMartin 提交于 2月 18, 2008
```
They make way more sense here, really...
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>
```
d912e1dc
K
[PARISC] wire up timerfd syscalls · ff80c66a
由 Kyle McMartin 提交于 2月 18, 2008
```
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>
```
ff80c66a
K
[PARISC] remove old timerfd syscall · 0cb845ec
由 Kyle McMartin 提交于 2月 18, 2008
```
Signed-off-by: NKyle McMartin <kyle@parisc-linux.org>
```
0cb845ec

ACPI: Remove ACPI_CUSTOM_DSDT_INITRD option · 9a9e0d68

由 Linus Torvalds 提交于 3月 15, 2008

This essentially reverts commit 71fc47a9
("ACPI: basic initramfs DSDT override support"), because the code simply
isn't ready.

It did ugly things to the init sequence to populate the rootfs image
early, but that just ended up showing other problems with the whole
approach.  The fact is, the VFS layer simply isn't initialized this
early, and the relevant ACPI code should either run much later, or this
shouldn't be done at all.

For 2.6.25, we'll just pick the latter option.  We can revisit this
concept later if necessary.

Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Tilman Schmidt <tilman@imap.cc>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Eric Piel <eric.piel@tremplin-utc.net>
Cc: Len Brown <len.brown@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Markus Gaugusch <dsdt@gaugusch.at>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9a9e0d68

tifm_sd: DATA_CARRY is not boolean in tifm_sd_transfer_data() · ce636452

由 Roel Kluin 提交于 3月 15, 2008

DATA_CARRY is not boolean
Signed-off-by: NRoel Kluin <12o3l@tiscali.nl>
Signed-off-by: NPierre Ossman <drzeus@drzeus.cx>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ce636452

L
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · afbf331e
由 Linus Torvalds 提交于 3月 15, 2008
```
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: Fix tbench regression in 2.6.25-rc1
```
afbf331e

15 3月, 2008 10 次提交

sched: simplify sched_slice() · 6a6029b8

由 Ingo Molnar 提交于 3月 14, 2008

Use the existing calc_delta_mine() calculation for sched_slice(). This
saves a divide and simplifies the code because we share it with the
other /cfs_rq->load users.

It also improves code size:

      text    data     bss     dec     hex filename
     42659    2740     144   45543    b1e7 sched.o.before
     42093    2740     144   44977    afb1 sched.o.after
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

6a6029b8

sched: fix fair sleepers · e22ecef1

由 Ingo Molnar 提交于 3月 14, 2008

Fair sleepers need to scale their latency target down by runqueue
weight. Otherwise busy systems will gain ever larger sleep bonus.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

e22ecef1

sched: fix overload performance: buddy wakeups · aa2ac252

由 Peter Zijlstra 提交于 3月 14, 2008

Currently we schedule to the leftmost task in the runqueue. When the
runtimes are very short because of some server/client ping-pong,
especially in over-saturated workloads, this will cycle through all
tasks trashing the cache.

Reduce cache trashing by keeping dependent tasks together by running
newly woken tasks first. However, by not running the leftmost task first
we could starve tasks because the wakee can gain unlimited runtime.

Therefore we only run the wakee if its within a small
(wakeup_granularity) window of the leftmost task. This preserves
fairness, but does alternate server/client task groups.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

aa2ac252

sched: fix calc_delta_mine() · 27d11726

由 Ingo Molnar 提交于 3月 14, 2008

lw->weight can be 0 for a short time during bootup.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

27d11726

sched: fix update_load_add()/sub() · e89996ae

由 Ingo Molnar 提交于 3月 14, 2008

Clear the cached inverse value when updating load. This is needed for
calc_delta_mine() to work correctly when using the rq load.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

e89996ae

sched: min_vruntime fix · 3fe69747

由 Peter Zijlstra 提交于 3月 14, 2008

Current min_vruntime tracking is incorrect and will cause serious
problems when we don't run the leftmost task for some reason.

min_vruntime does two things; 1) it's used to determine a forward
direction when the u64 vruntime wraps, 2) it's used to track the
leftmost vruntime to position newly enqueued tasks from.

The current logic advances min_vruntime whenever the current task's
vruntime advance. Because the current task may pass the leftmost task
still waiting we're failing the second goal. This causes new tasks to be
placed too far ahead and thus penalizes their runtime.

Fix this by making min_vruntime the min_vruntime of the waiting tasks by
tracking it in enqueue/dequeue, and compare against current's vruntime
to obtain the absolute minimum when placing new tasks.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3fe69747

sched: fix race in schedule() · 0e1f3483

由 Hiroshi Shimamoto 提交于 3月 10, 2008

Fix a hard to trigger crash seen in the -rt kernel that also affects
the vanilla scheduler.

There is a race condition between schedule() and some dequeue/enqueue
functions; rt_mutex_setprio(), __setscheduler() and sched_move_task().

When scheduling to idle, idle_balance() is called to pull tasks from
other busy processor. It might drop the rq lock. It means that those 3
functions encounter on_rq=0 and running=1. The current task should be
put when running.

Here is a possible scenario:

   CPU0                               CPU1
    |                              schedule()
    |                              ->deactivate_task()
    |                              ->idle_balance()
    |                              -->load_balance_newidle()
rt_mutex_setprio()                     |
    |                              --->double_lock_balance()
    *get lock                          *rel lock
    * on_rq=0, ruuning=1               |
    * sched_class is changed           |
    *rel lock                          *get lock
    :                                  |
                                       :
                                   ->put_prev_task_rt()
                                   ->pick_next_task_fair()
                                       => panic

The current process of CPU1(P1) is scheduling. Deactivated P1, and the
scheduler looks for another process on other CPU's runqueue because CPU1
will be idle. idle_balance(), load_balance_newidle() and
double_lock_balance() are called and double_lock_balance() could drop
the rq lock. On the other hand, CPU0 is trying to boost the priority of
P1. The result of boosting only P1's prio and sched_class are changed to
RT. The sched entities of P1 and P1's group are never put. It makes
cfs_rq invalid, because the cfs_rq has curr and no leaf, but
pick_next_task_fair() is called, then the kernel panics.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0e1f3483

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 · 4faa8496

由 Linus Torvalds 提交于 3月 14, 2008

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: fw-ohci: shut up false compiler warning on PPC32
  firewire: fw-ohci: use dma_alloc_coherent for ar_buffer
  ieee1394: sbp2: fix for SYM13FW500 bridge (Datafab disk)
  firewire: fw-sbp2: fix for SYM13FW500 bridge (Datafab disk)
  firewire: update Kconfig help text
  firewire: warn on fatal condition in topology code
  firewire: fw-sbp2: set single-phase retry_limit
  firewire: fw-ohci: Apple UniNorth 1st generation support
  firewire: fw-ohci: PPC PMac platform code
  firewire: endianess annotations
  firewire: endianess fix

4faa8496

nfsd: fix oops on access from high-numbered ports · b663c6fd

由 J. Bruce Fields 提交于 3月 14, 2008

This bug was always here, but before my commit 6fa02839
("recheck for secure ports in fh_verify"), it could only be triggered by
failure of a kmalloc().  After that commit it could be triggered by a
client making a request from a non-reserved port for access to an export
marked "secure".  (Exports are "secure" by default.)

The result is a struct svc_export with a reference count one too low,
resulting in likely oopses next time the export is accessed.

The reference counting here is not straightforward; a later patch will
clean up fh_verify().

Thanks to Lukas Hejtmanek for the bug report and followup.
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
Cc: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b663c6fd

struct export_operations: adjust comments to match current members · 9b89ca7a

由 Marc Dionne 提交于 3月 14, 2008

The comments in the definition of struct export_operations don't match the
current members.

Add a comment for the 2 new functions and remove 2 comments for unused ones.
Signed-off-by: NMarc Dionne <marc.c.dionne@gmail.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9b89ca7a

14 3月, 2008 2 次提交

firewire: fw-ohci: shut up false compiler warning on PPC32 · f5101d58

由 Stefan Richter 提交于 3月 14, 2008

Shut up "may be used uninitialised in this function" warnings due to
PPC32's implementation of dma_alloc_coherent().
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>

f5101d58

firewire: fw-ohci: use dma_alloc_coherent for ar_buffer · bde1709a

由 Jarod Wilson 提交于 3月 12, 2008

Currently, we do nothing to guarantee we have a consistent DMA buffer for
asynchronous receive packets. Rather than doing several sync's following a
dma_map_single() to get consistent buffers, just switch to using
dma_alloc_coherent().

Resolves constant buffer failures on my own x86_64 laptop w/4GB of RAM and
likely to fix a number of other failures witnessed on x86_64 systems with
4GB of RAM or more.
Signed-off-by: NJarod Wilson <jwilson@redhat.com>
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>

bde1709a

Linux-御风守护者 / linux 与 Fork 源项目一致

Linux-御风守护者 / linux
与 Fork 源项目一致