1. 17 3月, 2008 7 次提交
    • C
      virtio: fix race in enable_cb · 4265f161
      Christian Borntraeger 提交于
      There is a race in virtio_net, dealing with disabling/enabling the callback.
      I saw the following oops:
      
      kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:218!
      illegal operation: 0001 [#1] SMP
      Modules linked in: sunrpc dm_mod
      CPU: 2 Not tainted 2.6.25-rc1zlive-host-10623-gd358142-dirty #99
      Process swapper (pid: 0, task: 000000000f85a610, ksp: 000000000f873c60)
      Krnl PSW : 0404300180000000 00000000002b81a6 (vring_disable_cb+0x16/0x20)
                 R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3
      Krnl GPRS: 0000000000000001 0000000000000001 0000000010005800 0000000000000001
                 000000000f3a0900 000000000f85a610 0000000000000000 0000000000000000
                 0000000000000000 000000000f870000 0000000000000000 0000000000001237
                 000000000f3a0920 000000000010ff74 00000000002846f6 000000000fa0bcd8
      Krnl Code: 00000000002b819a: a7110001           tmll    %r1,1
                 00000000002b819e: a7840004           brc     8,2b81a6
                 00000000002b81a2: a7f40001           brc     15,2b81a4
                >00000000002b81a6: a51b0001           oill    %r1,1
                 00000000002b81aa: 40102000           sth     %r1,0(%r2)
                 00000000002b81ae: 07fe               bcr     15,%r14
                 00000000002b81b0: eb7ff0380024       stmg    %r7,%r15,56(%r15)
                 00000000002b81b6: a7f13e00           tmll    %r15,15872
      Call Trace:
      ([<000000000fa0bcd0>] 0xfa0bcd0)
       [<00000000002b8350>] vring_interrupt+0x5c/0x6c
       [<000000000010ab08>] do_extint+0xb8/0xf0
       [<0000000000110716>] ext_no_vtime+0x16/0x1a
       [<0000000000107e72>] cpu_idle+0x1c2/0x1e0
      
      The problem can be triggered with a high amount of host->guest traffic.
      I think its the following race:
      
      poll says netif_rx_complete
      poll calls enable_cb
      enable_cb opens the interrupt mask
      a new packet comes, an interrupt is triggered----\
      enable_cb sees that there is more work           |
      enable_cb disables the interrupt                 |
             .                                         V
             .                            interrupt is delivered
             .                            skb_recv_done does atomic napi test, ok
       some waiting                       disable_cb is called->check fails->bang!
             .
      poll would do napi check
      poll would do disable_cb
      
      The fix is to let enable_cb not disable the interrupt again, but expect the
      caller to do the cleanup if it returns false. In that case, the interrupt is
      only disabled, if the napi test_set_bit was successful.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cleaned up doco)
      4265f161
    • A
      virtio: Enable netpoll interface for netconsole logging · da74e89d
      Amit Shah 提交于
      Add a new poll_controller handler that the netpoll interface needs.
      
      This enables netconsole logging from a kvm guest over the virtio
      net interface.
      Signed-off-by: NAmit Shah <amitshah@gmx.net>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      da74e89d
    • R
      virtio: handle > 2 billion page balloon targets · bdc1681c
      Rusty Russell 提交于
      If the host asks for a huge target towards_target() can overflow, and
      we up oops as we try to release more pages than we have.  The simple
      fix is to use a 64-bit value.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      bdc1681c
    • J
      virtio: Fix sysfs bits to have proper block symlink · c4839346
      Jeremy Katz 提交于
      Fix up so that the virtio_blk devices in sysfs link correctly to their
      block device.  This then allows them to be detected by hal, etc
      Signed-off-by: NJeremy Katz <katzj@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      c4839346
    • A
      virtio: Use spin_lock_irqsave/restore for virtio-pci · 27ebe308
      Anthony Liguori 提交于
      virtio-pci acquires its spin lock in an interrupt context so it's necessary
      to use spin_lock_irqsave/restore variants.  This patch fixes guest SMP when
      using virtio devices in KVM.
      Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      27ebe308
    • L
      Linux 2.6.25-rc6 · a978b30a
      Linus Torvalds 提交于
      a978b30a
    • L
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6 · 69d1d523
      Linus Torvalds 提交于
      * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
        [PARISC] make ptr_to_pide() static
        [PARISC] head.S: section mismatch fixes
        [PARISC] add back Crestone Peak cpu
        [PARISC] futex: special case cmpxchg NULL in kernel space
        [PARISC] clean up show_stack
        [PARISC] add pa8900 CPUs to hardware inventory
        [PARISC] clean up include/asm-parisc/elf.h
        [PARISC] move defconfig to arch/parisc/configs/
        [PARISC] add back AD1889 MAINTAINERS entry
        [PARISC] pdc_console: fix bizarre panic on boot
        [PARISC] dump_stack in show_regs
        [PARISC] pdc_stable: fix compile errors
        [PARISC] remove unused pdc_iodc_printf function
        [PARISC] bump __NR_syscalls
        [PARISC] unbreak pgalloc.h
        [PARISC] move VMALLOC_* definitions to fixmap.h
        [PARISC] wire up timerfd syscalls
        [PARISC] remove old timerfd syscall
      69d1d523
  2. 16 3月, 2008 21 次提交
  3. 15 3月, 2008 10 次提交
    • I
      sched: simplify sched_slice() · 6a6029b8
      Ingo Molnar 提交于
      Use the existing calc_delta_mine() calculation for sched_slice(). This
      saves a divide and simplifies the code because we share it with the
      other /cfs_rq->load users.
      
      It also improves code size:
      
            text    data     bss     dec     hex filename
           42659    2740     144   45543    b1e7 sched.o.before
           42093    2740     144   44977    afb1 sched.o.after
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      6a6029b8
    • I
      sched: fix fair sleepers · e22ecef1
      Ingo Molnar 提交于
      Fair sleepers need to scale their latency target down by runqueue
      weight. Otherwise busy systems will gain ever larger sleep bonus.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      e22ecef1
    • P
      sched: fix overload performance: buddy wakeups · aa2ac252
      Peter Zijlstra 提交于
      Currently we schedule to the leftmost task in the runqueue. When the
      runtimes are very short because of some server/client ping-pong,
      especially in over-saturated workloads, this will cycle through all
      tasks trashing the cache.
      
      Reduce cache trashing by keeping dependent tasks together by running
      newly woken tasks first. However, by not running the leftmost task first
      we could starve tasks because the wakee can gain unlimited runtime.
      
      Therefore we only run the wakee if its within a small
      (wakeup_granularity) window of the leftmost task. This preserves
      fairness, but does alternate server/client task groups.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      aa2ac252
    • I
      sched: fix calc_delta_mine() · 27d11726
      Ingo Molnar 提交于
      lw->weight can be 0 for a short time during bootup.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      27d11726
    • I
      sched: fix update_load_add()/sub() · e89996ae
      Ingo Molnar 提交于
      Clear the cached inverse value when updating load. This is needed for
      calc_delta_mine() to work correctly when using the rq load.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      e89996ae
    • P
      sched: min_vruntime fix · 3fe69747
      Peter Zijlstra 提交于
      Current min_vruntime tracking is incorrect and will cause serious
      problems when we don't run the leftmost task for some reason.
      
      min_vruntime does two things; 1) it's used to determine a forward
      direction when the u64 vruntime wraps, 2) it's used to track the
      leftmost vruntime to position newly enqueued tasks from.
      
      The current logic advances min_vruntime whenever the current task's
      vruntime advance. Because the current task may pass the leftmost task
      still waiting we're failing the second goal. This causes new tasks to be
      placed too far ahead and thus penalizes their runtime.
      
      Fix this by making min_vruntime the min_vruntime of the waiting tasks by
      tracking it in enqueue/dequeue, and compare against current's vruntime
      to obtain the absolute minimum when placing new tasks.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3fe69747
    • H
      sched: fix race in schedule() · 0e1f3483
      Hiroshi Shimamoto 提交于
      Fix a hard to trigger crash seen in the -rt kernel that also affects
      the vanilla scheduler.
      
      There is a race condition between schedule() and some dequeue/enqueue
      functions; rt_mutex_setprio(), __setscheduler() and sched_move_task().
      
      When scheduling to idle, idle_balance() is called to pull tasks from
      other busy processor. It might drop the rq lock. It means that those 3
      functions encounter on_rq=0 and running=1. The current task should be
      put when running.
      
      Here is a possible scenario:
      
         CPU0                               CPU1
          |                              schedule()
          |                              ->deactivate_task()
          |                              ->idle_balance()
          |                              -->load_balance_newidle()
      rt_mutex_setprio()                     |
          |                              --->double_lock_balance()
          *get lock                          *rel lock
          * on_rq=0, ruuning=1               |
          * sched_class is changed           |
          *rel lock                          *get lock
          :                                  |
                                             :
                                         ->put_prev_task_rt()
                                         ->pick_next_task_fair()
                                             => panic
      
      The current process of CPU1(P1) is scheduling. Deactivated P1, and the
      scheduler looks for another process on other CPU's runqueue because CPU1
      will be idle. idle_balance(), load_balance_newidle() and
      double_lock_balance() are called and double_lock_balance() could drop
      the rq lock. On the other hand, CPU0 is trying to boost the priority of
      P1. The result of boosting only P1's prio and sched_class are changed to
      RT. The sched entities of P1 and P1's group are never put. It makes
      cfs_rq invalid, because the cfs_rq has curr and no leaf, but
      pick_next_task_fair() is called, then the kernel panics.
      Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0e1f3483
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 · 4faa8496
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
        firewire: fw-ohci: shut up false compiler warning on PPC32
        firewire: fw-ohci: use dma_alloc_coherent for ar_buffer
        ieee1394: sbp2: fix for SYM13FW500 bridge (Datafab disk)
        firewire: fw-sbp2: fix for SYM13FW500 bridge (Datafab disk)
        firewire: update Kconfig help text
        firewire: warn on fatal condition in topology code
        firewire: fw-sbp2: set single-phase retry_limit
        firewire: fw-ohci: Apple UniNorth 1st generation support
        firewire: fw-ohci: PPC PMac platform code
        firewire: endianess annotations
        firewire: endianess fix
      4faa8496
    • J
      nfsd: fix oops on access from high-numbered ports · b663c6fd
      J. Bruce Fields 提交于
      This bug was always here, but before my commit 6fa02839
      ("recheck for secure ports in fh_verify"), it could only be triggered by
      failure of a kmalloc().  After that commit it could be triggered by a
      client making a request from a non-reserved port for access to an export
      marked "secure".  (Exports are "secure" by default.)
      
      The result is a struct svc_export with a reference count one too low,
      resulting in likely oopses next time the export is accessed.
      
      The reference counting here is not straightforward; a later patch will
      clean up fh_verify().
      
      Thanks to Lukas Hejtmanek for the bug report and followup.
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Cc: Lukas Hejtmanek <xhejtman@ics.muni.cz>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b663c6fd
    • M
      struct export_operations: adjust comments to match current members · 9b89ca7a
      Marc Dionne 提交于
      The comments in the definition of struct export_operations don't match the
      current members.
      
      Add a comment for the 2 new functions and remove 2 comments for unused ones.
      Signed-off-by: NMarc Dionne <marc.c.dionne@gmail.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9b89ca7a
  4. 14 3月, 2008 2 次提交