1. 24 7月, 2011 3 次提交
  2. 23 7月, 2011 1 次提交
    • O
      virtio: expose for non-virtualization users too · e7254219
      Ohad Ben-Cohen 提交于
      virtio has been so far used only in the context of virtualization,
      and the virtio Kconfig was sourced directly by the relevant arch
      Kconfigs when VIRTUALIZATION was selected.
      
      Now that we start using virtio for inter-processor communications,
      we need to source the virtio Kconfig outside of the virtualization
      scope too.
      
      Moreover, some architectures might use virtio for both virtualization
      and inter-processor communications, so directly sourcing virtio
      might yield unexpected results due to conflicting selections.
      
      The simple solution offered by this patch is to always source virtio's
      Kconfig in drivers/Kconfig, and remove it from the appropriate arch
      Kconfigs. Additionally, a virtio menu entry has been added so virtio
      drivers don't show up in the general drivers menu.
      
      This way anyone can use virtio, though it's arguably less accessible
      (and neat!) for virtualization users now.
      
      Note: some architectures (mips and sh) seem to have a VIRTUALIZATION
      menu merely for sourcing virtio's Kconfig, so that menu is removed too.
      Signed-off-by: NOhad Ben-Cohen <ohad@wizery.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      e7254219
  3. 06 6月, 2011 2 次提交
    • C
      [S390] kvm-s390: fix stfle facilities numbers >=64 · 9950f8be
      Christian Borntraeger 提交于
      Currently KVM masks out the known good facilities only for the first
      double word, but passed the 2nd double word without filtering. This
      breaks some code on newer systems:
      
      [    0.593966] ------------[ cut here ]------------
      [    0.594086] WARNING: at arch/s390/oprofile/hwsampler.c:696
      [    0.594213] Modules linked in:
      [    0.594321] Modules linked in:
      [    0.594439] CPU: 0 Not tainted 3.0.0-rc1 #46
      [    0.594564] Process swapper (pid: 1, task: 00000001effa8038, ksp: 00000001effafab8)
      [    0.594735] Krnl PSW : 0704100180000000 00000000004ab89a (hwsampler_setup+0x75a/0x7b8)
      [    0.594910]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0 EA:3
      [    0.595120] Krnl GPRS: ffffffff00000000 00000000ffffffea ffffffffffffffea 00000000004a98f8
      [    0.595351]            00000000004aa002 0000000000000001 000000000080e720 000000000088b9f8
      [    0.595522]            000000000080d3e8 0000000000000000 0000000000000000 000000000080e464
      [    0.595725]            0000000000000000 00000000005db198 00000000004ab3a2 00000001effafd98
      [    0.595901] Krnl Code: 00000000004ab88c: c0e5000673ca        brasl   %r14,57a020
      [    0.596071]            00000000004ab892: a7f4fc77            brc     15,4ab180
      [    0.596276]            00000000004ab896: a7f40001            brc     15,4ab898
      [    0.596454]           >00000000004ab89a: a7c8ffa1            lhi     %r12,-95
      [    0.596657]            00000000004ab89e: a7f4fc71            brc     15,4ab180
      [    0.596854]            00000000004ab8a2: a7f40001            brc     15,4ab8a4
      [    0.597029]            00000000004ab8a6: a7f4ff22            brc     15,4ab6ea
      [    0.597230]            00000000004ab8aa: c0200011009a        larl    %r2,6cb9de
      [    0.597441] Call Trace:
      [    0.597511] ([<00000000004ab3a2>] hwsampler_setup+0x262/0x7b8)
      [    0.597676]  [<0000000000875812>] oprofile_arch_init+0x32/0xd0
      [    0.597834]  [<0000000000875788>] oprofile_init+0x28/0x74
      [    0.597991]  [<00000000001001be>] do_one_initcall+0x3a/0x170
      [    0.598151]  [<000000000084fa22>] kernel_init+0x142/0x1ec
      [    0.598314]  [<000000000057db16>] kernel_thread_starter+0x6/0xc
      [    0.598468]  [<000000000057db10>] kernel_thread_starter+0x0/0xc
      [    0.598606] Last Breaking-Event-Address:
      [    0.598707]  [<00000000004ab896>] hwsampler_setup+0x756/0x7b8
      [    0.598863] ---[ end trace ce3179037f4e3e5b ]---
      
      So lets also mask the 2nd double word. Facilites 66,76,76,77 should be fine.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      9950f8be
    • C
      [S390] kvm-s390: Fix host crash on misbehaving guests · a578b37c
      Christian Borntraeger 提交于
      commit 9ff4cfb3 ([S390] kvm-390: Let
      kernel exit SIE instruction on work) fixed a problem of commit
      commit cd3b70f5 ([S390] virtualization
      aware cpu measurement) but uncovered another one.
      
      If a kvm guest accesses guest real memory that doesnt exist, the
      page fault handler calls the sie hook, which then rewrites
      the return psw from sie_inst to either sie_exit or sie_reenter.
      On return, the page fault handler will then detect the wrong access
      as a kernel fault causing a kernel oops in sie_reenter or sie_exit.
      
      We have to add these two addresses to the exception  table to allow
      graceful exits.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      a578b37c
  4. 20 4月, 2011 1 次提交
  5. 31 3月, 2011 1 次提交
  6. 17 3月, 2011 1 次提交
  7. 12 1月, 2011 1 次提交
  8. 05 1月, 2011 1 次提交
  9. 25 10月, 2010 2 次提交
  10. 01 8月, 2010 6 次提交
  11. 09 6月, 2010 2 次提交
  12. 27 5月, 2010 1 次提交
  13. 19 5月, 2010 1 次提交
  14. 17 5月, 2010 3 次提交
    • L
      KVM: use the correct RCU API for PROVE_RCU=y · 90d83dc3
      Lai Jiangshan 提交于
      The RCU/SRCU API have already changed for proving RCU usage.
      
      I got the following dmesg when PROVE_RCU=y because we used incorrect API.
      This patch coverts rcu_deference() to srcu_dereference() or family API.
      
      ===================================================
      [ INFO: suspicious rcu_dereference_check() usage. ]
      ---------------------------------------------------
      arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 1, debug_locks = 0
      2 locks held by qemu-system-x86/8550:
       #0:  (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm]
       #1:  (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm]
      
      stack backtrace:
      Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27
      Call Trace:
       [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3
       [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm]
       [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm]
       [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm]
       [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm]
       [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel]
       [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm]
       [<ffffffff810a8692>] ? unlock_page+0x27/0x2c
       [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1
       [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm]
       [<ffffffff81060cfa>] ? up_read+0x23/0x3d
       [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6
       [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db
       [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241
       [<ffffffff810e416c>] ? do_sys_open+0x104/0x116
       [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13
       [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a
       [<ffffffff810021db>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      90d83dc3
    • W
      KVM: s390: Fix possible memory leak of in kvm_arch_vcpu_create() · 7b06bf2f
      Wei Yongjun 提交于
      This patch fixed possible memory leak in kvm_arch_vcpu_create()
      under s390, which would happen when kvm_arch_vcpu_create() fails.
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Acked-by: NCarsten Otte <cotte@de.ibm.com>
      Cc: stable@kernel.org
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      7b06bf2f
    • C
      [S390] virtualization aware cpu measurement · cd3b70f5
      Carsten Otte 提交于
      Use the SPP instruction to set a tag on entry to / exit of the virtual
      machine context. This allows the cpu measurement facility to distinguish
      the samples from the host and the different guests.
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      cd3b70f5
  15. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  16. 01 3月, 2010 3 次提交
  17. 27 2月, 2010 3 次提交
  18. 25 1月, 2010 1 次提交
  19. 15 1月, 2010 1 次提交
    • M
      vhost_net: a kernel-level virtio server · 3a4d5c94
      Michael S. Tsirkin 提交于
      What it is: vhost net is a character device that can be used to reduce
      the number of system calls involved in virtio networking.
      Existing virtio net code is used in the guest without modification.
      
      There's similarity with vringfd, with some differences and reduced scope
      - uses eventfd for signalling
      - structures can be moved around in memory at any time (good for
        migration, bug work-arounds in userspace)
      - write logging is supported (good for migration)
      - support memory table and not just an offset (needed for kvm)
      
      common virtio related code has been put in a separate file vhost.c and
      can be made into a separate module if/when more backends appear.  I used
      Rusty's lguest.c as the source for developing this part : this supplied
      me with witty comments I wouldn't be able to write myself.
      
      What it is not: vhost net is not a bus, and not a generic new system
      call. No assumptions are made on how guest performs hypercalls.
      Userspace hypervisors are supported as well as kvm.
      
      How it works: Basically, we connect virtio frontend (configured by
      userspace) to a backend. The backend could be a network device, or a tap
      device.  Backend is also configured by userspace, including vlan/mac
      etc.
      
      Status: This works for me, and I haven't see any crashes.
      Compared to userspace, people reported improved latency (as I save up to
      4 system calls per packet), as well as better bandwidth and CPU
      utilization.
      
      Features that I plan to look at in the future:
      - mergeable buffers
      - zero copy
      - scalability tuning: figure out the best threading model to use
      
      Note on RCU usage (this is also documented in vhost.h, near
      private_pointer which is the value protected by this variant of RCU):
      what is happening is that the rcu_dereference() is being used in a
      workqueue item.  The role of rcu_read_lock() is taken on by the start of
      execution of the workqueue item, of rcu_read_unlock() by the end of
      execution of the workqueue item, and of synchronize_rcu() by
      flush_workqueue()/flush_work(). In the future we might need to apply
      some gcc attribute or sparse annotation to the function passed to
      INIT_WORK(). Paul's ack below is for this RCU usage.
      
      (Includes fixes by Alan Cox <alan@linux.intel.com>,
      David L Stevens <dlstevens@us.ibm.com>,
      Chris Wright <chrisw@redhat.com>)
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a4d5c94
  20. 07 12月, 2009 1 次提交
    • M
      [S390] Improve address space mode selection. · b11b5334
      Martin Schwidefsky 提交于
      Introduce user_mode to replace the two variables switch_amode and
      s390_noexec. There are three valid combinations of the old values:
        1) switch_amode == 0 && s390_noexec == 0
        2) switch_amode == 1 && s390_noexec == 0
        3) switch_amode == 1 && s390_noexec == 1
      They get replaced by
        1) user_mode == HOME_SPACE_MODE
        2) user_mode == PRIMARY_SPACE_MODE
        3) user_mode == SECONDARY_SPACE_MODE
      The new kernel parameter user_mode=[primary,secondary,home] lets
      you choose the address space mode the user space processes should
      use. In addition the CONFIG_S390_SWITCH_AMODE config option
      is removed.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b11b5334
  21. 03 12月, 2009 4 次提交
    • C
      KVM: s390: Fix prefix register checking in arch/s390/kvm/sigp.c · f50146bd
      Carsten Otte 提交于
      This patch corrects the checking of the new address for the prefix register.
      On s390, the prefix register is used to address the cpu's lowcore (address
      0...8k). This check is supposed to verify that the memory is readable and
      present.
      copy_from_guest is a helper function, that can be used to read from guest
      memory. It applies prefixing, adds the start address of the guest memory in
      user, and then calls copy_from_user. Previous code was obviously broken for
      two reasons:
      - prefixing should not be applied here. The current prefix register is
        going to be updated soon, and the address we're looking for will be
        0..8k after we've updated the register
      - we're adding the guest origin (gmsor) twice: once in subject code
        and once in copy_from_guest
      
      With kuli, we did not hit this problem because (a) we were lucky with
      previous prefix register content, and (b) our guest memory was mmaped
      very low into user address space.
      
      Cc: stable@kernel.org
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      Reported-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      f50146bd
    • C
      KVM: s390: Make psw available on all exits, not just a subset · d7b0b5eb
      Carsten Otte 提交于
      This patch moves s390 processor status word into the base kvm_run
      struct and keeps it up-to date on all userspace exits.
      
      The userspace ABI is broken by this, however there are no applications
      in the wild using this.  A capability check is provided so users can
      verify the updated API exists.
      
      Cc: stable@kernel.org
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d7b0b5eb
    • A
      KVM: Activate Virtualization On Demand · 10474ae8
      Alexander Graf 提交于
      X86 CPUs need to have some magic happening to enable the virtualization
      extensions on them. This magic can result in unpleasant results for
      users, like blocking other VMMs from working (vmx) or using invalid TLB
      entries (svm).
      
      Currently KVM activates virtualization when the respective kernel module
      is loaded. This blocks us from autoloading KVM modules without breaking
      other VMMs.
      
      To circumvent this problem at least a bit, this patch introduces on
      demand activation of virtualization. This means, that instead
      virtualization is enabled on creation of the first virtual machine
      and disabled on destruction of the last one.
      
      So using this, KVM can be easily autoloaded, while keeping other
      hypervisors usable.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      10474ae8
    • A
      KVM: Return -ENOTTY on unrecognized ioctls · 367e1319
      Avi Kivity 提交于
      Not the incorrect -EINVAL.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      367e1319