1. 08 3月, 2012 29 次提交
  2. 05 3月, 2012 11 次提交
    • M
      KVM: x86: increase recommended max vcpus to 160 · a59cb29e
      Marcelo Tosatti 提交于
      Increase recommended max vcpus from 64 to 160 (tested internally
      at Red Hat).
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      a59cb29e
    • I
      x86: Introduce x86_cpuinit.early_percpu_clock_init hook · df156f90
      Igor Mammedov 提交于
      When kvm guest uses kvmclock, it may hang on vcpu hot-plug.
      This is caused by an overflow in pvclock_get_nsec_offset,
      
          u64 delta = tsc - shadow->tsc_timestamp;
      
      which in turn is caused by an undefined values from percpu
      hv_clock that hasn't been initialized yet.
      Uninitialized clock on being booted cpu is accessed from
         start_secondary
          -> smp_callin
            ->  smp_store_cpu_info
              -> identify_secondary_cpu
                -> mtrr_ap_init
                  -> mtrr_restore
                    -> stop_machine_from_inactive_cpu
                      -> queue_stop_cpus_work
                        ...
                          -> sched_clock
                            -> kvm_clock_read
      which is well before x86_cpuinit.setup_percpu_clockev call in
      start_secondary, where percpu clock is initialized.
      
      This patch introduces a hook that allows to setup/initialize
      per_cpu clock early and avoid overflow due to reading
        - undefined values
        - old values if cpu was offlined and then onlined again
      
      Another possible early user of this clock source is ftrace that
      accesses it to get timestamps for ring buffer entries. So if
      mtrr_ap_init is moved from identify_secondary_cpu to past
      x86_cpuinit.setup_percpu_clockev in start_secondary, ftrace
      may cause the same overflow/hang on cpu hot-plug anyway.
      
      More complete description of the problem:
        https://lkml.org/lkml/2012/2/2/101
      
      Credits to Marcelo Tosatti <mtosatti@redhat.com> for hook idea.
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      df156f90
    • G
      KVM: x86: reset edge sense circuit of i8259 on init · 242ec97c
      Gleb Natapov 提交于
      The spec says that during initialization "The edge sense circuit is
      reset which means that following initialization an interrupt request
      (IR) input must make a low-to-high transition to generate an interrupt",
      but currently if edge triggered interrupt is in IRR it is delivered
      after i8259 initialization.
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      242ec97c
    • A
      KVM: PPC: Add HPT preallocator · d2a1b483
      Alexander Graf 提交于
      We're currently allocating 16MB of linear memory on demand when creating
      a guest. That does work some times, but finding 16MB of linear memory
      available in the system at runtime is definitely not a given.
      
      So let's add another command line option similar to the RMA preallocator,
      that we can use to keep a pool of page tables around. Now, when a guest
      gets created it has a pretty low chance of receiving an OOM.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d2a1b483
    • A
      KVM: PPC: Initialize linears with zeros · b7f5d011
      Alexander Graf 提交于
      RMAs and HPT preallocated spaces should be zeroed, so we don't accidently
      leak information from previous VM executions.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      b7f5d011
    • A
      KVM: PPC: Convert RMA allocation into generic code · b4e70611
      Alexander Graf 提交于
      We have code to allocate big chunks of linear memory on bootup for later use.
      This code is currently used for RMA allocation, but can be useful beyond that
      extent.
      
      Make it generic so we can reuse it for other stuff later.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      b4e70611
    • A
      KVM: PPC: E500: Fail init when not on e500v2 · 9cf7c0e4
      Alexander Graf 提交于
      When enabling the current KVM code on e500mc, I get the following oops:
      
          Oops: Exception in kernel mode, sig: 4 [#1]
          SMP NR_CPUS=8 P2041 RDB
          Modules linked in:
          NIP: c067df4c LR: c067df44 CTR: 00000000
          REGS: ee055ed0 TRAP: 0700   Not tainted  (3.2.0-10391-g36c5afe)
          MSR: 00029002 <CE,EE,ME>  CR: 24042022  XER: 00000000
          TASK = ee0429b0[1] 'swapper/0' THREAD: ee054000 CPU: 2
          GPR00: c067df44 ee055f80 ee0429b0 00000000 00000058 0000003f ee211600 60c6b864
          GPR08: 7cc903a6 0000002c 00000000 00000001 44042082 2d180088 00000000 00000000
          GPR16: c0000a00 00000014 3fffffff 03fe9000 00000015 7ff3be68 c06e0000 00000000
          GPR24: 00000000 00000000 00001720 c067df1c c06e0000 00000000 ee054000 c06ab51c
          NIP [c067df4c] kvmppc_e500_init+0x30/0xf8
          LR [c067df44] kvmppc_e500_init+0x28/0xf8
          Call Trace:
          [ee055f80] [c067df44] kvmppc_e500_init+0x28/0xf8 (unreliable)
          [ee055fb0] [c0001d30] do_one_initcall+0x50/0x1f0
          [ee055fe0] [c06721dc] kernel_init+0xa4/0x14c
          [ee055ff0] [c000e910] kernel_thread+0x4c/0x68
          Instruction dump:
          9421ffd0 7c0802a6 93410018 9361001c 90010034 93810020 93a10024 93c10028
          93e1002c 4bfffe7d 2c030000 408200a4 <7c1082a6> 90010008 7c1182a6 9001000c
          ---[ end trace b8ef4903fcbf9dd3 ]---
      
      Since it doesn't make sense to run the init function on any non-supported
      platform, we can just call our "is this platform supported?" function and
      bail out of init() if it's not.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      9cf7c0e4
    • P
      KVM: Move gfn_to_memslot() to kvm_host.h · 9d4cba7f
      Paul Mackerras 提交于
      This moves __gfn_to_memslot() and search_memslots() from kvm_main.c to
      kvm_host.h to reduce the code duplication caused by the need for
      non-modular code in arch/powerpc/kvm/book3s_hv_rm_mmu.c to call
      gfn_to_memslot() in real mode.
      
      Rather than putting gfn_to_memslot() itself in a header, which would
      lead to increased code size, this puts __gfn_to_memslot() in a header.
      Then, the non-modular uses of gfn_to_memslot() are changed to call
      __gfn_to_memslot() instead.  This way there is only one place in the
      source code that needs to be changed should the gfn_to_memslot()
      implementation need to be modified.
      
      On powerpc, the Book3S HV style of KVM has code that is called from
      real mode which needs to call gfn_to_memslot() and thus needs this.
      (Module code is allocated in the vmalloc region, which can't be
      accessed in real mode.)
      
      With this, we can remove builtin_gfn_to_memslot() from book3s_hv_rm_mmu.c.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      9d4cba7f
    • A
      KVM: x86 emulator: reject SYSENTER in compatibility mode on AMD guests · 1a18a69b
      Avi Kivity 提交于
      If the guest thinks it's an AMD, it will not have prepared the SYSENTER MSRs,
      and if the guest executes SYSENTER in compatibility mode, it will fails.
      
      Detect this condition and #UD instead, like the spec says.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      1a18a69b
    • J
      KVM: Don't mistreat edge-triggered INIT IPI as INIT de-assert. (LAPIC) · a52315e1
      Julian Stecklina 提交于
      If the guest programs an IPI with level=0 (de-assert) and trig_mode=0 (edge),
      it is erroneously treated as INIT de-assert and ignored, but to quote the
      spec: "For this delivery mode [INIT de-assert], the level flag must be set to
      0 and trigger mode flag to 1."
      Signed-off-by: NJulian Stecklina <js@alien8.de>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      a52315e1
    • M
      KVM: fix error handling for out of range irq · b93a3553
      Michael S. Tsirkin 提交于
      find_index_from_host_irq returns 0 on error
      but callers assume < 0 on error. This should
      not matter much: an out of range irq should never happen since
      irq handler was registered with this irq #,
      and even if it does we get a spurious msix irq in guest
      and typically nothing terrible happens.
      
      Still, better to make it consistent.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      b93a3553