1. 18 10月, 2012 2 次提交
  2. 17 10月, 2012 4 次提交
  3. 11 10月, 2012 2 次提交
    • C
      s390/kvm: dont announce RRBM support · 87cac8f8
      Christian Borntraeger 提交于
      Newer kernels (linux-next with the transparent huge page patches)
      use rrbm if the feature is announced via feature bit 66.
      RRBM will cause intercepts, so KVM does not handle it right now,
      causing an illegal instruction in the guest.
      The  easy solution is to disable the feature bit for the guest.
      
      This fixes bugs like:
      Kernel BUG at 0000000000124c2a [verbose debug info unavailable]
      illegal operation: 0001 [#1] SMP
      Modules linked in: virtio_balloon virtio_net ipv6 autofs4
      CPU: 0 Not tainted 3.5.4 #1
      Process fmempig (pid: 659, task: 000000007b712fd0, ksp: 000000007bed3670)
      Krnl PSW : 0704d00180000000 0000000000124c2a (pmdp_clear_flush_young+0x5e/0x80)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
           00000000003cc000 0000000000000004 0000000000000000 0000000079800000
           0000000000040000 0000000000000000 000000007bed3918 000000007cf40000
           0000000000000001 000003fff7f00000 000003d281a94000 000000007bed383c
           000000007bed3918 00000000005ecbf8 00000000002314a6 000000007bed36e0
       Krnl Code:>0000000000124c2a: b9810025          ogr     %r2,%r5
                 0000000000124c2e: 41343000           la      %r3,0(%r4,%r3)
                 0000000000124c32: a716fffa           brct    %r1,124c26
                 0000000000124c36: b9010022           lngr    %r2,%r2
                 0000000000124c3a: e3d0f0800004       lg      %r13,128(%r15)
                 0000000000124c40: eb22003f000c       srlg    %r2,%r2,63
      [ 2150.713198] Call Trace:
      [ 2150.713223] ([<00000000002312c4>] page_referenced_one+0x6c/0x27c)
      [ 2150.713749]  [<0000000000233812>] page_referenced+0x32a/0x410
      [...]
      
      CC: stable@vger.kernel.org
      CC: Alex Graf <agraf@suse.de>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      87cac8f8
    • J
      s390/kvm: Interrupt injection bugfix · 82a12737
      Jason J. Herne 提交于
      EXTERNAL_CALL and EMERGENCY type interrupts need to preserve their interrupt
      code parameter when being injected from user space.
      Signed-off-by: NJason J. Herne <jjherne@us.ibm.com>
      Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      82a12737
  4. 09 10月, 2012 2 次提交
  5. 06 10月, 2012 30 次提交
    • J
      arch/powerpc/kvm/e500_tlb.c: fix error return code · 12ecd957
      Julia Lawall 提交于
      Convert a 0 error return code to a negative one, as returned elsewhere in the
      function.
      
      A new label is also added to avoid freeing things that are known to not yet
      be allocated.
      
      A simplified version of the semantic match that finds the first problem is as
      follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      identifier ret;
      expression e,e1,e2,e3,e4,x;
      @@
      
      (
      if (\(ret != 0\|ret < 0\) || ...) { ... return ...; }
      |
      ret = 0
      )
      ... when != ret = e1
      *x = \(kmalloc\|kzalloc\|kcalloc\|devm_kzalloc\|ioremap\|ioremap_nocache\|devm_ioremap\|devm_ioremap_nocache\)(...);
      ... when != x = e2
          when != ret = e3
      *if (x == NULL || ...)
      {
        ... when != ret = e4
      *  return ret;
      }
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      12ecd957
    • P
      KVM: PPC: Book3S HV: Provide a way for userspace to get/set per-vCPU areas · 55b665b0
      Paul Mackerras 提交于
      The PAPR paravirtualization interface lets guests register three
      different types of per-vCPU buffer areas in its memory for communication
      with the hypervisor.  These are called virtual processor areas (VPAs).
      Currently the hypercalls to register and unregister VPAs are handled
      by KVM in the kernel, and userspace has no way to know about or save
      and restore these registrations across a migration.
      
      This adds "register" codes for these three areas that userspace can
      use with the KVM_GET/SET_ONE_REG ioctls to see what addresses have
      been registered, and to register or unregister them.  This will be
      needed for guest hibernation and migration, and is also needed so
      that userspace can unregister them on reset (otherwise we corrupt
      guest memory after reboot by writing to the VPAs registered by the
      previous kernel).
      
      The "register" for the VPA is a 64-bit value containing the address,
      since the length of the VPA is fixed.  The "registers" for the SLB
      shadow buffer and dispatch trace log (DTL) are 128 bits long,
      consisting of the guest physical address in the high (first) 64 bits
      and the length in the low 64 bits.
      
      This also fixes a bug where we were calling init_vpa unconditionally,
      leading to an oops when unregistering the VPA.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      55b665b0
    • P
      KVM: PPC: Book3S: Get/set guest FP regs using the GET/SET_ONE_REG interface · a8bd19ef
      Paul Mackerras 提交于
      This enables userspace to get and set all the guest floating-point
      state using the KVM_[GS]ET_ONE_REG ioctls.  The floating-point state
      includes all of the traditional floating-point registers and the
      FPSCR (floating point status/control register), all the VMX/Altivec
      vector registers and the VSCR (vector status/control register), and
      on POWER7, the vector-scalar registers (note that each FP register
      is the high-order half of the corresponding VSR).
      
      Most of these are implemented in common Book 3S code, except for VSX
      on POWER7.  Because HV and PR differ in how they store the FP and VSX
      registers on POWER7, the code for these cases is not common.  On POWER7,
      the FP registers are the upper halves of the VSX registers vsr0 - vsr31.
      PR KVM stores vsr0 - vsr31 in two halves, with the upper halves in the
      arch.fpr[] array and the lower halves in the arch.vsr[] array, whereas
      HV KVM on POWER7 stores the whole VSX register in arch.vsr[].
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix whitespace, vsx compilation]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a8bd19ef
    • P
      KVM: PPC: Book3S: Get/set guest SPRs using the GET/SET_ONE_REG interface · a136a8bd
      Paul Mackerras 提交于
      This enables userspace to get and set various SPRs (special-purpose
      registers) using the KVM_[GS]ET_ONE_REG ioctls.  With this, userspace
      can get and set all the SPRs that are part of the guest state, either
      through the KVM_[GS]ET_REGS ioctls, the KVM_[GS]ET_SREGS ioctls, or
      the KVM_[GS]ET_ONE_REG ioctls.
      
      The SPRs that are added here are:
      
      - DABR:  Data address breakpoint register
      - DSCR:  Data stream control register
      - PURR:  Processor utilization of resources register
      - SPURR: Scaled PURR
      - DAR:   Data address register
      - DSISR: Data storage interrupt status register
      - AMR:   Authority mask register
      - UAMOR: User authority mask override register
      - MMCR0, MMCR1, MMCRA: Performance monitor unit control registers
      - PMC1..PMC8: Performance monitor unit counter registers
      
      In order to reduce code duplication between PR and HV KVM code, this
      moves the kvm_vcpu_ioctl_[gs]et_one_reg functions into book3s.c and
      centralizes the copying between user and kernel space there.  The
      registers that are handled differently between PR and HV, and those
      that exist only in one flavor, are handled in kvmppc_[gs]et_one_reg()
      functions that are specific to each flavor.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: minimal style fixes]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a136a8bd
    • S
      KVM: PPC: set IN_GUEST_MODE before checking requests · 5bd1cf11
      Scott Wood 提交于
      Avoid a race as described in the code comment.
      
      Also remove a related smp_wmb() from booke's kvmppc_prepare_to_enter().
      I can't see any reason for it, and the book3s_pr version doesn't have it.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5bd1cf11
    • S
      KVM: PPC: e500: MMU API: fix leak of shared_tlb_pages · adbb48a8
      Scott Wood 提交于
      This was found by kmemleak.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      adbb48a8
    • S
      KVM: PPC: e500: fix allocation size error on g2h_tlb1_map · e400e72f
      Scott Wood 提交于
      We were only allocating half the bytes we need, which was made more
      obvious by a recent fix to the memset in  clear_tlb1_bitmap().
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Cc: stable@vger.kernel.org
      e400e72f
    • P
      KVM: PPC: Book3S HV: Fix calculation of guest phys address for MMIO emulation · 70bddfef
      Paul Mackerras 提交于
      In the case where the host kernel is using a 64kB base page size and
      the guest uses a 4k HPTE (hashed page table entry) to map an emulated
      MMIO device, we were calculating the guest physical address wrongly.
      We were calculating a gfn as the guest physical address shifted right
      16 bits (PAGE_SHIFT) but then only adding back in 12 bits from the
      effective address, since the HPTE had a 4k page size.  Thus the gpa
      reported to userspace was missing 4 bits.
      
      Instead, we now compute the guest physical address from the HPTE
      without reference to the host page size, and then compute the gfn
      by shifting the gpa right PAGE_SHIFT bits.
      Reported-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      70bddfef
    • P
      KVM: PPC: Book3S HV: Remove bogus update of physical thread IDs · 964ee98c
      Paul Mackerras 提交于
      When making a vcpu non-runnable we incorrectly changed the
      thread IDs of all other threads on the core, just remove that
      code.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      964ee98c
    • P
      KVM: PPC: Book3S HV: Fix updates of vcpu->cpu · a47d72f3
      Paul Mackerras 提交于
      This removes the powerpc "generic" updates of vcpu->cpu in load and
      put, and moves them to the various backends.
      
      The reason is that "HV" KVM does its own sauce with that field
      and the generic updates might corrupt it. The field contains the
      CPU# of the -first- HW CPU of the core always for all the VCPU
      threads of a core (the one that's online from a host Linux
      perspective).
      
      However, the preempt notifiers are going to be called on the
      threads VCPUs when they are running (due to them sleeping on our
      private waitqueue) causing unload to be called, potentially
      clobbering the value.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a47d72f3
    • P
      KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctly · dfe49dbd
      Paul Mackerras 提交于
      This adds an implementation of kvm_arch_flush_shadow_memslot for
      Book3S HV, and arranges for kvmppc_core_commit_memory_region to
      flush the dirty log when modifying an existing slot.  With this,
      we can handle deletion and modification of memory slots.
      
      kvm_arch_flush_shadow_memslot calls kvmppc_core_flush_memslot, which
      on Book3S HV now traverses the reverse map chains to remove any HPT
      (hashed page table) entries referring to pages in the memslot.  This
      gets called by generic code whenever deleting a memslot or changing
      the guest physical address for a memslot.
      
      We flush the dirty log in kvmppc_core_commit_memory_region for
      consistency with what x86 does.  We only need to flush when an
      existing memslot is being modified, because for a new memslot the
      rmap array (which stores the dirty bits) is all zero, meaning that
      every page is considered clean already, and when deleting a memslot
      we obviously don't care about the dirty bits any more.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      dfe49dbd
    • P
      KVM: PPC: Move kvm->arch.slot_phys into memslot.arch · a66b48c3
      Paul Mackerras 提交于
      Now that we have an architecture-specific field in the kvm_memory_slot
      structure, we can use it to store the array of page physical addresses
      that we need for Book3S HV KVM on PPC970 processors.  This reduces the
      size of struct kvm_arch for Book3S HV, and also reduces the size of
      struct kvm_arch_memory_slot for other PPC KVM variants since the fields
      in it are now only compiled in for Book3S HV.
      
      This necessitates making the kvm_arch_create_memslot and
      kvm_arch_free_memslot operations specific to each PPC KVM variant.
      That in turn means that we now don't allocate the rmap arrays on
      Book3S PR and Book E.
      
      Since we now unpin pages and free the slot_phys array in
      kvmppc_core_free_memslot, we no longer need to do it in
      kvmppc_core_destroy_vm, since the generic code takes care to free
      all the memslots when destroying a VM.
      
      We now need the new memslot to be passed in to
      kvmppc_core_prepare_memory_region, since we need to initialize its
      arch.slot_phys member on Book3S HV.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a66b48c3
    • P
      KVM: PPC: Book3S HV: Take the SRCU read lock before looking up memslots · 2c9097e4
      Paul Mackerras 提交于
      The generic KVM code uses SRCU (sleeping RCU) to protect accesses
      to the memslots data structures against updates due to userspace
      adding, modifying or removing memory slots.  We need to do that too,
      both to avoid accessing stale copies of the memslots and to avoid
      lockdep warnings.  This therefore adds srcu_read_lock/unlock pairs
      around code that accesses and uses memslots.
      
      Since the real-mode handlers for H_ENTER, H_REMOVE and H_BULK_REMOVE
      need to access the memslots, and we don't want to call the SRCU code
      in real mode (since we have no assurance that it would only access
      the linear mapping), we hold the SRCU read lock for the VM while
      in the guest.  This does mean that adding or removing memory slots
      while some vcpus are executing in the guest will block for up to
      two jiffies.  This tradeoff is acceptable since adding/removing
      memory slots only happens rarely, while H_ENTER/H_REMOVE/H_BULK_REMOVE
      are performance-critical hot paths.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      2c9097e4
    • M
      KVM: PPC: bookehv: Allow duplicate calls of DO_KVM macro · d61966fc
      Mihai Caraman 提交于
      The current form of DO_KVM macro restricts its use to one call per input
      parameter set. This is caused by kvmppc_resume_\intno\()_\srr1 symbol
      definition.
      Duplicate calls of DO_KVM are required by distinct implementations of
      exeption handlers which are delegated at runtime. Use a rare label number
      to avoid conflicts with the calling contexts.
      Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      d61966fc
    • A
      KVM: PPC: BookE: Support FPU on non-hv systems · 7a08c274
      Alexander Graf 提交于
      When running on HV aware hosts, we can not trap when the guest sets the FP
      bit, so we just let it do so when it wants to, because it has full access to
      MSR.
      
      For non-HV aware hosts with an FPU (like 440), we need to also adjust the
      shadow MSR though. Otherwise the guest gets an FP unavailable trap even when
      it really enabled the FP bit in MSR.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7a08c274
    • A
      KVM: PPC: 440: Implement mfdcrx · ceb985f9
      Alexander Graf 提交于
      We need mfdcrx to execute properly on 460 cores.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ceb985f9
    • A
      KVM: PPC: 440: Implement mtdcrx · e4dcfe88
      Alexander Graf 提交于
      We need mtdcrx to execute properly on 460 cores.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e4dcfe88
    • A
      KVM: PPC: E500: Remove E500_TLB_DIRTY flag · 430c7ff5
      Alexander Graf 提交于
      Since we always mark pages as dirty immediately when mapping them read/write
      now, there's no need for the dirty flag in our cache.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      430c7ff5
    • A
      KVM: PPC: Use symbols for exit trace · 166a2b70
      Alexander Graf 提交于
      Exit traces are a lot easier to read when you don't have to remember
      cryptic numbers for guest exit reasons. Symbolify them in our trace
      output.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      166a2b70
    • A
      KVM: PPC: BookE: Add MCSR SPR support · 50c871ed
      Alexander Graf 提交于
      Add support for the MCSR SPR. This only implements the SPR storage
      bits, not actual machine checks.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      50c871ed
    • A
      KVM: PPC: 44x: Initialize PVR · 491dd5b8
      Alexander Graf 提交于
      We need to make sure that vcpu->arch.pvr is initialized to a sane value,
      so let's just take the host PVR.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      491dd5b8
    • B
      booke: Added ONE_REG interface for IAC/DAC debug registers · 6df8d3fc
      Bharat Bhushan 提交于
      IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
      interface is added to set/get them.
      Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      6df8d3fc
    • B
      KVM: PPC: booke: Add watchdog emulation · f61c94bb
      Bharat Bhushan 提交于
      This patch adds the watchdog emulation in KVM. The watchdog
      emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl.
      The kernel timer are used for watchdog emulation and emulates
      h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU
      if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how
      it is configured.
      Signed-off-by: NLiu Yu <yu.liu@freescale.com>
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      [bharat.bhushan@freescale.com: reworked patch]
      Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
      [agraf: adjust to new request framework]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f61c94bb
    • A
      KVM: PPC: Add return value to core_check_requests · 7c973a2e
      Alexander Graf 提交于
      Requests may want to tell us that we need to go back into host state,
      so add a return value for the checks.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7c973a2e
    • A
      KVM: PPC: Add return value in prepare_to_enter · 7ee78855
      Alexander Graf 提交于
      Our prepare_to_enter helper wants to be able to return in more circumstances
      to the host than only when an interrupt is pending. Broaden the interface a
      bit and move even more generic code to the generic helper.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7ee78855
    • A
      KVM: PPC: Ignore EXITING_GUEST_MODE mode · 206c2ed7
      Alexander Graf 提交于
      We don't need to do anything when mode is EXITING_GUEST_MODE, because
      we essentially are outside of guest mode and did everything it asked
      us to do by the time we check it.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      206c2ed7
    • A
      KVM: PPC: Move kvm_guest_enter call into generic code · 3766a4c6
      Alexander Graf 提交于
      We need to call kvm_guest_enter in booke and book3s, so move its
      call to generic code.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3766a4c6
    • A
      KVM: PPC: Book3S: PR: Rework irq disabling · bd2be683
      Alexander Graf 提交于
      Today, we disable preemption while inside guest context, because we need
      to expose to the world that we are not in a preemptible context. However,
      during that time we already have interrupts disabled, which would indicate
      that we are in a non-preemptible context.
      
      The reason the checks for irqs_disabled() fail for us though is that we
      manually control hard IRQs and ignore all the lazy EE framework. Let's
      stop doing that. Instead, let's always use lazy EE to indicate when we
      want to disable IRQs, but do a special final switch that gets us into
      EE disabled, but soft enabled state. That way when we get back out of
      guest state, we are immediately ready to process interrupts.
      
      This simplifies the code drastically and reduces the time that we appear
      as preempt disabled.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      bd2be683
    • A
      KVM: PPC: Consistentify vcpu exit path · 24afa37b
      Alexander Graf 提交于
      When getting out of __vcpu_run, let's be consistent about the state we
      return in. We want to always
      
        * have IRQs enabled
        * have called kvm_guest_exit before
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      24afa37b
    • A
      KVM: PPC: Book3S: PR: Indicate we're out of guest mode · 0652eaae
      Alexander Graf 提交于
      When going out of guest mode, indicate that we are in vcpu->mode. That way
      requests from other CPUs don't needlessly need to kick us to process them,
      because it'll just happen next time we enter the guest.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      0652eaae