1. 27 1月, 2014 16 次提交
    • A
      KVM: PPC: Book3S HV: Basic little-endian guest support · d682916a
      Anton Blanchard 提交于
      We create a guest MSR from scratch when delivering exceptions in
      a few places.  Instead of extracting LPCR[ILE] and inserting it
      into MSR_LE each time, we simply create a new variable intr_msr which
      contains the entire MSR to use.  For a little-endian guest, userspace
      needs to set the ILE (interrupt little-endian) bit in the LPCR for
      each vcpu (or at least one vcpu in each virtual core).
      
      [paulus@samba.org - removed H_SET_MODE implementation from original
      version of the patch, and made kvmppc_set_lpcr update vcpu->arch.intr_msr.]
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      d682916a
    • P
      KVM: PPC: Book3S HV: Add support for DABRX register on POWER7 · 8563bf52
      Paul Mackerras 提交于
      The DABRX (DABR extension) register on POWER7 processors provides finer
      control over which accesses cause a data breakpoint interrupt.  It
      contains 3 bits which indicate whether to enable accesses in user,
      kernel and hypervisor modes respectively to cause data breakpoint
      interrupts, plus one bit that enables both real mode and virtual mode
      accesses to cause interrupts.  Currently, KVM sets DABRX to allow
      both kernel and user accesses to cause interrupts while in the guest.
      
      This adds support for the guest to specify other values for DABRX.
      PAPR defines a H_SET_XDABR hcall to allow the guest to set both DABR
      and DABRX with one call.  This adds a real-mode implementation of
      H_SET_XDABR, which shares most of its code with the existing H_SET_DABR
      implementation.  To support this, we add a per-vcpu field to store the
      DABRX value plus code to get and set it via the ONE_REG interface.
      
      For Linux guests to use this new hcall, userspace needs to add
      "hcall-xdabr" to the set of strings in the /chosen/hypertas-functions
      property in the device tree.  If userspace does this and then migrates
      the guest to a host where the kernel doesn't include this patch, then
      userspace will need to implement H_SET_XDABR by writing the specified
      DABR value to the DABR using the ONE_REG interface.  In that case, the
      old kernel will set DABRX to DABRX_USER | DABRX_KERNEL.  That should
      still work correctly, at least for Linux guests, since Linux guests
      cope with getting data breakpoint interrupts in modes that weren't
      requested by just ignoring the interrupt, and Linux guests never set
      DABRX_BTI.
      
      The other thing this does is to make H_SET_DABR and H_SET_XDABR work
      on POWER8, which has the DAWR and DAWRX instead of DABR/X.  Guests that
      know about POWER8 should use H_SET_MODE rather than H_SET_[X]DABR, but
      guests running in POWER7 compatibility mode will still use H_SET_[X]DABR.
      For them, this adds the logic to convert DABR/X values into DAWR/X values
      on POWER8.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8563bf52
    • P
      KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbells · 5d00f66b
      Paul Mackerras 提交于
      POWER8 has support for hypervisor doorbell interrupts.  Though the
      kernel doesn't use them for IPIs on the powernv platform yet, it
      probably will in future, so this makes KVM cope gracefully if a
      hypervisor doorbell interrupt arrives while in a guest.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5d00f66b
    • P
      KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8 · e0622bd9
      Paul Mackerras 提交于
      POWER8 has a bit in the LPCR to enable or disable the PURR and SPURR
      registers to count when in the guest.  Set this bit.
      
      POWER8 has a field in the LPCR called AIL (Alternate Interrupt Location)
      which is used to enable relocation-on interrupts.  Allow userspace to
      set this field.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e0622bd9
    • P
      KVM: PPC: Book3S HV: Handle guest using doorbells for IPIs · aa31e843
      Paul Mackerras 提交于
      * SRR1 wake reason field for system reset interrupt on wakeup from nap
        is now a 4-bit field on P8, compared to 3 bits on P7.
      
      * Set PECEDP in LPCR when napping because of H_CEDE so guest doorbells
        will wake us up.
      
      * Waking up from nap because of a guest doorbell interrupt is not a
        reason to exit the guest.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      aa31e843
    • P
      KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from nap · e3bbbbfa
      Paul Mackerras 提交于
      Currently in book3s_hv_rmhandlers.S we have three places where we
      have woken up from nap mode and we check the reason field in SRR1
      to see what event woke us up.  This consolidates them into a new
      function, kvmppc_check_wake_reason.  It looks at the wake reason
      field in SRR1, and if it indicates that an external interrupt caused
      the wakeup, calls kvmppc_read_intr to check what sort of interrupt
      it was.
      
      This also consolidates the two places where we synthesize an external
      interrupt (0x500 vector) for the guest.  Now, if the guest exit code
      finds that there was an external interrupt which has been handled
      (i.e. it was an IPI indicating that there is now an interrupt pending
      for the guest), it jumps to deliver_guest_interrupt, which is in the
      last part of the guest entry code, where we synthesize guest external
      and decrementer interrupts.  That code has been streamlined a little
      and now clears LPCR[MER] when appropriate as well as setting it.
      
      The extra clearing of any pending IPI on a secondary, offline CPU
      thread before going back to nap mode has been removed.  It is no longer
      necessary now that we have code to read and acknowledge IPIs in the
      guest exit path.
      
      This fixes a minor bug in the H_CEDE real-mode handling - previously,
      if we found that other threads were already exiting the guest when we
      were about to go to nap mode, we would branch to the cede wakeup path
      and end up looking in SRR1 for a wakeup reason.  Now we branch to a
      point after we have checked the wakeup reason.
      
      This also fixes a minor bug in kvmppc_read_intr - previously it could
      return 0xff rather than 1, in the case where we find that a host IPI
      is pending after we have cleared the IPI.  Now it returns 1.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e3bbbbfa
    • P
      KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8 · 5557ae0e
      Paul Mackerras 提交于
      This allows us to select architecture 2.05 (POWER6) or 2.06 (POWER7)
      compatibility modes on a POWER8 processor.  (Note that transactional
      memory is disabled for usermode if either or both of the PCR_TM_DIS
      and PCR_ARCH_206 bits are set.)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5557ae0e
    • M
      KVM: PPC: Book3S HV: Add handler for HV facility unavailable · bd3048b8
      Michael Ellerman 提交于
      At present this should never happen, since the host kernel sets
      HFSCR to allow access to all facilities.  It's better to be prepared
      to handle it cleanly if it does ever happen, though.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      bd3048b8
    • P
      KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8 · ca252055
      Paul Mackerras 提交于
      POWER8 has 512 sets in the TLB, compared to 128 for POWER7, so we need
      to do more tlbiel instructions when flushing the TLB on POWER8.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ca252055
    • M
      KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs · b005255e
      Michael Neuling 提交于
      This adds fields to the struct kvm_vcpu_arch to store the new
      guest-accessible SPRs on POWER8, adds code to the get/set_one_reg
      functions to allow userspace to access this state, and adds code to
      the guest entry and exit to context-switch these SPRs between host
      and guest.
      
      Note that DPDES (Directed Privileged Doorbell Exception State) is
      shared between threads on a core; hence we store it in struct
      kvmppc_vcore and have the master thread save and restore it.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b005255e
    • P
      KVM: PPC: Book3S HV: Align physical and virtual CPU thread numbers · e0b7ec05
      Paul Mackerras 提交于
      On a threaded processor such as POWER7, we group VCPUs into virtual
      cores and arrange that the VCPUs in a virtual core run on the same
      physical core.  Currently we don't enforce any correspondence between
      virtual thread numbers within a virtual core and physical thread
      numbers.  Physical threads are allocated starting at 0 on a first-come
      first-served basis to runnable virtual threads (VCPUs).
      
      POWER8 implements a new "msgsndp" instruction which guest kernels can
      use to interrupt other threads in the same core or sub-core.  Since
      the instruction takes the destination physical thread ID as a parameter,
      it becomes necessary to align the physical thread IDs with the virtual
      thread IDs, that is, to make sure virtual thread N within a virtual
      core always runs on physical thread N.
      
      This means that it's possible that thread 0, which is where we call
      __kvmppc_vcore_entry, may end up running some other vcpu than the
      one whose task called kvmppc_run_core(), or it may end up running
      no vcpu at all, if for example thread 0 of the virtual core is
      currently executing in userspace.  However, we do need thread 0
      to be responsible for switching the MMU -- a previous version of
      this patch that had other threads switching the MMU was found to
      be responsible for occasional memory corruption and machine check
      interrupts in the guest on POWER7 machines.
      
      To accommodate this, we no longer pass the vcpu pointer to
      __kvmppc_vcore_entry, but instead let the assembly code load it from
      the PACA.  Since the assembly code will need to know the kvm pointer
      and the thread ID for threads which don't have a vcpu, we move the
      thread ID into the PACA and we add a kvm pointer to the virtual core
      structure.
      
      In the case where thread 0 has no vcpu to run, it still calls into
      kvmppc_hv_entry in order to do the MMU switch, and then naps until
      either its vcpu is ready to run in the guest, or some other thread
      needs to exit the guest.  In the latter case, thread 0 jumps to the
      code that switches the MMU back to the host.  This control flow means
      that now we switch the MMU before loading any guest vcpu state.
      Similarly, on guest exit we now save all the guest vcpu state before
      switching the MMU back to the host.  This has required substantial
      code movement, making the diff rather large.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e0b7ec05
    • M
      KVM: PPC: Book3S HV: Don't set DABR on POWER8 · eee7ff9d
      Michael Neuling 提交于
      POWER8 doesn't have the DABR and DABRX registers; instead it has
      new DAWR/DAWRX registers, which will be handled in a later patch.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      eee7ff9d
    • S
      kvm/ppc: IRQ disabling cleanup · 6c85f52b
      Scott Wood 提交于
      Simplify the handling of lazy EE by going directly from fully-enabled
      to hard-disabled.  This replaces the lazy_irq_pending() check
      (including its misplaced kvm_guest_exit() call).
      
      As suggested by Tiejun Chen, move the interrupt disabling into
      kvmppc_prepare_to_enter() rather than have each caller do it.  Also
      move the IRQ enabling on heavyweight exit into
      kvmppc_prepare_to_enter().
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      6c85f52b
    • M
      KVM: PPC: e500: Fix bad address type in deliver_tlb_misss() · 70713fe3
      Mihai Caraman 提交于
      Use gva_t instead of unsigned int for eaddr in deliver_tlb_miss().
      Signed-off-by: NMihai Caraman <mihai.caraman@freescale.com>
      CC: stable@vger.kernel.org
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      70713fe3
    • A
      KVM: PPC: Book3S HV: use xics_wake_cpu only when defined · 48eaef05
      Andreas Schwab 提交于
      Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
      CC: stable@vger.kernel.org
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      48eaef05
    • C
      KVM: PPC: Book3S: MMIO emulation support for little endian guests · 73601775
      Cédric Le Goater 提交于
      MMIO emulation reads the last instruction executed by the guest
      and then emulates. If the guest is running in Little Endian order,
      or more generally in a different endian order of the host, the
      instruction needs to be byte-swapped before being emulated.
      
      This patch adds a helper routine which tests the endian order of
      the host and the guest in order to decide whether a byteswap is
      needed or not. It is then used to byteswap the last instruction
      of the guest in the endian order of the host before MMIO emulation
      is performed.
      
      Finally, kvmppc_handle_load() of kvmppc_handle_store() are modified
      to reverse the endianness of the MMIO if required.
      Signed-off-by: NCédric Le Goater <clg@fr.ibm.com>
      [agraf: add booke handling]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      73601775
  2. 09 1月, 2014 15 次提交
  3. 21 11月, 2013 2 次提交
  4. 20 11月, 2013 1 次提交
  5. 19 11月, 2013 1 次提交
  6. 17 11月, 2013 1 次提交
  7. 15 11月, 2013 4 次提交
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f0804804
      Linus Torvalds 提交于
      Pull KVM changes from Paolo Bonzini:
       "Here are the 3.13 KVM changes.  There was a lot of work on the PPC
        side: the HV and emulation flavors can now coexist in a single kernel
        is probably the most interesting change from a user point of view.
      
        On the x86 side there are nested virtualization improvements and a few
        bugfixes.
      
        ARM got transparent huge page support, improved overcommit, and
        support for big endian guests.
      
        Finally, there is a new interface to connect KVM with VFIO.  This
        helps with devices that use NoSnoop PCI transactions, letting the
        driver in the guest execute WBINVD instructions.  This includes some
        nVidia cards on Windows, that fail to start without these patches and
        the corresponding userspace changes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits)
        kvm, vmx: Fix lazy FPU on nested guest
        arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
        arm/arm64: KVM: MMIO support for BE guest
        kvm, cpuid: Fix sparse warning
        kvm: Delete prototype for non-existent function kvm_check_iopl
        kvm: Delete prototype for non-existent function complete_pio
        hung_task: add method to reset detector
        pvclock: detect watchdog reset at pvclock read
        kvm: optimize out smp_mb after srcu_read_unlock
        srcu: API for barrier after srcu read unlock
        KVM: remove vm mmap method
        KVM: IOMMU: hva align mapping page size
        KVM: x86: trace cpuid emulation when called from emulator
        KVM: emulator: cleanup decode_register_operand() a bit
        KVM: emulator: check rex prefix inside decode_register()
        KVM: x86: fix emulation of "movzbl %bpl, %eax"
        kvm_host: typo fix
        KVM: x86: emulate SAHF instruction
        MAINTAINERS: add tree for kvm.git
        Documentation/kvm: add a 00-INDEX file
        ...
      f0804804
    • L
      Merge tag 'stable/for-linus-3.13-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · eda670c6
      Linus Torvalds 提交于
      Pull Xen updates from Konrad Rzeszutek Wilk:
       "This has tons of fixes and two major features which are concentrated
        around the Xen SWIOTLB library.
      
        The short <blurb> is that the tracing facility (just one function) has
        been added to SWIOTLB to make it easier to track I/O progress.
        Additionally under Xen and ARM (32 & 64) the Xen-SWIOTLB driver
        "is used to translate physical to machine and machine to physical
        addresses of foreign[guest] pages for DMA operations" (Stefano) when
        booting under hardware without proper IOMMU.
      
        There are also bug-fixes, cleanups, compile warning fixes, etc.
      
        The commit times for some of the commits is a bit fresh - that is b/c
        we wanted to make sure we have the Ack's from the ARM folks - which
        with the string of back-to-back conferences took a bit of time.  Rest
        assured - the code has been stewing in #linux-next for some time.
      
        Features:
         - SWIOTLB has tracing added when doing bounce buffer.
         - Xen ARM/ARM64 can use Xen-SWIOTLB.  This work allows Linux to
           safely program real devices for DMA operations when running as a
           guest on Xen on ARM, without IOMMU support. [*1]
         - xen_raw_printk works with PVHVM guests if needed.
      
        Bug-fixes:
         - Make memory ballooning work under HVM with large MMIO region.
         - Inform hypervisor of MCFG regions found in ACPI DSDT.
         - Remove deprecated IRQF_DISABLED.
         - Remove deprecated __cpuinit.
      
        [*1]:
        "On arm and arm64 all Xen guests, including dom0, run with second
         stage translation enabled.  As a consequence when dom0 programs a
         device for a DMA operation is going to use (pseudo) physical
         addresses instead machine addresses.  This work introduces two trees
         to track physical to machine and machine to physical mappings of
         foreign pages.  Local pages are assumed mapped 1:1 (physical address
         == machine address).  It enables the SWIOTLB-Xen driver on ARM and
         ARM64, so that Linux can translate physical addresses to machine
         addresses for dma operations when necessary.  " (Stefano)"
      
      * tag 'stable/for-linus-3.13-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (32 commits)
        xen/arm: pfn_to_mfn and mfn_to_pfn return the argument if nothing is in the p2m
        arm,arm64/include/asm/io.h: define struct bio_vec
        swiotlb-xen: missing include dma-direction.h
        pci-swiotlb-xen: call pci_request_acs only ifdef CONFIG_PCI
        arm: make SWIOTLB available
        xen: delete new instances of added __cpuinit
        xen/balloon: Set balloon's initial state to number of existing RAM pages
        xen/mcfg: Call PHYSDEVOP_pci_mmcfg_reserved for MCFG areas.
        xen: remove deprecated IRQF_DISABLED
        x86/xen: remove deprecated IRQF_DISABLED
        swiotlb-xen: fix error code returned by xen_swiotlb_map_sg_attrs
        swiotlb-xen: static inline xen_phys_to_bus, xen_bus_to_phys, xen_virt_to_bus and range_straddles_page_boundary
        grant-table: call set_phys_to_machine after mapping grant refs
        arm,arm64: do not always merge biovec if we are running on Xen
        swiotlb: print a warning when the swiotlb is full
        swiotlb-xen: use xen_dma_map/unmap_page, xen_dma_sync_single_for_cpu/device
        xen: introduce xen_dma_map/unmap_page and xen_dma_sync_single_for_cpu/device
        tracing/events: Fix swiotlb tracepoint creation
        swiotlb-xen: use xen_alloc/free_coherent_pages
        xen: introduce xen_alloc/free_coherent_pages
        ...
      eda670c6
    • L
      Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · b746f9c7
      Linus Torvalds 提交于
      Pull virtio updates from Rusty Russell:
       "Nothing really exciting: some groundwork for changing virtio endian,
        and some robustness fixes for broken virtio devices, plus minor
        tweaks"
      
      * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
        virtio_scsi: verify if queue is broken after virtqueue_get_buf()
        x86, asmlinkage, lguest: Pass in globals into assembler statement
        virtio: mmio: fix signature checking for BE guests
        virtio_ring: adapt to notify() returning bool
        virtio_net: verify if queue is broken after virtqueue_get_buf()
        virtio_console: verify if queue is broken after virtqueue_get_buf()
        virtio_blk: verify if queue is broken after virtqueue_get_buf()
        virtio_ring: add new function virtqueue_is_broken()
        virtio_test: verify if virtqueue_kick() succeeded
        virtio_net: verify if virtqueue_kick() succeeded
        virtio_ring: let virtqueue_{kick()/notify()} return a bool
        virtio_ring: change host notification API
        virtio_config: remove virtio_config_val
        virtio: use size-based config accessors.
        virtio_config: introduce size-based accessors.
        virtio_ring: plug kmemleak false positive.
        virtio: pm: use CONFIG_PM_SLEEP instead of CONFIG_PM
      b746f9c7
    • L
      Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · ce6513f7
      Linus Torvalds 提交于
      Pull module updates from Rusty Russell:
       "Mainly boring here, too.  rmmod --wait finally removed, though"
      
      * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
        modpost: fix bogus 'exported twice' warnings.
        init: fix in-place parameter modification regression
        asmlinkage, module: Make ksymtab and kcrctab symbols and __this_module __visible
        kernel: add support for init_array constructors
        modpost: Optionally ignore secondary errors seen if a single module build fails
        module: remove rmmod --wait option.
      ce6513f7