1. 19 4月, 2012 1 次提交
    • A
      KVM: lock slots_lock around device assignment · 21a1416a
      Alex Williamson 提交于
      As pointed out by Jason Baron, when assigning a device to a guest
      we first set the iommu domain pointer, which enables mapping
      and unmapping of memory slots to the iommu.  This leaves a window
      where this path is enabled, but we haven't synchronized the iommu
      mappings to the existing memory slots.  Thus a slot being removed
      at that point could send us down unexpected code paths removing
      non-existent pinnings and iommu mappings.  Take the slots_lock
      around creating the iommu domain and initial mappings as well as
      around iommu teardown to avoid this race.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      21a1416a
  2. 12 4月, 2012 1 次提交
    • A
      KVM: unmap pages from the iommu when slots are removed · 32f6daad
      Alex Williamson 提交于
      We've been adding new mappings, but not destroying old mappings.
      This can lead to a page leak as pages are pinned using
      get_user_pages, but only unpinned with put_page if they still
      exist in the memslots list on vm shutdown.  A memslot that is
      destroyed while an iommu domain is enabled for the guest will
      therefore result in an elevated page reference count that is
      never cleared.
      
      Additionally, without this fix, the iommu is only programmed
      with the first translation for a gpa.  This can result in
      peer-to-peer errors if a mapping is destroyed and replaced by a
      new mapping at the same gpa as the iommu will still be pointing
      to the original, pinned memory address.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      32f6daad
  3. 20 3月, 2012 1 次提交
  4. 08 3月, 2012 8 次提交
  5. 05 3月, 2012 5 次提交
  6. 01 2月, 2012 1 次提交
    • T
      KVM: Fix __set_bit() race in mark_page_dirty() during dirty logging · 50e92b3c
      Takuya Yoshikawa 提交于
      It is possible that the __set_bit() in mark_page_dirty() is called
      simultaneously on the same region of memory, which may result in only
      one bit being set, because some callers do not take mmu_lock before
      mark_page_dirty().
      
      This problem is hard to produce because when we reach mark_page_dirty()
      beginning from, e.g., tdp_page_fault(), mmu_lock is being held during
      __direct_map():  making kvm-unit-tests' dirty log api test write to two
      pages concurrently was not useful for this reason.
      
      So we have confirmed that there can actually be race condition by
      checking if some callers really reach there without holding mmu_lock
      using spin_is_locked():  probably they were from kvm_write_guest_page().
      
      To fix this race, this patch changes the bit operation to the atomic
      version:  note that nr_dirty_pages also suffers from the race but we do
      not need exactly correct numbers for now.
      Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      50e92b3c
  7. 13 1月, 2012 1 次提交
  8. 27 12月, 2011 14 次提交
  9. 26 12月, 2011 1 次提交
    • A
      KVM: Device assignment permission checks · 3d27e23b
      Alex Williamson 提交于
      Only allow KVM device assignment to attach to devices which:
      
       - Are not bridges
       - Have BAR resources (assume others are special devices)
       - The user has permissions to use
      
      Assigning a bridge is a configuration error, it's not supported, and
      typically doesn't result in the behavior the user is expecting anyway.
      Devices without BAR resources are typically chipset components that
      also don't have host drivers.  We don't want users to hold such devices
      captive or cause system problems by fencing them off into an iommu
      domain.  We determine "permission to use" by testing whether the user
      has access to the PCI sysfs resource files.  By default a normal user
      will not have access to these files, so it provides a good indication
      that an administration agent has granted the user access to the device.
      
      [Yang Bai: add missing #include]
      [avi: fix comment style]
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NYang Bai <hamo.by@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      3d27e23b
  10. 25 12月, 2011 1 次提交
  11. 10 11月, 2011 1 次提交
    • O
      iommu/core: split mapping to page sizes as supported by the hardware · 7d3002cc
      Ohad Ben-Cohen 提交于
      When mapping a memory region, split it to page sizes as supported
      by the iommu hardware. Always prefer bigger pages, when possible,
      in order to reduce the TLB pressure.
      
      The logic to do that is now added to the IOMMU core, so neither the iommu
      drivers themselves nor users of the IOMMU API have to duplicate it.
      
      This allows a more lenient granularity of mappings; traditionally the
      IOMMU API took 'order' (of a page) as a mapping size, and directly let
      the low level iommu drivers handle the mapping, but now that the IOMMU
      core can split arbitrary memory regions into pages, we can remove this
      limitation, so users don't have to split those regions by themselves.
      
      Currently the supported page sizes are advertised once and they then
      remain static. That works well for OMAP and MSM but it would probably
      not fly well with intel's hardware, where the page size capabilities
      seem to have the potential to be different between several DMA
      remapping devices.
      
      register_iommu() currently sets a default pgsize behavior, so we can convert
      the IOMMU drivers in subsequent patches. After all the drivers
      are converted, the temporary default settings will be removed.
      
      Mainline users of the IOMMU API (kvm and omap-iovmm) are adopted
      to deal with bytes instead of page order.
      
      Many thanks to Joerg Roedel <Joerg.Roedel@amd.com> for significant review!
      Signed-off-by: NOhad Ben-Cohen <ohad@wizery.com>
      Cc: David Brown <davidb@codeaurora.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Joerg Roedel <Joerg.Roedel@amd.com>
      Cc: Stepan Moskovchenko <stepanm@codeaurora.org>
      Cc: KyongHo Cho <pullip.cho@samsung.com>
      Cc: Hiroshi DOYU <hdoyu@nvidia.com>
      Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
      Cc: kvm@vger.kernel.org
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      7d3002cc
  12. 01 11月, 2011 2 次提交
  13. 21 10月, 2011 2 次提交
  14. 26 9月, 2011 1 次提交