1. 01 6月, 2011 1 次提交
    • Y
      intel-iommu: Enable super page (2MiB, 1GiB, etc.) support · 6dd9a7c7
      Youquan Song 提交于
      There are no externally-visible changes with this. In the loop in the
      internal __domain_mapping() function, we simply detect if we are mapping:
        - size >= 2MiB, and
        - virtual address aligned to 2MiB, and
        - physical address aligned to 2MiB, and
        - on hardware that supports superpages.
      
      (and likewise for larger superpages).
      
      We automatically use a superpage for such mappings. We never have to
      worry about *breaking* superpages, since we trust that we will always
      *unmap* the same range that was mapped. So all we need to do is ensure
      that dma_pte_clear_range() will also cope with superpages.
      
      Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
      it can return a PTE at the appropriate level rather than always
      extending the page tables all the way down to level 1. Again, this is
      simplified by the fact that we should never encounter existing small
      pages when we're creating a mapping; any old mapping that used the same
      virtual range will have been entirely removed and its obsolete page
      tables freed.
      
      Provide an 'intel_iommu=sp_off' argument on the command line as a
      chicken bit. Not that it should ever be required.
      
      ==
      
      The original commit seen in the iommu-2.6.git was Youquan's
      implementation (and completion) of my own half-baked code which I'd
      typed into an email. Followed by half a dozen subsequent 'fixes'.
      
      I've taken the unusual step of rewriting history and collapsing the
      original commits in order to keep the main history simpler, and make
      life easier for the people who are going to have to backport this to
      older kernels. And also so I can give it a more coherent commit comment
      which (hopefully) gives a better explanation of what's going on.
      
      The original sequence of commits leading to identical code was:
      
      Youquan Song (3):
            intel-iommu: super page support
            intel-iommu: Fix superpage alignment calculation error
            intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()
      
      David Woodhouse (4):
            intel-iommu: Precalculate superpage support for dmar_domain
            intel-iommu: Fix hardware_largepage_caps()
            intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
            intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages
      Signed-off-by: NYouquan Song <youquan.song@intel.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      6dd9a7c7
  2. 24 5月, 2011 2 次提交
  3. 21 4月, 2011 1 次提交
    • J
      intel_iommu: disable all VT-d PMRs when TXT launched · 51a63e67
      Joseph Cihula 提交于
      Intel VT-d Protected Memory Regions (PMRs) are supposed to be disabled,
      on each VT-d engine, after DMA remapping is enabled on the engines.
      This is because the behavior of having both enabled is not deterministic
      and because, if TXT has been used to launch the kernel, the PMRs may be
      programmed to cover memory regions that will be used for DMA.
      
      Under some circumstances (certain quirks detected, lack of multiple
      devices, etc.), the current code does not set up DMA remapping on some
      VT-d engines.  In such cases it also skips disabling the PMRs.  This
      causes failures when the kernel is launched with TXT (most often this
      occurs on the graphics engine and results in colored vertical bars on
      the display).
      
      This patch detects when the kernel has been launched with TXT and then
      disables the PMRs on all VT-d engines.  In some cases where the reason
      that remapping is not being enabled is due to possible ACPI DMAR table
      errors, the VT-d engine addresses may not be correct and thus not able
      to be safely programmed even to disable PMRs.  Because part of the TXT
      launch process is the verification of these addresses, it will always be
      safe to disable PMRs if the TXT launch has succeeded and hence only
      doing this in such cases.
      Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      51a63e67
  4. 31 3月, 2011 1 次提交
  5. 29 3月, 2011 1 次提交
  6. 24 3月, 2011 1 次提交
  7. 12 3月, 2011 2 次提交
  8. 18 1月, 2011 1 次提交
  9. 23 9月, 2010 1 次提交
  10. 22 9月, 2010 2 次提交
  11. 10 8月, 2010 1 次提交
  12. 05 8月, 2010 1 次提交
  13. 19 7月, 2010 1 次提交
  14. 15 6月, 2010 3 次提交
  15. 17 5月, 2010 2 次提交
  16. 09 4月, 2010 5 次提交
  17. 08 3月, 2010 2 次提交
  18. 17 12月, 2009 1 次提交
  19. 08 12月, 2009 4 次提交
    • K
      Revert "Intel IOMMU: Avoid memory allocation failures in dma map api calls" · 354bb65e
      KOSAKI Motohiro 提交于
      commit eb3fa7cb said Intel IOMMU
      
          Intel IOMMU driver needs memory during DMA map calls to setup its
          internal page tables and for other data structures.  As we all know
          that these DMA map calls are mostly called in the interrupt context
          or with the spinlock held by the upper level drivers(network/storage
          drivers), so in order to avoid any memory allocation failure due to
          low memory issues, this patch makes memory allocation by temporarily
          setting PF_MEMALLOC flags for the current task before making memory
          allocation calls.
      
          We evaluated mempools as a backup when kmem_cache_alloc() fails
          and found that mempools are really not useful here because
           1) We don't know for sure how much to reserve in advance
           2) And mempools are not useful for GFP_ATOMIC case (as we call
              memory alloc functions with GFP_ATOMIC)
      
          (akpm: point 2 is wrong...)
      
      The above description doesn't justify to waste system emergency memory
      at all. Non MM subsystem must not use PF_MEMALLOC. Memory reclaim need
      few memory, anyone must not prevent it. Otherwise the system cause
      mysterious hang-up and/or OOM Killer invokation.
      
      Plus, akpm already pointed out what we should do.
      
      Then, this patch revert it.
      
      Cc: Keshavamurthy Anil S <anil.s.keshavamurthy@intel.com>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      354bb65e
    • C
      intel-iommu: ignore page table validation in pass through mode · 1672af11
      Chris Wright 提交于
      We are seeing a bug when booting w/ iommu=pt with current upstream
      (bisect blames 19943b0e "intel-iommu:
      Unify hardware and software passthrough support).
      
      The issue is specific to this loop during identity map initialization
      of each device:
      
      domain_context_mapping_one(si_domain, ..., CONTEXT_TT_PASS_THROUGH)
      ...
      		/* Skip top levels of page tables for
      		* iommu which has less agaw than default.
      		*/
      		for (agaw = domain->agaw; agaw != iommu->agaw; agaw--) {
      			pgd = phys_to_virt(dma_pte_addr(pgd));
      			if (!dma_pte_present(pgd)) {      <------ failing here
      				spin_unlock_irqrestore(&iommu->lock, flags);
      			return -ENOMEM;
      		}
      
      This box has 2 iommu's in it.  The catchall iommu has MGAW == 48, and
      SAGAW == 4.  The other iommu has MGAW == 39, SAGAW == 2.
      
      The device that's failing the above pgd test is the only device connected
      to the non-catchall iommu, which has a smaller address width than the
      domain default.  This test is not necessary since the context is in PT
      mode and the ASR is ignored.
      
      Thanks to Don Dutile for discovering and debugging this one.
      
      Cc: stable@kernel.org
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      1672af11
    • D
      intel-iommu: Fix oops with intel_iommu=igfx_off · 44cd613c
      David Woodhouse 提交于
      The hotplug notifier will call find_domain() to see if the device in
      question has been assigned an IOMMU domain. However, this should never
      be called for devices with a "dummy" domain, such as graphics devices
      when intel_iommu=igfx_off is set and the corresponding IOMMU isn't even
      initialised. If you do that, it'll oops as it dereferences the (-1)
      pointer.
      
      The notifier function should check iommu_no_mapping() for the
      device before doing anything else.
      
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      44cd613c
    • D
      intel-iommu: Check for an RMRR which ends before it starts. · 5595b528
      David Woodhouse 提交于
      Some HP BIOSes report an RMRR region (a region which needs a 1:1 mapping
      in the IOMMU for a given device) which has an end address lower than its
      start address. Detect that and warn, rather than triggering the
      BUG() in dma_pte_clear_range().
      
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      5595b528
  20. 25 11月, 2009 1 次提交
  21. 12 11月, 2009 2 次提交
    • F
      intel-iommu: Support PCIe hot-plug · 99dcaded
      Fenghua Yu 提交于
      To support PCIe hot plug in IOMMU, we register a notifier to respond to device
      change action.
      
      When the notifier gets BUS_NOTIFY_UNBOUND_DRIVER, it removes the device
      from its DMAR domain.
      
      A hot added device will be added into an IOMMU domain when it first does IOMMU
      op. So there is no need to add more code for hot add.
      
      Without the patch, after a hot-remove, a hot-added device on the same
      slot will not work.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Tested-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      99dcaded
    • A
      intel-iommu: Obey coherent_dma_mask for alloc_coherent on passthrough · e8bb910d
      Alex Williamson 提交于
      The model for IOMMU passthrough is that decent devices that can cope
      with DMA to all of memory get passthrough; crappy devices with a limited
      dma_mask don't -- they get to use the IOMMU anyway.
      
      This is done on the basis that IOMMU passthrough is usually wanted for
      performance reasons, and it's only the decent PCI devices that you
      really care about performance for, while the crappy 32-bit ones like
      your USB controller can just use the IOMMU and you won't really care.
      
      Unfortunately, the check for this was only looking at dev->dma_mask, not
      at dev->coherent_dma_mask. And some devices have a 32-bit
      coherent_dma_mask even though they have a full 64-bit dma_mask.
      
      Even more unfortunately, fixing that simple oversight would upset
      certain broken HP devices. Not only do they have a 32-bit
      coherent_dma_mask, but they also have a tendency to do stray DMA to
      unmapped addresses. And then they die when they take the DMA fault they
      so richly deserve.
      
      So if we do the 'correct' fix, it'll mean that affected users have to
      disable IOMMU support completely on "a large percentage of servers from
      a major vendor."
      
      Personally, I have little sympathy -- given that this is the _same_
      'major vendor' who is shipping machines which claim to have IOMMU
      support but have obviously never _once_ booted a VT-d capable OS to do
      any form of QA. But strictly speaking, it _would_ be a regression even
      though it only ever worked by fluke.
      
      For 2.6.33, we'll come up with a quirk which gives swiotlb support
      for this particular device, and other devices with an inadequate
      coherent_dma_mask will just get normal IOMMU mapping.
      
      The simplest fix for 2.6.32, though, is just to jump through some hoops
      to try to allocate coherent DMA memory for such devices in a place that
      they can reach. We'd use dma_generic_alloc_coherent() for this if it
      existed on IA64.
      Signed-off-by: NAlex Williamson <alex.williamson@hp.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      e8bb910d
  22. 10 11月, 2009 1 次提交
    • F
      x86: Handle HW IOMMU initialization failure gracefully · 75f1cdf1
      FUJITA Tomonori 提交于
      If HW IOMMU initialization fails (Intel VT-d often does this,
      typically due to BIOS bugs), we fall back to nommu. It doesn't
      work for the majority since nowadays we have more than 4GB
      memory so we must use swiotlb instead of nommu.
      
      The problem is that it's too late to initialize swiotlb when HW
      IOMMU initialization fails. We need to allocate swiotlb memory
      earlier from bootmem allocator. Chris explained the issue in
      detail:
      
        http://marc.info/?l=linux-kernel&m=125657444317079&w=2
      
      The current x86 IOMMU initialization sequence is too complicated
      and handling the above issue makes it more hacky.
      
      This patch changes x86 IOMMU initialization sequence to handle
      the above issue cleanly.
      
      The new x86 IOMMU initialization sequence are:
      
      1. we initialize the swiotlb (and setting swiotlb to 1) in the case
         of (max_pfn > MAX_DMA32_PFN && !no_iommu). dma_ops is set to
         swiotlb_dma_ops or nommu_dma_ops. if swiotlb usage is forced by
         the boot option, we finish here.
      
      2. we call the detection functions of all the IOMMUs
      
      3. the detection function sets x86_init.iommu.iommu_init to the
         IOMMU initialization function (so we can avoid calling the
         initialization functions of all the IOMMUs needlessly).
      
      4. if the IOMMU initialization function doesn't need to swiotlb
         then sets swiotlb to zero (e.g. the initialization is
         sucessful).
      
      5. if we find that swiotlb is set to zero, we free swiotlb
         resource.
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: chrisw@sous-sol.org
      Cc: dwmw2@infradead.org
      Cc: joerg.roedel@amd.com
      Cc: muli@il.ibm.com
      LKML-Reference: <1257849980-22640-10-git-send-email-fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75f1cdf1
  23. 05 10月, 2009 1 次提交
  24. 01 10月, 2009 1 次提交
  25. 20 9月, 2009 1 次提交