1. 17 12月, 2011 1 次提交
    • E
      iommu: Export intel_iommu_enabled to signal when iommu is in use · 8bc1f85c
      Eugeni Dodonov 提交于
      In i915 driver, we do not enable either rc6 or semaphores on SNB when dmar
      is enabled. The new 'intel_iommu_enabled' variable signals when the
      iommu code is in operation.
      
      Cc: Ted Phelps <phelps@gnusto.com>
      Cc: Peter <pab1612@gmail.com>
      Cc: Lukas Hejtmanek <xhejtman@fi.muni.cz>
      Cc: Andrew Lutomirski <luto@mit.edu>
      CC: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Eugeni Dodonov <eugeni.dodonov@intel.com>
      Signed-off-by: NKeith Packard <keithp@keithp.com>
      8bc1f85c
  2. 09 12月, 2011 1 次提交
    • T
      memblock: Kill early_node_map[] · 0ee332c1
      Tejun Heo 提交于
      Now all ARCH_POPULATES_NODE_MAP archs select HAVE_MEBLOCK_NODE_MAP -
      there's no user of early_node_map[] left.  Kill early_node_map[] and
      replace ARCH_POPULATES_NODE_MAP with HAVE_MEMBLOCK_NODE_MAP.  Also,
      relocate for_each_mem_pfn_range() and helper from mm.h to memblock.h
      as page_alloc.c would no longer host an alternative implementation.
      
      This change is ultimately one to one mapping and shouldn't cause any
      observable difference; however, after the recent changes, there are
      some functions which now would fit memblock.c better than page_alloc.c
      and dependency on HAVE_MEMBLOCK_NODE_MAP instead of HAVE_MEMBLOCK
      doesn't make much sense on some of them.  Further cleanups for
      functions inside HAVE_MEMBLOCK_NODE_MAP in mm.h would be nice.
      
      -v2: Fix compile bug introduced by mis-spelling
       CONFIG_HAVE_MEMBLOCK_NODE_MAP to CONFIG_MEMBLOCK_HAVE_NODE_MAP in
       mmzone.h.  Reported by Stephen Rothwell.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      0ee332c1
  3. 06 12月, 2011 1 次提交
  4. 15 11月, 2011 2 次提交
  5. 10 11月, 2011 2 次提交
    • O
      iommu/intel: announce supported page sizes · 6d1c56a9
      Ohad Ben-Cohen 提交于
      Let the IOMMU core know we support arbitrary page sizes (as long as
      they're an order of 4KiB).
      
      This way the IOMMU core will retain the existing behavior we're used to;
      it will let us map regions that:
      - their size is an order of 4KiB
      - they are naturally aligned
      
      Note: Intel IOMMU hardware doesn't support arbitrary page sizes,
      but the driver does (it splits arbitrary-sized mappings into
      the pages supported by the hardware).
      
      To make everything simpler for now, though, this patch effectively tells
      the IOMMU core to keep giving this driver the same memory regions it did
      before, so nothing is changed as far as it's concerned.
      
      At this point, the page sizes announced remain static within the IOMMU
      core. To correctly utilize the pgsize-splitting of the IOMMU core by
      this driver, it seems that some core changes should still be done,
      because Intel's IOMMU page size capabilities seem to have the potential
      to be different between different DMA remapping devices.
      Signed-off-by: NOhad Ben-Cohen <ohad@wizery.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      6d1c56a9
    • O
      iommu/core: stop converting bytes to page order back and forth · 5009065d
      Ohad Ben-Cohen 提交于
      Express sizes in bytes rather than in page order, to eliminate the
      size->order->size conversions we have whenever the IOMMU API is calling
      the low level drivers' map/unmap methods.
      
      Adopt all existing drivers.
      Signed-off-by: NOhad Ben-Cohen <ohad@wizery.com>
      Cc: David Brown <davidb@codeaurora.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Joerg Roedel <Joerg.Roedel@amd.com>
      Cc: Stepan Moskovchenko <stepanm@codeaurora.org>
      Cc: KyongHo Cho <pullip.cho@samsung.com>
      Cc: Hiroshi DOYU <hdoyu@nvidia.com>
      Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      5009065d
  6. 01 11月, 2011 1 次提交
  7. 21 10月, 2011 3 次提交
  8. 19 10月, 2011 3 次提交
  9. 15 10月, 2011 2 次提交
  10. 11 10月, 2011 1 次提交
    • R
      intel-iommu: Fix AB-BA lockdep report · 3e7abe25
      Roland Dreier 提交于
      When unbinding a device so that I could pass it through to a KVM VM, I
      got the lockdep report below.  It looks like a legitimate lock
      ordering problem:
      
       - domain_context_mapping_one() takes iommu->lock and calls
         iommu_support_dev_iotlb(), which takes device_domain_lock (inside
         iommu->lock).
      
       - domain_remove_one_dev_info() starts by taking device_domain_lock
         then takes iommu->lock inside it (near the end of the function).
      
      So this is the classic AB-BA deadlock.  It looks like a safe fix is to
      simply release device_domain_lock a bit earlier, since as far as I can
      tell, it doesn't protect any of the stuff accessed at the end of
      domain_remove_one_dev_info() anyway.
      
      BTW, the use of device_domain_lock looks a bit unsafe to me... it's
      at least not obvious to me why we aren't vulnerable to the race below:
      
        iommu_support_dev_iotlb()
                                                domain_remove_dev_info()
      
        lock device_domain_lock
          find info
        unlock device_domain_lock
      
                                                lock device_domain_lock
                                                  find same info
                                                unlock device_domain_lock
      
                                                free_devinfo_mem(info)
      
        do stuff with info after it's free
      
      However I don't understand the locking here well enough to know if
      this is a real problem, let alone what the best fix is.
      
      Anyway here's the full lockdep output that prompted all of this:
      
           =======================================================
           [ INFO: possible circular locking dependency detected ]
           2.6.39.1+ #1
           -------------------------------------------------------
           bash/13954 is trying to acquire lock:
            (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
      
           but task is already holding lock:
            (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230
      
           which lock already depends on the new lock.
      
           the existing dependency chain (in reverse order) is:
      
           -> #1 (device_domain_lock){-.-...}:
                  [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
                  [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
                  [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750
                  [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120
                  [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0
                  [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e
                  [<ffffffff81cab204>] pci_iommu_init+0x16/0x41
                  [<ffffffff81002165>] do_one_initcall+0x45/0x190
                  [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168
                  [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10
      
           -> #0 (&(&iommu->lock)->rlock){......}:
                  [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
                  [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
                  [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
                  [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
                  [<ffffffff812f8b42>] device_notifier+0x72/0x90
                  [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
                  [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
                  [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
                  [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
                  [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
                  [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
                  [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
                  [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
                  [<ffffffff8117569e>] vfs_write+0xce/0x190
                  [<ffffffff811759e4>] sys_write+0x54/0xa0
                  [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b
      
           other info that might help us debug this:
      
           6 locks held by bash/13954:
            #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170
            #1:  (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170
            #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0
            #3:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50
            #4:  (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0
            #5:  (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230
      
           stack backtrace:
           Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1
           Call Trace:
            [<ffffffff810993a7>] print_circular_bug+0xf7/0x100
            [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
            [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
            [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
            [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
            [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
            [<ffffffff812f8b42>] device_notifier+0x72/0x90
            [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
            [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
            [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
            [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
            [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
            [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
            [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
            [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
            [<ffffffff8117569e>] vfs_write+0xce/0x190
            [<ffffffff811759e4>] sys_write+0x54/0xa0
            [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      3e7abe25
  11. 21 9月, 2011 3 次提交
  12. 13 9月, 2011 1 次提交
  13. 21 6月, 2011 1 次提交
    • O
      x86/ia64: intel-iommu: move to drivers/iommu/ · 166e9278
      Ohad Ben-Cohen 提交于
      This should ease finding similarities with different platforms,
      with the intention of solving problems once in a generic framework
      which everyone can use.
      
      Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
      has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
      in this patch, too.
      
      As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.
      
      Compile-tested on x86_64.
      Signed-off-by: NOhad Ben-Cohen <ohad@wizery.com>
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      166e9278
  14. 08 6月, 2011 1 次提交
  15. 01 6月, 2011 8 次提交
    • D
      intel-iommu: Fix off-by-one in RMRR setup · 70e535d1
      David Woodhouse 提交于
      We were mapping an extra byte (and hence usually an extra page):
      iommu_prepare_identity_map() expects to be given an 'end' argument which
      is the last byte to be mapped; not the first byte *not* to be mapped.
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      70e535d1
    • M
      intel-iommu: Add domain check in domain_remove_one_dev_info · 8519dc44
      Mike Habeck 提交于
      The comment in domain_remove_one_dev_info() states "No need to compare
      PCI domain; it has to be the same". But for the si_domain that isn't
      going to be true, as it consists of all the PCI devices that are
      identity mapped thus multiple PCI domains can be in si_domain.  The
      code needs to validate the PCI domain too.
      Signed-off-by: NMike Habeck <habeck@sgi.com>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      8519dc44
    • M
      intel-iommu: Remove Host Bridge devices from identity mapping · 825507d6
      Mike Travis 提交于
      When using the 1:1 (identity) PCI DMA remapping, PCI Host Bridge devices
      that do not use the IOMMU causes a kernel panic.  Fix that by not
      inserting those devices into the si_domain.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NMike Habeck <habeck@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      825507d6
    • M
      intel-iommu: Use coherent DMA mask when requested · c681d0ba
      Mike Travis 提交于
      The __intel_map_single function is not honoring the passed in DMA mask.
      This results in not using the coherent DMA mask when called from
      intel_alloc_coherent().
      Signed-off-by: NMike Travis <travis@sgi.com>
      Acked-by: NChris Wright <chrisw@sous-sol.org>
      Reviewed-by: NMike Habeck <habeck@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      c681d0ba
    • M
      intel-iommu: Speed up processing of the identity_mapping function · cb452a40
      Mike Travis 提交于
      When there are a large count of PCI devices, and the pass through
      option for iommu is set, much time is spent in the identity_mapping
      function hunting though the iommu domains to check if a specific
      device is "identity mapped".
      
      Speed up the function by checking the cached info to see if
      it's mapped to the static identity domain.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NMike Habeck <habeck@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      cb452a40
    • C
      intel-iommu: Check for identity mapping candidate using system dma mask · 8fcc5372
      Chris Wright 提交于
      The identity mapping code appears to make the assumption that if the
      devices dma_mask is greater than 32bits the device can use identity
      mapping.  But that is not true: take the case where we have a 40bit
      device in a 44bit architecture. The device can potentially receive a
      physical address that it will truncate and cause incorrect addresses
      to be used.
      
      Instead check to see if the device's dma_mask is large enough
      to address the system's dma_mask.
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NMike Habeck <habeck@sgi.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      8fcc5372
    • A
      intel-iommu: Only unlink device domains from iommu · 9b4554b2
      Alex Williamson 提交于
      Commit a97590e5 added unlinking domains from iommus to reciprocate the
      iommu from domains unlinking that was already done.  We actually want
      to only do this for device domains and never for the static
      identity map domain or VM domains.  The SI domain is special and
      never freed, while VM domain->id lives in their own special address
      space, separate from iommu->domain_ids.
      
      In the current code, a VM can get domain->id zero, then mark that
      domain unused when unbound from pci-stub.  This leads to DMAR
      write faults when the device is re-bound to the host driver.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      9b4554b2
    • Y
      intel-iommu: Enable super page (2MiB, 1GiB, etc.) support · 6dd9a7c7
      Youquan Song 提交于
      There are no externally-visible changes with this. In the loop in the
      internal __domain_mapping() function, we simply detect if we are mapping:
        - size >= 2MiB, and
        - virtual address aligned to 2MiB, and
        - physical address aligned to 2MiB, and
        - on hardware that supports superpages.
      
      (and likewise for larger superpages).
      
      We automatically use a superpage for such mappings. We never have to
      worry about *breaking* superpages, since we trust that we will always
      *unmap* the same range that was mapped. So all we need to do is ensure
      that dma_pte_clear_range() will also cope with superpages.
      
      Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
      it can return a PTE at the appropriate level rather than always
      extending the page tables all the way down to level 1. Again, this is
      simplified by the fact that we should never encounter existing small
      pages when we're creating a mapping; any old mapping that used the same
      virtual range will have been entirely removed and its obsolete page
      tables freed.
      
      Provide an 'intel_iommu=sp_off' argument on the command line as a
      chicken bit. Not that it should ever be required.
      
      ==
      
      The original commit seen in the iommu-2.6.git was Youquan's
      implementation (and completion) of my own half-baked code which I'd
      typed into an email. Followed by half a dozen subsequent 'fixes'.
      
      I've taken the unusual step of rewriting history and collapsing the
      original commits in order to keep the main history simpler, and make
      life easier for the people who are going to have to backport this to
      older kernels. And also so I can give it a more coherent commit comment
      which (hopefully) gives a better explanation of what's going on.
      
      The original sequence of commits leading to identical code was:
      
      Youquan Song (3):
            intel-iommu: super page support
            intel-iommu: Fix superpage alignment calculation error
            intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()
      
      David Woodhouse (4):
            intel-iommu: Precalculate superpage support for dmar_domain
            intel-iommu: Fix hardware_largepage_caps()
            intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
            intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages
      Signed-off-by: NYouquan Song <youquan.song@intel.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      6dd9a7c7
  16. 24 5月, 2011 2 次提交
  17. 21 4月, 2011 1 次提交
    • J
      intel_iommu: disable all VT-d PMRs when TXT launched · 51a63e67
      Joseph Cihula 提交于
      Intel VT-d Protected Memory Regions (PMRs) are supposed to be disabled,
      on each VT-d engine, after DMA remapping is enabled on the engines.
      This is because the behavior of having both enabled is not deterministic
      and because, if TXT has been used to launch the kernel, the PMRs may be
      programmed to cover memory regions that will be used for DMA.
      
      Under some circumstances (certain quirks detected, lack of multiple
      devices, etc.), the current code does not set up DMA remapping on some
      VT-d engines.  In such cases it also skips disabling the PMRs.  This
      causes failures when the kernel is launched with TXT (most often this
      occurs on the graphics engine and results in colored vertical bars on
      the display).
      
      This patch detects when the kernel has been launched with TXT and then
      disables the PMRs on all VT-d engines.  In some cases where the reason
      that remapping is not being enabled is due to possible ACPI DMAR table
      errors, the VT-d engine addresses may not be correct and thus not able
      to be safely programmed even to disable PMRs.  Because part of the TXT
      launch process is the verification of these addresses, it will always be
      safe to disable PMRs if the TXT launch has succeeded and hence only
      doing this in such cases.
      Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      51a63e67
  18. 11 4月, 2011 1 次提交
  19. 31 3月, 2011 1 次提交
  20. 29 3月, 2011 1 次提交
  21. 24 3月, 2011 1 次提交
  22. 12 3月, 2011 2 次提交