1. 28 1月, 2021 1 次提交
    • N
      iommu/vt-d: Do not use flush-queue when caching-mode is on · 29b32839
      Nadav Amit 提交于
      When an Intel IOMMU is virtualized, and a physical device is
      passed-through to the VM, changes of the virtual IOMMU need to be
      propagated to the physical IOMMU. The hypervisor therefore needs to
      monitor PTE mappings in the IOMMU page-tables. Intel specifications
      provide "caching-mode" capability that a virtual IOMMU uses to report
      that the IOMMU is virtualized and a TLB flush is needed after mapping to
      allow the hypervisor to propagate virtual IOMMU mappings to the physical
      IOMMU. To the best of my knowledge no real physical IOMMU reports
      "caching-mode" as turned on.
      
      Synchronizing the virtual and the physical IOMMU tables is expensive if
      the hypervisor is unaware which PTEs have changed, as the hypervisor is
      required to walk all the virtualized tables and look for changes.
      Consequently, domain flushes are much more expensive than page-specific
      flushes on virtualized IOMMUs with passthrough devices. The kernel
      therefore exploited the "caching-mode" indication to avoid domain
      flushing and use page-specific flushing in virtualized environments. See
      commit 78d5f0f5 ("intel-iommu: Avoid global flushes with caching
      mode.")
      
      This behavior changed after commit 13cf0174 ("iommu/vt-d: Make use
      of iova deferred flushing"). Now, when batched TLB flushing is used (the
      default), full TLB domain flushes are performed frequently, requiring
      the hypervisor to perform expensive synchronization between the virtual
      TLB and the physical one.
      
      Getting batched TLB flushes to use page-specific invalidations again in
      such circumstances is not easy, since the TLB invalidation scheme
      assumes that "full" domain TLB flushes are performed for scalability.
      
      Disable batched TLB flushes when caching-mode is on, as the performance
      benefit from using batched TLB invalidations is likely to be much
      smaller than the overhead of the virtual-to-physical IOMMU page-tables
      synchronization.
      
      Fixes: 13cf0174 ("iommu/vt-d: Make use of iova deferred flushing")
      Signed-off-by: NNadav Amit <namit@vmware.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: stable@vger.kernel.org
      Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
      Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
      29b32839
  2. 13 1月, 2021 1 次提交
  3. 07 1月, 2021 2 次提交
  4. 01 12月, 2020 1 次提交
  5. 27 11月, 2020 1 次提交
  6. 26 11月, 2020 1 次提交
  7. 25 11月, 2020 5 次提交
  8. 23 11月, 2020 1 次提交
  9. 18 11月, 2020 2 次提交
  10. 03 11月, 2020 1 次提交
    • L
      iommu/vt-d: Fix kernel NULL pointer dereference in find_domain() · 6097df45
      Lu Baolu 提交于
      If calling find_domain() for a device which hasn't been probed by the
      iommu core, below kernel NULL pointer dereference issue happens.
      
      [  362.736947] BUG: kernel NULL pointer dereference, address: 0000000000000038
      [  362.743953] #PF: supervisor read access in kernel mode
      [  362.749115] #PF: error_code(0x0000) - not-present page
      [  362.754278] PGD 0 P4D 0
      [  362.756843] Oops: 0000 [#1] SMP NOPTI
      [  362.760528] CPU: 0 PID: 844 Comm: cat Not tainted 5.9.0-rc4-intel-next+ #1
      [  362.767428] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake
                     U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3384.A02.1909200816
                     09/20/2019
      [  362.781109] RIP: 0010:find_domain+0xd/0x40
      [  362.785234] Code: 48 81 fb 60 28 d9 b2 75 de 5b 41 5c 41 5d 5d c3 0f 1f 00 66
                           2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 e0 02 00
                           00 55 <48> 8b 40 38 48 89 e5 48 83 f8 fe 0f 94 c1 48 85 ff
                           0f 94 c2 08 d1
      [  362.804041] RSP: 0018:ffffb09cc1f0bd38 EFLAGS: 00010046
      [  362.809292] RAX: 0000000000000000 RBX: ffff905b98e4fac8 RCX: 0000000000000000
      [  362.816452] RDX: 0000000000000001 RSI: ffff905b98e4fac8 RDI: ffff905b9ccd40d0
      [  362.823617] RBP: ffffb09cc1f0bda0 R08: ffffb09cc1f0bd48 R09: 000000000000000f
      [  362.830778] R10: ffffffffb266c080 R11: ffff905b9042602d R12: ffff905b98e4fac8
      [  362.837944] R13: ffffb09cc1f0bd48 R14: ffff905b9ccd40d0 R15: ffff905b98e4fac8
      [  362.845108] FS:  00007f8485460740(0000) GS:ffff905b9fc00000(0000)
                     knlGS:0000000000000000
      [  362.853227] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  362.858996] CR2: 0000000000000038 CR3: 00000004627a6003 CR4: 0000000000770ef0
      [  362.866161] PKRU: fffffffc
      [  362.868890] Call Trace:
      [  362.871363]  ? show_device_domain_translation+0x32/0x100
      [  362.876700]  ? bind_store+0x110/0x110
      [  362.880387]  ? klist_next+0x91/0x120
      [  362.883987]  ? domain_translation_struct_show+0x50/0x50
      [  362.889237]  bus_for_each_dev+0x79/0xc0
      [  362.893121]  domain_translation_struct_show+0x36/0x50
      [  362.898204]  seq_read+0x135/0x410
      [  362.901545]  ? handle_mm_fault+0xeb8/0x1750
      [  362.905755]  full_proxy_read+0x5c/0x90
      [  362.909526]  vfs_read+0xa6/0x190
      [  362.912782]  ksys_read+0x61/0xe0
      [  362.916037]  __x64_sys_read+0x1a/0x20
      [  362.919725]  do_syscall_64+0x37/0x80
      [  362.923329]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  362.928405] RIP: 0033:0x7f84855c5e95
      
      Filter out those devices to avoid such error.
      
      Fixes: e2726dae ("iommu/vt-d: debugfs: Add support to show page table internals")
      Reported-and-tested-by: NXu Pengfei <pengfei.xu@intel.com>
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Cc: stable@vger.kernel.org#v5.6+
      Link: https://lore.kernel.org/r/20201028070725.24979-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
      6097df45
  11. 02 11月, 2020 1 次提交
  12. 06 10月, 2020 2 次提交
  13. 01 10月, 2020 3 次提交
  14. 25 9月, 2020 1 次提交
    • C
      dma-mapping: add a new dma_alloc_pages API · efa70f2f
      Christoph Hellwig 提交于
      This API is the equivalent of alloc_pages, except that the returned memory
      is guaranteed to be DMA addressable by the passed in device.  The
      implementation will also be used to provide a more sensible replacement
      for DMA_ATTR_NON_CONSISTENT flag.
      
      Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages
      as its backend.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> (MIPS part)
      efa70f2f
  15. 24 9月, 2020 1 次提交
  16. 18 9月, 2020 1 次提交
  17. 11 9月, 2020 1 次提交
  18. 04 9月, 2020 2 次提交
    • C
      iommu/vt-d: Handle 36bit addressing for x86-32 · 29aaebbc
      Chris Wilson 提交于
      Beware that the address size for x86-32 may exceed unsigned long.
      
      [    0.368971] UBSAN: shift-out-of-bounds in drivers/iommu/intel/iommu.c:128:14
      [    0.369055] shift exponent 36 is too large for 32-bit type 'long unsigned int'
      
      If we don't handle the wide addresses, the pages are mismapped and the
      device read/writes go astray, detected as DMAR faults and leading to
      device failure. The behaviour changed (from working to broken) in commit
      fa954e68 ("iommu/vt-d: Delegate the dma domain to upper layer"), but
      the error looks older.
      
      Fixes: fa954e68 ("iommu/vt-d: Delegate the dma domain to upper layer")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
      Cc: James Sewart <jamessewart@arista.com>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: <stable@vger.kernel.org> # v5.3+
      Link: https://lore.kernel.org/r/20200822160209.28512-1-chris@chris-wilson.co.ukSigned-off-by: NJoerg Roedel <jroedel@suse.de>
      29aaebbc
    • L
      iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set() · 2d33b7d6
      Lu Baolu 提交于
      The dev_iommu_priv_set() must be called after probe_device(). This fixes
      a NULL pointer deference bug when booting a system with kernel cmdline
      "intel_iommu=on,igfx_off", where the dev_iommu_priv_set() is abused.
      
      The following stacktrace was produced:
      
       Command line: BOOT_IMAGE=/isolinux/bzImage console=tty1 intel_iommu=on,igfx_off
       ...
       DMAR: Host address width 39
       DMAR: DRHD base: 0x000000fed90000 flags: 0x0
       DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e
       DMAR: DRHD base: 0x000000fed91000 flags: 0x1
       DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
       DMAR: RMRR base: 0x0000009aa9f000 end: 0x0000009aabefff
       DMAR: RMRR base: 0x0000009d000000 end: 0x0000009f7fffff
       DMAR: No ATSR found
       BUG: kernel NULL pointer dereference, address: 0000000000000038
       #PF: supervisor write access in kernel mode
       #PF: error_code(0x0002) - not-present page
       PGD 0 P4D 0
       Oops: 0002 [#1] SMP PTI
       CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.9.0-devel+ #2
       Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS N1WET46S (1.25s ) 03/30/2018
       RIP: 0010:intel_iommu_init+0xed0/0x1136
       Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 00 48 63 d1 48 c1 e2 04 48
             03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 d0 02 00 00 48 89 7a 38 ff c1
             e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 c7 c7 a0
       RSP: 0000:ffff96d180073dd0 EFLAGS: 00010282
       RAX: ffff8c91037a7d20 RBX: 0000000000000000 RCX: 0000000000000000
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff
       RBP: ffff96d180073e90 R08: 0000000000000001 R09: ffff8c91039fe3c0
       R10: 0000000000000226 R11: 0000000000000226 R12: 000000000000000b
       R13: ffff8c910367c650 R14: ffffffffa8426d60 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff8c9107480000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000038 CR3: 00000004b100a001 CR4: 00000000003706e0
       Call Trace:
        ? _raw_spin_unlock_irqrestore+0x1f/0x30
        ? call_rcu+0x10e/0x320
        ? trace_hardirqs_on+0x2c/0xd0
        ? rdinit_setup+0x2c/0x2c
        ? e820__memblock_setup+0x8b/0x8b
        pci_iommu_init+0x16/0x3f
        do_one_initcall+0x46/0x1e4
        kernel_init_freeable+0x169/0x1b2
        ? rest_init+0x9f/0x9f
        kernel_init+0xa/0x101
        ret_from_fork+0x22/0x30
       Modules linked in:
       CR2: 0000000000000038
       ---[ end trace 3653722a6f936f18 ]---
      
      Fixes: 01b9d4e2 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
      Reported-by: NTorsten Hilbrich <torsten.hilbrich@secunet.com>
      Reported-by: NWendy Wang <wendy.wang@intel.com>
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Tested-by: NTorsten Hilbrich <torsten.hilbrich@secunet.com>
      Link: https://lore.kernel.org/linux-iommu/96717683-70be-7388-3d2f-61131070a96a@secunet.com/
      Link: https://lore.kernel.org/r/20200903065132.16879-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
      2d33b7d6
  19. 24 8月, 2020 1 次提交
  20. 24 7月, 2020 8 次提交
  21. 17 7月, 2020 1 次提交
  22. 11 7月, 2020 1 次提交
  23. 30 6月, 2020 1 次提交