1. 04 9月, 2019 3 次提交
  2. 03 9月, 2019 1 次提交
  3. 21 8月, 2019 1 次提交
  4. 11 8月, 2019 1 次提交
    • C
      dma-mapping: fix page attributes for dma_mmap_* · 33dcb37c
      Christoph Hellwig 提交于
      All the way back to introducing dma_common_mmap we've defaulted to mark
      the pages as uncached.  But this is wrong for DMA coherent devices.
      Later on DMA_ATTR_WRITE_COMBINE also got incorrect treatment as that
      flag is only treated special on the alloc side for non-coherent devices.
      
      Introduce a new dma_pgprot helper that deals with the check for coherent
      devices so that only the remapping cases ever reach arch_dma_mmap_pgprot
      and we thus ensure no aliasing of page attributes happens, which makes
      the powerpc version of arch_dma_mmap_pgprot obsolete and simplifies the
      remaining ones.
      
      Note that this means arch_dma_mmap_pgprot is a bit misnamed now, but
      we'll phase it out soon.
      
      Fixes: 64ccc9c0 ("common: dma-mapping: add support for generic dma_mmap_* calls")
      Reported-by: NShawn Anastasio <shawn@anastas.io>
      Reported-by: NGavin Li <git@thegavinli.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: Catalin Marinas <catalin.marinas@arm.com> # arm64
      33dcb37c
  5. 09 8月, 2019 4 次提交
  6. 06 8月, 2019 3 次提交
  7. 23 7月, 2019 1 次提交
  8. 22 7月, 2019 8 次提交
    • S
      iommu/vt-d: Print pasid table entries MSB to LSB in debugfs · 7f6cade5
      Sai Praneeth Prakhya 提交于
      Commit dd5142ca ("iommu/vt-d: Add debugfs support to show scalable mode
      DMAR table internals") prints content of pasid table entries from LSB to
      MSB where as other entries are printed MSB to LSB. So, to maintain
      uniformity among all entries and to not confuse the user, print MSB first.
      
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: Sohil Mehta <sohil.mehta@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Fixes: dd5142ca ("iommu/vt-d: Add debugfs support to show scalable mode DMAR table internals")
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      7f6cade5
    • J
      iommu/virtio: Update to most recent specification · ae24fb49
      Jean-Philippe Brucker 提交于
      Following specification review a few things were changed in v8 of the
      virtio-iommu series [1], but have been omitted when merging the base
      driver. Add them now:
      
      * Remove the EXEC flag.
      * Add feature bit for the MMIO flag.
      * Change domain_bits to domain_range.
      * Add NOMEM status flag.
      
      [1] https://lore.kernel.org/linux-iommu/20190530170929.19366-1-jean-philippe.brucker@arm.com/
      
      Fixes: edcd69ab ("iommu: Add virtio-iommu driver")
      Reported-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NEric Auger <eric.auger@redhat.com>
      Acked-by: NJoerg Roedel <jroedel@suse.de>
      ae24fb49
    • C
      iommu/iova: Remove stale cached32_node · 9eed17d3
      Chris Wilson 提交于
      Since the cached32_node is allowed to be advanced above dma_32bit_pfn
      (to provide a shortcut into the limited range), we need to be careful to
      remove the to be freed node if it is the cached32_node.
      
      [   48.477773] BUG: KASAN: use-after-free in __cached_rbnode_delete_update+0x68/0x110
      [   48.477812] Read of size 8 at addr ffff88870fc19020 by task kworker/u8:1/37
      [   48.477843]
      [   48.477879] CPU: 1 PID: 37 Comm: kworker/u8:1 Tainted: G     U            5.2.0+ #735
      [   48.477915] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [   48.478047] Workqueue: i915 __i915_gem_free_work [i915]
      [   48.478075] Call Trace:
      [   48.478111]  dump_stack+0x5b/0x90
      [   48.478137]  print_address_description+0x67/0x237
      [   48.478178]  ? __cached_rbnode_delete_update+0x68/0x110
      [   48.478212]  __kasan_report.cold.3+0x1c/0x38
      [   48.478240]  ? __cached_rbnode_delete_update+0x68/0x110
      [   48.478280]  ? __cached_rbnode_delete_update+0x68/0x110
      [   48.478308]  __cached_rbnode_delete_update+0x68/0x110
      [   48.478344]  private_free_iova+0x2b/0x60
      [   48.478378]  iova_magazine_free_pfns+0x46/0xa0
      [   48.478403]  free_iova_fast+0x277/0x340
      [   48.478443]  fq_ring_free+0x15a/0x1a0
      [   48.478473]  queue_iova+0x19c/0x1f0
      [   48.478597]  cleanup_page_dma.isra.64+0x62/0xb0 [i915]
      [   48.478712]  __gen8_ppgtt_cleanup+0x63/0x80 [i915]
      [   48.478826]  __gen8_ppgtt_cleanup+0x42/0x80 [i915]
      [   48.478940]  __gen8_ppgtt_clear+0x433/0x4b0 [i915]
      [   48.479053]  __gen8_ppgtt_clear+0x462/0x4b0 [i915]
      [   48.479081]  ? __sg_free_table+0x9e/0xf0
      [   48.479116]  ? kfree+0x7f/0x150
      [   48.479234]  i915_vma_unbind+0x1e2/0x240 [i915]
      [   48.479352]  i915_vma_destroy+0x3a/0x280 [i915]
      [   48.479465]  __i915_gem_free_objects+0xf0/0x2d0 [i915]
      [   48.479579]  __i915_gem_free_work+0x41/0xa0 [i915]
      [   48.479607]  process_one_work+0x495/0x710
      [   48.479642]  worker_thread+0x4c7/0x6f0
      [   48.479687]  ? process_one_work+0x710/0x710
      [   48.479724]  kthread+0x1b2/0x1d0
      [   48.479774]  ? kthread_create_worker_on_cpu+0xa0/0xa0
      [   48.479820]  ret_from_fork+0x1f/0x30
      [   48.479864]
      [   48.479907] Allocated by task 631:
      [   48.479944]  save_stack+0x19/0x80
      [   48.479994]  __kasan_kmalloc.constprop.6+0xc1/0xd0
      [   48.480038]  kmem_cache_alloc+0x91/0xf0
      [   48.480082]  alloc_iova+0x2b/0x1e0
      [   48.480125]  alloc_iova_fast+0x58/0x376
      [   48.480166]  intel_alloc_iova+0x90/0xc0
      [   48.480214]  intel_map_sg+0xde/0x1f0
      [   48.480343]  i915_gem_gtt_prepare_pages+0xb8/0x170 [i915]
      [   48.480465]  huge_get_pages+0x232/0x2b0 [i915]
      [   48.480590]  ____i915_gem_object_get_pages+0x40/0xb0 [i915]
      [   48.480712]  __i915_gem_object_get_pages+0x90/0xa0 [i915]
      [   48.480834]  i915_gem_object_prepare_write+0x2d6/0x330 [i915]
      [   48.480955]  create_test_object.isra.54+0x1a9/0x3e0 [i915]
      [   48.481075]  igt_shared_ctx_exec+0x365/0x3c0 [i915]
      [   48.481210]  __i915_subtests.cold.4+0x30/0x92 [i915]
      [   48.481341]  __run_selftests.cold.3+0xa9/0x119 [i915]
      [   48.481466]  i915_live_selftests+0x3c/0x70 [i915]
      [   48.481583]  i915_pci_probe+0xe7/0x220 [i915]
      [   48.481620]  pci_device_probe+0xe0/0x180
      [   48.481665]  really_probe+0x163/0x4e0
      [   48.481710]  device_driver_attach+0x85/0x90
      [   48.481750]  __driver_attach+0xa5/0x180
      [   48.481796]  bus_for_each_dev+0xda/0x130
      [   48.481831]  bus_add_driver+0x205/0x2e0
      [   48.481882]  driver_register+0xca/0x140
      [   48.481927]  do_one_initcall+0x6c/0x1af
      [   48.481970]  do_init_module+0x106/0x350
      [   48.482010]  load_module+0x3d2c/0x3ea0
      [   48.482058]  __do_sys_finit_module+0x110/0x180
      [   48.482102]  do_syscall_64+0x62/0x1f0
      [   48.482147]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   48.482190]
      [   48.482224] Freed by task 37:
      [   48.482273]  save_stack+0x19/0x80
      [   48.482318]  __kasan_slab_free+0x12e/0x180
      [   48.482363]  kmem_cache_free+0x70/0x140
      [   48.482406]  __free_iova+0x1d/0x30
      [   48.482445]  fq_ring_free+0x15a/0x1a0
      [   48.482490]  queue_iova+0x19c/0x1f0
      [   48.482624]  cleanup_page_dma.isra.64+0x62/0xb0 [i915]
      [   48.482749]  __gen8_ppgtt_cleanup+0x63/0x80 [i915]
      [   48.482873]  __gen8_ppgtt_cleanup+0x42/0x80 [i915]
      [   48.482999]  __gen8_ppgtt_clear+0x433/0x4b0 [i915]
      [   48.483123]  __gen8_ppgtt_clear+0x462/0x4b0 [i915]
      [   48.483250]  i915_vma_unbind+0x1e2/0x240 [i915]
      [   48.483378]  i915_vma_destroy+0x3a/0x280 [i915]
      [   48.483500]  __i915_gem_free_objects+0xf0/0x2d0 [i915]
      [   48.483622]  __i915_gem_free_work+0x41/0xa0 [i915]
      [   48.483659]  process_one_work+0x495/0x710
      [   48.483704]  worker_thread+0x4c7/0x6f0
      [   48.483748]  kthread+0x1b2/0x1d0
      [   48.483787]  ret_from_fork+0x1f/0x30
      [   48.483831]
      [   48.483868] The buggy address belongs to the object at ffff88870fc19000
      [   48.483868]  which belongs to the cache iommu_iova of size 40
      [   48.483920] The buggy address is located 32 bytes inside of
      [   48.483920]  40-byte region [ffff88870fc19000, ffff88870fc19028)
      [   48.483964] The buggy address belongs to the page:
      [   48.484006] page:ffffea001c3f0600 refcount:1 mapcount:0 mapping:ffff8888181a91c0 index:0x0 compound_mapcount: 0
      [   48.484045] flags: 0x8000000000010200(slab|head)
      [   48.484096] raw: 8000000000010200 ffffea001c421a08 ffffea001c447e88 ffff8888181a91c0
      [   48.484141] raw: 0000000000000000 0000000000120012 00000001ffffffff 0000000000000000
      [   48.484188] page dumped because: kasan: bad access detected
      [   48.484230]
      [   48.484265] Memory state around the buggy address:
      [   48.484314]  ffff88870fc18f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   48.484361]  ffff88870fc18f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   48.484406] >ffff88870fc19000: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc
      [   48.484451]                                ^
      [   48.484494]  ffff88870fc19080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   48.484530]  ffff88870fc19100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108602
      Fixes: e60aa7b5 ("iommu/iova: Extend rbtree node caching")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: <stable@vger.kernel.org> # v4.15+
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      9eed17d3
    • D
      iommu/vt-d: Check if domain->pgd was allocated · 3ee9eca7
      Dmitry Safonov 提交于
      There is a couple of places where on domain_init() failure domain_exit()
      is called. While currently domain_init() can fail only if
      alloc_pgtable_page() has failed.
      
      Make domain_exit() check if domain->pgd present, before calling
      domain_unmap(), as it theoretically should crash on clearing pte entries
      in dma_pte_clear_level().
      
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: iommu@lists.linux-foundation.org
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      3ee9eca7
    • D
      iommu/vt-d: Don't queue_iova() if there is no flush queue · effa4678
      Dmitry Safonov 提交于
      Intel VT-d driver was reworked to use common deferred flushing
      implementation. Previously there was one global per-cpu flush queue,
      afterwards - one per domain.
      
      Before deferring a flush, the queue should be allocated and initialized.
      
      Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush
      queue. It's probably worth to init it for static or unmanaged domains
      too, but it may be arguable - I'm leaving it to iommu folks.
      
      Prevent queuing an iova flush if the domain doesn't have a queue.
      The defensive check seems to be worth to keep even if queue would be
      initialized for all kinds of domains. And is easy backportable.
      
      On 4.19.43 stable kernel it has a user-visible effect: previously for
      devices in si domain there were crashes, on sata devices:
      
       BUG: spinlock bad magic on CPU#6, swapper/0/1
        lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
       CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1
       Call Trace:
        <IRQ>
        dump_stack+0x61/0x7e
        spin_bug+0x9d/0xa3
        do_raw_spin_lock+0x22/0x8e
        _raw_spin_lock_irqsave+0x32/0x3a
        queue_iova+0x45/0x115
        intel_unmap+0x107/0x113
        intel_unmap_sg+0x6b/0x76
        __ata_qc_complete+0x7f/0x103
        ata_qc_complete+0x9b/0x26a
        ata_qc_complete_multiple+0xd0/0xe3
        ahci_handle_port_interrupt+0x3ee/0x48a
        ahci_handle_port_intr+0x73/0xa9
        ahci_single_level_irq_intr+0x40/0x60
        __handle_irq_event_percpu+0x7f/0x19a
        handle_irq_event_percpu+0x32/0x72
        handle_irq_event+0x38/0x56
        handle_edge_irq+0x102/0x121
        handle_irq+0x147/0x15c
        do_IRQ+0x66/0xf2
        common_interrupt+0xf/0xf
       RIP: 0010:__do_softirq+0x8c/0x2df
      
      The same for usb devices that use ehci-pci:
       BUG: spinlock bad magic on CPU#0, swapper/0/1
        lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4
       Call Trace:
        <IRQ>
        dump_stack+0x61/0x7e
        spin_bug+0x9d/0xa3
        do_raw_spin_lock+0x22/0x8e
        _raw_spin_lock_irqsave+0x32/0x3a
        queue_iova+0x77/0x145
        intel_unmap+0x107/0x113
        intel_unmap_page+0xe/0x10
        usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d
        usb_hcd_unmap_urb_for_dma+0x17/0x100
        unmap_urb_for_dma+0x22/0x24
        __usb_hcd_giveback_urb+0x51/0xc3
        usb_giveback_urb_bh+0x97/0xde
        tasklet_action_common.isra.4+0x5f/0xa1
        tasklet_action+0x2d/0x30
        __do_softirq+0x138/0x2df
        irq_exit+0x7d/0x8b
        smp_apic_timer_interrupt+0x10f/0x151
        apic_timer_interrupt+0xf/0x20
        </IRQ>
       RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39
      
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: iommu@lists.linux-foundation.org
      Cc: <stable@vger.kernel.org> # 4.14+
      Fixes: 13cf0174 ("iommu/vt-d: Make use of iova deferred flushing")
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      effa4678
    • L
      iommu/vt-d: Avoid duplicated pci dma alias consideration · 55752949
      Lu Baolu 提交于
      As we have abandoned the home-made lazy domain allocation
      and delegated the DMA domain life cycle up to the default
      domain mechanism defined in the generic iommu layer, we
      needn't consider pci alias anymore when mapping/unmapping
      the context entries. Without this fix, we see kernel NULL
      pointer dereference during pci device hot-plug test.
      
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Kevin Tian <kevin.tian@intel.com>
      Fixes: fa954e68 ("iommu/vt-d: Delegate the dma domain to upper layer")
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reported-and-tested-by: NXu Pengfei <pengfei.xu@intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      55752949
    • J
      Revert "iommu/vt-d: Consolidate domain_init() to avoid duplication" · 301e7ee1
      Joerg Roedel 提交于
      This reverts commit 123b2ffc.
      
      This commit reportedly caused boot failures on some systems
      and needs to be reverted for now.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      301e7ee1
    • Q
      iommu/amd: fix a crash in iova_magazine_free_pfns · 8cf66504
      Qian Cai 提交于
      The commit b3aa14f0 ("iommu: remove the mapping_error dma_map_ops
      method") incorrectly changed the checking from dma_ops_alloc_iova() in
      map_sg() causes a crash under memory pressure as dma_ops_alloc_iova()
      never return DMA_MAPPING_ERROR on failure but 0, so the error handling
      is all wrong.
      
         kernel BUG at drivers/iommu/iova.c:801!
          Workqueue: kblockd blk_mq_run_work_fn
          RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0
          Call Trace:
           free_cpu_cached_iovas+0xbd/0x150
           alloc_iova_fast+0x8c/0xba
           dma_ops_alloc_iova.isra.6+0x65/0xa0
           map_sg+0x8c/0x2a0
           scsi_dma_map+0xc6/0x160
           pqi_aio_submit_io+0x1f6/0x440 [smartpqi]
           pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi]
           scsi_queue_rq+0x79c/0x1200
           blk_mq_dispatch_rq_list+0x4dc/0xb70
           blk_mq_sched_dispatch_requests+0x249/0x310
           __blk_mq_run_hw_queue+0x128/0x200
           blk_mq_run_work_fn+0x27/0x30
           process_one_work+0x522/0xa10
           worker_thread+0x63/0x5b0
           kthread+0x1d2/0x1f0
           ret_from_fork+0x22/0x40
      
      Fixes: b3aa14f0 ("iommu: remove the mapping_error dma_map_ops method")
      Signed-off-by: NQian Cai <cai@lca.pw>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8cf66504
  9. 04 7月, 2019 2 次提交
  10. 02 7月, 2019 1 次提交
    • W
      iommu/arm-smmu-v3: Fix compilation when CONFIG_CMA=n · 900a85ca
      Will Deacon 提交于
      When compiling a kernel without support for CMA, CONFIG_CMA_ALIGNMENT
      is not defined which results in the following build failure:
      
      In file included from ./include/linux/list.h:9:0
                       from ./include/linux/kobject.h:19,
                       from ./include/linux/of.h:17
                       from ./include/linux/irqdomain.h:35,
                       from ./include/linux/acpi.h:13,
                       from drivers/iommu/arm-smmu-v3.c:12:
      drivers/iommu/arm-smmu-v3.c: In function ‘arm_smmu_device_hw_probe’:
      drivers/iommu/arm-smmu-v3.c:194:40: error: ‘CONFIG_CMA_ALIGNMENT’ undeclared (first use in this function)
       #define Q_MAX_SZ_SHIFT   (PAGE_SHIFT + CONFIG_CMA_ALIGNMENT)
      
      Fix the breakage by capping the maximum queue size based on MAX_ORDER
      when CMA is not enabled.
      Reported-by: NZhangshaokun <zhangshaokun@hisilicon.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Tested-by: NShaokun Zhang <zhangshaokun@hisilicon.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      900a85ca
  11. 01 7月, 2019 5 次提交
  12. 25 6月, 2019 2 次提交
  13. 24 6月, 2019 1 次提交
    • S
      driver_find_device: Unify the match function with class_find_device() · 92ce7e83
      Suzuki K Poulose 提交于
      The driver_find_device() accepts a match function pointer to
      filter the devices for lookup, similar to bus/class_find_device().
      However, there is a minor difference in the prototype for the
      match parameter for driver_find_device() with the now unified
      version accepted by {bus/class}_find_device(), where it doesn't
      accept a "const" qualifier for the data argument. This prevents
      us from reusing the generic match functions for driver_find_device().
      
      For this reason, change the prototype of the driver_find_device() to
      make the "match" parameter in line with {bus/class}_find_device()
      and adjust its callers to use the const qualifier. Also, we could
      now promote the "data" parameter to const as we pass it down
      as a const parameter to the match functions.
      
      Cc: Corey Minyard <minyard@acm.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Thierry Reding <thierry.reding@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
      Cc: Sebastian Ott <sebott@linux.ibm.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Nehal Shah <nehal-bakulchandra.shah@amd.com>
      Cc: Shyam Sundar S K <shyam-sundar.s-k@amd.com>
      Cc: Lee Jones <lee.jones@linaro.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      92ce7e83
  14. 23 6月, 2019 1 次提交
    • P
      Revert "iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock" · 0aafc8ae
      Peter Xu 提交于
      This reverts commit 7560cc3c.
      
      With 5.2.0-rc5 I can easily trigger this with lockdep and iommu=pt:
      
          ======================================================
          WARNING: possible circular locking dependency detected
          5.2.0-rc5 #78 Not tainted
          ------------------------------------------------------
          swapper/0/1 is trying to acquire lock:
          00000000ea2b3beb (&(&iommu->lock)->rlock){+.+.}, at: domain_context_mapping_one+0xa5/0x4e0
          but task is already holding lock:
          00000000a681907b (device_domain_lock){....}, at: domain_context_mapping_one+0x8d/0x4e0
          which lock already depends on the new lock.
          the existing dependency chain (in reverse order) is:
          -> #1 (device_domain_lock){....}:
                 _raw_spin_lock_irqsave+0x3c/0x50
                 dmar_insert_one_dev_info+0xbb/0x510
                 domain_add_dev_info+0x50/0x90
                 dev_prepare_static_identity_mapping+0x30/0x68
                 intel_iommu_init+0xddd/0x1422
                 pci_iommu_init+0x16/0x3f
                 do_one_initcall+0x5d/0x2b4
                 kernel_init_freeable+0x218/0x2c1
                 kernel_init+0xa/0x100
                 ret_from_fork+0x3a/0x50
          -> #0 (&(&iommu->lock)->rlock){+.+.}:
                 lock_acquire+0x9e/0x170
                 _raw_spin_lock+0x25/0x30
                 domain_context_mapping_one+0xa5/0x4e0
                 pci_for_each_dma_alias+0x30/0x140
                 dmar_insert_one_dev_info+0x3b2/0x510
                 domain_add_dev_info+0x50/0x90
                 dev_prepare_static_identity_mapping+0x30/0x68
                 intel_iommu_init+0xddd/0x1422
                 pci_iommu_init+0x16/0x3f
                 do_one_initcall+0x5d/0x2b4
                 kernel_init_freeable+0x218/0x2c1
                 kernel_init+0xa/0x100
                 ret_from_fork+0x3a/0x50
      
          other info that might help us debug this:
           Possible unsafe locking scenario:
                 CPU0                    CPU1
                 ----                    ----
            lock(device_domain_lock);
                                         lock(&(&iommu->lock)->rlock);
                                         lock(device_domain_lock);
            lock(&(&iommu->lock)->rlock);
      
           *** DEADLOCK ***
          2 locks held by swapper/0/1:
           #0: 00000000033eb13d (dmar_global_lock){++++}, at: intel_iommu_init+0x1e0/0x1422
           #1: 00000000a681907b (device_domain_lock){....}, at: domain_context_mapping_one+0x8d/0x4e0
      
          stack backtrace:
          CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5 #78
          Hardware name: LENOVO 20KGS35G01/20KGS35G01, BIOS N23ET50W (1.25 ) 06/25/2018
          Call Trace:
           dump_stack+0x85/0xc0
           print_circular_bug.cold.57+0x15c/0x195
           __lock_acquire+0x152a/0x1710
           lock_acquire+0x9e/0x170
           ? domain_context_mapping_one+0xa5/0x4e0
           _raw_spin_lock+0x25/0x30
           ? domain_context_mapping_one+0xa5/0x4e0
           domain_context_mapping_one+0xa5/0x4e0
           ? domain_context_mapping_one+0x4e0/0x4e0
           pci_for_each_dma_alias+0x30/0x140
           dmar_insert_one_dev_info+0x3b2/0x510
           domain_add_dev_info+0x50/0x90
           dev_prepare_static_identity_mapping+0x30/0x68
           intel_iommu_init+0xddd/0x1422
           ? printk+0x58/0x6f
           ? lockdep_hardirqs_on+0xf0/0x180
           ? do_early_param+0x8e/0x8e
           ? e820__memblock_setup+0x63/0x63
           pci_iommu_init+0x16/0x3f
           do_one_initcall+0x5d/0x2b4
           ? do_early_param+0x8e/0x8e
           ? rcu_read_lock_sched_held+0x55/0x60
           ? do_early_param+0x8e/0x8e
           kernel_init_freeable+0x218/0x2c1
           ? rest_init+0x230/0x230
           kernel_init+0xa/0x100
           ret_from_fork+0x3a/0x50
      
      domain_context_mapping_one() is taking device_domain_lock first then
      iommu lock, while dmar_insert_one_dev_info() is doing the reverse.
      
      That should be introduced by commit:
      
      7560cc3c ("iommu/vt-d: Fix lock inversion between iommu->lock and
                    device_domain_lock", 2019-05-27)
      
      So far I still cannot figure out how the previous deadlock was
      triggered (I cannot find iommu lock taken before calling of
      iommu_flush_dev_iotlb()), however I'm pretty sure that that change
      should be incomplete at least because it does not fix all the places
      so we're still taking the locks in different orders, while reverting
      that commit is very clean to me so far that we should always take
      device_domain_lock first then the iommu lock.
      
      We can continue to try to find the real culprit mentioned in
      7560cc3c, but for now I think we should revert it to fix current
      breakage.
      
      CC: Joerg Roedel <joro@8bytes.org>
      CC: Lu Baolu <baolu.lu@linux.intel.com>
      CC: dave.jiang@intel.com
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      0aafc8ae
  15. 19 6月, 2019 4 次提交
  16. 18 6月, 2019 2 次提交