1. 21 9月, 2019 2 次提交
    • J
      iommu/amd: Fix race in increase_address_space() · 0d50f7b1
      Joerg Roedel 提交于
      [ Upstream commit 754265bcab78a9014f0f99cd35e0d610fcd7dfa7 ]
      
      After the conversion to lock-less dma-api call the
      increase_address_space() function can be called without any
      locking. Multiple CPUs could potentially race for increasing
      the address space, leading to invalid domain->mode settings
      and invalid page-tables. This has been happening in the wild
      under high IO load and memory pressure.
      
      Fix the race by locking this operation. The function is
      called infrequently so that this does not introduce
      a performance regression in the dma-api path again.
      Reported-by: NQian Cai <cai@lca.pw>
      Fixes: 256e4621 ('iommu/amd: Make use of the generic IOVA allocator')
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0d50f7b1
    • S
      iommu/amd: Flush old domains in kdump kernel · 52f32e4a
      Stuart Hayes 提交于
      [ Upstream commit 36b7200f67dfe75b416b5281ed4ace9927b513bc ]
      
      When devices are attached to the amd_iommu in a kdump kernel, the old device
      table entries (DTEs), which were copied from the crashed kernel, will be
      overwritten with a new domain number.  When the new DTE is written, the IOMMU
      is told to flush the DTE from its internal cache--but it is not told to flush
      the translation cache entries for the old domain number.
      
      Without this patch, AMD systems using the tg3 network driver fail when kdump
      tries to save the vmcore to a network system, showing network timeouts and
      (sometimes) IOMMU errors in the kernel log.
      
      This patch will flush IOMMU translation cache entries for the old domain when
      a DTE gets overwritten with a new domain number.
      Signed-off-by: NStuart Hayes <stuart.w.hayes@gmail.com>
      Fixes: 3ac3e5ee ('iommu/amd: Copy old trans table from old kernel')
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      52f32e4a
  2. 16 9月, 2019 1 次提交
    • C
      iommu/iova: Remove stale cached32_node · a532a120
      Chris Wilson 提交于
      [ Upstream commit 9eed17d37c77171cf5ffb95c4257f87df3cd4c8f ]
      
      Since the cached32_node is allowed to be advanced above dma_32bit_pfn
      (to provide a shortcut into the limited range), we need to be careful to
      remove the to be freed node if it is the cached32_node.
      
      [   48.477773] BUG: KASAN: use-after-free in __cached_rbnode_delete_update+0x68/0x110
      [   48.477812] Read of size 8 at addr ffff88870fc19020 by task kworker/u8:1/37
      [   48.477843]
      [   48.477879] CPU: 1 PID: 37 Comm: kworker/u8:1 Tainted: G     U            5.2.0+ #735
      [   48.477915] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [   48.478047] Workqueue: i915 __i915_gem_free_work [i915]
      [   48.478075] Call Trace:
      [   48.478111]  dump_stack+0x5b/0x90
      [   48.478137]  print_address_description+0x67/0x237
      [   48.478178]  ? __cached_rbnode_delete_update+0x68/0x110
      [   48.478212]  __kasan_report.cold.3+0x1c/0x38
      [   48.478240]  ? __cached_rbnode_delete_update+0x68/0x110
      [   48.478280]  ? __cached_rbnode_delete_update+0x68/0x110
      [   48.478308]  __cached_rbnode_delete_update+0x68/0x110
      [   48.478344]  private_free_iova+0x2b/0x60
      [   48.478378]  iova_magazine_free_pfns+0x46/0xa0
      [   48.478403]  free_iova_fast+0x277/0x340
      [   48.478443]  fq_ring_free+0x15a/0x1a0
      [   48.478473]  queue_iova+0x19c/0x1f0
      [   48.478597]  cleanup_page_dma.isra.64+0x62/0xb0 [i915]
      [   48.478712]  __gen8_ppgtt_cleanup+0x63/0x80 [i915]
      [   48.478826]  __gen8_ppgtt_cleanup+0x42/0x80 [i915]
      [   48.478940]  __gen8_ppgtt_clear+0x433/0x4b0 [i915]
      [   48.479053]  __gen8_ppgtt_clear+0x462/0x4b0 [i915]
      [   48.479081]  ? __sg_free_table+0x9e/0xf0
      [   48.479116]  ? kfree+0x7f/0x150
      [   48.479234]  i915_vma_unbind+0x1e2/0x240 [i915]
      [   48.479352]  i915_vma_destroy+0x3a/0x280 [i915]
      [   48.479465]  __i915_gem_free_objects+0xf0/0x2d0 [i915]
      [   48.479579]  __i915_gem_free_work+0x41/0xa0 [i915]
      [   48.479607]  process_one_work+0x495/0x710
      [   48.479642]  worker_thread+0x4c7/0x6f0
      [   48.479687]  ? process_one_work+0x710/0x710
      [   48.479724]  kthread+0x1b2/0x1d0
      [   48.479774]  ? kthread_create_worker_on_cpu+0xa0/0xa0
      [   48.479820]  ret_from_fork+0x1f/0x30
      [   48.479864]
      [   48.479907] Allocated by task 631:
      [   48.479944]  save_stack+0x19/0x80
      [   48.479994]  __kasan_kmalloc.constprop.6+0xc1/0xd0
      [   48.480038]  kmem_cache_alloc+0x91/0xf0
      [   48.480082]  alloc_iova+0x2b/0x1e0
      [   48.480125]  alloc_iova_fast+0x58/0x376
      [   48.480166]  intel_alloc_iova+0x90/0xc0
      [   48.480214]  intel_map_sg+0xde/0x1f0
      [   48.480343]  i915_gem_gtt_prepare_pages+0xb8/0x170 [i915]
      [   48.480465]  huge_get_pages+0x232/0x2b0 [i915]
      [   48.480590]  ____i915_gem_object_get_pages+0x40/0xb0 [i915]
      [   48.480712]  __i915_gem_object_get_pages+0x90/0xa0 [i915]
      [   48.480834]  i915_gem_object_prepare_write+0x2d6/0x330 [i915]
      [   48.480955]  create_test_object.isra.54+0x1a9/0x3e0 [i915]
      [   48.481075]  igt_shared_ctx_exec+0x365/0x3c0 [i915]
      [   48.481210]  __i915_subtests.cold.4+0x30/0x92 [i915]
      [   48.481341]  __run_selftests.cold.3+0xa9/0x119 [i915]
      [   48.481466]  i915_live_selftests+0x3c/0x70 [i915]
      [   48.481583]  i915_pci_probe+0xe7/0x220 [i915]
      [   48.481620]  pci_device_probe+0xe0/0x180
      [   48.481665]  really_probe+0x163/0x4e0
      [   48.481710]  device_driver_attach+0x85/0x90
      [   48.481750]  __driver_attach+0xa5/0x180
      [   48.481796]  bus_for_each_dev+0xda/0x130
      [   48.481831]  bus_add_driver+0x205/0x2e0
      [   48.481882]  driver_register+0xca/0x140
      [   48.481927]  do_one_initcall+0x6c/0x1af
      [   48.481970]  do_init_module+0x106/0x350
      [   48.482010]  load_module+0x3d2c/0x3ea0
      [   48.482058]  __do_sys_finit_module+0x110/0x180
      [   48.482102]  do_syscall_64+0x62/0x1f0
      [   48.482147]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   48.482190]
      [   48.482224] Freed by task 37:
      [   48.482273]  save_stack+0x19/0x80
      [   48.482318]  __kasan_slab_free+0x12e/0x180
      [   48.482363]  kmem_cache_free+0x70/0x140
      [   48.482406]  __free_iova+0x1d/0x30
      [   48.482445]  fq_ring_free+0x15a/0x1a0
      [   48.482490]  queue_iova+0x19c/0x1f0
      [   48.482624]  cleanup_page_dma.isra.64+0x62/0xb0 [i915]
      [   48.482749]  __gen8_ppgtt_cleanup+0x63/0x80 [i915]
      [   48.482873]  __gen8_ppgtt_cleanup+0x42/0x80 [i915]
      [   48.482999]  __gen8_ppgtt_clear+0x433/0x4b0 [i915]
      [   48.483123]  __gen8_ppgtt_clear+0x462/0x4b0 [i915]
      [   48.483250]  i915_vma_unbind+0x1e2/0x240 [i915]
      [   48.483378]  i915_vma_destroy+0x3a/0x280 [i915]
      [   48.483500]  __i915_gem_free_objects+0xf0/0x2d0 [i915]
      [   48.483622]  __i915_gem_free_work+0x41/0xa0 [i915]
      [   48.483659]  process_one_work+0x495/0x710
      [   48.483704]  worker_thread+0x4c7/0x6f0
      [   48.483748]  kthread+0x1b2/0x1d0
      [   48.483787]  ret_from_fork+0x1f/0x30
      [   48.483831]
      [   48.483868] The buggy address belongs to the object at ffff88870fc19000
      [   48.483868]  which belongs to the cache iommu_iova of size 40
      [   48.483920] The buggy address is located 32 bytes inside of
      [   48.483920]  40-byte region [ffff88870fc19000, ffff88870fc19028)
      [   48.483964] The buggy address belongs to the page:
      [   48.484006] page:ffffea001c3f0600 refcount:1 mapcount:0 mapping:ffff8888181a91c0 index:0x0 compound_mapcount: 0
      [   48.484045] flags: 0x8000000000010200(slab|head)
      [   48.484096] raw: 8000000000010200 ffffea001c421a08 ffffea001c447e88 ffff8888181a91c0
      [   48.484141] raw: 0000000000000000 0000000000120012 00000001ffffffff 0000000000000000
      [   48.484188] page dumped because: kasan: bad access detected
      [   48.484230]
      [   48.484265] Memory state around the buggy address:
      [   48.484314]  ffff88870fc18f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   48.484361]  ffff88870fc18f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   48.484406] >ffff88870fc19000: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc
      [   48.484451]                                ^
      [   48.484494]  ffff88870fc19080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   48.484530]  ffff88870fc19100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108602
      Fixes: e60aa7b5 ("iommu/iova: Extend rbtree node caching")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: <stable@vger.kernel.org> # v4.15+
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      a532a120
  3. 06 9月, 2019 1 次提交
  4. 25 8月, 2019 1 次提交
  5. 04 8月, 2019 1 次提交
    • D
      iommu/vt-d: Don't queue_iova() if there is no flush queue · 4fd0eb60
      Dmitry Safonov 提交于
      commit effa467870c7612012885df4e246bdb8ffd8e44c upstream.
      
      Intel VT-d driver was reworked to use common deferred flushing
      implementation. Previously there was one global per-cpu flush queue,
      afterwards - one per domain.
      
      Before deferring a flush, the queue should be allocated and initialized.
      
      Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush
      queue. It's probably worth to init it for static or unmanaged domains
      too, but it may be arguable - I'm leaving it to iommu folks.
      
      Prevent queuing an iova flush if the domain doesn't have a queue.
      The defensive check seems to be worth to keep even if queue would be
      initialized for all kinds of domains. And is easy backportable.
      
      On 4.19.43 stable kernel it has a user-visible effect: previously for
      devices in si domain there were crashes, on sata devices:
      
       BUG: spinlock bad magic on CPU#6, swapper/0/1
        lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
       CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1
       Call Trace:
        <IRQ>
        dump_stack+0x61/0x7e
        spin_bug+0x9d/0xa3
        do_raw_spin_lock+0x22/0x8e
        _raw_spin_lock_irqsave+0x32/0x3a
        queue_iova+0x45/0x115
        intel_unmap+0x107/0x113
        intel_unmap_sg+0x6b/0x76
        __ata_qc_complete+0x7f/0x103
        ata_qc_complete+0x9b/0x26a
        ata_qc_complete_multiple+0xd0/0xe3
        ahci_handle_port_interrupt+0x3ee/0x48a
        ahci_handle_port_intr+0x73/0xa9
        ahci_single_level_irq_intr+0x40/0x60
        __handle_irq_event_percpu+0x7f/0x19a
        handle_irq_event_percpu+0x32/0x72
        handle_irq_event+0x38/0x56
        handle_edge_irq+0x102/0x121
        handle_irq+0x147/0x15c
        do_IRQ+0x66/0xf2
        common_interrupt+0xf/0xf
       RIP: 0010:__do_softirq+0x8c/0x2df
      
      The same for usb devices that use ehci-pci:
       BUG: spinlock bad magic on CPU#0, swapper/0/1
        lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4
       Call Trace:
        <IRQ>
        dump_stack+0x61/0x7e
        spin_bug+0x9d/0xa3
        do_raw_spin_lock+0x22/0x8e
        _raw_spin_lock_irqsave+0x32/0x3a
        queue_iova+0x77/0x145
        intel_unmap+0x107/0x113
        intel_unmap_page+0xe/0x10
        usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d
        usb_hcd_unmap_urb_for_dma+0x17/0x100
        unmap_urb_for_dma+0x22/0x24
        __usb_hcd_giveback_urb+0x51/0xc3
        usb_giveback_urb_bh+0x97/0xde
        tasklet_action_common.isra.4+0x5f/0xa1
        tasklet_action+0x2d/0x30
        __do_softirq+0x138/0x2df
        irq_exit+0x7d/0x8b
        smp_apic_timer_interrupt+0x10f/0x151
        apic_timer_interrupt+0xf/0x20
        </IRQ>
       RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39
      
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: iommu@lists.linux-foundation.org
      Cc: <stable@vger.kernel.org> # 4.14+
      Fixes: 13cf0174 ("iommu/vt-d: Make use of iova deferred flushing")
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      [v4.14-port notes:
      o minor conflict with untrusted IOMMU devices check under if-condition]
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4fd0eb60
  6. 26 7月, 2019 1 次提交
  7. 19 6月, 2019 1 次提交
  8. 15 6月, 2019 2 次提交
  9. 26 5月, 2019 1 次提交
  10. 10 5月, 2019 1 次提交
    • J
      iommu/amd: Set exclusion range correctly · 29184cba
      Joerg Roedel 提交于
      [ Upstream commit 3c677d206210f53a4be972211066c0f1cd47fe12 ]
      
      The exlcusion range limit register needs to contain the
      base-address of the last page that is part of the range, as
      bits 0-11 of this register are treated as 0xfff by the
      hardware for comparisons.
      
      So correctly set the exclusion range in the hardware to the
      last page which is _in_ the range.
      
      Fixes: b2026aa2 ('x86, AMD IOMMU: add functions for programming IOMMU MMIO space')
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      29184cba
  11. 04 5月, 2019 1 次提交
  12. 20 4月, 2019 2 次提交
    • J
      iommu/dmar: Fix buffer overflow during PCI bus notification · 38855a84
      Julia Cartwright 提交于
      [ Upstream commit cffaaf0c816238c45cd2d06913476c83eb50f682 ]
      
      Commit 57384592 ("iommu/vt-d: Store bus information in RMRR PCI
      device path") changed the type of the path data, however, the change in
      path type was not reflected in size calculations.  Update to use the
      correct type and prevent a buffer overflow.
      
      This bug manifests in systems with deep PCI hierarchies, and can lead to
      an overflow of the static allocated buffer (dmar_pci_notify_info_buf),
      or can lead to overflow of slab-allocated data.
      
         BUG: KASAN: global-out-of-bounds in dmar_alloc_pci_notify_info+0x1d5/0x2e0
         Write of size 1 at addr ffffffff90445d80 by task swapper/0/1
         CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.14.87-rt49-02406-gd0a0e96 #1
         Call Trace:
          ? dump_stack+0x46/0x59
          ? print_address_description+0x1df/0x290
          ? dmar_alloc_pci_notify_info+0x1d5/0x2e0
          ? kasan_report+0x256/0x340
          ? dmar_alloc_pci_notify_info+0x1d5/0x2e0
          ? e820__memblock_setup+0xb0/0xb0
          ? dmar_dev_scope_init+0x424/0x48f
          ? __down_write_common+0x1ec/0x230
          ? dmar_dev_scope_init+0x48f/0x48f
          ? dmar_free_unused_resources+0x109/0x109
          ? cpumask_next+0x16/0x20
          ? __kmem_cache_create+0x392/0x430
          ? kmem_cache_create+0x135/0x2f0
          ? e820__memblock_setup+0xb0/0xb0
          ? intel_iommu_init+0x170/0x1848
          ? _raw_spin_unlock_irqrestore+0x32/0x60
          ? migrate_enable+0x27a/0x5b0
          ? sched_setattr+0x20/0x20
          ? migrate_disable+0x1fc/0x380
          ? task_rq_lock+0x170/0x170
          ? try_to_run_init_process+0x40/0x40
          ? locks_remove_file+0x85/0x2f0
          ? dev_prepare_static_identity_mapping+0x78/0x78
          ? rt_spin_unlock+0x39/0x50
          ? lockref_put_or_lock+0x2a/0x40
          ? dput+0x128/0x2f0
          ? __rcu_read_unlock+0x66/0x80
          ? __fput+0x250/0x300
          ? __rcu_read_lock+0x1b/0x30
          ? mntput_no_expire+0x38/0x290
          ? e820__memblock_setup+0xb0/0xb0
          ? pci_iommu_init+0x25/0x63
          ? pci_iommu_init+0x25/0x63
          ? do_one_initcall+0x7e/0x1c0
          ? initcall_blacklisted+0x120/0x120
          ? kernel_init_freeable+0x27b/0x307
          ? rest_init+0xd0/0xd0
          ? kernel_init+0xf/0x120
          ? rest_init+0xd0/0xd0
          ? ret_from_fork+0x1f/0x40
         The buggy address belongs to the variable:
          dmar_pci_notify_info_buf+0x40/0x60
      
      Fixes: 57384592 ("iommu/vt-d: Store bus information in RMRR PCI device path")
      Signed-off-by: NJulia Cartwright <julia@ni.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      38855a84
    • L
      iommu/vt-d: Check capability before disabling protected memory · cff04fad
      Lu Baolu 提交于
      [ Upstream commit 5bb71fc790a88d063507dc5d445ab8b14e845591 ]
      
      The spec states in 10.4.16 that the Protected Memory Enable
      Register should be treated as read-only for implementations
      not supporting protected memory regions (PLMR and PHMR fields
      reported as Clear in the Capability register).
      
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: mark gross <mgross@intel.com>
      Suggested-by: NAshok Raj <ashok.raj@intel.com>
      Fixes: f8bab735 ("intel-iommu: PMEN support")
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      cff04fad
  13. 06 4月, 2019 1 次提交
    • N
      iommu/io-pgtable-arm-v7s: Only kmemleak_ignore L2 tables · fc96b44c
      Nicolas Boichat 提交于
      [ Upstream commit 032ebd8548c9d05e8d2bdc7a7ec2fe29454b0ad0 ]
      
      L1 tables are allocated with __get_dma_pages, and therefore already
      ignored by kmemleak.
      
      Without this, the kernel would print this error message on boot,
      when the first L1 table is allocated:
      
      [    2.810533] kmemleak: Trying to color unknown object at 0xffffffd652388000 as Black
      [    2.818190] CPU: 5 PID: 39 Comm: kworker/5:0 Tainted: G S                4.19.16 #8
      [    2.831227] Workqueue: events deferred_probe_work_func
      [    2.836353] Call trace:
      ...
      [    2.852532]  paint_ptr+0xa0/0xa8
      [    2.855750]  kmemleak_ignore+0x38/0x6c
      [    2.859490]  __arm_v7s_alloc_table+0x168/0x1f4
      [    2.863922]  arm_v7s_alloc_pgtable+0x114/0x17c
      [    2.868354]  alloc_io_pgtable_ops+0x3c/0x78
      ...
      
      Fixes: e5fc9753 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
      Signed-off-by: NNicolas Boichat <drinkcat@chromium.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      fc96b44c
  14. 03 4月, 2019 1 次提交
    • N
      iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debugging · c9874d39
      Nicolas Boichat 提交于
      commit 0a352554da69b02f75ca3389c885c741f1f63235 upstream.
      
      IOMMUs using ARMv7 short-descriptor format require page tables (level 1
      and 2) to be allocated within the first 4GB of RAM, even on 64-bit
      systems.
      
      For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32 is
      defined (e.g.  on arm64 platforms).
      
      For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32.  Note that
      we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc, as this is
      not strictly necessary, and would cause a warning in mm/sl*b.c, as we
      did not update GFP_SLAB_BUG_MASK.
      
      Also, print an error when the physical address does not fit in
      32-bit, to make debugging easier in the future.
      
      Link: http://lkml.kernel.org/r/20181210011504.122604-3-drinkcat@chromium.org
      Fixes: ad67f5a6 ("arm64: replace ZONE_DMA with ZONE_DMA32")
      Signed-off-by: NNicolas Boichat <drinkcat@chromium.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: Huaisheng Ye <yehs1@lenovo.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matthias Brugger <matthias.bgg@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Sasha Levin <Alexander.Levin@microsoft.com>
      Cc: Tomasz Figa <tfiga@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
      Cc: Yong Wu <yong.wu@mediatek.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c9874d39
  15. 27 3月, 2019 1 次提交
    • S
      iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE · 86915713
      Stanislaw Gruszka 提交于
      commit 4e50ce03976fbc8ae995a000c4b10c737467beaa upstream.
      
      Take into account that sg->offset can be bigger than PAGE_SIZE when
      setting segment sg->dma_address. Otherwise sg->dma_address will point
      at diffrent page, what makes DMA not possible with erros like this:
      
      xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa70c0 flags=0x0020]
      xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7040 flags=0x0020]
      xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7080 flags=0x0020]
      xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7100 flags=0x0020]
      xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7000 flags=0x0020]
      
      Additinally with wrong sg->dma_address unmap_sg will free wrong pages,
      what what can cause crashes like this:
      
      Feb 28 19:27:45 kernel: BUG: Bad page state in process cinnamon  pfn:39e8b1
      Feb 28 19:27:45 kernel: Disabling lock debugging due to kernel taint
      Feb 28 19:27:45 kernel: flags: 0x2ffff0000000000()
      Feb 28 19:27:45 kernel: raw: 02ffff0000000000 0000000000000000 ffffffff00000301 0000000000000000
      Feb 28 19:27:45 kernel: raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
      Feb 28 19:27:45 kernel: page dumped because: nonzero _refcount
      Feb 28 19:27:45 kernel: Modules linked in: ccm fuse arc4 nct6775 hwmon_vid amdgpu nls_iso8859_1 nls_cp437 edac_mce_amd vfat fat kvm_amd ccp rng_core kvm mt76x0u mt76x0_common mt76x02_usb irqbypass mt76_usb mt76x02_lib mt76 crct10dif_pclmul crc32_pclmul chash mac80211 amd_iommu_v2 ghash_clmulni_intel gpu_sched i2c_algo_bit ttm wmi_bmof snd_hda_codec_realtek snd_hda_codec_generic drm_kms_helper snd_hda_codec_hdmi snd_hda_intel drm snd_hda_codec aesni_intel snd_hda_core snd_hwdep aes_x86_64 crypto_simd snd_pcm cfg80211 cryptd mousedev snd_timer glue_helper pcspkr r8169 input_leds realtek agpgart libphy rfkill snd syscopyarea sysfillrect sysimgblt fb_sys_fops soundcore sp5100_tco k10temp i2c_piix4 wmi evdev gpio_amdpt pinctrl_amd mac_hid pcc_cpufreq acpi_cpufreq sg ip_tables x_tables ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) dm_mod(E) serio_raw(E) atkbd(E) libps2(E) crc32c_intel(E) ahci(E) libahci(E) libata(E) xhci_pci(E) xhci_hcd(E)
      Feb 28 19:27:45 kernel:  scsi_mod(E) i8042(E) serio(E) bcache(E) crc64(E)
      Feb 28 19:27:45 kernel: CPU: 2 PID: 896 Comm: cinnamon Tainted: G    B   W   E     4.20.12-arch1-1-custom #1
      Feb 28 19:27:45 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P1.20 06/26/2018
      Feb 28 19:27:45 kernel: Call Trace:
      Feb 28 19:27:45 kernel:  dump_stack+0x5c/0x80
      Feb 28 19:27:45 kernel:  bad_page.cold.29+0x7f/0xb2
      Feb 28 19:27:45 kernel:  __free_pages_ok+0x2c0/0x2d0
      Feb 28 19:27:45 kernel:  skb_release_data+0x96/0x180
      Feb 28 19:27:45 kernel:  __kfree_skb+0xe/0x20
      Feb 28 19:27:45 kernel:  tcp_recvmsg+0x894/0xc60
      Feb 28 19:27:45 kernel:  ? reuse_swap_page+0x120/0x340
      Feb 28 19:27:45 kernel:  ? ptep_set_access_flags+0x23/0x30
      Feb 28 19:27:45 kernel:  inet_recvmsg+0x5b/0x100
      Feb 28 19:27:45 kernel:  __sys_recvfrom+0xc3/0x180
      Feb 28 19:27:45 kernel:  ? handle_mm_fault+0x10a/0x250
      Feb 28 19:27:45 kernel:  ? syscall_trace_enter+0x1d3/0x2d0
      Feb 28 19:27:45 kernel:  ? __audit_syscall_exit+0x22a/0x290
      Feb 28 19:27:45 kernel:  __x64_sys_recvfrom+0x24/0x30
      Feb 28 19:27:45 kernel:  do_syscall_64+0x5b/0x170
      Feb 28 19:27:45 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: NJan Viktorin <jan.viktorin@gmail.com>
      Reviewed-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
      Fixes: 80187fd3 ('iommu/amd: Optimize map_sg and unmap_sg')
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      86915713
  16. 14 3月, 2019 3 次提交
  17. 13 2月, 2019 4 次提交
    • W
      iommu/arm-smmu-v3: Use explicit mb() when moving cons pointer · 710e1e56
      Will Deacon 提交于
      [ Upstream commit a868e8530441286342f90c1fd9c5f24de3aa2880 ]
      
      After removing an entry from a queue (e.g. reading an event in
      arm_smmu_evtq_thread()) it is necessary to advance the MMIO consumer
      pointer to free the queue slot back to the SMMU. A memory barrier is
      required here so that all reads targetting the queue entry have
      completed before the consumer pointer is updated.
      
      The implementation of queue_inc_cons() relies on a writel() to complete
      the previous reads, but this is incorrect because writel() is only
      guaranteed to complete prior writes. This patch replaces the call to
      writel() with an mb(); writel_relaxed() sequence, which gives us the
      read->write ordering which we require.
      
      Cc: Robin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      710e1e56
    • V
      iommu/arm-smmu: Add support for qcom,smmu-v2 variant · 61010bd9
      Vivek Gautam 提交于
      [ Upstream commit 89cddc563743cb1e0068867ac97013b2a5bf86aa ]
      
      qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
      clock and power requirements.
      On msm8996, multiple cores, viz. mdss, video, etc. use this
      smmu. On sdm845, this smmu is used with gpu.
      Add bindings for the same.
      Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org>
      Reviewed-by: NRob Herring <robh@kernel.org>
      Reviewed-by: NTomasz Figa <tfiga@chromium.org>
      Tested-by: NSrinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      61010bd9
    • Z
      iommu/arm-smmu-v3: Avoid memory corruption from Hisilicon MSI payloads · 00b0fbb8
      Zhen Lei 提交于
      [ Upstream commit 84a9a75774961612d0c7dd34a1777e8f98a65abd ]
      
      The GITS_TRANSLATER MMIO doorbell register in the ITS hardware is
      architected to be 4 bytes in size, yet on hi1620 and earlier, Hisilicon
      have allocated the adjacent 4 bytes to carry some IMPDEF sideband
      information which results in an 8-byte MSI payload being delivered when
      signalling an interrupt:
      
      MSIAddr:
      	 |----4bytes----|----4bytes----|
      	 |    MSIData   |    IMPDEF    |
      
      This poses no problem for the ITS hardware because the adjacent 4 bytes
      are reserved in the memory map. However, when delivering MSIs to memory,
      as we do in the SMMUv3 driver for signalling the completion of a SYNC
      command, the extended payload will corrupt the 4 bytes adjacent to the
      "sync_count" member in struct arm_smmu_device. Fortunately, the current
      layout allocates these bytes to padding, but this is fragile and we
      should make this explicit.
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
      [will: Rewrote commit message and comment]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      00b0fbb8
    • Y
      iommu/amd: Fix amd_iommu=force_isolation · 3a6f1afa
      Yu Zhao 提交于
      [ Upstream commit c12b08ebbe16f0d3a96a116d86709b04c1ee8e74 ]
      
      The parameter is still there but it's ignored. We need to check its
      value before deciding to go into passthrough mode for AMD IOMMU v2
      capable device.
      
      We occasionally use this parameter to force v2 capable device into
      translation mode to debug memory corruption that we suspect is
      caused by DMA writes.
      
      To address the following comment from Joerg Roedel on the first
      version, v2 capability of device is completely ignored.
      > This breaks the iommu_v2 use-case, as it needs a direct mapping for the
      > devices that support it.
      
      And from Documentation/admin-guide/kernel-parameters.txt:
        This option does not override iommu=pt
      
      Fixes: aafd8ba0 ("iommu/amd: Implement add_device and remove_device")
      Signed-off-by: NYu Zhao <yuzhao@google.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      3a6f1afa
  18. 07 2月, 2019 1 次提交
  19. 13 1月, 2019 1 次提交
  20. 10 1月, 2019 1 次提交
    • R
      iommu/arm-smmu-v3: Fix big-endian CMD_SYNC writes · 1817b2cc
      Robin Murphy 提交于
      commit 3cd508a8c1379427afb5e16c2e0a7c986d907853 upstream.
      
      When we insert the sync sequence number into the CMD_SYNC.MSIData field,
      we do so in CPU-native byte order, before writing out the whole command
      as explicitly little-endian dwords. Thus on big-endian systems, the SMMU
      will receive and write back a byteswapped version of sync_nr, which would
      be perfect if it were targeting a similarly-little-endian ITS, but since
      it's actually writing back to memory being polled by the CPUs, they're
      going to end up seeing the wrong thing.
      
      Since the SMMU doesn't care what the MSIData actually contains, the
      minimal-overhead solution is to simply add an extra byteswap initially,
      such that it then writes back the big-endian format directly.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 37de98f8 ("iommu/arm-smmu-v3: Use CMD_SYNC completion MSI")
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1817b2cc
  21. 13 12月, 2018 4 次提交
  22. 14 11月, 2018 1 次提交
  23. 05 10月, 2018 1 次提交
    • S
      iommu/amd: Clear memory encryption mask from physical address · b3e9b515
      Singh, Brijesh 提交于
      Boris Ostrovsky reported a memory leak with device passthrough when SME
      is active.
      
      The VFIO driver uses iommu_iova_to_phys() to get the physical address for
      an iova. This physical address is later passed into vfio_unmap_unpin() to
      unpin the memory. The vfio_unmap_unpin() uses pfn_valid() before unpinning
      the memory. The pfn_valid() check was failing because encryption mask was
      part of the physical address returned. This resulted in the memory not
      being unpinned and therefore leaked after the guest terminates.
      
      The memory encryption mask must be cleared from the physical address in
      iommu_iova_to_phys().
      
      Fixes: 2543a786 ("iommu/amd: Allow the AMD IOMMU to work with memory encryption")
      Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: <iommu@lists.linux-foundation.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: <stable@vger.kernel.org> # 4.14+
      Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      b3e9b515
  24. 26 9月, 2018 1 次提交
  25. 25 9月, 2018 2 次提交
  26. 24 8月, 2018 2 次提交
    • M
      iommu/rockchip: Move irq request past pm_runtime_enable · 1aa55ca9
      Marc Zyngier 提交于
      Enabling the interrupt early, before power has been applied to the
      device, can result in an interrupt being delivered too early if:
      
      - the IOMMU shares an interrupt with a VOP
      - the VOP has a pending interrupt (after a kexec, for example)
      
      In these conditions, we end-up taking the interrupt without
      the IOMMU being ready to handle the interrupt (not powered on).
      
      Moving the interrupt request past the pm_runtime_enable() call
      makes sure we can at least access the IOMMU registers. Note that
      this is only a partial fix, and that the VOP interrupt will still
      be screaming until the VOP driver kicks in, which advocates for
      a more synchronized interrupt enabling/disabling approach.
      
      Fixes: 0f181d3c ("iommu/rockchip: Add runtime PM support")
      Reviewed-by: NHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      1aa55ca9
    • M
      iommu/rockchip: Handle errors returned from PM framework · 3fc7c5c0
      Marc Zyngier 提交于
      pm_runtime_get_if_in_use can fail: either PM has been disabled
      altogether (-EINVAL), or the device hasn't been enabled yet (0).
      Sadly, the Rockchip IOMMU driver tends to conflate the two things
      by considering a non-zero return value as successful.
      
      This has the consequence of hiding other bugs, so let's handle this
      case throughout the driver, with a WARN_ON_ONCE so that we can try
      and work out what happened.
      
      Fixes: 0f181d3c ("iommu/rockchip: Add runtime PM support")
      Reviewed-by: NHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      3fc7c5c0
  27. 18 8月, 2018 1 次提交