1. 22 3月, 2017 1 次提交
    • K
      iommu/vt-d: Fix NULL pointer dereference in device_to_iommu · 5003ae1e
      Koos Vriezen 提交于
      The function device_to_iommu() in the Intel VT-d driver
      lacks a NULL-ptr check, resulting in this oops at boot on
      some platforms:
      
       BUG: unable to handle kernel NULL pointer dereference at 00000000000007ab
       IP: [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
       PGD 0
      
       [...]
      
       Call Trace:
         ? find_or_alloc_domain.constprop.29+0x1a/0x300
         ? dw_dma_probe+0x561/0x580 [dw_dmac_core]
         ? __get_valid_domain_for_dev+0x39/0x120
         ? __intel_map_single+0x138/0x180
         ? intel_alloc_coherent+0xb6/0x120
         ? sst_hsw_dsp_init+0x173/0x420 [snd_soc_sst_haswell_pcm]
         ? mutex_lock+0x9/0x30
         ? kernfs_add_one+0xdb/0x130
         ? devres_add+0x19/0x60
         ? hsw_pcm_dev_probe+0x46/0xd0 [snd_soc_sst_haswell_pcm]
         ? platform_drv_probe+0x30/0x90
         ? driver_probe_device+0x1ed/0x2b0
         ? __driver_attach+0x8f/0xa0
         ? driver_probe_device+0x2b0/0x2b0
         ? bus_for_each_dev+0x55/0x90
         ? bus_add_driver+0x110/0x210
         ? 0xffffffffa11ea000
         ? driver_register+0x52/0xc0
         ? 0xffffffffa11ea000
         ? do_one_initcall+0x32/0x130
         ? free_vmap_area_noflush+0x37/0x70
         ? kmem_cache_alloc+0x88/0xd0
         ? do_init_module+0x51/0x1c4
         ? load_module+0x1ee9/0x2430
         ? show_taint+0x20/0x20
         ? kernel_read_file+0xfd/0x190
         ? SyS_finit_module+0xa3/0xb0
         ? do_syscall_64+0x4a/0xb0
         ? entry_SYSCALL64_slow_path+0x25/0x25
       Code: 78 ff ff ff 4d 85 c0 74 ee 49 8b 5a 10 0f b6 9b e0 00 00 00 41 38 98 e0 00 00 00 77 da 0f b6 eb 49 39 a8 88 00 00 00 72 ce eb 8f <41> f6 82 ab 07 00 00 04 0f 85 76 ff ff ff 0f b6 4d 08 88 0e 49
       RIP  [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
        RSP <ffffc90001457a78>
       CR2: 00000000000007ab
       ---[ end trace 16f974b6d58d0aad ]---
      
      Add the missing pointer check.
      
      Fixes: 1c387188 ("iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions")
      Signed-off-by: NKoos Vriezen <koos.vriezen@gmail.com>
      Cc: stable@vger.kernel.org # 4.8.15+
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      5003ae1e
  2. 28 2月, 2017 1 次提交
  3. 25 2月, 2017 1 次提交
  4. 10 2月, 2017 3 次提交
  5. 31 1月, 2017 2 次提交
    • D
      iommu/vt-d: Don't over-free page table directories · f7116e11
      David Dillow 提交于
      dma_pte_free_level() recurses down the IOMMU page tables and frees
      directory pages that are entirely contained in the given PFN range.
      Unfortunately, it incorrectly calculates the starting address covered
      by the PTE under consideration, which can lead to it clearing an entry
      that is still in use.
      
      This occurs if we have a scatterlist with an entry that has a length
      greater than 1026 MB and is aligned to 2 MB for both the IOMMU and
      physical addresses. For example, if __domain_mapping() is asked to map a
      two-entry scatterlist with 2 MB and 1028 MB segments to PFN 0xffff80000,
      it will ask if dma_pte_free_pagetable() is asked to PFNs from
      0xffff80200 to 0xffffc05ff, it will also incorrectly clear the PFNs from
      0xffff80000 to 0xffff801ff because of this issue. The current code will
      set level_pfn to 0xffff80200, and 0xffff80200-0xffffc01ff fits inside
      the range being cleared. Properly setting the level_pfn for the current
      level under consideration catches that this PTE is outside of the range
      being cleared.
      
      This patch also changes the value passed into dma_pte_free_level() when
      it recurses. This only affects the first PTE of the range being cleared,
      and is handled by the existing code that ensures we start our cursor no
      lower than start_pfn.
      
      This was found when using dma_map_sg() to map large chunks of contiguous
      memory, which immediatedly led to faults on the first access of the
      erroneously-deleted mappings.
      
      Fixes: 3269ee0b ("intel-iommu: Fix leaks in pagetable freeing")
      Reviewed-by: NBenjamin Serebrin <serebrin@google.com>
      Signed-off-by: NDavid Dillow <dillow@google.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      f7116e11
    • A
      iommu/vt-d: Tylersburg isoch identity map check is done too late. · 21e722c4
      Ashok Raj 提交于
      The check to set identity map for tylersburg is done too late. It needs
      to be done before the check for identity_map domain is done.
      
      To: Joerg Roedel <joro@8bytes.org>
      To: David Woodhouse <dwmw2@infradead.org>
      Cc: iommu@lists.linux-foundation.org
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org
      Cc: Ashok Raj <ashok.raj@intel.com>
      
      Fixes: 86080ccc ("iommu/vt-d: Allocate si_domain in init_dmars()")
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Reported-by: NYunhong Jiang <yunhong.jiang@intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      21e722c4
  6. 23 1月, 2017 1 次提交
  7. 04 1月, 2017 2 次提交
    • J
      iommu/vt-d: Fix pasid table size encoding · 65ca7f5f
      Jacob Pan 提交于
      Different encodings are used to represent supported PASID bits
      and number of PASID table entries.
      The current code assigns ecap_pss directly to extended context
      table entry PTS which is wrong and could result in writing
      non-zero bits to the reserved fields. IOMMU fault reason
      11 will be reported when reserved bits are nonzero.
      This patch converts ecap_pss to extend context entry pts encoding
      based on VT-d spec. Chapter 9.4 as follows:
       - number of PASID bits = ecap_pss + 1
       - number of PASID table entries = 2^(pts + 5)
      Software assigned limit of pasid_max value is also respected to
      match the allocation limitation of PASID table.
      
      cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      cc: Ashok Raj <ashok.raj@intel.com>
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Tested-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Fixes: 2f26e0a9 ('iommu/vt-d: Add basic SVM PASID support')
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      65ca7f5f
    • X
      iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped · aec0e861
      Xunlei Pang 提交于
      We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers
      under kdump, it can be steadily reproduced on several different machines,
      the dmesg log is like:
      HP HPSA Driver (v 3.4.16-0)
      hpsa 0000:02:00.0: using doorbell to reset controller
      hpsa 0000:02:00.0: board ready after hard reset.
      hpsa 0000:02:00.0: Waiting for controller to respond to no-op
      DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff]
      DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff]
      DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - 0xbdf6efff]
      DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff]
      DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff]
      DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff]
      DMAR: DRHD: handling fault status reg 2
      DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason 06] PTE Read access is not set
      hpsa 0000:02:00.0: controller message 03:00 timed out
      hpsa 0000:02:00.0: no-op failed; re-trying
      
      After some debugging, we found that the fault addr is from DMA initiated at
      the driver probe stage after reset(not in-flight DMA), and the corresponding
      pte entry value is correct, the fault is likely due to the old iommu caches
      of the in-flight DMA before it.
      
      Thus we need to flush the old cache after context mapping is setup for the
      device, where the device is supposed to finish reset at its driver probe
      stage and no in-flight DMA exists hereafter.
      
      I'm not sure if the hardware is responsible for invalidating all the related
      caches allocated in the iommu hardware before, but seems not the case for hpsa,
      actually many device drivers have problems in properly resetting the hardware.
      Anyway flushing (again) by software in kdump kernel when the device gets context
      mapped which is a quite infrequent operation does little harm.
      
      With this patch, the problematic machine can survive the kdump tests.
      
      CC: Myron Stowe <myron.stowe@gmail.com>
      CC: Joseph Szczypek <jszczype@redhat.com>
      CC: Don Brace <don.brace@microsemi.com>
      CC: Baoquan He <bhe@redhat.com>
      CC: Dave Young <dyoung@redhat.com>
      Fixes: 091d42e4 ("iommu/vt-d: Copy translation tables from old kernel")
      Fixes: dbcd861f ("iommu/vt-d: Do not re-use domain-ids from the old kernel")
      Fixes: cf484d0e ("iommu/vt-d: Mark copied context entries")
      Signed-off-by: NXunlei Pang <xlpang@redhat.com>
      Tested-by: NDon Brace <don.brace@microsemi.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      aec0e861
  8. 02 12月, 2016 1 次提交
  9. 08 11月, 2016 1 次提交
  10. 30 10月, 2016 1 次提交
    • A
      iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions · 1c387188
      Ashok Raj 提交于
      The VT-d specification (§8.3.3) says:
          ‘Virtual Functions’ of a ‘Physical Function’ are under the scope
          of the same remapping unit as the ‘Physical Function’.
      
      The BIOS is not required to list all the possible VFs in the scope
      tables, and arguably *shouldn't* make any attempt to do so, since there
      could be a huge number of them.
      
      This has been broken basically for ever — the VF is never going to match
      against a specific unit's scope, so it ends up being assigned to the
      INCLUDE_ALL IOMMU. Which was always actually correct by coincidence, but
      now we're looking at Root-Complex integrated devices with SR-IOV support
      it's going to start being wrong.
      
      Fix it to simply use pci_physfn() before doing the lookup for PCI devices.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NSainath Grandhi <sainath.grandhi@intel.com>
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      1c387188
  11. 05 9月, 2016 2 次提交
  12. 04 8月, 2016 1 次提交
    • K
      dma-mapping: use unsigned long for dma_attrs · 00085f1e
      Krzysztof Kozlowski 提交于
      The dma-mapping core and the implementations do not change the DMA
      attributes passed by pointer.  Thus the pointer can point to const data.
      However the attributes do not have to be a bitfield.  Instead unsigned
      long will do fine:
      
      1. This is just simpler.  Both in terms of reading the code and setting
         attributes.  Instead of initializing local attributes on the stack
         and passing pointer to it to dma_set_attr(), just set the bits.
      
      2. It brings safeness and checking for const correctness because the
         attributes are passed by value.
      
      Semantic patches for this change (at least most of them):
      
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
      
          @@
          f(...,
          - struct dma_attrs *attrs
          + unsigned long attrs
          , ...)
          {
          ...
          }
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      and
      
          // Options: --all-includes
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
          type t;
      
          @@
          t f(..., struct dma_attrs *attrs);
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      Acked-by: Mark Salter <msalter@redhat.com> [c6x]
      Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
      Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
      Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
      Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
      Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
      Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
      Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00085f1e
  13. 28 7月, 2016 1 次提交
    • L
      Add braces to avoid "ambiguous ‘else’" compiler warnings · 194dc870
      Linus Torvalds 提交于
      Some of our "for_each_xyz()" macro constructs make gcc unhappy about
      lack of braces around if-statements inside or outside the loop, because
      the loop construct itself has a "if-then-else" statement inside of it.
      
      The resulting warnings look something like this:
      
        drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’:
        drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses]
           if (ctx != dev_priv->kernel_context)
              ^
      
      even if the code itself is fine.
      
      Since the warning is fairly easy to avoid by adding a braces around the
      if-statement near the for_each_xyz() construct, do so, rather than
      disabling the otherwise potentially useful warning.
      
      (The if-then-else statements used in the "for_each_xyz()" constructs are
      designed to be inherently safe even with no braces, but in this case
      it's quite understandable that gcc isn't really able to tell that).
      
      This finally leaves the standard "allmodconfig" build with just a
      handful of remaining warnings, so new and valid warnings hopefully will
      stand out.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      194dc870
  14. 14 7月, 2016 1 次提交
  15. 04 7月, 2016 1 次提交
    • A
      iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas · 0caa7616
      Aaron Campbell 提交于
      Per VT-d spec Section 10.4.2 ("Capability Register"), the maximum
      number of possible domains is 64K; indeed this is the maximum value
      that the cap_ndoms() macro will expand to.  Since the value 65536
      will not fix in a u16, the 'did' variable must be promoted to an
      int, otherwise the test for < 65536 will always be true and the
      loop will never end.
      
      The symptom, in my case, was a hung machine during suspend.
      
      Fixes: 3bd4f911 ("iommu/vt-d: Fix overflow of iommu->domains array")
      Signed-off-by: NAaron Campbell <aaron@monkey.org>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      0caa7616
  16. 27 6月, 2016 1 次提交
  17. 17 6月, 2016 1 次提交
  18. 15 6月, 2016 1 次提交
  19. 21 4月, 2016 7 次提交
  20. 07 4月, 2016 1 次提交
  21. 05 4月, 2016 1 次提交
  22. 01 3月, 2016 1 次提交
  23. 29 1月, 2016 1 次提交
  24. 16 12月, 2015 1 次提交
    • D
      Revert "scatterlist: use sg_phys()" · 3e6110fd
      Dan Williams 提交于
      commit db0fa0cb "scatterlist: use sg_phys()" did replacements of
      the form:
      
          phys_addr_t phys = page_to_phys(sg_page(s));
          phys_addr_t phys = sg_phys(s) & PAGE_MASK;
      
      However, this breaks platforms where sizeof(phys_addr_t) >
      sizeof(unsigned long).  Revert for 4.3 and 4.4 to make room for a
      combined helper in 4.5.
      
      Cc: <stable@vger.kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Fixes: db0fa0cb ("scatterlist: use sg_phys()")
      Suggested-by: NJoerg Roedel <joro@8bytes.org>
      Reported-by: NVitaly Lavrov <vel21ripn@gmail.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      3e6110fd
  25. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  26. 25 10月, 2015 1 次提交
    • D
      iommu/vt-d: Clean up pasid_enabled() and ecs_enabled() dependencies · d42fde70
      David Woodhouse 提交于
      When booted with intel_iommu=ecs_off we were still allocating the PASID
      tables even though we couldn't actually use them. We really want to make
      the pasid_enabled() macro depend on ecs_enabled().
      
      Which is unfortunate, because currently they're the other way round to
      cope with the Broadwell/Skylake problems with ECS.
      
      Instead of having ecs_enabled() depend on pasid_enabled(), which was never
      something that made me happy anyway, make it depend in the normal case
      on the "broken PASID" bit 28 *not* being set.
      
      Then pasid_enabled() can depend on ecs_enabled() as it should. And we also
      don't need to mess with it if we ever see an implementation that has some
      features requiring ECS (like PRI) but which *doesn't* have PASID support.
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      d42fde70
  27. 22 10月, 2015 1 次提交
  28. 19 10月, 2015 1 次提交
  29. 15 10月, 2015 1 次提交