提交 · 1e6aaae93e9ddb9dc664993eb949b1da94cab3a5 · openeuler / Kernel

13 8月, 2020 1 次提交

mm: do page fault accounting in handle_mm_fault · bce617ed

由 Peter Xu 提交于 8月 11, 2020

Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b982 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more

This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bce617ed

24 7月, 2020 6 次提交

iommu/vt-d: Rename intel-pasid.h to pasid.h · 02f3effd

由 Lu Baolu 提交于 7月 24, 2020

As Intel VT-d files have been moved to its own subdirectory, the prefix
makes no sense. No functional changes.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-13-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

02f3effd

iommu/vt-d: Add page response ops support · 8b737121

由 Lu Baolu 提交于 7月 24, 2020

After page requests are handled, software must respond to the device
which raised the page request with the result. This is done through
the iommu ops.page_response if the request was reported to outside of
vendor iommu driver through iommu_report_device_fault(). This adds the
VT-d implementation of page_response ops.
Co-developed-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Co-developed-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-12-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

8b737121

iommu/vt-d: Report page request faults for guest SVA · eb8d93ea

由 Lu Baolu 提交于 7月 24, 2020

A pasid might be bound to a page table from a VM guest via the iommu
ops.sva_bind_gpasid. In this case, when a DMA page fault is detected
on the physical IOMMU, we need to inject the page fault request into
the guest. After the guest completes handling the page fault, a page
response need to be sent back via the iommu ops.page_response().

This adds support to report a page request fault. Any external module
which is interested in handling this fault should regiester a notifier
with iommu_register_device_fault_handler().
Co-developed-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Co-developed-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-11-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

eb8d93ea

iommu/vt-d: Add a helper to get svm and sdev for pasid · 19abcf70

由 Lu Baolu 提交于 7月 24, 2020

There are several places in the code that need to get the pointers of
svm and sdev according to a pasid and device. Add a helper to achieve
this for code consolidation and readability.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-10-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

19abcf70

iommu/vt-d: Refactor device_to_iommu() helper · dd6692f1

由 Lu Baolu 提交于 7月 24, 2020

It is refactored in two ways:

- Make it global so that it could be used in other files.

- Make bus/devfn optional so that callers could ignore these two returned
values when they only want to get the coresponding iommu pointer.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-9-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

dd6692f1

iommu/vt-d: Disable multiple GPASID-dev bind · d315e9e6

由 Jacob Pan 提交于 7月 24, 2020

For the unlikely use case where multiple aux domains from the same pdev
are attached to a single guest and then bound to a single process
(thus same PASID) within that guest, we cannot easily support this case
by refcounting the number of users. As there is only one SL page table
per PASID while we have multiple aux domains thus multiple SL page tables
for the same PASID.

Extra unbinding guest PASID can happen due to race between normal and
exception cases. Termination of one aux domain may affect others unless
we actively track and switch aux domains to ensure the validity of SL
page tables and TLB states in the shared PASID entry.

Support for sharing second level PGDs across domains can reduce the
complexity but this is not available due to the limitations on VFIO
container architecture. We can revisit this decision once sharing PGDs
are available.

Overall, the complexity and potential glitch do not warrant this unlikely
use case thereby removed by this patch.

Fixes: 56722a43 ("iommu/vt-d: Add bind guest PASID support")
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-8-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

d315e9e6

10 6月, 2020 2 次提交

iommu/vt-d: Move Intel IOMMU driver into subdirectory · 672cf6df

由 Joerg Roedel 提交于 6月 09, 2020

Move all files related to the Intel IOMMU driver into its own
subdirectory.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200609130303.26974-3-joro@8bytes.org

672cf6df

mmap locking API: use coccinelle to convert mmap_sem rwsem call sites · d8ed45c5

由 Michel Lespinasse 提交于 6月 08, 2020

This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.

The change is generated using coccinelle with the following rule:

// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)
Signed-off-by: NMichel Lespinasse <walken@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: NLaurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d8ed45c5

29 5月, 2020 1 次提交

iommu/vt-d: Fix compile warning · 71974cfb

由 Jacob Pan 提交于 5月 28, 2020

Make intel_svm_unbind_mm() a static function.

Fixes: 064a57d7 ("iommu/vt-d: Replace intel SVM APIs with generic SVA APIs")
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1590689031-79318-1-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

71974cfb

25 5月, 2020 1 次提交

iommu/vt-d: Fix pointer cast warnings on 32 bit · bfe6240d

由 Lu Baolu 提交于 5月 19, 2020

Pointers should be casted to unsigned long to avoid "cast from pointer
to integer of different size" warnings.

drivers/iommu/intel-pasid.c:818:6: warning:
cast from pointer to integer of different size [-Wpointer-to-int-cast]
drivers/iommu/intel-pasid.c:821:9: warning:
cast from pointer to integer of different size [-Wpointer-to-int-cast]
drivers/iommu/intel-pasid.c:824:23: warning:
cast from pointer to integer of different size [-Wpointer-to-int-cast]
drivers/iommu/intel-svm.c:343:45: warning:
cast to pointer from integer of different size [-Wint-to-pointer-cast]

Fixes: b0d1f874 ("iommu/vt-d: Add nested translation helper function")
Fixes: 56722a43 ("iommu/vt-d: Add bind guest PASID support")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200519013423.11971-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

bfe6240d

18 5月, 2020 8 次提交

iommu/vt-d: Remove duplicated check in intel_svm_bind_mm() · 7482fd59

由 Lu Baolu 提交于 5月 16, 2020

The info and info->pasid_support have already been checked in previous
intel_iommu_enable_pasid() call. No need to check again.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-18-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

7482fd59

iommu/vt-d: Remove redundant IOTLB flush · 81ebd91a

由 Lu Baolu 提交于 5月 16, 2020

IOTLB flush already included in the PASID tear down and the page request
drain process. There is no need to flush again.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-17-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

81ebd91a

iommu/vt-d: Add page request draining support · 66ac4db3

由 Lu Baolu 提交于 5月 16, 2020

When a PASID is stopped or terminated, there can be pending PRQs
(requests that haven't received responses) in remapping hardware.
This adds the interface to drain page requests and call it when a
PASID is terminated.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-16-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

66ac4db3

iommu/vt-d: Disable non-recoverable fault processing before unbind · 37e91bd4

由 Lu Baolu 提交于 5月 16, 2020

When a PASID is used for SVA by the device, it's possible that the PASID
entry is cleared before the device flushes all ongoing DMA requests. The
IOMMU should tolerate and ignore the non-recoverable faults caused by the
untranslated requests from this device.

For example, when an exception happens, the process terminates before the
device driver stops DMA and call IOMMU driver to unbind PASID. The flow
of process exist is as follows:

do_exit() {
     exit_mm() {
             mm_put();
             exit_mmap() {
                     intel_invalidate_range() //mmu notifier
                     tlb_finish_mmu()
                     mmu_notifier_release(mm) {
                             intel_iommu_release() {
[2]                                  intel_iommu_teardown_pasid();
                                     intel_iommu_flush_tlbs();
                             }
                     }
                     unmap_vmas();
                     free_pgtables();
             };
     }
     exit_files(tsk) {
             close_files() {
                     dsa_close();
[1]                  dsa_stop_dma();
                     intel_svm_unbind_pasid();
             }
     }
}

Care must be taken on VT-d to avoid unrecoverable faults between the time
window of [1] and [2]. [Process exist flow was contributed by Jacob Pan.]

Intel VT-d provides such function through the FPD bit of the PASID entry.
This sets FPD bit when PASID entry is changing from present to nonpresent
in the mm notifier and will clear it when the pasid is unbound.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-15-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

37e91bd4

iommu/vt-d: Multiple descriptors per qi_submit_sync() · 8a1d8246

由 Lu Baolu 提交于 5月 16, 2020

Current qi_submit_sync() only supports single invalidation descriptor
per submission and appends wait descriptor after each submission to
poll the hardware completion. This extends the qi_submit_sync() helper
to support multiple descriptors, and add an option so that the caller
could specify the Page-request Drain (PD) bit in the wait descriptor.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-13-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

8a1d8246

iommu/vt-d: Replace intel SVM APIs with generic SVA APIs · 064a57d7

由 Jacob Pan 提交于 5月 16, 2020

This patch is an initial step to replace Intel SVM code with the
following IOMMU SVA ops:
intel_svm_bind_mm() => iommu_sva_bind_device()
intel_svm_unbind_mm() => iommu_sva_unbind_device()
intel_svm_is_pasid_valid() => iommu_sva_get_pasid()

The features below will continue to work but are not included in this patch
in that they are handled mostly within the IOMMU subsystem.
- IO page fault
- mmu notifier

Consolidation of the above will come after merging generic IOMMU sva
code[1]. There should not be any changes needed for SVA users such as
accelerator device drivers during this time.

[1] http://jpbrucker.net/sva/Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-12-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

064a57d7

iommu/vt-d: Add get_domain_info() helper · e85bb99b

由 Lu Baolu 提交于 5月 16, 2020

Add a get_domain_info() helper to retrieve the valid per-device
iommu private data.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-10-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

e85bb99b

iommu/vt-d: Add bind guest PASID support · 56722a43

由 Jacob Pan 提交于 5月 16, 2020

When supporting guest SVA with emulated IOMMU, the guest PASID
table is shadowed in VMM. Updates to guest vIOMMU PASID table
will result in PASID cache flush which will be passed down to
the host as bind guest PASID calls.

For the SL page tables, it will be harvested from device's
default domain (request w/o PASID), or aux domain in case of
mediated device.

    .-------------.  .---------------------------.
    |   vIOMMU    |  | Guest process CR3, FL only|
    |             |  '---------------------------'
    .----------------/
    | PASID Entry |--- PASID cache flush -
    '-------------'                       |
    |             |                       V
    |             |                CR3 in GPA
    '-------------'
Guest
------| Shadow |--------------------------|--------
      v        v                          v
Host
    .-------------.  .----------------------.
    |   pIOMMU    |  | Bind FL for GVA-GPA  |
    |             |  '----------------------'
    .----------------/  |
    | PASID Entry |     V (Nested xlate)
    '----------------\.------------------------------.
    |             |   |SL for GPA-HPA, default domain|
    |             |   '------------------------------'
    '-------------'
Where:
 - FL = First level/stage one page tables
 - SL = Second level/stage two page tables
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-5-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

56722a43

27 3月, 2020 1 次提交

iommu/vt-d: Fix mm reference leak · 902baf61

由 Jacob Pan 提交于 3月 19, 2020

Move canonical address check before mmget_not_zero() to avoid mm
reference leak.

Fixes: 9d8c3af3 ("iommu/vt-d: IOMMU Page Request needs to check if address is canonical.")
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

902baf61

19 3月, 2020 1 次提交

iommu/vt-d: Fix page request descriptor size · 52355fb1

由 Jacob Pan 提交于 3月 17, 2020

Intel VT-d might support PRS (Page Reqest Support) when it's
running in the scalable mode. Each page request descriptor
occupies 32 bytes and is 32-bytes aligned. The page request
descriptor offset mask should be 32-bytes aligned.

Fixes: 5b438f4b ("iommu/vt-d: Support page request in scalable mode")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

52355fb1

07 1月, 2020 7 次提交

iommu/vt-d: Add PASID_FLAG_FL5LP for first-level pasid setup · 87208f22

由 Lu Baolu 提交于 1月 02, 2020

Current intel_pasid_setup_first_level() use 5-level paging for
first level translation if CPUs use 5-level paging mode too.
This makes sense for SVA usages since the page table is shared
between CPUs and IOMMUs. But it makes no sense if we only want
to use first level for IOVA translation. Add PASID_FLAG_FL5LP
bit in the flags which indicates whether the 5-level paging
mode should be used.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

87208f22

iommu/vt-d: Misc macro clean up for SVM · 034d4731

由 Jacob Pan 提交于 1月 02, 2020

Use combined macros for_each_svm_dev() to simplify SVM device iteration
and error checking.
Suggested-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

034d4731

iommu/vt-d: Avoid sending invalid page response · 5f75585e

由 Jacob Pan 提交于 1月 02, 2020

Page responses should only be sent when last page in group (LPIG) or
private data is present in the page request. This patch avoids sending
invalid descriptors.

Fixes: 5d308fc1 ("iommu/vt-d: Add 256-bit invalidation descriptor support")
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5f75585e

iommu/vt-d: Replace Intel specific PASID allocator with IOASID · 59a62337

由 Jacob Pan 提交于 1月 02, 2020

Make use of generic IOASID code to manage PASID allocation,
free, and lookup. Replace Intel specific code.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

59a62337

iommu/vt-d: Fix off-by-one in PASID allocation · 39d630e3

由 Jacob Pan 提交于 1月 02, 2020

PASID allocator uses IDR which is exclusive for the end of the
allocation range. There is no need to decrement pasid_max.

Fixes: af395073 ("iommu/vt-d: Apply global PASID in SVA")
Reported-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

39d630e3

iommu/vt-d: Reject SVM bind for failed capability check · 6eba09a4

由 Jacob Pan 提交于 1月 02, 2020

Add a check during SVM bind to ensure CPU and IOMMU hardware capabilities
are met.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

6eba09a4

iommu/vt-d: Fix CPU and IOMMU SVM feature matching checks · ff3dc652

由 Jacob Pan 提交于 1月 02, 2020

Shared Virtual Memory(SVM) is based on a collective set of hardware
features detected at runtime. There are requirements for matching CPU
and IOMMU capabilities.

The current code checks CPU and IOMMU feature set for SVM support but
the result is never stored nor used. Therefore, SVM can still be used
even when these checks failed. The consequences can be:
1. CPU uses 5-level paging mode for virtual address of 57 bits, but
IOMMU can only support 4-level paging mode with 48 bits address for DMA.
2. 1GB page size is used by CPU but IOMMU does not support it. VT-d
unrecoverable faults may be generated.

The best solution to fix these problems is to prevent them in the first
place.

This patch consolidates code for checking PASID, CPU vs. IOMMU paging
mode compatibility, as well as provides specific error messages for
each failed checks. On sane hardware configurations, these error message
shall never appear in kernel log.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

ff3dc652

18 12月, 2019 1 次提交

iommu/vt-d: Remove incorrect PSI capability check · f81b846d

由 Lu Baolu 提交于 11月 20, 2019

The PSI (Page Selective Invalidation) bit in the capability register
is only valid for second-level translation. Intel IOMMU supporting
scalable mode must support page/address selective IOTLB invalidation
for first-level translation. Remove the PSI capability check in SVA
cache invalidation code.

Fixes: 8744daf4 ("iommu/vt-d: Remove global page flush support")
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

f81b846d

03 9月, 2019 1 次提交

iommu/vt-d: Remove global page flush support · 8744daf4

由 Jacob Pan 提交于 8月 26, 2019

Global pages support is removed from VT-d spec 3.0. Since global pages G
flag only affects first-level paging structures and because DMA request
with PASID are only supported by VT-d spec. 3.0 and onward, we can
safely remove global pages support.

For kernel shared virtual address IOTLB invalidation, PASID
granularity and page selective within PASID will be used. There is
no global granularity supported. Without this fix, IOTLB invalidation
will cause invalid descriptor error in the queued invalidation (QI)
interface.

Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Reported-by: NSanjay K Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

8744daf4

05 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 · 2025cf9e

由 Thomas Gleixner 提交于 5月 29, 2019

Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 263 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Reviewed-by: NAlexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2025cf9e

27 5月, 2019 1 次提交

iommu/vt-d: Fix bind svm with multiple devices · d7af4d98

由 Jacob Pan 提交于 5月 08, 2019

If multiple devices try to bind to the same mm/PASID, we need to
set up first level PASID entries for all the devices. The current
code does not consider this case which results in failed DMA for
devices after the first bind.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reported-by: NMike Campin <mike.campin@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

d7af4d98

11 4月, 2019 1 次提交

iommu/vt-d: Make intel_iommu_enable_pasid() more generic · d7cbc0f3

由 Lu Baolu 提交于 3月 25, 2019

This moves intel_iommu_enable_pasid() out of the scope of
CONFIG_INTEL_IOMMU_SVM with more and more features requiring
pasid function.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

d7cbc0f3

01 3月, 2019 1 次提交

iommu/vt-d: Fix NULL pointer reference in intel_svm_bind_mm() · c56cba5d

由 Lu Baolu 提交于 3月 01, 2019

Intel IOMMU could be turned off with intel_iommu=off. If Intel
IOMMU is off,  the intel_iommu struct will not be initialized.
When device drivers call intel_svm_bind_mm(), the NULL pointer
reference will happen there.

Add dmar_disabled check to avoid NULL pointer reference.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reported-by: NDave Jiang <dave.jiang@intel.com>
Fixes: 2f26e0a9 ("iommu/vt-d: Add basic SVM PASID support")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

c56cba5d

31 1月, 2019 1 次提交

iommu/vt-d: Remove change_pte notifier · 1a9eb9b9

由 Peter Xu 提交于 1月 30, 2019

The change_pte() interface is tailored for PFN updates, while the
other notifier invalidate_range() should be enough for Intel IOMMU
cache flushing.  Actually we've done similar thing for AMD IOMMU
already in 8301da53 ("iommu/amd: Remove change_pte mmu_notifier
call-back", 2014-07-30) but the Intel IOMMU driver still have it.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

1a9eb9b9

11 1月, 2019 1 次提交

iommu/vt-d: Support page request in scalable mode · 5b438f4b

由 Jacob Pan 提交于 1月 11, 2019

VT-d Rev3.0 has made a few changes to the page request interface,

1. widened PRQ descriptor from 128 bits to 256 bits;
2. removed streaming response type;
3. introduced private data that requires page response even the
   request is not last request in group (LPIG).

This is a supplement to commit 1c4f88b7 ("iommu/vt-d: Shared
virtual address in scalable mode") and makes the svm code compliant
with VT-d Rev3.0.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5b438f4b

11 12月, 2018 4 次提交

iommu/vt-d: Remove deferred invalidation · 6d68b88e

由 Lu Baolu 提交于 12月 10, 2018

Deferred invalidation is an ECS specific feature. It will not be
supported when IOMMU works in scalable mode. As we deprecated the
ECS support, remove deferred invalidation and cleanup the code.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Cc: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

6d68b88e

iommu/vt-d: Shared virtual address in scalable mode · 1c4f88b7

由 Lu Baolu 提交于 12月 10, 2018

This patch enables the current SVA (Shared Virtual Address)
implementation to work in the scalable mode.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

1c4f88b7

iommu/vt-d: Add 256-bit invalidation descriptor support · 5d308fc1

由 Lu Baolu 提交于 12月 10, 2018

Intel vt-d spec rev3.0 requires software to use 256-bit
descriptors in invalidation queue. As the spec reads in
section 6.5.2:

Remapping hardware supporting Scalable Mode Translations
(ECAP_REG.SMTS=1) allow software to additionally program
the width of the descriptors (128-bits or 256-bits) that
will be written into the Queue. Software should setup the
Invalidation Queue for 256-bit descriptors before progra-
mming remapping hardware for scalable-mode translation as
128-bit descriptors are treated as invalid descriptors
(see Table 21 in Section 6.5.2.10) in scalable-mode.

This patch adds 256-bit invalidation descriptor support
if the hardware presents scalable mode capability.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5d308fc1

iommu/vt-d: Manage scalalble mode PASID tables · 0bbeb01a

由 Lu Baolu 提交于 12月 10, 2018

In scalable mode, pasid structure is a two level table with
a pasid directory table and a pasid table. Any pasid entry
can be identified by a pasid value in below way.

   1
   9                       6 5      0
    .-----------------------.-------.
    |              PASID    |       |
    '-----------------------'-------'    .-------------.
             |                    |      |             |
             |                    |      |             |
             |                    |      |             |
             |     .-----------.  |      .-------------.
             |     |           |  |----->| PASID Entry |
             |     |           |  |      '-------------'
             |     |           |  |Plus  |             |
             |     .-----------.  |      |             |
             |---->| DIR Entry |-------->|             |
             |     '-----------'         '-------------'
.---------.  |Plus |           |
| Context |  |     |           |
|  Entry  |------->|           |
'---------'        '-----------'

This changes the pasid table APIs to support scalable mode
PASID directory and PASID table. It also adds a helper to
get the PASID table entry according to the pasid value.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

0bbeb01a

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功