提交 · 5c1c86031ead5f13674fff31de9c2bf503c1c11a · openeuler / Kernel

21 11月, 2018 1 次提交

drm/i915/gvt: Avoid use-after-free iterating the gtt list · 7513edbc

由 Chris Wilson 提交于 11月 20, 2018

Found by smatch:

drivers/gpu/drm/i915/gvt/gtt.c:2452 intel_vgpu_destroy_ggtt_mm() error: dereferencing freed memory 'pos'
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

7513edbc

31 10月, 2018 1 次提交

drm/i915/gvt: support inconsecutive partial gtt entry write · bc0686ff

由 Hang Yuan 提交于 9月 19, 2018

Previously we assumed two 4-byte writes to the same PTE coming in sequence.
But recently we observed inconsecutive partial write happening as well. So
this patch enhances the previous solution. It now uses a list to save more
partial writes. If one partial write can be combined with another one in
the list to construct a full PTE, update its shadow entry. Otherwise, save
the partial write in the list.

v2: invalidate old entry and flush ggtt (Zhenyu)
v3: split old ggtt page unmap to another patch (Zhenyu)
v4: refine codes (Zhenyu)
Signed-off-by: NHang Yuan <hang.yuan@linux.intel.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Xiaolin Zhang <xiaolin.zhang@intel.com>
Cc: Zhenyu Wang <zhenyu.z.wang@intel.com>
Reviewed-by: NXiaolin Zhang <xiaolin.zhang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

bc0686ff

08 10月, 2018 1 次提交

drm/i915/gvt: invalidate old ggtt page when update ggtt entry · f42259ef

由 Hang Yuan 提交于 9月 19, 2018

Previously only cancelled dma map of a ggtt page when the ggtt entry was
cleared. This patch will cancel dma map of an old ggtt page as well when
the ggtt entry is updated with new page address.

Fixes: 7598e870(drm/i915/gvt: Missed to cancel dma map for ggtt entries)
Signed-off-by: NHang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

f42259ef

07 8月, 2018 1 次提交

drm/i915/gvt: Fix function comment doc errors · a752b070

由 Zhenyu Wang 提交于 7月 31, 2018

Caught by W=1 to fix left wrong function comment doc.
Reviewed-by: NHang Yuan <hang.yuan@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

a752b070

09 7月, 2018 12 次提交

drm/i915/gvt: Fix error handling in ppgtt_populate_spt_by_guest_entry · 80e76ea6

由 Changbin Du 提交于 5月 15, 2018

Don't forget to free allocated spt if shadowing failed.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

80e76ea6

drm/i915/gvt: Handle special sequence on PDE IPS bit · 54c81653

由 Changbin Du 提交于 5月 15, 2018

If the guest update the 64K gtt entry before changing IPS bit of PDE, we
need to re-shadow the whole page table. Because we have ignored all
updates to unused entries.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

54c81653

drm/i915/gvt: Add 2M huge gtt support · b901b252

由 Changbin Du 提交于 5月 15, 2018

This add 2M huge gtt support for GVTg. Unlike 64K gtt entry, we can
shadow 2M guest entry with real huge gtt. But before that, we have to
check memory physical continuous, alignment and if it is supported on
the host. We can get all supported page sizes from
intel_device_info.page_sizes.

Finally we must split the 2M page into smaller pages if we cannot
satisfy guest Huge Page.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

b901b252

drm/i915/kvmgt: Support setting dma map for huge pages · 79e542f5

由 Changbin Du 提交于 5月 15, 2018

To support huge gtt, we need to support huge pages in kvmgt first.
This patch adds a 'size' param to the intel_gvt_mpt::dma_map_guest_page
API and implements it in kvmgt.

v2: rebase.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

79e542f5

drm/i915/gvt: Add 64K huge gtt support · eb3a3530

由 Changbin Du 提交于 5月 15, 2018

Finally, this add the first huge gtt support for GVTg - 64K pages. Since
64K page and 4K page cannot be mixed on the same page table, so we always
split a 64K entry into small 4K page. And when unshadow guest 64K entry,
we need ensure all the shadowed entries in shadow page table also get
cleared.

For page table which has 64K gtt entry, only PTE#0, PTE#16, PTE#32, ...
PTE#496 are used. Unused PTEs update should be ignored.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

eb3a3530

drm/i915/gvt: Make PTE iterator 64K entry aware · 4c9414d7

由 Changbin Du 提交于 5月 15, 2018

64K PTE is special, only PTE#0, PTE#16, PTE#32, ... PTE#496 are used in
the page table.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

4c9414d7

drm/i915/gvt: Split ppgtt_alloc_spt into two parts · 155521c9

由 Changbin Du 提交于 5月 15, 2018

We need a interface to allocate a pure shadow page which doesn't have
a guest page associated with. Such shadow page is used to shadow 2M
huge gtt entry.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

155521c9

drm/i915/gvt: Add GTT clear_pse operation · c3e69763

由 Changbin Du 提交于 5月 15, 2018

Add clear_pse operation in case we need to split huge gtt into small pages.

v2: correct description.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

c3e69763

drm/i915/gvt: Add software PTE flag to mark special 64K splited entry · 71634848

由 Changbin Du 提交于 5月 15, 2018

This add a software PTE flag on the Ignored bit of PTE. It will be used
to identify splited 64K shadow entries.

v2: fix mask definition.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

71634848

drm/i915/gvt: Detect 64K gtt entry by IPS bit of PDE · 40b27176

由 Changbin Du 提交于 5月 15, 2018

This change help us detect the real entry type per PSE and IPS setting.
For 64K entry, we also need to check reg GEN8_GAMW_ECO_DEV_RW_IA.

v2: Extend IPS mmio control to Gen10. (Matthew Auld)
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

40b27176

drm/i915/gvt: Add PTE IPS bit operations · 6fd79378

由 Changbin Du 提交于 5月 15, 2018

Add three IPS operation functions to test/set/clear IPS in PDE.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

6fd79378

drm/i915/gvt: Add new 64K entry type · b294657d

由 Changbin Du 提交于 5月 15, 2018

Add a new entry type GTT_TYPE_PPGTT_PTE_64K_ENTRY. 64K entry is very
different from 2M/1G entry. 64K entry is controlled by IPS bit in upper
PDE. To leverage the current logic, I take IPS bit as 'PSE' for PTE
level. Which means, 64K entries can also processed by get_pse_type().

v2: Make it bisectable.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

b294657d

02 7月, 2018 1 次提交

drm/i915/gvt: fix a bug of partially write ggtt enties · 510fe10b

由 Zhao Yan 提交于 6月 19, 2018

when guest writes ggtt entries, it could write 8 bytes a time if
gtt_entry_size is 8. But, qemu could split the 8 bytes into 2 consecutive
4-byte writes.

If each 4-byte partial write could trigger a host ggtt write, it is very
possible that a wrong combination is written to the host ggtt. E.g.
the higher 4 bytes is the old value, but the lower 4 bytes is the new
value, and this 8-byte combination is wrong but written to the ggtt, thus
causing bugs.

To handle this condition, we just record the first 4-byte write, then wait
until the second 4-byte write comes and write the combined 64-bit data to
host ggtt table.

To save memory space and to spot partial write as early as possible, we
don't keep this information for every ggtt index. Instread, we just record
the last ggtt write position, and assume the two 4-byte writes come in
consecutively for each vgpu.

This assumption is right based on the characteristic of ggtt entry which
stores memory address. When gtt_entry_size is 8, the guest memory physical
address should be 64 bits, so any sane guest driver should write 8-byte
long data at a time, so 2 consecutive 4-byte writes at the same ggtt index
should be trapped in gvt.

v2:
when incomplete ggtt entry write is located, e.g.
    1. guest only writes 4 bytes at a ggtt offset and no long writes the
       rest 4 bytes.
    2. guest writes 4 bytes of a ggtt offset, then write at other ggtt
       offsets, then return back to write the left 4 bytes of the first
       ggtt offset.
add error handling logic to remap host entry to scratch page, and mark
guest virtual ggtt entry as not present.  (zhenyu wang)
Signed-off-by: NZhao Yan <yan.y.zhao@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

510fe10b

13 6月, 2018 2 次提交

drm/i915/gvt: Enable gtt initialization for BXT. · 665004b8

由 Colin Xu 提交于 6月 11, 2018

Initialize BXT gtt as SKL/KBL.

v2: All supported platforms share the same gtt ops.
    Remove the platform check by now and let is_supported_device()
    be the gate keeper.
Signed-off-by: NColin Xu <colin.xu@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

665004b8

treewide: Use array_size() in vzalloc() · fad953ce

由 Kees Cook 提交于 6月 12, 2018

The vzalloc() function has no 2-factor argument form, so multiplication
factors need to be wrapped in array_size(). This patch replaces cases of:

        vzalloc(a * b)

with:
        vzalloc(array_size(a, b))

as well as handling cases of:

        vzalloc(a * b * c)

with:

        vzalloc(array3_size(a, b, c))

This does, however, attempt to ignore constant size factors like:

        vzalloc(4 * 1024)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  vzalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  vzalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  vzalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  vzalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  vzalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  vzalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  vzalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  vzalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  vzalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  vzalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
  vzalloc(
-	sizeof(TYPE) * (COUNT_ID)
+	array_size(COUNT_ID, sizeof(TYPE))
  , ...)
|
  vzalloc(
-	sizeof(TYPE) * COUNT_ID
+	array_size(COUNT_ID, sizeof(TYPE))
  , ...)
|
  vzalloc(
-	sizeof(TYPE) * (COUNT_CONST)
+	array_size(COUNT_CONST, sizeof(TYPE))
  , ...)
|
  vzalloc(
-	sizeof(TYPE) * COUNT_CONST
+	array_size(COUNT_CONST, sizeof(TYPE))
  , ...)
|
  vzalloc(
-	sizeof(THING) * (COUNT_ID)
+	array_size(COUNT_ID, sizeof(THING))
  , ...)
|
  vzalloc(
-	sizeof(THING) * COUNT_ID
+	array_size(COUNT_ID, sizeof(THING))
  , ...)
|
  vzalloc(
-	sizeof(THING) * (COUNT_CONST)
+	array_size(COUNT_CONST, sizeof(THING))
  , ...)
|
  vzalloc(
-	sizeof(THING) * COUNT_CONST
+	array_size(COUNT_CONST, sizeof(THING))
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

  vzalloc(
-	SIZE * COUNT
+	array_size(COUNT, SIZE)
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  vzalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vzalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vzalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vzalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vzalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  vzalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  vzalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  vzalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  vzalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  vzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  vzalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  vzalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  vzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  vzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  vzalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vzalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vzalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vzalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vzalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vzalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vzalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vzalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  vzalloc(C1 * C2 * C3, ...)
|
  vzalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants.
@@
expression E1, E2;
constant C1, C2;
@@

(
  vzalloc(C1 * C2, ...)
|
  vzalloc(
-	E1 * E2
+	array_size(E1, E2)
  , ...)
)
Signed-off-by: NKees Cook <keescook@chromium.org>

fad953ce

11 6月, 2018 1 次提交

drm/i915/gvt: removed unnecessary boundary check · 65957195

由 Xinyun Liu 提交于 6月 07, 2018

type is already checked in the function entry. So it is unnecessary
to check it again.
Signed-off-by: NXinyun Liu <xinyun.liu@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

65957195

30 3月, 2018 2 次提交

drm/i915/gvt: Cancel dma map when resetting ggtt entries · f4c43db3

由 Changbin Du 提交于 3月 27, 2018

Ditto, don't forget ggtt entries during reset.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

f4c43db3

drm/i915/gvt: Missed to cancel dma map for ggtt entries · 7598e870

由 Changbin Du 提交于 3月 27, 2018

We have canceled dma map for ppgtt entries. Also we need to do it for
ggtt entries when them are invalidated.

This can fix task hung issue as:
[13517.791767] INFO: task gvt_service_thr:1081 blocked for more than 120 seconds.
[13517.792584] Not tainted 4.14.15+ #3
[13517.793417] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13517.794267] gvt_service_thr D 0 1081 2 0x80000000
[13517.795132] Call Trace:
[13517.795996] ? __schedule+0x493/0x77b
[13517.796859] schedule+0x79/0x82
[13517.797740] schedule_preempt_disabled+0x5/0x6
[13517.798614] __mutex_lock.isra.0+0x2b5/0x445
[13517.799504] ? __switch_to_asm+0x24/0x60
[13517.800381] ? intel_gvt_cleanup+0x10/0x10
[13517.801261] ? intel_gvt_schedule+0x19/0x2b9
[13517.802107] intel_gvt_schedule+0x19/0x2b9
[13517.802954] ? intel_gvt_cleanup+0x10/0x10
[13517.803824] gvt_service_thread+0xe3/0x10d
[13517.804704] ? wait_woken+0x68/0x68
[13517.805588] kthread+0x118/0x120
[13517.806478] ? kthread_create_on_node+0x3a/0x3a
[13517.807381] ? call_usermodehelper_exec_async+0x113/0x11a
[13517.808307] ret_from_fork+0x35/0x40

v3: split out ggtt reset case.
v2: also unmap ggtt during reset.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

7598e870

19 3月, 2018 2 次提交

drm/i915/gvt: Invalidate vGPU PPGTT mm objects during a vGPU reset. · 730c8ead

由 Zhi Wang 提交于 2月 07, 2018

As different OSes might handling GVT PPGTT creation/destroy notification
differently during a vGPU reset. A better approach is invalidating all
vGPU PPGTT mm objects during vGPU reset.
Signed-off-by: NZhi Wang <zhi.a.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

730c8ead

drm/i915/gvt: fix spelling mistake: "destoried" -> "destroyed" · 84f69ba0

由 Colin Ian King 提交于 3月 12, 2018

Trivial fix to spelling mistake in gvt_err error message text.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

84f69ba0

06 3月, 2018 16 次提交

drm/i915/gvt: Fix guest vGPU hang caused by very high dma setup overhead · cf4ee73f

由 Changbin Du 提交于 3月 01, 2018

The implementation of current kvmgt implicitly setup dma mapping at MPT
API gfn_to_mfn. First this design against the API's original purpose.
Second, there is no unmap hit in this design. The result is that the
dma mapping keep growing larger and larger. For mutl-vm case, they will
consume IOMMU IOVA low 4GB address space quickly and so tons of rbtree
entries crated in the IOMMU IOVA allocator. Finally, single IOVA
allocation can take as long as ~70ms. Such latency is intolerable.

To address both above issues, this patch introduced two new MPT API:
  o dma_map_guest_page - setup dma map for guest page
  o dma_unmap_guest_page - cancel dma map for guest page

The kvmgt implements these 2 API. And to reduce dma setup overhead for
duplicated pages (eg. scratch pages), two caches are used: one is for
mapping gfn to struct gvt_dma, another is for mapping dma addr to
struct gvt_dma.

With these 2 new API, the gtt now is able to cancel dma mapping when page
table is invalidated. The dma mapping is not in a gradual increase now.

v2: follow the old logic for VFIO_IOMMU_NOTIFY_DMA_UNMAP at this point.

Cc: Hang Yuan <hang.yuan@intel.com>
Cc: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

cf4ee73f

drm/i915/gvt: Define PTE addr mask with GENMASK_ULL · 420fba78

由 Changbin Du 提交于 1月 30, 2018

Define the masks better.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

420fba78

drm/i915/gvt: Manage shadow pages with radix tree · b6c126a3

由 Changbin Du 提交于 1月 30, 2018

We don't know how many page tables will be shadowed. It varies
considerably corresponding to guest load. Radix tree is a better
choice for us. Since Page Frame Number is used as key so most of
the bits are common.

Here is some performance data (duration in us) of looking up a
element:
Before: (aka. ppgtt_find_shadow_page)
 0.308 0.292 0.246 0.432 0.143 ... 0.311 0.225 0.382 0.199 0.325
After: (aka. intel_vgpu_find_spt_by_mfn)
 0.106 0.106 0.107 0.106 0.105 0.107 ... 0.107 0.109 0.105 0.108

This time I didn't get the early data of hash table. The data is
measured when desktop is shown.

As last change, the overall benchmark almost is not changed, but
we get better scalability.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

b6c126a3

drm/i915/gvt: Provide generic page_track infrastructure for write-protected page · e502a2af

由 Changbin Du 提交于 1月 30, 2018

This patch provide generic page_track infrastructure for write-protected
guest page. The old page_track logic gets rewrote and now stays in a new
standalone page_track.c. This page track infrastructure can be both used
by vGUC and GTT shadowing.

The important change is that it uses radix tree instead of hash table.
We don't have a predictable number of pages that will be tracked.

Here is some performance data (duration in us) of looking up a element:
Before: (aka. intel_vgpu_find_tracked_page)
 0.091 0.089 0.090 ... 0.093 0.091 0.087 ... 0.292 0.285 0.292 0.291
After: (aka. intel_vgpu_find_page_track)
 0.104 0.105 0.100 0.102 0.102 0.100 ... 0.101 0.101 0.105 0.105

The hash table has good performance at beginning, but turns bad with
more pages being tracked even no 3D applications are running. As
expected, radix tree has stable duration and very quick.

The overall benchmark (tested with Heaven Benchmark) marginally improved
since this is not the bottleneck. What we benefit more from this change
is scalability.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

e502a2af

drm/i915/gvt: Don't extend page_track to mpt layer · 09475728

由 Changbin Du 提交于 1月 30, 2018

Don't extend page_track to mpt layer. Keep MPT simple and clean.
Meanwhile remove gtt.n_tracked_guest_page which doesn't make much
sense.

v2: clean up gtt.n_tracked_guest_page.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

09475728

drm/i915/gvt: Rename shadow_page to short name spt · d87f5ff3

由 Changbin Du 提交于 1月 30, 2018

The target structure of some functions is struct intel_vgpu_ppgtt_spt and
their names are xxx_shadow_page. It should be xxx_shadow_page_table. Let's
use short name 'spt' instead to reduce the length. As well as the hash
table name.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

d87f5ff3

drm/i915/gvt: Rework shadow page management code · 44b46733

由 Changbin Du 提交于 1月 30, 2018

This is a another big one and the GVT shadow page management code is
heavily refined.

The new code only use struct intel_vgpu_ppgtt_spt to represent a vgpu
shadow page table - w/ or wo/ a guest page associated with. A pure shadow
page (no guest page associated) will be used to shadow splited 2M huge
gtt. In this case, the spt.guest_page.gfn should be a zero.

To search a existed shadow page table, we have two new interfaces:
 - intel_vgpu_find_spt_by_gfn(), find a spt by guest gfn. It must not
   be a pure spt.
 - intel_vgpu_find_spt_by_mfn, Find the spt using shadow page mfn in
   shadowed PTE.

The oos_page management is remained as what is was.

v2: Split some changes into small standalone patches.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

44b46733

drm/i915/gvt: Refine pte shadowing process · 72f03d7e

由 Changbin Du 提交于 1月 30, 2018

Make the shadow PTE population code clear. Later we will add huge gtt
support based on this.

v2:
  - rebase to latest code.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Reviewed-by: NZhi Wang <zhi.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

72f03d7e

drm/i915/gvt: Use standard pte bit definition · d861ca23

由 Changbin Du 提交于 1月 30, 2018

GTT entry has similar format with the CPU PTE. We'd prefer named macro
instead of hardcode.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Reviewed-by: NZhi Wang <zhi.a.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

d861ca23

drm/i915/gvt: Factor out intel_vgpu_{get, put}_ppgtt_mm interface · e6e9c46f

由 Changbin Du 提交于 1月 30, 2018

Factor out these two interfaces so we can kill some duplicated code in
scheduler.c.

v2:
  - rename to intel_vgpu_{get,put}_ppgtt_mm
  - refine handle_g2v_notification
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Reviewed-by: NZhi Wang <zhi.a.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

e6e9c46f

drm/i915/gvt: Rename ggtt related functions to be more specific · a143cef7

由 Changbin Du 提交于 1月 30, 2018

Accurate names help to avoid confusing so improve readability.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Reviewed-by: NZhi Wang <zhi.a.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

a143cef7

drm/i915/gvt: Add verbose gtt shadow logs · bc37ab56

由 Changbin Du 提交于 1月 30, 2018

This add a new macro gvt_vdbg_mm() to print more verbose logs for
gtt shadowing. The added verbose logs are very useful for debugging.
gvt_vdbg_mm() only comes into effect if VERBOSE_DEBUG is defined by
the developer.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Reviewed-by: NZhi Wang <zhi.a.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

bc37ab56

drm/i915/gvt: Refine ggtt_set_shadow_entry · b0c766bf

由 Changbin Du 提交于 1月 30, 2018

Less code and use existed helper ggtt_set_host_entry.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

b0c766bf

drm/i915/gvt: Refine ggtt and ppgtt root entry ops · 3aff3512

由 Changbin Du 提交于 1月 30, 2018

Separate ggtt and ppgtt since they are different. A little more code but
straightforward.

And move these helpers to gtt.c since that is the only client.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

3aff3512

drm/i915/gvt: Refine the intel_vgpu_mm reference management · 1bc25851

由 Changbin Du 提交于 1月 30, 2018

If we manage an object with a reference count, then its life cycle
must flow the reference count operations. Meanwhile, change the
operation functions to generic name *put* and *get*.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

1bc25851

drm/i915/gvt: Rework shadow graphic memory management code · ede9d0cf

由 Changbin Du 提交于 1月 30, 2018

This is a big one and the GVT shadow graphic memory management code is
heavily refined. The new code is more straightforward with less code.

The struct intel_vgpu_mm is restructured to be clearly defined, use
accurate names and some of the original fields are removed which are
really redundant.

Now we only manage ppgtt mm object with mm->ppgtt_mm.lru_list. No need
to mix ppgtt and ggtt together, since one vGPU only has one ggtt object.

v4: Don't invoke ppgtt_free_all_shadow_page before intel_vgpu_destroy_all_ppgtt_mm.
v3: Add GVT_RING_CTX_NR_PDPS to avoid confusing about the PDPs.
v2: Split some changes into small standalone patches.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

ede9d0cf

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功