提交 · 0a445945be6d10c5e6fd5599a27e43b6a7fdf14d · openeuler / Kernel

15 8月, 2017 1 次提交

drm/i915/gvt: Fix guest i915 full ppgtt blocking issue · 6b3816d6

由 Tina Zhang 提交于 8月 14, 2017

Guest i915 full ppgtt functionality was blocking by an issue, which would
lead to gpu hardware hang. Guest i915 driver may update the ppgtt table
just before this workload is going to be submitted to the hardware by
device model. This case wasn't handled well by device model before, due
to the small time window between removing old ppgtt entry and adding the
new one. Errors occur when the workload is executed by hardware during
that small time window. This patch is to remove this time window by adding
the new ppgtt entry first and then remove the old one.

Changes in v2:
- Move VGT_CAPS_FULL_PPGTT introduction to patch 2/4. (Joonas)

Changes since v2:
- Divide the whole patch set into two separate patch series, with one
  patch in i915 side to check guest i915 full ppgtt capability and enable
  it when this capability is supported by the device model, and the other
  one in gvt side which fixs the blocking issue and enables the device
  model to provide the capability to guest. And this patch focuses on gvt
  side. (Joonas)
- Change the title from "reorder the shadow ppgtt update process by adding
  entry first" to "Fix guest i915 full ppgtt blocking issue". (Tina)

Changes since v3:
- Rebase to the latest branch.

Changes since v4:
- Tested by Tina Zhang.

Changes since v5:
- Rebase to the latest branch.

v6:
- Update full 48bit ppgtt definition

Cc: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: NTina Zhang <tina.zhang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

6b3816d6

10 8月, 2017 13 次提交

drm/i915/gvt: Add shadow context descriptor updating · 9dfb8e5b

由 Kechen Lu 提交于 8月 10, 2017

The current context logic only updates the descriptor of context when
it's being pinned to graphics memory space. But this cannot satisfy the
requirement of shadow context. The addressing mode of the pinned shadow
context descriptor may be changed according to the guest addressing mode.
And this won't be updated, as the already pinned shadow context has no
chance to update its descriptor. And this will lead to GPU hang issue,
as shadow context is used with wrong descriptor. This patch fixes this
issue by letting the pinned shadow context descriptor update its
addressing mode on demand.

This patch fixes GPU HANG issue which happends after changing the
grub parameter i915.enable_ppgtt form 0x01 to 0x03 or vice versa and
then rebooting the guest.
Signed-off-by: NTina Zhang <tina.zhang@intel.com>
Signed-off-by: NKechen Lu <kechen.lu@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

9dfb8e5b

drm/i915/gvt: expose vGPU context hw id · a45050d7

由 Zhenyu Wang 提交于 8月 01, 2017

This exposes vGPU context hw id in mdev sysfs which is used to
do vGPU based profiling. Retrieved vGPU context hw id can be set
through i915 perf ioctl to set profiling for target vGPU.

Cc: Jiao Pengyuan <pengyuan.jiao@intel.com>
Cc: Niu Bing <bing.niu@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

a45050d7

drm/i915/gvt: Refine the intel_vgpu_reset_gtt reset function · 4d3e67bb

由 Chuanxiao Dong 提交于 8月 04, 2017

When doing the VGPU reset, we don't need to do the gtt/ppgtt reset.
This will make the GVT to do the ppgtt shadow every time for
a workload and caused really bad performance after a VGPU reset.
This patch will make sure ppgtt clean only happen at device module
level reset to fix this.
Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

4d3e67bb

drm/i915/gvt: Add carefully checking in GTT walker paths · 4b2dbbc2

由 Changbin Du 提交于 8月 02, 2017

When debugging the gtt code, found the intel_vgpu_gma_to_gpa() can
translate any given GMA though the GMA is not valid. This because
the GTT ops suppress the possible errors, which may result in an
invalid PT entry is retrieved by upper caller.

This patch changed the prototype of pte ops to propagate status to
callers. Then we make sure the GTT walker stop as early as when
a error is detected to prevent undefined behavior.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

4b2dbbc2

drm/i915/gvt: Remove duplicated MMIO entries · 36ed7e97

由 Jian Jun Chen 提交于 7月 19, 2017

Remove duplicated MMIO entries in the tracked MMIO list. -EEXIST
is returned if duplicated MMIO entries are found when new MMIO
entry is added.

v2:
- Use WARN(1, ...) for more verbose message. (Zhenyu)
Signed-off-by: NJian Jun Chen <jian.jun.chen@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: NYulei Zhang <yulei.zhang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

36ed7e97

drm/i915/gvt: take runtime pm when do early scan and shadow · 73821a53

由 Zhenyu Wang 提交于 7月 10, 2017

Need to take runtime pm when do early scan/shadow of workload
for request operations.

Fixes: 7fa56bd159bc ("drm/i915/gvt: Audit and shadow workload during ELSP writing")
Cc: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

73821a53

drm/i915/gvt: Replace duplicated code with exist function · 64d8bb83

由 Ping Gao 提交于 7月 04, 2017

Use the exist function intel_gvt_ggtt_validate_range to replace
these duplicated code that do the same thing.
Signed-off-by: NPing Gao <ping.a.gao@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

64d8bb83

drm/i915/gvt: To check whether workload scan and shadow has mutex hold · 87e919d7

由 Ping Gao 提交于 7月 04, 2017

The function workload scan and shadow have to hold the drm.struct_mutex
before called. To avoid misusing of this function, add a lockdep assert
in it.
Signed-off-by: NPing Gao <ping.a.gao@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

87e919d7

drm/i915/gvt: Audit and shadow workload during ELSP writing · d0302e74

由 Ping Gao 提交于 6月 29, 2017

Let the workload audit and shadow ahead of vGPU scheduling, that
will eliminate GPU idle time and improve performance for multi-VM.

The performance of Heaven running simultaneously in 3VMs has
improved 20% after this patch.

v2:Remove condition current->vgpu==vgpu when shadow during ELSP
writing.
Signed-off-by: NPing Gao <ping.a.gao@intel.com>
Reviewed-by: NZhi Wang <zhi.a.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

d0302e74

drm/i915/gvt: Factor out scan and shadow from workload dispatch · 89ea20b9

由 Ping Gao 提交于 6月 29, 2017

To perform the workload scan and shadow in ELSP writing stage for
performance consideration, the workload scan and shadow stuffs
should be factored out from dispatch_workload().

v2:Put context pin before i915_add_request;
   Refine the comments;
   Rename some APIs;

v3:workload->status should set only when error happens.
v4:i915_add_request is must to have after i915_gem_request_alloc.
Signed-off-by: NPing Gao <ping.a.gao@intel.com>
Reviewed-by: NZhi Wang <zhi.a.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

89ea20b9

drm/i915/gvt: Optimize ring siwtch 2x faster again by light weight mmio access wrapper · 4671ea20

由 Changbin Du 提交于 6月 23, 2017

The I915_READ/WRITE is not only a mmio read/write, it also contains
debug checking and Forcewake domain lookup. This is too heavy for
GVT ring switch case which access batch of mmio registers on ring
switch. We can handle Forcewake manually and use the raw
i915_read/write instead. The benefit from this is 2x faster mmio
switch performance.
         Before       After
cycles  ~550000      ~250000

v2: Use existing I915_READ_FW/I915_WRITE_FW macro. (zhenyu)
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Reviewed-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

4671ea20

drm/i915/gvt: Optimize ring siwtch 2x faster by removing unnecessary POSTING_READ · f846c8de

由 Changbin Du 提交于 6月 23, 2017

There are lots of POSTING_READ alongside each mmio write Op. While
actually this is not necessary. It just bring too much latency since
PCIe read Op is very slow which is of non-posted transaction.

For PCIe device, the mem transaction for strong ordering rules are:
  o PCIe mmio write sequence is FIFO. Posted request cannot
    pass previous posted request.
  o PCIe mmio read will not go ahead of previous write.

Intel graphics doesn't support RO, so we can apply above rules. In
our case, we only need one POSTING_READ at last. This can remove
half of mmio read Op and then the average ring switch performance
is nearly doubled.
         Before       After
cycles  ~970000      ~550000
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

f846c8de

drm/i915/gvt: Use gvt_err to print the resource not enough error · 4cf196eb

由 Chuanxiao Dong 提交于 6月 13, 2017

It is better to use gvt_err when the gvt resource is not enough so
the user can be notified from the kernel dmesg. And this kind of
error message is gvt related.
Suggested-by: NBing Niu <bing.niu@intel.com>
Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Bing Niu <bing.niu@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

4cf196eb

27 7月, 2017 2 次提交

drm/i915/hsw+: Add has_fuses power well attribute · b2891eb2

由 Imre Deak 提交于 7月 11, 2017

The pattern of a power well backing a set of fuses whose initialization
we need to wait for during power well enabling is common to all GEN9+
platforms. Adding support for this to the HSW power well enable helper
allows us to use the HSW/BDW power well code for GEN9+ as well in a
follow-up patch.

v2:
- Use an enum for power gates instead of raw numbers. (Ville)
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NArkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170711204236.5618-6-imre.deak@intel.comSigned-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b2891eb2

drm/i915/hsw+: Unify the hsw/bdw and gen9+ power well req/state macros · 1af474fe

由 Imre Deak 提交于 7月 06, 2017

Although on HSW/BDW there is only a single display global power well,
it's programmed the same way as other GEN9+ power wells. This also
means we can get at the HSW/BDW request and status flags the same way
it's done on GEN9+ by assigning the corresponding HSW/BDW power well ID.
This ID was assigned in a recent patch, so we can now switch to using
the same macros everywhere on HSW+.

Updating the HSW power well control register with RMW is not strictly
necessary, but this will allow us to use the same code for GEN9+.
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NArkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1499352040-8819-13-git-send-email-imre.deak@intel.comSigned-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1af474fe

11 7月, 2017 5 次提交

drm/i915/gvt: Use fence error from GVT request for workload status · 0cf5ec41

由 Chuanxiao Dong 提交于 6月 23, 2017

The req->fence.error will be set if this request caused GPU hang so
we can use this value to workload->status to indicate whether this
GVT request caused any problem. If it caused GPU hang, we shouldn't
trigger any context switch back to the guest.

v2:
- only take -EIO from fence->error. (Zhenyu)

Fixes: 8f1117ab (drm/i915/gvt: handle workload lifecycle properly)
Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

0cf5ec41

drm/i915/gvt: remove scheduler_mutex in per-engine workload_thread · 4cc74389

由 Weinan Li 提交于 6月 19, 2017

For the vGPU workloads, now GVT-g use per vGPU scheduler, the per-ring
work_thread only pick workload belongs to the current vGPU. And with time
slice based scheduler, it waits all the engines become idle before do vGPU
switch. So we can run free dispatch in per-ring work_thread, different ring
running in different 'vGPU' won't happen.

For the workloads between vGPU and Host, this scheduler_mutex can't block
host to dispatch workload into other ring engines.

Here remove this mutex since it impacts the performance when applications
use more than 1 ring engines in 1 vgpu.

ring0 running in vGPU1, ring1 running in Host. Will happen.
ring0 running in vGPU1, ring1 running in vGPU2. Won't happen.
Signed-off-by: NWeinan Li <weinan.z.li@intel.com>
Signed-off-by: NPing Gao <ping.a.gao@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

4cc74389

drm/i915/gvt: Revert "drm/i915/gvt: Fix possible recursive locking issue" · 08673c3e

由 Chuanxiao Dong 提交于 7月 07, 2017

This reverts commit 62d02fd1.

The rwsem recursive trace should not be fixed from kvmgt side by using
a workqueue and it is an issue should be fixed in VFIO. So this one
should be reverted.
Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

08673c3e

drm/i915/gvt: Audit the command buffer address · 3364bf5f

由 Ping Gao 提交于 7月 04, 2017

The command buffer address in context like ring buffer base address
and wa_ctx address need to be audit to make sure they are in the
valid GGTT range.
Signed-off-by: NPing Gao <ping.a.gao@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

3364bf5f

drm/i915/gvt: Fix a memory leak in intel_gvt_init_gtt() · 0de98709

由 Zhou, Wenjia 提交于 7月 04, 2017

It will causes memory leak, if the function setup_spt_oos() fail,
in the function intel_gvt_init_gtt(),
which allocated by get_zeroed_page() and mapped by dma_map_page().

Unmap and free the page,  after STP oos initialize fail,
it will fix this issue.
Signed-off-by: NZhou, Wenjia <zhiyuan_zhu@htc.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

0de98709

29 6月, 2017 1 次提交

drm/i915/gvt: Make function dpy_reg_mmio_readx safe · 5cd82b75

由 Changbin Du 提交于 6月 13, 2017

The dpy_reg_mmio_read_x functions directly copy 4 bytes data to the
target address with considering the length. If may cause the target
memory corrupted if the requested length less than 4 bytes. Fix it
for safety even we already have some checking to avoid this happen.
And for convince, the 3 functions are merged.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

5cd82b75

27 6月, 2017 2 次提交

drm/i915/gvt: Don't read ADPA_CRT_HOTPLUG_MONITOR from host · 75e64ff2

由 Xiong Zhang 提交于 6月 28, 2017

When host connects a crt screen, linux guest will detect two
screens: crt and dp. This is wrong as linux guest has only
one dp.

In order to avoid guest get host crt screen, we should set
ADPA_CRT_HOTPLUG_MONITOR to none. But MMIO_RO(PCH_ADPA) prevent
from that. So MMIO_DH should be used instead of MMIO_RO.

v2: Clear its staus to none at initialize, so guest don't
    get host crt.(Zhangyu)
v3: SKL doesn't have this register, limit it to pre_skl.(xiong)
Signed-off-by: NXiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

75e64ff2

drm/i915/gvt: Set initial PORT_CLK_SEL vreg for BDW · 295a0d0b

由 Xiong Zhang 提交于 6月 20, 2017

On BDW, when host physical screen and guest virtual screen aren't on
the same DDI port, guest i915 driver prints the following error and
stop running.
[    6.775873] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000068
[    6.775928] IP: intel_ddi_clock_get+0x81/0x430 [i915]
[    6.776206] Call Trace:
[    6.776233]  ? vgpu_read32+0x4f/0x100 [i915]
[    6.776264]  intel_ddi_get_config+0x11c/0x230 [i915]
[    6.776298]  intel_modeset_setup_hw_state+0x313/0xd40 [i915]
[    6.776334]  intel_modeset_init+0xe49/0x18d0 [i915]
[    6.776368]  ? vgpu_write32+0x53/0x100 [i915]
[    6.776731]  ? intel_i2c_reset+0x42/0x50 [i915]
[    6.777085]  ? intel_setup_gmbus+0x32a/0x350 [i915]
[    6.777427]  i915_driver_load+0xabc/0x14d0 [i915]
[    6.777768]  i915_pci_probe+0x4f/0x70 [i915]

The null pointer is guest intel_crtc_state->shared_dpll which is
setted in haswell_get_ddi_pll(). When guest and host screen are
on different DDI port, host driver won't set PORT_CLK_SET(guest_port),
so haswell_get_ddi_pll() will return null and don't set
pipe_config->shared_dpll, once the following program refernce this
structure, it will print the above error.

This patch set the initial val of guest PORT_CLK_SEL(guest_port) to
LCPLL_810. And guest i915 driver will reset this value according to
guest screen mode.
Signed-off-by: NXiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

295a0d0b

26 6月, 2017 2 次提交

drm/i915/gvt: Fix inconsistent locks holding sequence · f16bd3dd

由 Chuanxiao Dong 提交于 6月 26, 2017

There are two kinds of locking sequence.

One is in the thread which is started by vfio ioctl to do
the iommu unmapping. The locking sequence is:
	down_read(&group_lock) ----> mutex_lock(&cached_lock)

The other is in the vfio release thread which will unpin all
the cached pages. The lock sequence is:
	mutex_lock(&cached_lock) ---> down_read(&group_lock)

And, the cache_lock is used to protect the rb tree of the cache
node and doing vfio unpin doesn't require this lock. Move the
vfio unpin out of the cache_lock protected region.

v2:
- use for style instead of do{}while(1). (Zhenyu)

Fixes: f30437c5 ("drm/i915/gvt: add KVMGT support")
Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

f16bd3dd

drm/i915/gvt: Fix possible recursive locking issue · 62d02fd1

由 Chuanxiao Dong 提交于 6月 26, 2017

vfio_unpin_pages will hold a read semaphore however it is already hold
in the same thread by vfio ioctl. It will cause below warning:

[ 5102.127454] ============================================
[ 5102.133379] WARNING: possible recursive locking detected
[ 5102.139304] 4.12.0-rc4+ #3 Not tainted
[ 5102.143483] --------------------------------------------
[ 5102.149407] qemu-system-x86/1620 is trying to acquire lock:
[ 5102.155624]  (&container->group_lock){++++++}, at: [<ffffffff817768c6>] vfio_unpin_pages+0x96/0xf0
[ 5102.165626]
but task is already holding lock:
[ 5102.172134]  (&container->group_lock){++++++}, at: [<ffffffff8177728f>] vfio_fops_unl_ioctl+0x5f/0x280
[ 5102.182522]
other info that might help us debug this:
[ 5102.189806]  Possible unsafe locking scenario:

[ 5102.196411]        CPU0
[ 5102.199136]        ----
[ 5102.201861]   lock(&container->group_lock);
[ 5102.206527]   lock(&container->group_lock);
[ 5102.211191]
*** DEADLOCK ***

[ 5102.217796]  May be due to missing lock nesting notation

[ 5102.225370] 3 locks held by qemu-system-x86/1620:
[ 5102.230618]  #0:  (&container->group_lock){++++++}, at: [<ffffffff8177728f>] vfio_fops_unl_ioctl+0x5f/0x280
[ 5102.241482]  #1:  (&(&iommu->notifier)->rwsem){++++..}, at: [<ffffffff810de775>] __blocking_notifier_call_chain+0x35/0x70
[ 5102.253713]  #2:  (&vgpu->vdev.cache_lock){+.+...}, at: [<ffffffff8157b007>] intel_vgpu_iommu_notifier+0x77/0x120
[ 5102.265163]
stack backtrace:
[ 5102.270022] CPU: 5 PID: 1620 Comm: qemu-system-x86 Not tainted 4.12.0-rc4+ #3
[ 5102.277991] Hardware name: Intel Corporation S1200RP/S1200RP, BIOS S1200RP.86B.03.01.APER.061220151418 06/12/2015
[ 5102.289445] Call Trace:
[ 5102.292175]  dump_stack+0x85/0xc7
[ 5102.295871]  validate_chain.isra.21+0x9da/0xaf0
[ 5102.300925]  __lock_acquire+0x405/0x820
[ 5102.305202]  lock_acquire+0xc7/0x220
[ 5102.309191]  ? vfio_unpin_pages+0x96/0xf0
[ 5102.313666]  down_read+0x2b/0x50
[ 5102.317259]  ? vfio_unpin_pages+0x96/0xf0
[ 5102.321732]  vfio_unpin_pages+0x96/0xf0
[ 5102.326024]  intel_vgpu_iommu_notifier+0xe5/0x120
[ 5102.331283]  notifier_call_chain+0x4a/0x70
[ 5102.335851]  __blocking_notifier_call_chain+0x4d/0x70
[ 5102.341490]  blocking_notifier_call_chain+0x16/0x20
[ 5102.346935]  vfio_iommu_type1_ioctl+0x87b/0x920
[ 5102.351994]  vfio_fops_unl_ioctl+0x81/0x280
[ 5102.356660]  ? __fget+0xf0/0x210
[ 5102.360261]  do_vfs_ioctl+0x93/0x6a0
[ 5102.364247]  ? __fget+0x111/0x210
[ 5102.367942]  SyS_ioctl+0x41/0x70
[ 5102.371542]  entry_SYSCALL_64_fastpath+0x1f/0xbe

put the vfio_unpin_pages in a workqueue can fix this.

v2:
- use for style instead of do{}while(1). (Zhenyu)
v3:
- rename gvt_cache_mark to gvt_cache_mark_remove. (Zhenyu)

Fixes: 659643f7 ("drm/i915/gvt/kvmgt: add vfio/mdev support to KVMGT")
Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

62d02fd1

21 6月, 2017 1 次提交

drm/i915: Allow contexts to be unreferenced locklessly · 5f09a9c8

由 Chris Wilson 提交于 6月 20, 2017

If we move the actual cleanup of the context to a worker, we can allow
the final free to be called from any context and avoid undue latency in
the caller.

v2: Negotiate handling the delayed contexts free by flushing the
workqueue before calling i915_gem_context_fini() and performing the final
free of the kernel context directly
v3: Flush deferred frees before new context allocations
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-2-chris@chris-wilson.co.uk

5f09a9c8

08 6月, 2017 13 次提交

drm/i915/gvt: Refine virtual reset function · 615c16a9

由 fred gao 提交于 5月 25, 2017

during the emulation of virtual reset:
1. only reset the engine related mmio ending with MMIO
   offset Master_IRQ, not include display stuff.

2. fences are not required to set default
   value as well to prevent screen flicking.

this will fix the issue of Guest screen hang while running
Force tdr in Linux guest.

v2:
- only reset the engine related mmio. (Zhenyu & Zhiyuan)
v3:
- IMR/Ring mode registers are not save/restored. (Changbin)
v4:
- redefine the MMIO reset offset for easy understanding. (Zhenyu)
- pvinfo can be reset. (Zhenyu)
v5:
- add more comments for mmio reset. (Zhenyu)

Cc: Changbin Du <changbin.du@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Lv zhiyuan <zhiyuan.lv@intel.com>
Cc: Zhang Yulei <yulei.zhang@intel.com>
Signed-off-by: Nfred gao <fred.gao@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

615c16a9

drm/i915/gvt: Fix GDRST vreg state after reset · 0811fa66

由 fred gao 提交于 5月 24, 2017

Emulating the GDRST read behavior correctly to ack the
guest reset request.

v2:
- split the original patch into two:
  GDRST read handler and virtual gpu reset. (Zhenyu)
v3:
- emulate the GDRST read right after write. (Zhenyu)

Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhang Yulei <yulei.zhang@intel.com>
Signed-off-by: Nfred gao <fred.gao@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

0811fa66

drm/i915/gvt: Tuning the size of MMIO hash lookup table to 2048 · 178cd160

由 Changbin Du 提交于 6月 06, 2017

On Skylake platform, The traced virtual mmio registers are up to 2039.
So tuning the hash table size to improve lookup performance.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

178cd160

drm/i915/gvt: Add helper for tuning MMIO hash table · fbfd76c3

由 Changbin Du 提交于 6月 06, 2017

We count all the tracked virtual MMIO registers, which can help us to
tune the MMIO hash table.

v2: Move num_tracked_mmio into gvt structure.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

fbfd76c3

drm/i915/gvt: Make the MMIO attribute wrappers be inline · 5c6d4c67

由 Changbin Du 提交于 6月 06, 2017

Function calls are expensive. I have see obvious overhead call to
these wrappers in perf data, especially from the cmd parser side.
So make these simple wrappers be inline to kill them all.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

5c6d4c67

drm/i915/gvt: Make mmio_attribute as type u8 to save 1.5MB memory · 56a78de5

由 Changbin Du 提交于 6月 06, 2017

Type u8 is big enough to contain all MMIO attribute flags. As the
total MMIO size is 2MB so we saved 1.5MB memory.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

56a78de5

drm/i915/gvt: Cleanup struct intel_gvt_mmio_info · d8d94ba3

由 Changbin Du 提交于 6月 06, 2017

The size, length, addr_mask fields actually are not necessary. Every
tracked mmio has DWORD size, and addr_mask is a legacy field.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

d8d94ba3

drm/i915/gvt: Optimize MMIO register handling for some large MMIO blocks · 65f9f6fe

由 Changbin Du 提交于 6月 06, 2017

Some of traced MMIO registers are a large continuous section. These
stuffed the MMIO lookup hash table and so waste lots of memory and
get much lower lookup performance.

Here we picked out these sections by special handling. These sections
include:
  o Display pipe registers, total 768.
  o The PVINFO page, total 1024.
  o MCHBAR_MIRROR, total 65536.
  o CSR_MMIO, total 3072.

So we removed 70,400 items from the hash table, and speed up guest
boot time by ~500ms.

v2:
  o add a local function find_mmio_block().
  o fix comments.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

65f9f6fe

drm/i915/gvt: add gtt_invalidate API to flush the GTT TLB · af2c6399

由 Chuanxiao Dong 提交于 6月 02, 2017

add gtt_invalidate API to handle the GTT TLB flush instead of
hiding in write_pte64 function. This can avoid overkill when using
write_pte64
Suggested-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

af2c6399

drm/i915/gvt: Add runtime_pm get/put to proctect MMIO accessing · 9b7bd65e

由 Chuanxiao Dong 提交于 6月 02, 2017

In some cases, GVT-g is accessing MMIO without holding runtime_pm
and this patch can add the inline API for doing the runtime_pm get/put
to make sure when accessing HW MMIO the i915 HW is really powered on.
Suggested-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

9b7bd65e

drm/i915/gvt: remove redundant -Wall · 89009b77

由 Nick Desaulniers 提交于 5月 21, 2017

This flag is already set in the top level Makefile of the kernel.

Also, by having set CONFIG_DRM_I915_GVT, thereby appending -Wall to
ccflags, you undo all the -Wno-* cflags previously set in the Make
variable KBUILD_CFLAGS.

For example:

cc foo.c -Wall -Wno-format -Wall

resets -Wformat.
Signed-off-by: NNick Desaulniers <nick.desaulniers@gmail.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

89009b77

drm/i915/gvt: Legacy HSW related MMIO handler clean up · a1dcba90

由 fred gao 提交于 5月 25, 2017

remove all the legacy pre-BDW mmio handlers and the corresponding
usage/definition since pre-BDW platforms are not supported in GVT
environment.

v2:
- clean up all the left dirty code before BDW, e.g
  all D_HSW usage and itself, D_IVB, D_PRE_BDW. (Zhenyu)
v3:
- change is based on gvt-staging. (Zhenyu)
Signed-off-by: Nfred gao <fred.gao@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

a1dcba90

drm/i915/gvt: Trigger scheduling after context complete · f100daec

由 Ping Gao 提交于 5月 24, 2017

The time based scheduler poll context busy status at every
micro-second during vGPU switch, it will make GPU idle for a while
when the context is very small and completed before the next
micro-second arrival. Trigger scheduling immediately after context
complete will eliminate GPU idle and improve performance.

Create two vGPU with same type, run Heaven simultaneously:
Before this patch:
 +---------+----------+----------+
 |         |  vGPU1   |   vGPU2  |
 +---------+----------+----------+
 |  Heaven |  357     |    354   |
 +-------------------------------+

After this patch:
 +---------+----------+----------+
 |         |  vGPU1   |   vGPU2  |
 +---------+----------+----------+
 |  Heaven |  397     |    398   |
 +-------------------------------+

v2: Let need_reschedule protect by gvt-lock.
Signed-off-by: NPing Gao <ping.a.gao@intel.com>
Signed-off-by: NWeinan Li <weinan.z.li@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>

f100daec

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功