1. 06 12月, 2017 1 次提交
  2. 28 11月, 2017 2 次提交
    • F
      drm/i915/gvt: Move request alloc to dispatch_workload path only · c3c80f07
      fred gao 提交于
      Previously the performance is improved through the workload auditing
      and shadowing ahead of vGPU scheduling, however, there is the case that
      more requests are allocated in submit_context before the previous request
      is added, the timeline will hold its seqno which is later.
      
      This patch is to move the request alloc to dispatch_workload function,
      where is the same place as request is added.
      
      It will fix the issue of kernel BUG for (timeline->seqno != request->fence.seqno)
      check when add_request.
      
      Fixes: 89ea20b9 ("drm/i915/gvt: Factor out scan and shadow from workload dispatch")
      Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
      Signed-off-by: Nfred gao <fred.gao@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      (cherry picked from commit f2880e04)
      c3c80f07
    • C
      drm/i915/gvt: Fix unsafe locking caused by spin_unlock_bh · 679fd3eb
      Changbin Du 提交于
      The caller of shadow_context_status_change may disable irqs. So it is not
      safe to use spin_unlock_bh in such context. Let's switch to irqsave version
      for safety.
      
      ------------[ cut here ]------------
      WARNING: CPU: 2 PID: 4504 at kernel/softirq.c:161 __local_bh_enable_ip+0x46/0x60
      [  168.797710] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
      [  168.797712] task: ffff8c693d22db80 task.stack: ffffb51b482bc000
      [  168.797718] RIP: 0010:__local_bh_enable_ip+0x46/0x60
      [  168.797721] RSP: 0018:ffffb51b482bfa10 EFLAGS: 00010046
      [  168.797724] RAX: 0000000000000046 RBX: ffff8c6900278000 RCX: 00000000ffffffff
      [  168.797726] RDX: 0000000000000001 RSI: 0000000000000200 RDI: ffffffffc06a0330
      [  168.797728] RBP: ffffb51b482bfa10 R08: 0000000000000000 R09: ffff8c690027cb90
      [  168.797730] R10: ffffb51b482bfa40 R11: 00000004072f0001 R12: 0000000000000000
      [  168.797732] R13: 0000000000000000 R14: ffff8c690027ca9c R15: 0000000000000000
      [  168.797735] FS:  00007ff187c56700(0000) GS:ffff8c6959d00000(0000) knlGS:0000000000000000
      [  168.797738] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  168.797740] CR2: 0000562bc0c3991f CR3: 0000000430614006 CR4: 00000000003606e0
      [  168.797742] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  168.797744] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  168.797745] Call Trace:
      [  168.797755]  _raw_spin_unlock_bh+0x1e/0x20
      [  168.797826]  shadow_context_status_change+0x120/0x1e0 [i915]
      [  168.797831]  notifier_call_chain+0x4a/0x70
      [  168.797834]  atomic_notifier_call_chain+0x1a/0x20
      [  168.797896]  execlists_cancel_port_requests+0x4f/0x80 [i915]
      [  168.797956]  reset_common_ring+0x30/0x100 [i915]
      [  168.798007]  i915_gem_reset_engine+0x114/0x330 [i915]
      [  168.798060]  ? i915_gem_retire_requests+0x75/0x180 [i915]
      [  168.798111]  i915_gem_reset+0x3e/0xb0 [i915]
      [  168.798149]  i915_reset+0x10b/0x1c0 [i915]
      [  168.798187]  i915_reset_device+0x209/0x220 [i915]
      [  168.798225]  ? gen8_gt_irq_ack+0x170/0x170 [i915]
      [  168.798229]  ? __queue_work+0x430/0x430
      [  168.798270]  i915_handle_error+0x285/0x420 [i915]
      [  168.798275]  ? mntput+0x24/0x40
      [  168.798281]  ? terminate_walk+0x8e/0xf0
      [  168.798328]  i915_wedged_set+0x84/0xc0 [i915]
      [  168.798333]  simple_attr_write+0xab/0xc0
      [  168.798337]  full_proxy_write+0x54/0x90
      [  168.798343]  __vfs_write+0x37/0x170
      [  168.798349]  ? common_file_perm+0x4c/0x100
      [  168.798355]  ? apparmor_file_permission+0x1a/0x20
      [  168.798361]  ? security_file_permission+0x3b/0xc0
      [  168.798365]  vfs_write+0xb8/0x1b0
      [  168.798370]  SyS_write+0x55/0xc0
      [  168.798376]  entry_SYSCALL_64_fastpath+0x1e/0xa9
      
      Fixes: 0e86cc9c ("drm/i915/gvt: implement per-vm mmio switching optimization")
      Signed-off-by: NChangbin Du <changbin.du@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      679fd3eb
  3. 05 10月, 2017 1 次提交
  4. 13 9月, 2017 1 次提交
  5. 08 9月, 2017 3 次提交
    • F
      drm/i915/gvt: Refine error handling in dispatch_workload · 0f43702a
      fred gao 提交于
      When an error occurs in dispatch_workload, this patch is to do the
      proper cleanup and rollback to the original states before the workload
      is abandoned.
      
      v2:
      - split the mixed several error paths for better review. (Zhenyu)
      
      v3:
      - original PTR_ERR(cs) is good and code cleanup. (Zhenyu)
      
      v4:
      - reuse the existing i915_add_request for error handling. (Zhenyu)
      
      v5:
      - remove the duplicate error handling release_shadow_wa_ctx and
        move the engine->context_unpin upper. (Zhenyu)
      
      v6:
      - keep the old label "out". (Zhenyu)
      Signed-off-by: Nfred gao <fred.gao@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      0f43702a
    • F
      drm/i915/gvt: Add error handling for intel_gvt_scan_and_shadow_workload · a3cfdca9
      fred gao 提交于
      When an error occurs after shadow_indirect_ctx, this patch is to do the
      proper cleanup and rollback to the original states for shadowed indirect
      context before the workload is abandoned.
      
      v2:
      - split the mixed several error paths for better review. (Zhenyu)
      
      v3:
      - no return check for clean up functions. (Changbin)
      
      v4:
      - expose and reuse the existing release_shadow_wa_ctx. (Zhenyu)
      
      v5:
      - move the release function to scheduler.c file. (Zhenyu)
      
      v6:
      - move error handling code of intel_gvt_scan_and_shadow_workload
        to here. (Zhenyu)
      Signed-off-by: Nfred gao <fred.gao@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      a3cfdca9
    • F
      drm/i915/gvt: Separate cmd scan from request allocation · 0a53bc07
      fred gao 提交于
      Currently i915 request structure and shadow ring buffer are allocated
      before command scan, so it will have to restore to previous states once
      any error happens afterwards in the long dispatch_workload path.
      
      This patch is to introduce a reserved ring buffer created at the beginning
      of vGPU initialization. Workload will be coped to this reserved buffer and
      be scanned first, the i915 request and shadow ring buffer are only
      allocated after the result of scan is successful.
      
      To balance the memory usage and buffer alloc time, the coming bigger ring
      buffer will be reallocated and kept until more bigger buffer is coming.
      
      v2:
      - use kmalloc for the smaller ring buffer, realloc if required. (Zhenyu)
      
      v3:
      - remove the dynamically allocated ring buffer. (Zhenyu)
      
      v4:
      - code style polish.
      - kfree previous allocated buffer once kmalloc failed. (Zhenyu)
      Signed-off-by: Nfred gao <fred.gao@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      0a53bc07
  6. 10 8月, 2017 4 次提交
  7. 02 8月, 2017 1 次提交
  8. 11 7月, 2017 2 次提交
  9. 21 6月, 2017 1 次提交
  10. 08 6月, 2017 2 次提交
    • P
      drm/i915/gvt: Trigger scheduling after context complete · f100daec
      Ping Gao 提交于
      The time based scheduler poll context busy status at every
      micro-second during vGPU switch, it will make GPU idle for a while
      when the context is very small and completed before the next
      micro-second arrival. Trigger scheduling immediately after context
      complete will eliminate GPU idle and improve performance.
      
      Create two vGPU with same type, run Heaven simultaneously:
      Before this patch:
       +---------+----------+----------+
       |         |  vGPU1   |   vGPU2  |
       +---------+----------+----------+
       |  Heaven |  357     |    354   |
       +-------------------------------+
      
      After this patch:
       +---------+----------+----------+
       |         |  vGPU1   |   vGPU2  |
       +---------+----------+----------+
       |  Heaven |  397     |    398   |
       +-------------------------------+
      
      v2: Let need_reschedule protect by gvt-lock.
      Signed-off-by: NPing Gao <ping.a.gao@intel.com>
      Signed-off-by: NWeinan Li <weinan.z.li@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      f100daec
    • C
      drm/i915/gvt: implement per-vm mmio switching optimization · 0e86cc9c
      Changbin Du 提交于
      Commit ab9da627906a ("drm/i915: make context status notifier head be
      per engine") gives us a chance to inspect every single request. Then
      we can eliminate unnecessary mmio switching for same vGPU. We only
      need mmio switching for different VMs (including host).
      
      This patch introduced a new general API intel_gvt_switch_mmio() to
      replace the old intel_gvt_load/restore_render_mmio(). This function
      can be further optimized for vGPU to vGPU switching.
      
      To support individual ring switch, we track the owner who occupy
      each ring. When another VM or host request a ring we do the mmio
      context switching. Otherwise no need to switch the ring.
      
      This optimization is very useful if only one guest has plenty of
      workloads and the host is mostly idle. The best case is no mmio
      switching will happen.
      
      v2:
        o fix missing ring switch issue. (chuanxiao)
        o support individual ring switch.
      Signed-off-by: NChangbin Du <changbin.du@intel.com>
      Reviewed-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      0e86cc9c
  11. 04 5月, 2017 1 次提交
  12. 28 4月, 2017 1 次提交
    • J
      drm/i915: Sanitize engine context sizes · 63ffbcda
      Joonas Lahtinen 提交于
      Pre-calculate engine context size based on engine class and device
      generation and store it in the engine instance.
      
      v2:
      - Squash and get rid of hw_context_size (Chris)
      
      v3:
      - Move after MMIO init for probing on Gen7 and 8 (Chris)
      - Retained rounding (Tvrtko)
      v4:
      - Rebase for deferred legacy context allocation
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
      Cc: intel-gvt-dev@lists.freedesktop.org
      Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      63ffbcda
  13. 13 4月, 2017 1 次提交
  14. 29 3月, 2017 1 次提交
  15. 22 3月, 2017 1 次提交
    • C
      drm/i915/gvt: Use force single submit flag to distinguish gvt request from i915 request · bc2d4b62
      Changbin Du 提交于
      In my previous Commit ab9da627906a ("drm/i915: make context status
      notifier head be per engine") rely on scheduler->current_workload[x]
      to distinguish gvt spacial request from i915 request. But this is
      not always true since no synchronization between workload_thread and
      lrc irq handler.
      
          lrc irq handler               workload_thread
               ----                          ----
        pick i915 requests;
                                      intel_vgpu_submit_execlist();
                                      current_workload[x] = xxx;
        shadow_context_status_change();
      
      Then current_workload[x] is not null but current request is of i915 self.
      So instead we check ctx flag CONTEXT_FORCE_SINGLE_SUBMISSION. Only gvt
      request set this flag and always set.
      
      v2: Reverse the order of multi-condition 'if' statement.
      
      Fixes: ab9da6279 ("drm/i915: make context status notifier head be per engine")
      Signed-off-by: NChangbin Du <changbin.du@intel.com>
      Reviewed-by: NYulei Zhang <yulei.zhang@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      bc2d4b62
  16. 21 3月, 2017 1 次提交
  17. 17 3月, 2017 6 次提交
  18. 06 3月, 2017 1 次提交
    • C
      drm/i915/gvt: handle workload lifecycle properly · 8f1117ab
      Chuanxiao Dong 提交于
      Currently i915 has a request replay mechanism which can make sure
      the request can be replayed after a GPU reset. With this mechanism,
      gvt should wait until the GVT request seqno passed before complete
      the current workload. So that there should be a context switch interrupt
      come before gvt free the workload. In this way, workload lifecylce
      matches with the i915 request lifecycle. The workload can only be freed
      after the request is completed.
      
      v2: use gvt_dbg_sched instead of gvt_err to print when wait again
      Signed-off-by: NChuanxiao Dong <chuanxiao.dong@intel.com>
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      8f1117ab
  19. 23 2月, 2017 1 次提交
  20. 14 2月, 2017 1 次提交
  21. 09 2月, 2017 1 次提交
  22. 09 1月, 2017 2 次提交
  23. 19 12月, 2016 1 次提交
  24. 25 11月, 2016 1 次提交
  25. 14 11月, 2016 1 次提交
    • P
      drm/i915/gvt: fix deadlock in workload_thread · 90d27a1b
      Pei Zhang 提交于
      It's a classical abba type deadlock when using 2 mutex objects, which
      are gvt.lock(a) and drm.struct_mutex(b). Deadlock happens in threads:
      1. intel_gvt_create/destroy_vgpu: P(a)->P(b)
      2. workload_thread: P(b)->P(a)
      
      Fix solution is align the lock acquire sequence in both threads. This
      patch choose to adjust the sequence in workload_thread function.
      
      This fixed lockup symptom for guest-reboot stress test.
      
      v2: adjust sequence in workload_thread based on zhenyu's suggestion.
          adjust sequence in create/destroy_vgpu function.
      v3: fix to still require struct_mutex for dispatch_workload()
      Signed-off-by: NPei Zhang <pei.zhang@intel.com>
      [zhenyuw: fix unused variables warnings.]
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      90d27a1b
  26. 10 11月, 2016 1 次提交