1. 24 10月, 2019 1 次提交
    • C
      drm/i915/execlists: Force preemption · 3a7a92ab
      Chris Wilson 提交于
      If the preempted context takes too long to relinquish control, e.g. it
      is stuck inside a shader with arbitration disabled, evict that context
      with an engine reset. This ensures that preemptions are reasonably
      responsive, providing a tighter QoS for the more important context at
      the cost of flagging unresponsive contexts more frequently (i.e. instead
      of using an ~10s hangcheck, we now evict at ~100ms).  The challenge of
      lies in picking a timeout that can be reasonably serviced by HW for
      typical workloads, balancing the existing clients against the needs for
      responsiveness.
      
      Note that coupled with timeslicing, this will lead to rapid GPU "hang"
      detection with multiple active contexts vying for GPU time.
      
      The forced preemption mechanism can be compiled out with
      
      	./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk
      3a7a92ab
  2. 22 10月, 2019 1 次提交
  3. 18 10月, 2019 2 次提交
  4. 17 10月, 2019 1 次提交
  5. 16 10月, 2019 1 次提交
  6. 14 10月, 2019 2 次提交
  7. 10 10月, 2019 1 次提交
  8. 08 10月, 2019 1 次提交
    • C
      drm/i915/selftests: Appease lockdep · 1664f35a
      Chris Wilson 提交于
      Disable irqs around updating the context image to keep lockdep happy:
      
      <4>[  673.483340] WARNING: possible irq lock inversion dependency detected
      <4>[  673.483342] 5.4.0-rc1-CI-Trybot_5118+ #1 Tainted: G     U
      <4>[  673.483342] --------------------------------------------------------
      <4>[  673.483343] swapper/2/0 just changed the state of lock:
      <4>[  673.483344] ffff88845db885a0 (&i915_request_get(rq)->submit/1){-...}, at: __i915_sw_fence_complete+0x1b2/0x250 [i915]
      <4>[  673.483387] but this lock took another, HARDIRQ-unsafe lock in the past:
      <4>[  673.483388]  (&ce->pin_mutex/2){+...}
      <4>[  673.483389]
      
                        and interrupts could create inverse lock ordering between them.
      
      <4>[  673.483390]
                        other info that might help us debug this:
      <4>[  673.483390] Chain exists of:
                          &i915_request_get(rq)->submit/1 --> &engine->active.lock --> &ce->pin_mutex/2
      
      <4>[  673.483392]  Possible interrupt unsafe locking scenario:
      
      <4>[  673.483392]        CPU0                    CPU1
      <4>[  673.483393]        ----                    ----
      <4>[  673.483393]   lock(&ce->pin_mutex/2);
      <4>[  673.483394]                                local_irq_disable();
      <4>[  673.483395]                                lock(&i915_request_get(rq)->submit/1);
      <4>[  673.483396]                                lock(&engine->active.lock);
      <4>[  673.483396]   <Interrupt>
      <4>[  673.483397]     lock(&i915_request_get(rq)->submit/1);
      <4>[  673.483398]
                         *** DEADLOCK ***
      
      <4>[  673.483398] 2 locks held by swapper/2/0:
      <4>[  673.483399]  #0: ffff8883f61ac9b0 (&(&gt->irq_lock)->rlock){-.-.}, at: gen11_gt_irq_handler+0x42/0x280 [i915]
      <4>[  673.483433]  #1: ffff88845db8c418 (&(&rq->lock)->rlock){-.-.}, at: intel_engine_breadcrumbs_irq+0x34a/0x5a0 [i915]
      <4>[  673.483463]
                        the shortest dependencies between 2nd lock and 1st lock:
      <4>[  673.483466]   -> (&ce->pin_mutex/2){+...} ops: 614520 {
      <4>[  673.483468]      HARDIRQ-ON-W at:
      <4>[  673.483471]                         lock_acquire+0xa7/0x1c0
      <4>[  673.483501]                         live_unlite_restore+0x1d8/0x6c0 [i915]
      <4>[  673.483543]                         __i915_subtests+0xb8/0x210 [i915]
      <4>[  673.483581]                         __run_selftests+0x112/0x170 [i915]
      <4>[  673.483615]                         i915_live_selftests+0x2c/0x60 [i915]
      <4>[  673.483644]                         i915_pci_probe+0x93/0x1b0 [i915]
      <4>[  673.483646]                         pci_device_probe+0x9e/0x120
      <4>[  673.483648]                         really_probe+0xea/0x420
      <4>[  673.483649]                         driver_probe_device+0x10b/0x120
      <4>[  673.483651]                         device_driver_attach+0x4a/0x50
      <4>[  673.483652]                         __driver_attach+0x97/0x130
      <4>[  673.483653]                         bus_for_each_dev+0x74/0xc0
      <4>[  673.483654]                         bus_add_driver+0x142/0x220
      <4>[  673.483655]                         driver_register+0x56/0xf0
      <4>[  673.483657]                         do_one_initcall+0x58/0x2ff
      <4>[  673.483659]                         do_init_module+0x56/0x1f8
      <4>[  673.483660]                         load_module+0x243e/0x29f0
      <4>[  673.483661]                         __do_sys_finit_module+0xe9/0x110
      <4>[  673.483662]                         do_syscall_64+0x4f/0x210
      <4>[  673.483665]                         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  673.483665]      INITIAL USE at:
      <4>[  673.483667]                        lock_acquire+0xa7/0x1c0
      <4>[  673.483698]                        live_unlite_restore+0x1d8/0x6c0 [i915]
      <4>[  673.483733]                        __i915_subtests+0xb8/0x210 [i915]
      <4>[  673.483764]                        __run_selftests+0x112/0x170 [i915]
      <4>[  673.483793]                        i915_live_selftests+0x2c/0x60 [i915]
      <4>[  673.483821]                        i915_pci_probe+0x93/0x1b0 [i915]
      <4>[  673.483822]                        pci_device_probe+0x9e/0x120
      <4>[  673.483824]                        really_probe+0xea/0x420
      <4>[  673.483825]                        driver_probe_device+0x10b/0x120
      <4>[  673.483826]                        device_driver_attach+0x4a/0x50
      <4>[  673.483827]                        __driver_attach+0x97/0x130
      <4>[  673.483828]                        bus_for_each_dev+0x74/0xc0
      <4>[  673.483829]                        bus_add_driver+0x142/0x220
      <4>[  673.483830]                        driver_register+0x56/0xf0
      <4>[  673.483831]                        do_one_initcall+0x58/0x2ff
      <4>[  673.483833]                        do_init_module+0x56/0x1f8
      <4>[  673.483834]                        load_module+0x243e/0x29f0
      <4>[  673.483835]                        __do_sys_finit_module+0xe9/0x110
      <4>[  673.483836]                        do_syscall_64+0x4f/0x210
      <4>[  673.483837]                        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  673.483838]    }
      <4>[  673.483868]    ... key      at: [<ffffffffa0a8f132>] __key.70113+0x2/0xffffffffffef2ed0 [i915]
      <4>[  673.483869]    ... acquired at:
      <4>[  673.483935]    __execlists_reset+0xfb/0xc20 [i915]
      <4>[  673.483965]    execlists_reset+0x3d/0x50 [i915]
      <4>[  673.483995]    intel_engine_reset+0xdf/0x230 [i915]
      <4>[  673.484022]    live_preempt_hang+0x1d7/0x2e0 [i915]
      <4>[  673.484064]    __i915_subtests+0xb8/0x210 [i915]
      <4>[  673.484130]    __run_selftests+0x112/0x170 [i915]
      <4>[  673.484163]    i915_live_selftests+0x2c/0x60 [i915]
      <4>[  673.484193]    i915_pci_probe+0x93/0x1b0 [i915]
      <4>[  673.484194]    pci_device_probe+0x9e/0x120
      <4>[  673.484195]    really_probe+0xea/0x420
      <4>[  673.484196]    driver_probe_device+0x10b/0x120
      <4>[  673.484197]    device_driver_attach+0x4a/0x50
      <4>[  673.484198]    __driver_attach+0x97/0x130
      <4>[  673.484199]    bus_for_each_dev+0x74/0xc0
      <4>[  673.484200]    bus_add_driver+0x142/0x220
      <4>[  673.484202]    driver_register+0x56/0xf0
      <4>[  673.484203]    do_one_initcall+0x58/0x2ff
      <4>[  673.484204]    do_init_module+0x56/0x1f8
      <4>[  673.484205]    load_module+0x243e/0x29f0
      <4>[  673.484206]    __do_sys_finit_module+0xe9/0x110
      <4>[  673.484207]    do_syscall_64+0x4f/0x210
      <4>[  673.484208]    entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      <4>[  673.484209]  -> (&engine->active.lock){..-.} ops: 972791 {
      <4>[  673.484211]     IN-SOFTIRQ-W at:
      <4>[  673.484213]                       lock_acquire+0xa7/0x1c0
      <4>[  673.484214]                       _raw_spin_lock_irqsave+0x33/0x50
      <4>[  673.484244]                       execlists_submission_tasklet+0xaf/0x100 [i915]
      <4>[  673.484246]                       tasklet_action_common.isra.18+0x6c/0x1c0
      <4>[  673.484247]                       __do_softirq+0xdf/0x47f
      <4>[  673.484248]                       irq_exit+0xba/0xc0
      <4>[  673.484249]                       do_IRQ+0x83/0x160
      <4>[  673.484250]                       ret_from_intr+0x0/0x1d
      <4>[  673.484252]                       cpuidle_enter_state+0xb2/0x450
      <4>[  673.484253]                       cpuidle_enter+0x24/0x40
      <4>[  673.484254]                       do_idle+0x1e7/0x250
      <4>[  673.484256]                       cpu_startup_entry+0x14/0x20
      <4>[  673.484257]                       start_secondary+0x15f/0x1b0
      <4>[  673.484258]                       secondary_startup_64+0xa4/0xb0
      <4>[  673.484259]     INITIAL USE at:
      <4>[  673.484261]                      lock_acquire+0xa7/0x1c0
      <4>[  673.484290]                      intel_engine_init_active+0x7e/0xb0 [i915]
      <4>[  673.484305]                      intel_engines_setup+0x1cd/0x3b0 [i915]
      <4>[  673.484305]                      i915_gem_init+0x12d/0x900 [i915]
      <4>[  673.484305]                      i915_driver_probe+0xb70/0x15d0 [i915]
      <4>[  673.484305]                      i915_pci_probe+0x43/0x1b0 [i915]
      <4>[  673.484305]                      pci_device_probe+0x9e/0x120
      <4>[  673.484305]                      really_probe+0xea/0x420
      <4>[  673.484305]                      driver_probe_device+0x10b/0x120
      <4>[  673.484305]                      device_driver_attach+0x4a/0x50
      <4>[  673.484305]                      __driver_attach+0x97/0x130
      <4>[  673.484305]                      bus_for_each_dev+0x74/0xc0
      <4>[  673.484305]                      bus_add_driver+0x142/0x220
      <4>[  673.484305]                      driver_register+0x56/0xf0
      <4>[  673.484305]                      do_one_initcall+0x58/0x2ff
      <4>[  673.484305]                      do_init_module+0x56/0x1f8
      <4>[  673.484305]                      load_module+0x243e/0x29f0
      <4>[  673.484305]                      __do_sys_finit_module+0xe9/0x110
      <4>[  673.484305]                      do_syscall_64+0x4f/0x210
      <4>[  673.484305]                      entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  673.484305]   }
      <4>[  673.484305]   ... key      at: [<ffffffffa0a8f160>] __key.70307+0x0/0xffffffffffef2ea0 [i915]
      <4>[  673.484305]   ... acquired at:
      <4>[  673.484305]    _raw_spin_lock_irqsave+0x33/0x50
      <4>[  673.484305]    execlists_submit_request+0x2b/0x1e0 [i915]
      <4>[  673.484305]    submit_notify+0xa8/0x13c [i915]
      <4>[  673.484305]    __i915_sw_fence_complete+0x81/0x250 [i915]
      <4>[  673.484305]    i915_sw_fence_wake+0x51/0x70 [i915]
      <4>[  673.484305]    __i915_sw_fence_complete+0x1ee/0x250 [i915]
      <4>[  673.484305]    dma_i915_sw_fence_wake+0x1b/0x30 [i915]
      <4>[  673.484305]    dma_fence_signal_locked+0x9e/0x1b0
      <4>[  673.484305]    dma_fence_signal+0x1f/0x40
      <4>[  673.484305]    fence_work+0x28/0x80 [i915]
      <4>[  673.484305]    process_one_work+0x26a/0x620
      <4>[  673.484305]    worker_thread+0x37/0x380
      <4>[  673.484305]    kthread+0x119/0x130
      <4>[  673.484305]    ret_from_fork+0x24/0x50
      
      <4>[  673.484305] -> (&i915_request_get(rq)->submit/1){-...} ops: 857694 {
      <4>[  673.484305]    IN-HARDIRQ-W at:
      <4>[  673.484305]                     lock_acquire+0xa7/0x1c0
      <4>[  673.484305]                     _raw_spin_lock_irqsave_nested+0x39/0x50
      <4>[  673.484305]                     __i915_sw_fence_complete+0x1b2/0x250 [i915]
      <4>[  673.484305]                     intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915]
      <4>[  673.484305]                     cs_irq_handler+0x39/0x50 [i915]
      <4>[  673.484305]                     gen11_gt_irq_handler+0x17b/0x280 [i915]
      <4>[  673.484305]                     gen11_irq_handler+0x54/0xf0 [i915]
      <4>[  673.484305]                     __handle_irq_event_percpu+0x41/0x2c0
      <4>[  673.484305]                     handle_irq_event_percpu+0x2b/0x70
      <4>[  673.484305]                     handle_irq_event+0x2f/0x50
      <4>[  673.484305]                     handle_edge_irq+0x99/0x1b0
      <4>[  673.484305]                     do_IRQ+0x7e/0x160
      <4>[  673.484305]                     ret_from_intr+0x0/0x1d
      <4>[  673.484305]                     cpuidle_enter_state+0xb2/0x450
      <4>[  673.484305]                     cpuidle_enter+0x24/0x40
      <4>[  673.484305]                     do_idle+0x1e7/0x250
      <4>[  673.484305]                     cpu_startup_entry+0x14/0x20
      <4>[  673.484305]                     start_secondary+0x15f/0x1b0
      <4>[  673.484305]                     secondary_startup_64+0xa4/0xb0
      <4>[  673.484305]    INITIAL USE at:
      <4>[  673.484305]                    lock_acquire+0xa7/0x1c0
      <4>[  673.484305]                    _raw_spin_lock_irqsave_nested+0x39/0x50
      <4>[  673.484305]                    __i915_sw_fence_complete+0x1b2/0x250 [i915]
      <4>[  673.484305]                    __engine_park+0x233/0x420 [i915]
      <4>[  673.484305]                    ____intel_wakeref_put_last+0x1c/0x70 [i915]
      <4>[  673.484305]                    intel_gt_resume+0x202/0x2c0 [i915]
      <4>[  673.484305]                    i915_gem_init+0x36e/0x900 [i915]
      <4>[  673.484305]                    i915_driver_probe+0xb70/0x15d0 [i915]
      <4>[  673.484305]                    i915_pci_probe+0x43/0x1b0 [i915]
      <4>[  673.484305]                    pci_device_probe+0x9e/0x120
      <4>[  673.484305]                    really_probe+0xea/0x420
      <4>[  673.484305]                    driver_probe_device+0x10b/0x120
      <4>[  673.484305]                    device_driver_attach+0x4a/0x50
      <4>[  673.484305]                    __driver_attach+0x97/0x130
      <4>[  673.484305]                    bus_for_each_dev+0x74/0xc0
      <4>[  673.484305]                    bus_add_driver+0x142/0x220
      <4>[  673.484305]                    driver_register+0x56/0xf0
      <4>[  673.484305]                    do_one_initcall+0x58/0x2ff
      <4>[  673.484305]                    do_init_module+0x56/0x1f8
      <4>[  673.484305]                    load_module+0x243e/0x29f0
      <4>[  673.484305]                    __do_sys_finit_module+0xe9/0x110
      <4>[  673.484305]                    do_syscall_64+0x4f/0x210
      <4>[  673.484305]                    entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  673.484305]  }
      <4>[  673.484305]  ... key      at: [<ffffffffa0a8f6a1>] __key.80173+0x1/0xffffffffffef2960 [i915]
      <4>[  673.484305]  ... acquired at:
      <4>[  673.484305]    mark_lock+0x382/0x500
      <4>[  673.484305]    __lock_acquire+0x7e1/0x15d0
      <4>[  673.484305]    lock_acquire+0xa7/0x1c0
      <4>[  673.484305]    _raw_spin_lock_irqsave_nested+0x39/0x50
      <4>[  673.484305]    __i915_sw_fence_complete+0x1b2/0x250 [i915]
      <4>[  673.484305]    intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915]
      <4>[  673.484305]    cs_irq_handler+0x39/0x50 [i915]
      <4>[  673.484305]    gen11_gt_irq_handler+0x17b/0x280 [i915]
      <4>[  673.484305]    gen11_irq_handler+0x54/0xf0 [i915]
      <4>[  673.484305]    __handle_irq_event_percpu+0x41/0x2c0
      <4>[  673.484305]    handle_irq_event_percpu+0x2b/0x70
      <4>[  673.484305]    handle_irq_event+0x2f/0x50
      <4>[  673.484305]    handle_edge_irq+0x99/0x1b0
      <4>[  673.484305]    do_IRQ+0x7e/0x160
      <4>[  673.484305]    ret_from_intr+0x0/0x1d
      <4>[  673.484305]    cpuidle_enter_state+0xb2/0x450
      <4>[  673.484305]    cpuidle_enter+0x24/0x40
      <4>[  673.484305]    do_idle+0x1e7/0x250
      <4>[  673.484305]    cpu_startup_entry+0x14/0x20
      <4>[  673.484305]    start_secondary+0x15f/0x1b0
      <4>[  673.484305]    secondary_startup_64+0xa4/0xb0
      
      <4>[  673.484305]
                        stack backtrace:
      <4>[  673.484305] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G     U            5.4.0-rc1-CI-Trybot_5118+ #1
      <4>[  673.484305] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
      <4>[  673.484305] Call Trace:
      <4>[  673.484305]  <IRQ>
      <4>[  673.484305]  dump_stack+0x67/0x9b
      <4>[  673.484305]  check_usage_forwards+0x13c/0x150
      <4>[  673.484305]  ? mark_lock+0x382/0x500
      <4>[  673.484305]  mark_lock+0x382/0x500
      <4>[  673.484305]  ? check_usage_backwards+0x140/0x140
      <4>[  673.484305]  __lock_acquire+0x7e1/0x15d0
      <4>[  673.484305]  ? debug_object_deactivate+0x17e/0x190
      <4>[  673.484305]  lock_acquire+0xa7/0x1c0
      <4>[  673.484305]  ? __i915_sw_fence_complete+0x1b2/0x250 [i915]
      <4>[  673.484305]  _raw_spin_lock_irqsave_nested+0x39/0x50
      <4>[  673.484305]  ? __i915_sw_fence_complete+0x1b2/0x250 [i915]
      <4>[  673.484305]  __i915_sw_fence_complete+0x1b2/0x250 [i915]
      <4>[  673.484305]  intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915]
      <4>[  673.484305]  cs_irq_handler+0x39/0x50 [i915]
      <4>[  673.484305]  gen11_gt_irq_handler+0x17b/0x280 [i915]
      <4>[  673.484305]  gen11_irq_handler+0x54/0xf0 [i915]
      <4>[  673.484305]  __handle_irq_event_percpu+0x41/0x2c0
      <4>[  673.484305]  handle_irq_event_percpu+0x2b/0x70
      <4>[  673.484305]  handle_irq_event+0x2f/0x50
      <4>[  673.484305]  handle_edge_irq+0x99/0x1b0
      <4>[  673.484305]  do_IRQ+0x7e/0x160
      <4>[  673.484305]  common_interrupt+0xf/0xf
      <4>[  673.484305]  </IRQ>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191004203121.31138-1-chris@chris-wilson.co.uk
      1664f35a
  9. 04 10月, 2019 4 次提交
  10. 03 10月, 2019 1 次提交
  11. 28 9月, 2019 1 次提交
  12. 25 9月, 2019 1 次提交
  13. 20 9月, 2019 1 次提交
    • C
      drm/i915: Mark i915_request.timeline as a volatile, rcu pointer · d19d71fc
      Chris Wilson 提交于
      The request->timeline is only valid until the request is retired (i.e.
      before it is completed). Upon retiring the request, the context may be
      unpinned and freed, and along with it the timeline may be freed. We
      therefore need to be very careful when chasing rq->timeline that the
      pointer does not disappear beneath us. The vast majority of users are in
      a protected context, either during request construction or retirement,
      where the timeline->mutex is held and the timeline cannot disappear. It
      is those few off the beaten path (where we access a second timeline) that
      need extra scrutiny -- to be added in the next patch after first adding
      the warnings about dangerous access.
      
      One complication, where we cannot use the timeline->mutex itself, is
      during request submission onto hardware (under spinlocks). Here, we want
      to check on the timeline to finalize the breadcrumb, and so we need to
      impose a second rule to ensure that the request->timeline is indeed
      valid. As we are submitting the request, it's context and timeline must
      be pinned, as it will be used by the hardware. Since it is pinned, we
      know the request->timeline must still be valid, and we cannot submit the
      idle barrier until after we release the engine->active.lock, ergo while
      submitting and holding that spinlock, a second thread cannot release the
      timeline.
      
      v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
      as we need it, and tidy up acquiring the timeline with a bit of
      refactoring (i915_active_add_request)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
      d19d71fc
  14. 13 9月, 2019 1 次提交
  15. 19 8月, 2019 1 次提交
  16. 12 8月, 2019 1 次提交
  17. 08 8月, 2019 1 次提交
  18. 06 8月, 2019 1 次提交
  19. 31 7月, 2019 1 次提交
  20. 16 7月, 2019 1 次提交
  21. 15 7月, 2019 1 次提交
  22. 13 7月, 2019 1 次提交
  23. 10 7月, 2019 1 次提交
  24. 09 7月, 2019 1 次提交
  25. 03 7月, 2019 1 次提交
  26. 20 6月, 2019 1 次提交
  27. 19 6月, 2019 1 次提交
  28. 14 6月, 2019 1 次提交
  29. 11 6月, 2019 1 次提交
  30. 29 5月, 2019 1 次提交
  31. 28 5月, 2019 2 次提交
  32. 22 5月, 2019 3 次提交
    • C
      drm/i915/execlists: Virtual engine bonding · ee113690
      Chris Wilson 提交于
      Some users require that when a master batch is executed on one particular
      engine, a companion batch is run simultaneously on a specific slave
      engine. For this purpose, we introduce virtual engine bonding, allowing
      maps of master:slaves to be constructed to constrain which physical
      engines a virtual engine may select given a fence on a master engine.
      
      For the moment, we continue to ignore the issue of preemption deferring
      the master request for later. Ideally, we would like to then also remove
      the slave and run something else rather than have it stall the pipeline.
      With load balancing, we should be able to move workload around it, but
      there is a similar stall on the master pipeline while it may wait for
      the slave to be executed. At the cost of more latency for the bonded
      request, it may be interesting to launch both on their engines in
      lockstep. (Bubbles abound.)
      
      Opens: Also what about bonding an engine as its own master? It doesn't
      break anything internally, so allow the silliness.
      
      v2: Emancipate the bonds
      v3: Couple in delayed scheduling for the selftests
      v4: Handle invalid mutually exclusive bonding
      v5: Mention what the uapi does
      v6: s/nbond/num_bonds/
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-9-chris@chris-wilson.co.uk
      ee113690
    • C
      drm/i915: Apply an execution_mask to the virtual_engine · 78e41ddd
      Chris Wilson 提交于
      Allow the user to direct which physical engines of the virtual engine
      they wish to execute one, as sometimes it is necessary to override the
      load balancing algorithm.
      
      v2: Only kick the virtual engines on context-out if required
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-7-chris@chris-wilson.co.uk
      78e41ddd
    • C
      drm/i915: Load balancing across a virtual engine · 6d06779e
      Chris Wilson 提交于
      Having allowed the user to define a set of engines that they will want
      to only use, we go one step further and allow them to bind those engines
      into a single virtual instance. Submitting a batch to the virtual engine
      will then forward it to any one of the set in a manner as best to
      distribute load.  The virtual engine has a single timeline across all
      engines (it operates as a single queue), so it is not able to concurrently
      run batches across multiple engines by itself; that is left up to the user
      to submit multiple concurrent batches to multiple queues. Multiple users
      will be load balanced across the system.
      
      The mechanism used for load balancing in this patch is a late greedy
      balancer. When a request is ready for execution, it is added to each
      engine's queue, and when an engine is ready for its next request it
      claims it from the virtual engine. The first engine to do so, wins, i.e.
      the request is executed at the earliest opportunity (idle moment) in the
      system.
      
      As not all HW is created equal, the user is still able to skip the
      virtual engine and execute the batch on a specific engine, all within the
      same queue. It will then be executed in order on the correct engine,
      with execution on other virtual engines being moved away due to the load
      detection.
      
      A couple of areas for potential improvement left!
      
      - The virtual engine always take priority over equal-priority tasks.
      Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
      and hopefully the virtual and real engines are not then congested (i.e.
      all work is via virtual engines, or all work is to the real engine).
      
      - We require the breadcrumb irq around every virtual engine request. For
      normal engines, we eliminate the need for the slow round trip via
      interrupt by using the submit fence and queueing in order. For virtual
      engines, we have to allow any job to transfer to a new ring, and cannot
      coalesce the submissions, so require the completion fence instead,
      forcing the persistent use of interrupts.
      
      - We only drip feed single requests through each virtual engine and onto
      the physical engines, even if there was enough work to fill all ELSP,
      leaving small stalls with an idle CS event at the end of every request.
      Could we be greedy and fill both slots? Being lazy is virtuous for load
      distribution on less-than-full workloads though.
      
      Other areas of improvement are more general, such as reducing lock
      contention, reducing dispatch overhead, looking at direct submission
      rather than bouncing around tasklets etc.
      
      sseu: Lift the restriction to allow sseu to be reconfigured on virtual
      engines composed of RENDER_CLASS (rcs).
      
      v2: macroize check_user_mbz()
      v3: Cancel virtual engines on wedging
      v4: Commence commenting
      v5: Replace 64b sibling_mask with a list of class:instance
      v6: Drop the one-element array in the uabi
      v7: Assert it is an virtual engine in to_virtual_engine()
      v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2)
      
      Link: https://github.com/intel/media-driver/pull/283Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-6-chris@chris-wilson.co.uk
      6d06779e