1. 27 4月, 2019 1 次提交
  2. 25 4月, 2019 3 次提交
    • C
      drm/i915: Invert the GEM wakeref hierarchy · 79ffac85
      Chris Wilson 提交于
      In the current scheme, on submitting a request we take a single global
      GEM wakeref, which trickles down to wake up all GT power domains. This
      is undesirable as we would like to be able to localise our power
      management to the available power domains and to remove the global GEM
      operations from the heart of the driver. (The intent there is to push
      global GEM decisions to the boundary as used by the GEM user interface.)
      
      Now during request construction, each request is responsible via its
      logical context to acquire a wakeref on each power domain it intends to
      utilize. Currently, each request takes a wakeref on the engine(s) and
      the engines themselves take a chipset wakeref. This gives us a
      transition on each engine which we can extend if we want to insert more
      powermangement control (such as soft rc6). The global GEM operations
      that currently require a struct_mutex are reduced to listening to pm
      events from the chipset GT wakeref. As we reduce the struct_mutex
      requirement, these listeners should evaporate.
      
      Perhaps the biggest immediate change is that this removes the
      struct_mutex requirement around GT power management, allowing us greater
      flexibility in request construction. Another important knock-on effect,
      is that by tracking engine usage, we can insert a switch back to the
      kernel context on that engine immediately, avoiding any extra delay or
      inserting global synchronisation barriers. This makes tracking when an
      engine and its associated contexts are idle much easier -- important for
      when we forgo our assumed execution ordering and need idle barriers to
      unpin used contexts. In the process, it means we remove a large chunk of
      code whose only purpose was to switch back to the kernel context.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
      79ffac85
    • C
      drm/i915: Introduce context->enter() and context->exit() · 6eee33e8
      Chris Wilson 提交于
      We wish to start segregating the power management into different control
      domains, both with respect to the hardware and the user interface. The
      first step is that at the lowest level flow of requests, we want to
      process a context event (and not a global GEM operation). In this patch,
      we introduce the context callbacks that in future patches will be
      redirected to per-engine interfaces leading to global operations as
      required.
      
      The intent is that this will be guarded by the timeline->mutex, except
      that retiring has not quite finished transitioning over from being
      guarded by struct_mutex. So at the moment it is protected by
      struct_mutex with a reminded to switch.
      
      v2: Rename default handlers to intel_context_enter_engine.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-3-chris@chris-wilson.co.uk
      6eee33e8
    • C
      drm/i915: Move GraphicsTechnology files under gt/ · 112ed2d3
      Chris Wilson 提交于
      Start partitioning off the code that talks to the hardware (GT) from the
      uapi layers and move the device facing code under gt/
      
      One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
      subdivide that header and body further (and split out the submission
      code from the ringbuffer and logical context handling). This patch aims
      to be simple motion so git can fixup inflight patches with little mess.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Acked-by: NJani Nikula <jani.nikula@intel.com>
      Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
      112ed2d3
  3. 24 4月, 2019 1 次提交
  4. 12 4月, 2019 3 次提交
    • C
      drm/i915: Flush the CSB pointer reset · 0edda1d6
      Chris Wilson 提交于
      The HW resets it CSB tail pointer on resetting the engine. Most of the
      time. In case it doesn't (and for system resume) we write the expected
      value anyway. For extra paranoia, flush the write before we invalidate
      the cacheline.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Acked-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190412110159.10495-1-chris@chris-wilson.co.uk
      0edda1d6
    • C
      drm/i915/execlists: Always reset the context's RING registers · 1863e302
      Chris Wilson 提交于
      During reset, we try and stop the active ring. This has the consequence
      that we often clobber the RING registers within the context image. When
      we find an active request, we update the context image to rerun that
      request (if it was guilty, we replace the hanging user payload with
      NOPs). However, we were ignoring an active context if the request had
      completed, with the consequence that the next submission on that request
      would start with RING_HEAD==0 and not the tail of the previous request,
      causing all requests still in the ring to be rerun. Rare, but
      occasionally seen within CI where we would spot that the context seqno
      would reverse and complain that we were retiring an incomplete request.
      
          <0> [412.390350]   <idle>-0       3d.s2 408373352us : __i915_request_submit: rcs0 fence 1e95b:3640 -> current 3638
          <0> [412.390350]   <idle>-0       3d.s2 408373353us : __i915_request_submit: rcs0 fence 1e95b:3642 -> current 3638
          <0> [412.390350]   <idle>-0       3d.s2 408373354us : __i915_request_submit: rcs0 fence 1e95b:3644 -> current 3638
          <0> [412.390350]   <idle>-0       3d.s2 408373354us : __i915_request_submit: rcs0 fence 1e95b:3646 -> current 3638
          <0> [412.390350]   <idle>-0       3d.s2 408373356us : __execlists_submission_tasklet: rcs0 in[0]:  ctx=2.1, fence 1e95b:3646 (current 3638), prio=4
          <0> [412.390350] i915_sel-4613    0.... 408373374us : __i915_request_commit: rcs0 fence 1e95b:3648
          <0> [412.390350] i915_sel-4613    0d..1 408373377us : process_csb: rcs0 cs-irq head=2, tail=3
          <0> [412.390350] i915_sel-4613    0d..1 408373377us : process_csb: rcs0 csb[3]: status=0x00000001:0x00000000, active=0x1
          <0> [412.390350] i915_sel-4613    0d..1 408373378us : __i915_request_submit: rcs0 fence 1e95b:3648 -> current 3638
          <0> [412.390350]   <idle>-0       3..s1 408373378us : execlists_submission_tasklet: rcs0 awake?=1, active=5
          <0> [412.390350] i915_sel-4613    0d..1 408373379us : __execlists_submission_tasklet: rcs0 in[0]:  ctx=2.2, fence 1e95b:3648 (current 3638), prio=4
          <0> [412.390350] i915_sel-4613    0.... 408373381us : i915_reset_engine: rcs0 flags=4
          <0> [412.390350] i915_sel-4613    0.... 408373382us : execlists_reset_prepare: rcs0: depth<-0
          <0> [412.390350]   <idle>-0       3d.s2 408373390us : process_csb: rcs0 cs-irq head=3, tail=4
          <0> [412.390350]   <idle>-0       3d.s2 408373390us : process_csb: rcs0 csb[4]: status=0x00008002:0x00000002, active=0x1
          <0> [412.390350]   <idle>-0       3d.s2 408373390us : process_csb: rcs0 out[0]: ctx=2.2, fence 1e95b:3648 (current 3640), prio=4
          <0> [412.390350] i915_sel-4613    0.... 408373401us : intel_engine_stop_cs: rcs0
          <0> [412.390350] i915_sel-4613    0d..1 408373402us : process_csb: rcs0 cs-irq head=4, tail=4
          <0> [412.390350] i915_sel-4613    0.... 408373403us : intel_gpu_reset: engine_mask=1
          <0> [412.390350] i915_sel-4613    0d..1 408373408us : execlists_cancel_port_requests: rcs0:port0 fence 1e95b:3648, (current 3648)
          <0> [412.390350] i915_sel-4613    0.... 408373442us : intel_engine_cancel_stop_cs: rcs0
          <0> [412.390350] i915_sel-4613    0.... 408373442us : execlists_reset_finish: rcs0: depth->0
          <0> [412.390350] ksoftirq-26      3..s. 408373442us : execlists_submission_tasklet: rcs0 awake?=1, active=0
          <0> [412.390350] ksoftirq-26      3d.s1 408373443us : process_csb: rcs0 cs-irq head=5, tail=5
          <0> [412.390350] i915_sel-4613    0.... 408373475us : i915_request_retire: rcs0 fence 1e95b:3640, current 3648
          <0> [412.390350] i915_sel-4613    0.... 408373476us : i915_request_retire: __retire_engine_request(rcs0) fence 1e95b:3640, current 3648
          <0> [412.390350] i915_sel-4613    0.... 408373494us : __i915_request_commit: rcs0 fence 1e95b:3650
          <0> [412.390350] i915_sel-4613    0d..1 408373496us : process_csb: rcs0 cs-irq head=5, tail=5
          <0> [412.390350] i915_sel-4613    0d..1 408373496us : __i915_request_submit: rcs0 fence 1e95b:3650 -> current 3648
          <0> [412.390350] i915_sel-4613    0d..1 408373498us : __execlists_submission_tasklet: rcs0 in[0]:  ctx=2.1, fence 1e95b:3650 (current 3648), prio=6
          <0> [412.390350] i915_sel-4613    0.... 408373500us : i915_request_retire_upto: rcs0 fence 1e95b:3648, current 3648
          <0> [412.390350] i915_sel-4613    0.... 408373500us : i915_request_retire: rcs0 fence 1e95b:3642, current 3648
          <0> [412.390350] i915_sel-4613    0.... 408373501us : i915_request_retire: __retire_engine_request(rcs0) fence 1e95b:3642, current 3648
          <0> [412.390350] i915_sel-4613    0.... 408373514us : i915_request_retire: rcs0 fence 1e95b:3644, current 3648
          <0> [412.390350] i915_sel-4613    0.... 408373515us : i915_request_retire: __retire_engine_request(rcs0) fence 1e95b:3644, current 3648
          <0> [412.390350] i915_sel-4613    0.... 408373527us : i915_request_retire: rcs0 fence 1e95b:3646, current 3640
          <0> [412.390350]   <idle>-0       3..s1 408373569us : execlists_submission_tasklet: rcs0 awake?=1, active=1
          <0> [412.390350]   <idle>-0       3d.s2 408373569us : process_csb: rcs0 cs-irq head=5, tail=1
          <0> [412.390350]   <idle>-0       3d.s2 408373570us : process_csb: rcs0 csb[0]: status=0x00000001:0x00000000, active=0x1
          <0> [412.390350]   <idle>-0       3d.s2 408373570us : process_csb: rcs0 csb[1]: status=0x00000018:0x00000002, active=0x5
          <0> [412.390350]   <idle>-0       3d.s2 408373570us : process_csb: rcs0 out[0]: ctx=2.1, fence 1e95b:3650 (current 3650), prio=6
          <0> [412.390350]   <idle>-0       3d.s2 408373571us : process_csb: rcs0 completed ctx=2
          <0> [412.390350] i915_sel-4613    0.... 408373621us : i915_request_retire: i915_request_retire:253 GEM_BUG_ON(!i915_request_completed(request))
      
      v2: Fixup the cancellation path to drain the CSB and reset the pointers.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190411130515.20716-2-chris@chris-wilson.co.uk
      1863e302
    • C
      drm/i915/guc: Implement reset locally · 292ad25c
      Chris Wilson 提交于
      Before causing guc and execlists to diverge further (breaking guc in the
      process), take a copy of the current reset procedure and make it local to
      the guc submission backend
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190411130515.20716-1-chris@chris-wilson.co.uk
      292ad25c
  5. 11 4月, 2019 3 次提交
  6. 05 4月, 2019 2 次提交
    • C
      drm/i915: Make RING_PDP relative to engine->mmio_base · 6d425728
      Chris Wilson 提交于
      The PDP registers are an oddity inside the set of context saved
      registers in that they take the engine as a parameter to the macro and
      not the mmio_base as the others do. Make it accept the engine->mmio_base
      for consistency in programming the context registers.
      
      add/remove: 0/0 grow/shrink: 2/1 up/down: 3/-32 (-29)
      Function                                     old     new   delta
      emit_ppgtt_update                            324     326      +2
      capture                                     5102    5103      +1
      execlists_init_reg_state.isra               1128    1096     -32
      
      And similar savings later!
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190405123831.9724-1-chris@chris-wilson.co.uk
      6d425728
    • C
      drm/i915/execlists: Enable coarse preemption boundaries for gen8 · bac24f59
      Chris Wilson 提交于
      When we introduced preemption, we chose to keep it disabled for gen8 as
      supporting preemption inside GPGPU user batches required various w/a in
      userspace. Since then, the desire to preempt long queues of requests
      between batches (e.g. within busywaiting semaphores) has grown. So allow
      arbitration within the busywaits and between requests, but disable
      arbitration within user batches so that we can preempt between requests
      and not risk breaking GPGPU.
      
      However, since this preemption is much coarser and doesn't interfere
      with userspace, we decline to include it amongst the scheduler
      capabilities. (This is also required for us to skip over the preemption
      selftests that expect to be able to preempt user batches.)
      
      Michal suggested that we could perhaps allow preemption inside gen8
      userspace batches if we can satisfy ourselves that the default
      preemption settings are viable with existing userspace (principally
      OpenCL which already should carry any known workaround). We could then
      merge the two code paths back into one, even dropping the artifical
      has-preemption device feature flag.
      
      Testcase: igt/gem_exec_scheduler/semaphore-user
      References: beecec90 ("drm/i915/execlists: Preemption!")
      Fixes: e8861964 ("drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michal Winiarski <michal.winiarski@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: Michal Winiarski <michal.winiarski@intel.com> #irc
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190329134024.5254-1-chris@chris-wilson.co.uk
      bac24f59
  7. 27 3月, 2019 2 次提交
  8. 22 3月, 2019 2 次提交
    • C
      drm/i915: Allow contexts to share a single timeline across all engines · ea593dbb
      Chris Wilson 提交于
      Previously, our view has been always to run the engines independently
      within a context. (Multiple engines happened before we had contexts and
      timelines, so they always operated independently and that behaviour
      persisted into contexts.) However, at the user level the context often
      represents a single timeline (e.g. GL contexts) and userspace must
      ensure that the individual engines are serialised to present that
      ordering to the client (or forgot about this detail entirely and hope no
      one notices - a fair ploy if the client can only directly control one
      engine themselves ;)
      
      In the next patch, we will want to construct a set of engines that
      operate as one, that have a single timeline interwoven between them, to
      present a single virtual engine to the user. (They submit to the virtual
      engine, then we decide which engine to execute on based.)
      
      To that end, we want to be able to create contexts which have a single
      timeline (fence context) shared between all engines, rather than multiple
      timelines.
      
      v2: Move the specialised timeline ordering to its own function.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-4-chris@chris-wilson.co.uk
      ea593dbb
    • C
      drm/i915: Flush pages on acquisition · a679f58d
      Chris Wilson 提交于
      When we return pages to the system, we ensure that they are marked as
      being in the CPU domain since any external access is uncontrolled and we
      must assume the worst. This means that we need to always flush the pages
      on acquisition if we need to use them on the GPU, and from the beginning
      have used set-domain. Set-domain is overkill for the purpose as it is a
      general synchronisation barrier, but our intent is to only flush the
      pages being swapped in. If we move that flush into the pages acquisition
      phase, we know then that when we have obj->mm.pages, they are coherent
      with the GPU and need only maintain that status without resorting to
      heavy handed use of set-domain.
      
      The principle knock-on effect for userspace is through mmap-gtt
      pagefaulting. Our uAPI has always implied that the GTT mmap was async
      (especially as when any pagefault occurs is unpredicatable to userspace)
      and so userspace had to apply explicit domain control itself
      (set-domain). However, swapping is transparent to the kernel, and so on
      first fault we need to acquire the pages and make them coherent for
      access through the GTT. Our use of set-domain here leaks into the uABI
      that the first pagefault was synchronous. This is unintentional and
      baring a few igt should be unoticed, nevertheless we bump the uABI
      version for mmap-gtt to reflect the change in behaviour.
      
      Another implication of the change is that gem_create() is presumed to
      create an object that is coherent with the CPU and is in the CPU write
      domain, so a set-domain(CPU) following a gem_create() would be a minor
      operation that merely checked whether we could allocate all pages for
      the object. On applying this change, a set-domain(CPU) causes a clflush
      as we acquire the pages. This will have a small impact on mesa as we move
      the clflush here on !llc from execbuf time to create, but that should
      have minimal performance impact as the same clflush exists but is now
      done early and because of the clflush issue, userspace recycles bo and
      so should resist allocating fresh objects.
      
      Internally, the presumption that objects are created in the CPU
      write-domain and remain so through writes to obj->mm.mapping is more
      prevalent than I expected; but easy enough to catch and apply a manual
      flush.
      
      For the future, we should push the page flush from the central
      set_pages() into the callers so that we can more finely control when it
      is applied, but for now doing it one location is easier to validate, at
      the cost of sometimes flushing when there is no need.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Matthew Auld <matthew.william.auld@gmail.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: Antonio Argenziano <antonio.argenziano@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-1-chris@chris-wilson.co.uk
      a679f58d
  9. 21 3月, 2019 2 次提交
  10. 19 3月, 2019 4 次提交
  11. 15 3月, 2019 2 次提交
    • C
      drm/i915: Always kick the execlists tasklet after reset · 41a1bde3
      Chris Wilson 提交于
      With direct submission being disabled while the reset in progress, we
      have a small window where we may forgo the submission of a new request
      and not notice its addition during execlists_reset_finish. To close this
      window, always schedule the submission tasklet on coming out of reset to
      catch any residual work.
      
      <6> [333.144082] i915: Running intel_hangcheck_live_selftests/igt_reset_engines
      <3> [333.296927] i915_reset_engine(rcs0:idle): failed to idle after reset
      <6> [333.296932] i915 0000:00:02.0: [drm] rcs0
      <6> [333.296934] i915 0000:00:02.0: [drm] 	Hangcheck 0:a9ddf7a5 [4157 ms]
      <6> [333.296936] i915 0000:00:02.0: [drm] 	Reset count: 36048 (global 754)
      <6> [333.296938] i915 0000:00:02.0: [drm] 	Requests:
      <6> [333.296997] i915 0000:00:02.0: [drm] 	RING_START: 0x00000000
      <6> [333.296999] i915 0000:00:02.0: [drm] 	RING_HEAD:  0x00000000
      <6> [333.297001] i915 0000:00:02.0: [drm] 	RING_TAIL:  0x00000000
      <6> [333.297003] i915 0000:00:02.0: [drm] 	RING_CTL:   0x00000000
      <6> [333.297005] i915 0000:00:02.0: [drm] 	RING_MODE:  0x00000200 [idle]
      <6> [333.297007] i915 0000:00:02.0: [drm] 	RING_IMR: fffffeff
      <6> [333.297010] i915 0000:00:02.0: [drm] 	ACTHD:  0x00000000_00000000
      <6> [333.297012] i915 0000:00:02.0: [drm] 	BBADDR: 0x00000000_00000000
      <6> [333.297015] i915 0000:00:02.0: [drm] 	DMA_FADDR: 0x00000000_00000000
      <6> [333.297017] i915 0000:00:02.0: [drm] 	IPEIR: 0x00000000
      <6> [333.297019] i915 0000:00:02.0: [drm] 	IPEHR: 0x00000000
      <6> [333.297021] i915 0000:00:02.0: [drm] 	Execlist status: 0x00000001 00000000
      <6> [333.297023] i915 0000:00:02.0: [drm] 	Execlist CSB read 5, write 5 [mmio:7], tasklet queued? no (enabled)
      <6> [333.297025] i915 0000:00:02.0: [drm] 		ELSP[0] idle
      <6> [333.297027] i915 0000:00:02.0: [drm] 		ELSP[1] idle
      <6> [333.297028] i915 0000:00:02.0: [drm] 		HW active? 0x0
      <6> [333.297044] i915 0000:00:02.0: [drm] 		Queue priority hint: -8186
      <6> [333.297067] i915 0000:00:02.0: [drm] 		Q  2afac:5f2+  prio=-8186 @ 50ms: (null)
      <6> [333.297068] i915 0000:00:02.0: [drm] HWSP:
      <6> [333.297071] i915 0000:00:02.0: [drm] [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      <6> [333.297073] i915 0000:00:02.0: [drm] *
      <6> [333.297075] i915 0000:00:02.0: [drm] [0040] 00000001 00000000 00000018 00000002 00000001 00000000 00000018 00000000
      <6> [333.297077] i915 0000:00:02.0: [drm] [0060] 00000001 00000000 00008002 00000002 00000000 00000000 00000000 00000005
      <6> [333.297079] i915 0000:00:02.0: [drm] [0080] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      <6> [333.297081] i915 0000:00:02.0: [drm] *
      <6> [333.297083] i915 0000:00:02.0: [drm] [00c0] 00000000 00000000 00000000 00000000 a9ddf7a5 00000000 00000000 00000000
      <6> [333.297085] i915 0000:00:02.0: [drm] [00e0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      <6> [333.297087] i915 0000:00:02.0: [drm] *
      <6> [333.297089] i915 0000:00:02.0: [drm] Idle? no
      <6> [333.297090] i915_reset_engine(rcs0:idle): 3000 resets
      <3> [333.297092] i915/intel_hangcheck_live_selftests: igt_reset_engines failed with error -5
      <3> [333.455460] i915 0000:00:02.0: Failed to idle engines, declaring wedged!
      ...
      <0> [333.491294] i915_sel-4916    1.... 333262143us : i915_reset_engine: rcs0 flags=4
      <0> [333.491328] i915_sel-4916    1.... 333262143us : execlists_reset_prepare: rcs0: depth<-0
      <0> [333.491362] i915_sel-4916    1.... 333262143us : intel_engine_stop_cs: rcs0
      <0> [333.491396] i915_sel-4916    1d..1 333262144us : process_csb: rcs0 cs-irq head=5, tail=5
      <0> [333.491424] i915_sel-4916    1.... 333262145us : intel_gpu_reset: engine_mask=1
      <0> [333.491454] kworker/-214     5.... 333262184us : i915_gem_switch_to_kernel_context: awake?=yes
      <0> [333.491487] kworker/-214     5.... 333262192us : i915_request_add: rcs0 fence 2afac:1522
      <0> [333.491520] kworker/-214     5.... 333262193us : i915_request_add: marking (null) as active
      <0> [333.491553] i915_sel-4916    1.... 333262199us : intel_engine_cancel_stop_cs: rcs0
      <0> [333.491587] i915_sel-4916    1.... 333262199us : execlists_reset_finish: rcs0: depth->0
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190313162835.30228-1-chris@chris-wilson.co.uk
      41a1bde3
    • C
      drm/i915/gtt: Rename i915_vm_is_48b to i915_vm_is_4lvl · a9fe9ca4
      Chris Wilson 提交于
      Large ppGTT are differentiated by the requirement to go to four levels
      to address more than 32b. Given the introduction of more 4 level ppGTT
      with different sizes of addressable bits, rename i915_vm_is_48b() to
      better reflect the commonality of using 4 levels.
      
      Based on a patch by Bob Paauwe.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Bob Paauwe <bob.j.paauwe@intel.com>
      Cc: Matthew Auld <matthew.william.auld@gmail.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-4-chris@chris-wilson.co.uk
      a9fe9ca4
  12. 12 3月, 2019 1 次提交
  13. 08 3月, 2019 6 次提交
  14. 06 3月, 2019 2 次提交
  15. 02 3月, 2019 3 次提交
    • C
      drm/i915: Prioritise non-busywait semaphore workloads · f9e9e9de
      Chris Wilson 提交于
      We don't want to busywait on the GPU if we have other work to do. If we
      give non-busywaiting workloads higher (initial) priority than workloads
      that require a busywait, we will prioritise work that is ready to run
      immediately. We then also have to be careful that we don't give earlier
      semaphores an accidental boost because later work doesn't wait on other
      rings, hence we keep a history of semaphore usage of the dependency chain.
      
      v2: Stop rolling the bits into a chain and just use a flag in case this
      request or any of our dependencies use a semaphore. The rolling around
      was contagious as Tvrtko was heard to fall off his chair.
      
      Testcase: igt/gem_exec_schedule/semaphore
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-4-chris@chris-wilson.co.uk
      f9e9e9de
    • C
      drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ · e8861964
      Chris Wilson 提交于
      Having introduced per-context seqno, we now have a means to identity
      progress across the system without feel of rollback as befell the
      global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in
      advance of submission safe in the knowledge that our target seqno and
      address is stable.
      
      However, since we are telling the GPU to busy-spin on the target address
      until it matches the signaling seqno, we only want to do so when we are
      sure that busy-spin will be completed quickly. To achieve this we only
      submit the request to HW once the signaler is itself executing (modulo
      preemption causing us to wait longer), and we only do so for default and
      above priority requests (so that idle priority tasks never themselves
      hog the GPU waiting for others).
      
      As might be reasonably expected, HW semaphores excel in inter-engine
      synchronisation microbenchmarks (where the 3x reduced latency / increased
      throughput more than offset the power cost of spinning on a second ring)
      and have significant improvement (can be up to ~10%, most see no change)
      for single clients that utilize multiple engines (typically media players
      and transcoders), without regressing multiple clients that can saturate
      the system or changing the power envelope dramatically.
      
      v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway.
      v4: Tell the world and include it as part of scheduler caps.
      
      Testcase: igt/gem_exec_whisper
      Testcase: igt/benchmarks/gem_wsim
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-3-chris@chris-wilson.co.uk
      e8861964
    • C
      drm/i915/execlists: Suppress redundant preemption · 1e3f697e
      Chris Wilson 提交于
      On unwinding the active request we give it a small (limited to internal
      priority levels) boost to prevent it from being gazumped a second time.
      However, this means that it can be promoted to above the request that
      triggered the preemption request, causing a preempt-to-idle cycle for no
      change. We can avoid this if we take the boost into account when
      checking if the preemption request is valid.
      
      v2: After preemption the active request will be after the preemptee if
      they end up with equal priority.
      
      v3: Tvrtko pointed out that this, the existing logic, makes
      I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!
      
      v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
      v5: Except not all priorities were made equal, and the WAIT not preempting
      is only if we start off as !NEWCLIENT.
      
      v6: More commentary after coming to an understanding about what I had
      forgotten to say.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-1-chris@chris-wilson.co.uk
      1e3f697e
  16. 01 3月, 2019 1 次提交
    • C
      drm/i915/execlists: Suppress mere WAIT preemption · b5773a36
      Chris Wilson 提交于
      WAIT is occasionally suppressed by virtue of preempted requests being
      promoted to NEWCLIENT if they have not all ready received that boost.
      Make this consistent for all WAIT boosts that they are not allowed to
      preempt executing contexts and are merely granted the right to be at the
      front of the queue for the next execution slot. This is in keeping with
      the desire that the WAIT boost be a minor tweak that does not give
      excessive promotion to its user and open ourselves to trivial abuse.
      
      The problem with the inconsistent WAIT preemption becomes more apparent
      as the preemption is propagated across the engines, where one engine may
      preempt and the other not, and we be relying on the exact execution
      order being consistent across engines (e.g. using HW semaphores to
      coordinate parallel execution).
      
      v2: Also protect GuC submission from false preemption loops.
      v3: Build bug safeguards and better debug messages for st.
      v4: Do the priority bumping in unsubmit (i.e. on preemption/reset
      unwind), applying it earlier during submit causes out-of-order execution
      combined with execute fences.
      v5: Call sw_fence_fini for our dummy request (Matthew)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190228220639.3173-1-chris@chris-wilson.co.uk
      b5773a36
  17. 28 2月, 2019 2 次提交