1. 30 4月, 2019 2 次提交
  2. 26 4月, 2019 1 次提交
  3. 17 4月, 2019 5 次提交
  4. 11 4月, 2019 1 次提交
  5. 10 4月, 2019 2 次提交
  6. 08 4月, 2019 1 次提交
  7. 03 4月, 2019 1 次提交
  8. 25 3月, 2019 2 次提交
  9. 21 3月, 2019 1 次提交
  10. 20 3月, 2019 1 次提交
  11. 15 3月, 2019 1 次提交
  12. 14 3月, 2019 1 次提交
  13. 06 3月, 2019 1 次提交
  14. 05 3月, 2019 1 次提交
  15. 02 3月, 2019 1 次提交
  16. 21 2月, 2019 1 次提交
    • C
      drm/i915: Reduce the RPS shock · 2a8862d2
      Chris Wilson 提交于
      Limit deboosting and boosting to keep ourselves at the extremes
      when in the respective power modes (i.e. slowly decrease frequencies
      while in the HIGH_POWER zone and slowly increase frequencies while
      in the LOW_POWER zone). On idle, we will hit the timeout and drop
      to the next level quickly, and conversely if busy we expect to
      hit a waitboost and rapidly switch into max power.
      
      This should improve the UX experience by keeping the GPU clocks higher
      than they ostensibly should be (based on simple busyness) by switching
      into the INTERACTIVE mode (due to waiting for pageflips) and increasing
      clocks via waitboosting. This will incur some additional power, our
      saving grace should be rc6 and powergating to keep the extra current
      draw in check.
      
      Food for future thought would be deadline scheduling? If we know certain
      contexts (high priority compositors) absolutely must hit the next vblank
      then we can raise the frequencies ahead of time. Part of this is covered
      by per-context frequencies, where userspace is given control over the
      frequency range they want the GPU to execute at (for largely the same
      problem as this, where the workload is very latency sensitive but at the
      EI level appears mostly idle). Indeed, the per-context series does
      extend the modeset boosting to include a frequency range tweak which
      seems applicable to solving this jittery UX behaviour.
      Reported-by: NLyude Paul <lyude@redhat.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109408
      References: 0d55babc ("drm/i915: Drop stray clearing of rps->last_adj")
      References: 60548c55 ("drm/i915: Interactive RPS mode")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Lyude Paul <lyude@redhat.com>
      Cc: Eero Tamminen <eero.t.tamminen@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Michel Thierry <michel.thierry@intel.com>
      
      Quoting Lyude Paul:
      > Before reverting 0d55babc: [4.20]
      >
      > 35 measurements [of gnome-shell animations]
      > Average: 33.65657142857143 FPS
      > FPS observed: 20.8 - 46.87 FPS
      > Percentage under 60 FPS: 100.0%
      > Percentage under 55 FPS: 100.0%
      > Percentage under 50 FPS: 100.0%
      > Percentage under 45 FPS: 97.14285714285714%
      > Percentage under 40 FPS: 97.14285714285714%
      > Percentage under 35 FPS: 45.714285714285715%
      > Percentage under 30 FPS: 11.428571428571429%
      > Percentage under 25 FPS: 2.857142857142857%
      >
      > After reverting: [4.19 behaviour]
      >
      > 30 measurements
      > Average: 49.833666666666666 FPS
      > FPS observed: 33.85 - 60.0 FPS
      > Percentage under 60 FPS: 86.66666666666667%
      > Percentage under 55 FPS: 70.0%
      > Percentage under 50 FPS: 53.333333333333336%
      > Percentage under 45 FPS: 20.0%
      > Percentage under 40 FPS: 6.666666666666667%
      > Percentage under 35 FPS: 6.666666666666667%
      > Percentage under 30 FPS: 0%
      > Percentage under 25 FPS: 0%
      >
      > Patched:
      > 42 measurements
      > Average: 46.05428571428571 FPS
      > FPS observed: 1.82 - 59.98 FPS
      > Percentage under 60 FPS: 88.09523809523809%
      > Percentage under 55 FPS: 61.904761904761905%
      > Percentage under 50 FPS: 45.23809523809524%
      > Percentage under 45 FPS: 35.714285714285715%
      > Percentage under 40 FPS: 33.33333333333333%
      > Percentage under 35 FPS: 19.047619047619047%
      > Percentage under 30 FPS: 7.142857142857142%
      > Percentage under 25 FPS: 4.761904761904762%
      Tested-by: NLyude Paul <lyude@redhat.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-13-chris@chris-wilson.co.uk
      2a8862d2
  17. 16 2月, 2019 1 次提交
  18. 30 1月, 2019 2 次提交
    • C
      drm/i915: Replace global breadcrumbs with per-context interrupt tracking · 52c0fdb2
      Chris Wilson 提交于
      A few years ago, see commit 688e6c72 ("drm/i915: Slaughter the
      thundering i915_wait_request herd"), the issue of handling multiple
      clients waiting in parallel was brought to our attention. The
      requirement was that every client should be woken immediately upon its
      request being signaled, without incurring any cpu overhead.
      
      To handle certain fragility of our hw meant that we could not do a
      simple check inside the irq handler (some generations required almost
      unbounded delays before we could be sure of seqno coherency) and so
      request completion checking required delegation.
      
      Before commit 688e6c72, the solution was simple. Every client
      waiting on a request would be woken on every interrupt and each would do
      a heavyweight check to see if their request was complete. Commit
      688e6c72 introduced an rbtree so that only the earliest waiter on
      the global timeline would woken, and would wake the next and so on.
      (Along with various complications to handle requests being reordered
      along the global timeline, and also a requirement for kthread to provide
      a delegate for fence signaling that had no process context.)
      
      The global rbtree depends on knowing the execution timeline (and global
      seqno). Without knowing that order, we must instead check all contexts
      queued to the HW to see which may have advanced. We trim that list by
      only checking queued contexts that are being waited on, but still we
      keep a list of all active contexts and their active signalers that we
      inspect from inside the irq handler. By moving the waiters onto the fence
      signal list, we can combine the client wakeup with the dma_fence
      signaling (a dramatic reduction in complexity, but does require the HW
      being coherent, the seqno must be visible from the cpu before the
      interrupt is raised - we keep a timer backup just in case).
      
      Having previously fixed all the issues with irq-seqno serialisation (by
      inserting delays onto the GPU after each request instead of random delays
      on the CPU after each interrupt), we can rely on the seqno state to
      perfom direct wakeups from the interrupt handler. This allows us to
      preserve our single context switch behaviour of the current routine,
      with the only downside that we lose the RT priority sorting of wakeups.
      In general, direct wakeup latency of multiple clients is about the same
      (about 10% better in most cases) with a reduction in total CPU time spent
      in the waiter (about 20-50% depending on gen). Average herd behaviour is
      improved, but at the cost of not delegating wakeups on task_prio.
      
      v2: Capture fence signaling state for error state and add comments to
      warm even the most cold of hearts.
      v3: Check if the request is still active before busywaiting
      v4: Reduce the amount of pointer misdirection with list_for_each_safe
      and using a local i915_request variable inside the loops
      v5: Add a missing pluralisation to a purely informative selftest message.
      
      References: 688e6c72 ("drm/i915: Slaughter the thundering i915_wait_request herd")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk
      52c0fdb2
    • C
      drm/i915: Remove the intel_engine_notify tracepoint · 3df0bd19
      Chris Wilson 提交于
      The global seqno is defunct and so we have no meaningful indicator of
      forward progress for an engine. You need to listen to the request
      signaling tracepoints instead.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-1-chris@chris-wilson.co.uk
      3df0bd19
  19. 28 1月, 2019 1 次提交
  20. 26 1月, 2019 1 次提交
    • V
      drm/i915: Don't try to use the hardware frame counter with i965gm TV output · 32db0b65
      Ville Syrjälä 提交于
      On i965gm the hardware frame counter does not work when
      the TV encoder is active. So let's not try to consult
      the hardware frame counter in that case. Instead we'll
      fall back to the timestamp based guesstimation method
      used on gen2.
      
      Note that the pipe timings generated by the TV encoder
      are also rather peculiar. Apparently the pipe wants to
      run at a much higher speed (related to the oversample
      clock somehow it seems) but during the vertical active
      period the TV encoder stalls the pipe every few lines
      to keep its speed in check. But once the vertical
      blanking period is reached the pipe gets to run at full
      speed. This means our vblank timestamp estimates are
      suspect. Fixing all that would require quite a bit
      more work. This simple fix at least avoids the nasty
      vblank timeouts that are happening currently.
      
      Curiously the frame counter works just fine on i945gm
      and gm45. I don't really understand what kind of mishap
      occurred with the hardware design on i965gm. Sadly
      I wasn't able to find any chicken bits etc. that would
      fix the frame counter :(
      
      v2: Move the zero vs. non-zero hw counter value handling
          into i915_get_vblank_counter() (Daniel)
          Use the per-crtc maximum exclusively, leaving the
          per-device maximum at zero
      v3: max_vblank_count not populated yet in intel_enable_pipe()
          use intel_crtc_max_vblank_count() instead
      
      Cc: stable@vger.kernel.org
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Fixes: 51e31d49 ("drm/i915: Use generic vblank wait")
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93782Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190122125149.GE5527@ideak-desk.fi.intel.comReviewed-by: NImre Deak <imre.deak@intel.com>
      32db0b65
  21. 24 1月, 2019 1 次提交
  22. 23 1月, 2019 1 次提交
  23. 22 1月, 2019 1 次提交
  24. 17 1月, 2019 2 次提交
  25. 15 1月, 2019 2 次提交
  26. 10 1月, 2019 1 次提交
  27. 09 1月, 2019 1 次提交
  28. 31 12月, 2018 1 次提交
  29. 18 12月, 2018 1 次提交
  30. 13 12月, 2018 1 次提交