1. 14 3月, 2017 1 次提交
  2. 09 3月, 2017 2 次提交
    • I
      drm/i915/gen9: Increase PCODE request timeout to 50ms · d253371c
      Imre Deak 提交于
      After
      commit 2c7d0602
      Author: Imre Deak <imre.deak@intel.com>
      Date:   Mon Dec 5 18:27:37 2016 +0200
      
          drm/i915/gen9: Fix PCODE polling during CDCLK change notification
      
      there is still one report of the CDCLK-change request timing out on a
      KBL machine, see the Reference link. On that machine the maximum time
      the request took to succeed was 34ms, so increase the timeout to 50ms.
      
      v2:
      - Change timeout from 100 to 50 ms to maintain the current 50 ms limit
        for atomic waits in the driver. (Chris, Tvrtko)
      
      Reference: https://bugs.freedesktop.org/show_bug.cgi?id=99345
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1487946730-17162-1-git-send-email-imre.deak@intel.com
      (cherry picked from commit 0129936d)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      d253371c
    • M
      drm/i915: Avoid tweaking evaluation thresholds on Baytrail v3 · 34dc8993
      Mika Kuoppala 提交于
      Certain Baytrails, namely the 4 cpu core variants, have been
      plaqued by spurious system hangs, mostly occurring with light loads.
      
      Multiple bisects by various people point to a commit which changes the
      reclocking strategy for Baytrail to follow its bigger brethen:
      commit 8fb55197 ("drm/i915: Agressive downclocking on Baytrail")
      
      There is also a review comment attached to this commit from Deepak S
      on avoiding punit access on Cherryview and thus it was excluded on
      common reclocking path. By taking the same approach and omitting
      the punit access by not tweaking the thresholds when the hardware
      has been asked to move into different frequency, considerable gains
      in stability have been observed.
      
      With J1900 box, light render/video load would end up in system hang
      in usually less than 12 hours. With this patch applied, the cumulative
      uptime has now been 34 days without issues. To provoke system hang,
      light loads on both render and bsd engines in parallel have been used:
      glxgears >/dev/null 2>/dev/null &
      mpv --vo=vaapi --hwdec=vaapi --loop=inf vid.mp4
      
      So far, author has not witnessed system hang with above load
      and this patch applied. Reports from the tenacious people at
      kernel bugzilla are also promising.
      
      Considering that the punit access frequency with this patch is
      considerably less, there is a possibility that this will push
      the, still unknown, root cause past the triggering point on most loads.
      
      But as we now can reliably reproduce the hang independently,
      we can reduce the pain that users are having and use a
      static thresholds until a root cause is found.
      
      v3: don't break debugfs and simplification (Chris Wilson)
      
      References: https://bugzilla.kernel.org/show_bug.cgi?id=109051
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: fritsch@xbmc.org
      Cc: miku@iki.fi
      Cc: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
      CC: Michal Feix <michal@feix.cz>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Deepak S <deepak.s@linux.intel.com>
      Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v4.2+
      Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1487166779-26945-1-git-send-email-mika.kuoppala@intel.com
      (cherry picked from commit 6067a27d)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      34dc8993
  3. 03 1月, 2017 1 次提交
  4. 23 12月, 2016 1 次提交
  5. 20 12月, 2016 2 次提交
    • I
      drm/i915/gen9: Fix PCODE polling during SAGV disabling · dccf82ad
      Imre Deak 提交于
      According to the previous patch, it's possible atm that we call
      intel_do_sagv_disable() only once during the 1ms period and time out if
      that call fails. As opposed to this the spec says that we need to keep
      retrying this request for a 1ms duration, so let's do this similarly to
      the CDCLK change notification request.
      
      v4-5:
      - Rebased on the reply_mask, reply change.
      v6:
      - Remove w/s change. (Lyude)
      - Rebased on the timeout_base argument change.
      
      Cc: Lyude <cpaul@redhat.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Fixes: 656d1b89 ("drm/i915/skl: Add support for the SAGV, fix underrun hangs")
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Reviewed-by: Lyude <lyude@redhat.com> (v4)
      Link: http://patchwork.freedesktop.org/patch/msgid/1480955258-26311-2-git-send-email-imre.deak@intel.com
      (cherry picked from commit b3b8e999)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      dccf82ad
    • I
      drm/i915/gen9: Fix PCODE polling during CDCLK change notification · 2c7d0602
      Imre Deak 提交于
      commit 848496e5
      Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Date:   Wed Jul 13 16:32:03 2016 +0300
      
          drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
      
      increased the timeout to match the spec, but we still see a timeout on
      at least one SKL. A CDCLK change request following the failed one will
      succeed nevertheless.
      
      I could reproduce this problem easily by running kms_pipe_crc_basic in a
      loop. In all failure cases _wait_for() was pre-empted for >3ms and so in
      the worst case - when the pre-emption happened right after calculating
      timeout__ in _wait_for() - we called skl_cdclk_wait_for_pcu_ready() only
      once which failed and so _wait_for() timed out. As opposed to this the
      spec says to keep retrying the request for at most a 3ms period.
      
      To fix this send the first request explicitly to guarantee that there is
      3ms between the first and last request. Though this matches the spec, I
      noticed that in rare cases this can still time out if we sent only a few
      requests (in the worst case 2) _and_ PCODE is busy for some reason even
      after a previous request and a 3ms delay. To work around this retry the
      polling with pre-emption disabled to maximize the number of requests.
      Also increase the timeout to 10ms to account for interrupts that could
      reduce the number of requests. With this change I couldn't trigger
      the problem.
      
      v2:
      - Use 1ms poll period instead of 10us. (Chris)
      v3:
      - Poll with pre-emption disabled to increase the number of request
        attempts. (Ville, Chris)
      - Factor out a helper to poll, it's also needed by the next patch.
      v4:
      - Pass reply_mask, reply to skl_pcode_request(), instead of assuming the
        reply is generic. (Ville)
      v5:
      - List the request specific timeout values as code comment. (Ville)
      v6:
      - Try the poll first with preemption enabled.
      - Add code comment about first request being queued by PCODE. (Art)
      - Add timeout_base_ms argument. (Ville)
      v7:
      - Clarify code comment about first queued request. (Chris)
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Art Runyan <arthur.j.runyan@intel.com>
      Cc: <stable@vger.kernel.org> # v4.2- : 3b2c1710 : drm/i915: Wait up to 3ms
      Cc: <stable@vger.kernel.org> # v4.2-
      Fixes: 5d96d8af ("drm/i915/skl: Deinit/init the display at suspend/resume")
      Reference: https://bugs.freedesktop.org/show_bug.cgi?id=97929
      Testcase: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-B
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1480955258-26311-1-git-send-email-imre.deak@intel.com
      (cherry picked from commit a0b8a1fe)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      2c7d0602
  6. 19 12月, 2016 1 次提交
    • C
      drm/i915: Unify active context tracking between legacy/execlists/guc · e8a9c58f
      Chris Wilson 提交于
      The requests conversion introduced a nasty bug where we could generate a
      new request in the middle of constructing a request if we needed to idle
      the system in order to evict space for a context. The request to idle
      would be executed (and waited upon) before the current one, creating a
      minor havoc in the seqno accounting, as we will consider the current
      request to already be completed (prior to deferred seqno assignment) but
      ring->last_retired_head would have been updated and still could allow
      us to overwrite the current request before execution.
      
      We also employed two different mechanisms to track the active context
      until it was switched out. The legacy method allowed for waiting upon an
      active context (it could forcibly evict any vma, including context's),
      but the execlists method took a step backwards by pinning the vma for
      the entire active lifespan of the context (the only way to evict was to
      idle the entire GPU, not individual contexts). However, to circumvent
      the tricky issue of locking (i.e. we cannot take struct_mutex at the
      time of i915_gem_request_submit(), where we would want to move the
      previous context onto the active tracker and unpin it), we take the
      execlists approach and keep the contexts pinned until retirement.
      The benefit of the execlists approach, more important for execlists than
      legacy, was the reduction in work in pinning the context for each
      request - as the context was kept pinned until idle, it could short
      circuit the pinning for all active contexts.
      
      We introduce new engine vfuncs to pin and unpin the context
      respectively. The context is pinned at the start of the request, and
      only unpinned when the following request is retired (this ensures that
      the context is idle and coherent in main memory before we unpin it). We
      move the engine->last_context tracking into the retirement itself
      (rather than during request submission) in order to allow the submission
      to be reordered or unwound without undue difficultly.
      
      And finally an ulterior motive for unifying context handling was to
      prepare for mock requests.
      
      v2: Rename to last_retired_context, split out legacy_context tracking
      for MI_SET_CONTEXT.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-3-chris@chris-wilson.co.uk
      e8a9c58f
  7. 15 12月, 2016 3 次提交
  8. 09 12月, 2016 2 次提交
    • I
      drm/i915/gen9: Fix PCODE polling during SAGV disabling · b3b8e999
      Imre Deak 提交于
      According to the previous patch, it's possible atm that we call
      intel_do_sagv_disable() only once during the 1ms period and time out if
      that call fails. As opposed to this the spec says that we need to keep
      retrying this request for a 1ms duration, so let's do this similarly to
      the CDCLK change notification request.
      
      v4-5:
      - Rebased on the reply_mask, reply change.
      v6:
      - Remove w/s change. (Lyude)
      - Rebased on the timeout_base argument change.
      
      Cc: Lyude <cpaul@redhat.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Fixes: 656d1b89 ("drm/i915/skl: Add support for the SAGV, fix underrun hangs")
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Reviewed-by: Lyude <lyude@redhat.com> (v4)
      Link: http://patchwork.freedesktop.org/patch/msgid/1480955258-26311-2-git-send-email-imre.deak@intel.com
      b3b8e999
    • I
      drm/i915/gen9: Fix PCODE polling during CDCLK change notification · a0b8a1fe
      Imre Deak 提交于
      commit 848496e5
      Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Date:   Wed Jul 13 16:32:03 2016 +0300
      
          drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
      
      increased the timeout to match the spec, but we still see a timeout on
      at least one SKL. A CDCLK change request following the failed one will
      succeed nevertheless.
      
      I could reproduce this problem easily by running kms_pipe_crc_basic in a
      loop. In all failure cases _wait_for() was pre-empted for >3ms and so in
      the worst case - when the pre-emption happened right after calculating
      timeout__ in _wait_for() - we called skl_cdclk_wait_for_pcu_ready() only
      once which failed and so _wait_for() timed out. As opposed to this the
      spec says to keep retrying the request for at most a 3ms period.
      
      To fix this send the first request explicitly to guarantee that there is
      3ms between the first and last request. Though this matches the spec, I
      noticed that in rare cases this can still time out if we sent only a few
      requests (in the worst case 2) _and_ PCODE is busy for some reason even
      after a previous request and a 3ms delay. To work around this retry the
      polling with pre-emption disabled to maximize the number of requests.
      Also increase the timeout to 10ms to account for interrupts that could
      reduce the number of requests. With this change I couldn't trigger
      the problem.
      
      v2:
      - Use 1ms poll period instead of 10us. (Chris)
      v3:
      - Poll with pre-emption disabled to increase the number of request
        attempts. (Ville, Chris)
      - Factor out a helper to poll, it's also needed by the next patch.
      v4:
      - Pass reply_mask, reply to skl_pcode_request(), instead of assuming the
        reply is generic. (Ville)
      v5:
      - List the request specific timeout values as code comment. (Ville)
      v6:
      - Try the poll first with preemption enabled.
      - Add code comment about first request being queued by PCODE. (Art)
      - Add timeout_base_ms argument. (Ville)
      v7:
      - Clarify code comment about first queued request. (Chris)
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Art Runyan <arthur.j.runyan@intel.com>
      Cc: <stable@vger.kernel.org> # v4.2- : 3b2c1710 : drm/i915: Wait up to 3ms
      Cc: <stable@vger.kernel.org> # v4.2-
      Fixes: 5d96d8af ("drm/i915/skl: Deinit/init the display at suspend/resume")
      Reference: https://bugs.freedesktop.org/show_bug.cgi?id=97929
      Testcase: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-B
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1480955258-26311-1-git-send-email-imre.deak@intel.com
      a0b8a1fe
  9. 08 12月, 2016 4 次提交
  10. 07 12月, 2016 2 次提交
  11. 05 12月, 2016 13 次提交
  12. 02 12月, 2016 4 次提交
  13. 24 11月, 2016 3 次提交
  14. 17 11月, 2016 1 次提交