1. 07 9月, 2017 1 次提交
  2. 01 9月, 2017 2 次提交
  3. 22 8月, 2017 1 次提交
  4. 27 7月, 2017 1 次提交
    • C
      drm/i915: Disable per-engine reset for Broxton · 2b49e721
      Chris Wilson 提交于
      Triggering a GPU reset for one engine affects another, notably
      corrupting the context status buffer (CSB) effectively losing track of
      inflight requests.
      
      Adding a few printks:
      diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
      index ad41836fa5e5..a969456bc0fa 100644
      --- a/drivers/gpu/drm/i915/i915_drv.c
      +++ b/drivers/gpu/drm/i915/i915_drv.c
      @@ -1953,6 +1953,7 @@ int i915_reset_engine(struct intel_engine_cs *engine)
                      goto out;
              }
      
      +       pr_err("Resetting %s\n", engine->name);
              ret = intel_gpu_reset(engine->i915, intel_engine_flag(engine));
              if (ret) {
                      /* If we fail here, we expect to fallback to a global reset */
      diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
      index 716e5c9ea222..a72bc35d0870 100644
      --- a/drivers/gpu/drm/i915/intel_lrc.c
      +++ b/drivers/gpu/drm/i915/intel_lrc.c
      @@ -355,6 +355,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
                                      execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
                              port_set(&port[n], port_pack(rq, count));
                              desc = execlists_update_context(rq);
      +                       pr_err("%s: in (rq=%x) ctx=%d\n", engine->name, rq->global_seqno, upper_32_bits(desc));
                              GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
                      } else {
                              GEM_BUG_ON(!n);
      @@ -594,9 +595,23 @@ static void intel_lrc_irq_handler(unsigned long data)
                              if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
                                      continue;
      
      +                       pr_err("%s: out CSB (%x head=%d, tail=%d), ctx=%d, rq=%d\n",
      +                                       engine->name,
      +                                       readl(csb_mmio),
      +                                       head, tail,
      +                                       readl(buf+2*head+1),
      +                                       port->context_id);
      +
                              /* Check the context/desc id for this event matches */
      -                       GEM_DEBUG_BUG_ON(readl(buf + 2 * head + 1) !=
      -                                        port->context_id);
      +                       if (readl(buf + 2 * head + 1) != port->context_id) {
      +                               pr_err("%s: BUG CSB (%x head=%d, tail=%d), ctx=%d, rq=%d\n",
      +                                               engine->name,
      +                                               readl(csb_mmio),
      +                                               head, tail,
      +                                               readl(buf+2*head+1),
      +                                               port->context_id);
      +                               BUG();
      +                       }
      
                              rq = port_unpack(port, &count);
                              GEM_BUG_ON(count == 0);
      
      Results in:
      
      [ 6423.006602] Resetting rcs0
      [ 6423.009080] rcs0: in (rq=fffffe70) ctx=1
      [ 6423.009216] rcs0: in (rq=fffffe6f) ctx=3
      [ 6423.009542] rcs0: out CSB (2 head=1, tail=2), ctx=3, rq=3
      [ 6423.009619] Resetting bcs0
      [ 6423.009980] rcs0: BUG CSB (0 head=1, tail=2), ctx=0, rq=3
      
      Note that this bug may be affect all machines and not just Broxton,
      Broxton is just the first machine on which I have confirmed this bug.
      
      Fixes: 142bc7d9 ("drm/i915: Modify error handler for per engine hang recovery")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Acked-by: NMichel Thierry <michel.thierry@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-13-chris@chris-wilson.co.ukSigned-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2b49e721
  5. 07 7月, 2017 1 次提交
  6. 21 6月, 2017 1 次提交
    • M
      drm/i915: Modify error handler for per engine hang recovery · 142bc7d9
      Michel Thierry 提交于
      This is a preparatory patch which modifies error handler to do per engine
      hang recovery. The actual patch which implements this sequence follows
      later in the series. The aim is to prepare existing recovery function to
      adapt to this new function where applicable (which fails at this point
      because core implementation is lacking) and continue recovery using legacy
      full gpu reset.
      
      A helper function is also added to query the availability of engine
      reset. A subsequent patch will add the capability to query which type
      of reset is present (engine -> full -> no-reset) via the get-param
      ioctl.
      
      It has been decided that the error events that are used to notify user of
      reset will only be sent in case if full chip reset. In case of just
      single (or multiple) engine resets, userspace won't be notified by these
      events.
      
      Note that this implementation of engine reset is for i915 directly
      submitting to the ELSP, where the driver manages the hang detection,
      recovery and resubmission. With GuC submission these tasks are shared
      between driver and firmware; i915 will still responsible for detecting a
      hang, and when it does it will have to request GuC to reset that Engine and
      remind the firmware about the outstanding submissions. This will be
      added in different patch.
      
      v2: rebase, advertise engine reset availability in platform definition,
      add note about GuC submission.
      v3: s/*engine_reset*/*reset_engine*/. (Chris)
      Handle reset as 2 level resets, by first going to engine only and fall
      backing to full/chip reset as needed, i.e. reset_engine will need the
      struct_mutex.
      v4: Pass the engine mask to i915_reset. (Chris)
      v5: Rebase, update selftests.
      v6: Rebase, prepare for mutex-less reset engine.
      v7: Pass reset_engine mask as a function parameter, and iterate over the
      engine mask for reset_engine. (Chris)
      v8: Use i915.reset >=2 in has_reset_engine; remove redundant reset
      logging; add a reset-engine-in-progress flag to prevent concurrent
      resets, and avoid dual purposing of reset-backoff. (Chris)
      v9: Support reset of different engines in parallel (Chris)
      v10: Handle reset-engine flag locking better (Chris)
      v11: Squash in reporting of per-engine-reset availability.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: NIan Lister <ian.lister@intel.com>
      Signed-off-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
      Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170615201828.23144-4-michel.thierry@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170620095751.13127-5-chris@chris-wilson.co.uk
      142bc7d9
  7. 13 6月, 2017 1 次提交
  8. 10 6月, 2017 5 次提交
  9. 09 6月, 2017 2 次提交
  10. 07 6月, 2017 6 次提交
  11. 30 5月, 2017 1 次提交
  12. 28 4月, 2017 1 次提交
    • J
      drm/i915: Eliminate HAS_HW_CONTEXTS · f2e4d76e
      Joonas Lahtinen 提交于
      HAS_HW_CONTEXTS is misleading condition for GPU reset and CCID,
      replace it with Gen specific (to be updated in next patches).
      
      HAS_HW_CONTEXTS in i915_l3_write is bogus because each HAS_L3_DPF
      match also has .has_hw_contexts = 1 set.
      
      This leads to us being able to get rid of the property completely.
      
      v2:
      - Keep the checks at Gen6 for no functional change (Ville)
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      f2e4d76e
  13. 29 3月, 2017 1 次提交
  14. 27 3月, 2017 1 次提交
  15. 14 2月, 2017 1 次提交
    • C
      drm/i915: Provide a hook for selftests · 953c7f82
      Chris Wilson 提交于
      Some pieces of code are independent of hardware but are very tricky to
      exercise through the normal userspace ABI or via debugfs hooks. Being
      able to create mock unit tests and execute them through CI is vital.
      Start by adding a central point where we can execute unit tests and
      a parameter to enable them. This is disabled by default as the
      expectation is that these tests will occasionally explode.
      
      To facilitate integration with igt, any parameter beginning with
      i915.igt__ is interpreted as a subtest executable independently via
      igt/drv_selftest.
      
      Two classes of selftests are recognised: mock unit tests and integration
      tests. Mock unit tests are run as soon as the module is loaded, before
      the device is probed. At that point there is no driver instantiated and
      all hw interactions must be "mocked". This is very useful for writing
      universal tests to exercise code not typically run on a broad range of
      architectures. Alternatively, you can hook into the live selftests and
      run when the device has been instantiated - hw interactions are real.
      
      v2: Add a macro for compiling conditional code for mock objects inside
      real objects.
      v3: Differentiate between mock unit tests and late integration test.
      v4: List the tests in natural order, use igt to sort after modparam.
      v5: s/late/live/
      v6: s/unsigned long/unsigned int/
      v7: Use igt_ prefixes for long helpers.
      v8: Deobfuscate macros overriding functions, stop using -I$(src)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-1-chris@chris-wilson.co.uk
      953c7f82
  16. 30 1月, 2017 1 次提交
  17. 05 1月, 2017 1 次提交
    • P
      drm/i915: actually drive the BDW reserved IDs · 98b2f01c
      Paulo Zanoni 提交于
      Back in 2014, commit fb7023e0 ("drm/i915: BDW: Adding Reserved PCI
      IDs.") added the reserved PCI IDs in order to try to make sure we had
      working drivers in case we ever released products using these IDs
      (since we had instances of this type of problem in the past). The
      problem is that the patch only touched the macros used by
      early-quirks.c and by the user space components that rely on
      i915_pciids.h, it didn't touch the macros used by i915_pci.c. So we
      correctly handled the stolen memory for these theoretical IDs, but we
      didn't actually drive the devices from i915.ko.
      
      So this patch fixes the original commit by actually making i915.ko
      drive these IDs, which was the goal. There's no information on what
      would be the GT count on these IDs, so we just go with the safer
      intel_broadwell_info, at the risk of ignoring a possibly inexistent
      BSD2_RING.
      
      I did some checking, and it seems that these IDs are driven by
      intel-gpu-tools, xf86-video-intel and libdrm (since they contain old
      copies of i915_pciids.h), but they are not checked by mesa.
      
      The alternative to this patch would be to just assume we're actually
      never going to use these IDs, and then remove them from our ID lists
      and make sure our user space components sync the latest i915_pciids.h
      copy. I'm fine with either approaches, as long as we make sure that
      every component tries to drive the same list of PCI IDs.
      
      Fixes: fb7023e0 ("drm/i915: BDW: Adding Reserved PCI IDs.")
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1483473860-17644-3-git-send-email-paulo.r.zanoni@intel.com
      98b2f01c
  18. 21 12月, 2016 1 次提交
  19. 20 12月, 2016 1 次提交
  20. 08 12月, 2016 1 次提交
  21. 07 12月, 2016 6 次提交
  22. 01 12月, 2016 3 次提交