1. 29 1月, 2016 2 次提交
    • T
      drm/i915: Fix premature LRC unpin in GuC mode · f4e2dece
      Tvrtko Ursulin 提交于
      In GuC mode LRC pinning lifetime depends exclusively on the
      request liftime. Since that is terminated by the seqno update
      that opens up a race condition between GPU finishing writing
      out the context image and the driver unpinning the LRC.
      
      To extend the LRC lifetime we will employ a similar approach
      to what legacy ringbuffer submission does.
      
      We will start tracking the last submitted context per engine
      and keep it pinned until it is replaced by another one.
      
      Note that the driver unload path is a bit fragile and could
      benefit greatly from efforts to unify the legacy and exec
      list submission code paths.
      
      At the moment i915_gem_context_fini has special casing for the
      two which are potentialy not needed, and also depends on
      i915_gem_cleanup_ringbuffer running before itself.
      
      v2:
       * Move pinning into engine->emit_request and actually fix
         the reference/unreference logic. (Chris Wilson)
      
       * ring->dev can be NULL on driver unload so use a different
         route towards it.
      
      v3:
       * Rebase.
       * Handle the reset path. (Chris Wilson)
       * Exclude default context from the pinning - it is impossible
         to get it right before default context special casing in
         general is eliminated.
      
      v4:
       * Rebased & moved context tracking to
         intel_logical_ring_advance_and_submit.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Issue: VIZ-4277
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Nick Hoath <nicholas.hoath@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1453976997-25424-1-git-send-email-tvrtko.ursulin@linux.intel.com
      f4e2dece
    • T
      drm/i915: Extract context unpinning to its own function · a0b4a6a8
      Tvrtko Ursulin 提交于
      Will enable cleaner implementation of a following fix and
      easier code unification in the future.
      
      Idea and code by Chris Wilson.
      
      v2: Do not return before last_contexts on engines are unpinned.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      a0b4a6a8
  2. 21 1月, 2016 1 次提交
  3. 13 1月, 2016 1 次提交
  4. 06 1月, 2016 1 次提交
    • C
      drm/i915: Restore inhibiting the load of the default context · 42f1cae8
      Chris Wilson 提交于
      Following a GPU reset, we may leave the context in a poorly defined
      state, and reloading from that context will leave the GPU flummoxed. For
      secondary contexts, this will lead to that context being banned - but
      currently it is also causing the default context to become banned,
      leading to turmoil in the shared state.
      
      This is a regression from
      
      commit 6702cf16 [v4.1]
      Author: Ben Widawsky <benjamin.widawsky@intel.com>
      Date:   Mon Mar 16 16:00:58 2015 +0000
      
          drm/i915: Initialize all contexts
      
      which quietly introduced the removal of the MI_RESTORE_INHIBIT on the
      default context.
      
      v2: Mark the global default context as uninitialized on GPU reset so
      that the context-local workarounds are reloaded upon re-enabling.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Link: http://patchwork.freedesktop.org/patch/msgid/1448630935-27377-1-git-send-email-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Cc: stable@vger.kernel.org
      [danvet: This seems to fix a gpu hand on after the first resume,
      resulting in any future suspend operation failing with -EIO because
      the gpu seems to be in a funky state. Somehow this patch fixes that.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      42f1cae8
  5. 10 12月, 2015 2 次提交
  6. 08 12月, 2015 1 次提交
  7. 24 11月, 2015 1 次提交
  8. 18 11月, 2015 1 次提交
  9. 19 10月, 2015 1 次提交
  10. 09 10月, 2015 1 次提交
  11. 06 10月, 2015 1 次提交
    • T
      drm/i915: Clean up associated VMAs on context destruction · e9f24d5f
      Tvrtko Ursulin 提交于
      Prevent leaking VMAs and PPGTT VMs when objects are imported
      via flink.
      
      Scenario is that any VMAs created by the importer will be left
      dangling after the importer exits, or destroys the PPGTT context
      with which they are associated.
      
      This is caused by object destruction not running when the
      importer closes the buffer object handle due the reference held
      by the exporter. This also leaks the VM since the VMA has a
      reference on it.
      
      In practice these leaks can be observed by stopping and starting
      the X server on a kernel with fbcon compiled in. Every time
      X server exits another VMA will be leaked against the fbcon's
      frame buffer object.
      
      Also on systems where flink buffer sharing is used extensively,
      like Android, this leak has even more serious consequences.
      
      This version is takes a general approach from the  earlier work
      by Rafael Barbalho (drm/i915: Clean-up PPGTT on context
      destruction) and tries to incorporate the subsequent discussion
      between Chris Wilson and Daniel Vetter.
      
      v2:
      
      Removed immediate cleanup on object retire - it was causing a
      recursive VMA unbind via i915_gem_object_wait_rendering. And
      it is in fact not even needed since by definition context
      cleanup worker runs only after the last context reference has
      been dropped, hence all VMAs against the VM belonging to the
      context are already on the inactive list.
      
      v3:
      
      Previous version could deadlock since VMA unbind waits on any
      rendering on an object to complete. Objects can be busy in a
      different VM which would mean that the cleanup loop would do
      the wait with the struct mutex held.
      
      This is an even simpler approach where we just unbind VMAs
      without waiting since we know all VMAs belonging to this VM
      are idle, and there is nothing in flight, at the point
      context destructor runs.
      
      v4:
      
      Double underscore prefix for __915_vma_unbind_no_wait and a
      commit message typo fix. (Michel Thierry)
      
      Note that this is just a partial/interim fix since we have a bit a
      fundamental issue with cleaning up, e.g.
      
      https://bugs.freedesktop.org/show_bug.cgi?id=87729Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Testcase: igt/gem_ppgtt.c/flink-and-exit-vma-leak
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Rafael Barbalho <rafael.barbalho@intel.com>
      Cc: Michel Thierry <michel.thierry@intel.com>
      [danvet: Add a note that this isn't everything.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e9f24d5f
  12. 02 9月, 2015 1 次提交
  13. 14 8月, 2015 1 次提交
    • C
      drm/i915: Remove the failed context from the fpriv->context_idr · 37876df6
      Chris Wilson 提交于
      If we encounter an allocation failure during ppggt creation (trivial
      even with 16Gib+ RAM!), we need to remove the dead context from the
      fpriv->context_idr along with the references.
      
      gem_exec_ctx: page allocation failure: order:0, mode:0x8004
      CPU: 3 PID: 27272 Comm: gem_exec_ctx Tainted: G        W       4.2.0-rc5+ #37
       0000000000000000 ffff880086ff7a78 ffffffff816b947a ffff88041ed90038
       0000000000008004 ffff880086ff7b08 ffffffff8114b1a5 ffff880086ff7ac8
       ffffffff8108d848 0000000000000000 ffffffff81ce84b8 0000000000000000
      Call Trace:
       [<ffffffff816b947a>] dump_stack+0x45/0x57
       [<ffffffff8114b1a5>] warn_alloc_failed+0xd5/0x120
       [<ffffffff8108d848>] ? __wake_up+0x48/0x60
       [<ffffffff8114e0ed>] __alloc_pages_nodemask+0x73d/0x8e0
       [<ffffffffc0472238>] ? i915_gem_execbuffer2+0x148/0x240 [i915]
       [<ffffffffc0474240>] __setup_page_dma+0x30/0x110 [i915]
       [<ffffffffc0477f61>] gen8_ppgtt_init+0x31/0x2f0 [i915]
       [<ffffffffc04785e0>] i915_ppgtt_init+0x30/0x80 [i915]
       [<ffffffffc0478928>] i915_ppgtt_create+0x48/0xc0 [i915]
       [<ffffffffc046c9c2>] i915_gem_create_context+0x1c2/0x390 [i915]
       [<ffffffffc046d9cb>] i915_gem_context_create_ioctl+0x5b/0xa0 [i915]
      
      leading to an oops in i915_gem_context_close. Also note that this
      benchmark should not be running out of memory in the first place...
      
      Testcase: igt/benchmark/gem_exec_ctx -b create # ppgtt >= 2
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      37876df6
  14. 14 7月, 2015 1 次提交
    • C
      drm/i915: Store device pointer in contexts for late tracepoint usafe · 9ea4feec
      Chris Wilson 提交于
      [ 1572.417121] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [ 1572.421010] IP: [<ffffffffa00b2514>] ftrace_raw_event_i915_context+0x5d/0x70 [i915]
      [ 1572.424970] PGD 1766a3067 PUD 1767a2067 PMD 0
      [ 1572.428892] Oops: 0000 [#1] SMP
      [ 1572.432787] Modules linked in: ipv6 dm_mod iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore serio_raw pcspkr lpc_ich i2c_i801 mfd_core battery ac acpi_cpufreq i915 button video drm_kms_helper drm
      [ 1572.441720] CPU: 2 PID: 18853 Comm: kworker/u8:0 Not tainted 4.0.0_kcloud_3f0360_20150429+ #588
      [ 1572.446298] Workqueue: i915 i915_gem_retire_work_handler [i915]
      [ 1572.450876] task: ffff880002f428f0 ti: ffff880035724000 task.ti: ffff880035724000
      [ 1572.455557] RIP: 0010:[<ffffffffa00b2514>]  [<ffffffffa00b2514>] ftrace_raw_event_i915_context+0x5d/0x70 [i915]
      [ 1572.460423] RSP: 0018:ffff880035727ce8  EFLAGS: 00010286
      [ 1572.465262] RAX: ffff880073f1643c RBX: ffff880002da9058 RCX: ffff880073e5db40
      [ 1572.470179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880035727ce8
      [ 1572.475107] RBP: ffff88007bb11a00 R08: 0000000000000000 R09: 0000000000000000
      [ 1572.480034] R10: 0000000000362200 R11: 0000000000000008 R12: 0000000000000000
      [ 1572.484952] R13: ffff880035727d78 R14: ffff880002dc1c98 R15: ffff880002dc1dc8
      [ 1572.489886] FS:  0000000000000000(0000) GS:ffff88017fd00000(0000) knlGS:0000000000000000
      [ 1572.494883] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 1572.499859] CR2: 0000000000000000 CR3: 000000017572a000 CR4: 00000000001006e0
      [ 1572.504842] Stack:
      [ 1572.509834]  ffff88017b0090c0 ffff880073f16438 ffff880002da9058 ffff880073f1643c
      [ 1572.514904]  0000000000000246 ffff880100000000 ffff88007bb11a00 ffff880002ddeb10
      [ 1572.519985]  ffff8801759f79c0 ffffffffa0092ff0 0000000000000000 ffff88007bb11a00
      [ 1572.525049] Call Trace:
      [ 1572.530093]  [<ffffffffa0092ff0>] ? i915_gem_context_free+0xa8/0xc1 [i915]
      [ 1572.535227]  [<ffffffffa009b969>] ? i915_gem_request_free+0x4e/0x50 [i915]
      [ 1572.540347]  [<ffffffffa00b5533>] ? intel_execlists_retire_requests+0x14c/0x159 [i915]
      [ 1572.545500]  [<ffffffffa009d9ea>] ? i915_gem_retire_requests+0x9d/0xeb [i915]
      [ 1572.550664]  [<ffffffffa009dd8c>] ? i915_gem_retire_work_handler+0x4c/0x61 [i915]
      [ 1572.555825]  [<ffffffff8104ca7f>] ? process_one_work+0x1b2/0x31d
      [ 1572.560951]  [<ffffffff8104d278>] ? worker_thread+0x24d/0x339
      [ 1572.566033]  [<ffffffff8104d02b>] ? cancel_delayed_work_sync+0xa/0xa
      [ 1572.571140]  [<ffffffff81050b25>] ? kthread+0xce/0xd6
      [ 1572.576191]  [<ffffffff81050a57>] ? kthread_create_on_node+0x162/0x162
      [ 1572.581228]  [<ffffffff8179b3c8>] ? ret_from_fork+0x58/0x90
      [ 1572.586259]  [<ffffffff81050a57>] ? kthread_create_on_node+0x162/0x162
      [ 1572.591318] Code: de 48 89 e7 e8 09 4d 00 e1 48 85 c0 74 27 48 89 68 10 48 8b 55 38 48 89 e7 48 89 50 18 48 8b 55 10 48 8b 12 48 8b 12 48 8b 52 38 <8b> 12 89 50 08 e8 95 4d 00 e1 48 83 c4 30 5b 5d 41 5c c3 41 55
      [ 1572.596981] RIP  [<ffffffffa00b2514>] ftrace_raw_event_i915_context+0x5d/0x70 [i915]
      [ 1572.602464]  RSP <ffff880035727ce8>
      [ 1572.607911] CR2: 0000000000000000
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90112#c23Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9ea4feec
  15. 09 7月, 2015 1 次提交
  16. 06 7月, 2015 1 次提交
  17. 23 6月, 2015 11 次提交
  18. 29 5月, 2015 1 次提交
  19. 21 5月, 2015 1 次提交
    • C
      drm/i915: Implement inter-engine read-read optimisations · b4716185
      Chris Wilson 提交于
      Currently, we only track the last request globally across all engines.
      This prevents us from issuing concurrent read requests on e.g. the RCS
      and BCS engines (or more likely the render and media engines). Without
      semaphores, we incur costly stalls as we synchronise between rings -
      greatly impacting the current performance of Broadwell versus Haswell in
      certain workloads (like video decode). With the introduction of
      reference counted requests, it is much easier to track the last request
      per ring, as well as the last global write request so that we can
      optimise inter-engine read read requests (as well as better optimise
      certain CPU waits).
      
      v2: Fix inverted readonly condition for nonblocking waits.
      v3: Handle non-continguous engine array after waits
      v4: Rebase, tidy, rewrite ring list debugging
      v5: Use obj->active as a bitfield, it looks cool
      v6: Micro-optimise, mostly involving moving code around
      v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf)
      v8: Rebase
      v9: Refactor i915_gem_object_sync() to allow the compiler to better
      optimise it.
      
      Benchmark: igt/gem_read_read_speed
      hsw:gt3e (with semaphores):
      Before: Time to read-read 1024k:		275.794µs
      After:  Time to read-read 1024k:		123.260µs
      
      hsw:gt3e (w/o semaphores):
      Before: Time to read-read 1024k:		230.433µs
      After:  Time to read-read 1024k:		124.593µs
      
      bdw-u (w/o semaphores):             Before          After
      Time to read-read 1x1:            26.274µs       10.350µs
      Time to read-read 128x128:        40.097µs       21.366µs
      Time to read-read 256x256:        77.087µs       42.608µs
      Time to read-read 512x512:       281.999µs      181.155µs
      Time to read-read 1024x1024:    1196.141µs     1118.223µs
      Time to read-read 2048x2048:    5639.072µs     5225.837µs
      Time to read-read 4096x4096:   22401.662µs    21137.067µs
      Time to read-read 8192x8192:   89617.735µs    85637.681µs
      
      Testcase: igt/gem_concurrent_blit (read-read and friends)
      Cc: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v8]
      [danvet: s/\<rq\>/req/g]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b4716185
  20. 24 4月, 2015 2 次提交
  21. 21 4月, 2015 1 次提交
  22. 10 4月, 2015 1 次提交
  23. 20 3月, 2015 3 次提交
    • B
      drm/i915: Initialize all contexts · 6702cf16
      Ben Widawsky 提交于
      The problem is we're going to switch to a new context, which could be
      the default context. The plan was to use restore inhibit, which would be
      fine, except if we are using dynamic page tables (which we will). If we
      use dynamic page tables and we don't load new page tables, the previous
      page tables might go away, and future operations will fault.
      
      CTXA runs.
      switch to default, restore inhibit
      CTXA dies and has its address space taken away.
      Run CTXB, tries to save using the context A's address space - this
      fails.
      
      The general solution is to make sure every context has it's own state,
      and its own address space. For cases when we must restore inhibit, first
      thing we do is load a valid address space. I thought this would be
      enough, but apparently there are references within the context itself
      which will refer to the old address space - therefore, we also must
      reinitialize.
      
      v2: to->ppgtt is only valid in full ppgtt.
      v3: Rebased.
      v4: Make post PDP update clearer.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6702cf16
    • B
      drm/i915: Track page table reload need · 563222a7
      Ben Widawsky 提交于
      This patch was formerly known as, "Force pd restore when PDEs change,
      gen6-7." I had to change the name because it is needed for GEN8 too.
      
      The real issue this is trying to solve is when a new object is mapped
      into the current address space. The GPU does not snoop the new mapping
      so we must do the gen specific action to reload the page tables.
      
      GEN8 and GEN7 do differ in the way they load page tables for the RCS.
      GEN8 does so with the context restore, while GEN7 requires the proper
      load commands in the command streamer. Non-render is similar for both.
      
      Caveat for GEN7
      The docs say you cannot change the PDEs of a currently running context.
      We never map new PDEs of a running context, and expect them to be
      present - so I think this is okay. (We can unmap, but this should also
      be okay since we only unmap unreferenced objects that the GPU shouldn't
      be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag
      to signal that even if the context is the same, force a reload. It's
      unclear exactly what this does, but I have a hunch it's the right thing
      to do.
      
      The logic assumes that we always emit a context switch after mapping new
      PDEs, and before we submit a batch. This is the case today, and has been
      the case since the inception of hardware contexts. A note in the comment
      let's the user know.
      
      It's not just for gen8. If the current context has mappings change, we
      need a context reload to switch
      
      v2: Rebased after ppgtt clean up patches. Split the warning for aliasing
      and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt
      is always null.
      
      v3: Invalidate PPGTT TLBs inside alloc_va_range.
      
      v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move
      pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when
      neither ctx->ppgtt and aliasing_ppgtt exist.
      
      v5: Removed references to teardown_va_range.
      
      v6: Updated needs_pd_load_pre/post.
      
      v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move
      comment about updated PDEs to object_pin/bind (Mika).
      
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      563222a7
    • B
      drm/i915: Extract context switch skip and add pd load logic · 317b4e90
      Ben Widawsky 提交于
      In Gen8, PDPs are saved and restored with legacy contexts (legacy contexts
      only exist on the render ring). So change the ordering of LRI vs MI_SET_CONTEXT
      for the initialization of the context. Also the only cases in which we
      need to manually update the PDPs are when MI_RESTORE_INHIBIT has been
      set in MI_SET_CONTEXT (i.e. when the context is not yet initialized or
      it is the default context).
      
      Legacy submission is not available post GEN8, so it isn't necessary to
      add extra checks for newer generations.
      
      v2: Use new functions to replace the logic right away (Daniel)
      v3: Add missing pd load logic.
      v4: Add warning in case pd_load_pre & pd_load_post are true, and add
      missing trace_switch_mm. Cleaned up pd_load conditions. Add more
      information about when is pd_load_post needed. (Mika)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      317b4e90
  24. 24 2月, 2015 1 次提交
  25. 08 1月, 2015 1 次提交