1. 03 3月, 2017 2 次提交
  2. 02 3月, 2017 4 次提交
  3. 28 2月, 2017 1 次提交
    • C
      drm/i915: Signal first fence from irq handler if complete · 56299fb7
      Chris Wilson 提交于
      As execlists and other non-semaphore multi-engine devices coordinate
      between engines using interrupts, we can shave off a few 10s of
      microsecond of scheduling latency by doing the fence signaling from the
      interrupt as opposed to a RT kthread. (Realistically the delay adds
      about 1% to an individual cross-engine workload.) We only signal the
      first fence in order to limit the amount of work we move into the
      interrupt handler. We also have to remember that our breadcrumbs may be
      unordered with respect to the interrupt and so we still require the
      waiter process to perform some heavyweight coherency fixups, as well as
      traversing the tree of waiters.
      
      v2: No need for early exit in irq handler - it breaks the flow between
      patches and prevents the tracepoint
      v3: Restore rcu hold across irq signaling of request
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-2-chris@chris-wilson.co.uk
      56299fb7
  4. 23 2月, 2017 12 次提交
  5. 21 2月, 2017 2 次提交
  6. 17 2月, 2017 1 次提交
  7. 16 2月, 2017 1 次提交
  8. 14 2月, 2017 2 次提交
    • T
      drm/i915: Emit to ringbuffer directly · 73dec95e
      Tvrtko Ursulin 提交于
      This removes the usage of intel_ring_emit in favour of
      directly writing to the ring buffer.
      
      intel_ring_emit was preventing the compiler for optimising
      fetch and increment of the current ring buffer pointer and
      therefore generating very verbose code for every write.
      
      It had no useful purpose since all ringbuffer operations
      are started and ended with intel_ring_begin and
      intel_ring_advance respectively, with no bail out in the
      middle possible, so it is fine to increment the tail in
      intel_ring_begin and let the code manage the pointer
      itself.
      
      Useless instruction removal amounts to approximately
      two and half kilobytes of saved text on my build.
      
      Not sure if this has any measurable performance
      implications but executing a ton of useless instructions
      on fast paths cannot be good.
      
      v2:
       * Change return from intel_ring_begin to error pointer by
         popular demand.
       * Move tail increment to intel_ring_advance to enable some
         error checking.
      
      v3:
       * Move tail advance back into intel_ring_begin.
       * Rebase and tidy.
      
      v4:
       * Complete rebase after a few months since v3.
      
      v5:
       * Remove unecessary cast and fix !debug compile. (Chris Wilson)
      
      v6:
       * Make intel_ring_offset take request as well.
       * Fix recording of request postfix plus a sprinkle of asserts.
         (Chris Wilson)
      
      v7:
       * Use intel_ring_offset to get the postfix. (Chris Wilson)
       * Convert GVT code as well.
      
      v8:
       * Rename *out++ to *cs++.
      
      v9:
       * Fix GVT out to cs conversion in GVT.
      
      v10:
       * Rebase for new intel_ring_begin in selftests.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Zhi Wang <zhi.a.wang@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170214113242.29241-1-tvrtko.ursulin@linux.intel.com
      73dec95e
    • C
      drm/i915: Add selftests for i915_gem_request · c835c550
      Chris Wilson 提交于
      Simple starting point for adding seltests for i915_gem_request, first
      mock a device (with engines and contexts) that allows us to construct
      and execute a request, along with waiting for the request to complete.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-10-chris@chris-wilson.co.uk
      c835c550
  9. 09 2月, 2017 1 次提交
  10. 16 1月, 2017 1 次提交
  11. 11 1月, 2017 1 次提交
    • C
      drm/i915: Add a sanity check that no request is submitted in the middle · c781c978
      Chris Wilson 提交于
      It is an error to start a new request on the same timeline (ringbuffer)
      as the current one before the current is submitted. If there are two
      requests emitting to the ringbuffer at the same time, the operation is
      undefined. We can catch this by checking for the timeline having a later
      seqno than ours when we come to submit our request.
      
      Currently we have this check at the end of __i915_add_request, but
      having an early check as well isolates a failure in the caller versus a
      failure in sealing the request (i.e. from inside __i915_add_request
      itself). For example, CI is currently tripping over this late assertion
      on ctg/ilk:
      
      [  100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
      [  100.336333] ------------[ cut here ]------------
      [  100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
      [  100.336347] invalid opcode: 0000 [#1] PREEMPT SMP
      [  100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei e1000e ptp pps_core [last unloaded: i915]
      [  100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G     U          4.10.0-rc3-CI-CI_DRM_2045+ #1
      [  100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 04/22/2009
      [  100.336386] task: ffff88012b738040 task.stack: ffffc90000560000
      [  100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
      [  100.336445] RSP: 0018:ffffc90000563ac0 EFLAGS: 00010212
      [  100.336451] RAX: 0000000000005d52 RBX: ffff880133bb84c0 RCX: 0000000000000001
      [  100.336456] RDX: 0000000080000001 RSI: ffff88012b738860 RDI: 00000000ffffffff
      [  100.336461] RBP: ffffc90000563b00 R08: ffff880133bb8780 R09: 0000000000000000
      [  100.336466] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012f53d950
      [  100.336472] R13: ffff88012a2b0af8 R14: ffff88012a5b0008 R15: ffff88012f53d960
      [  100.336477] FS:  00007f0d19da38c0(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000
      [  100.336483] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  100.336488] CR2: 00007f0d17706000 CR3: 000000012aa3e000 CR4: 00000000000406f0
      [  100.336496] Call Trace:
      [  100.336527]  i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
      [  100.336559]  i915_gem_evict_vm+0x202/0x2b0 [i915]
      [  100.336590]  i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
      [  100.336623]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
      [  100.336656]  i915_gem_execbuffer2+0xc0/0x250 [i915]
      [  100.336666]  drm_ioctl+0x200/0x450
      [  100.336697]  ? i915_gem_execbuffer+0x330/0x330 [i915]
      [  100.336708]  do_vfs_ioctl+0x90/0x6e0
      [  100.336716]  ? up_read+0x1a/0x40
      [  100.336723]  ? trace_hardirqs_on_caller+0x122/0x1b0
      [  100.336730]  SyS_ioctl+0x3c/0x70
      [  100.336738]  entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  100.336745] RIP: 0033:0x7f0d187cb357
      [  100.336750] RSP: 002b:00007ffe0b2f7c28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [  100.336761] RAX: ffffffffffffffda RBX: 00007ffe0b2f7d60 RCX: 00007f0d187cb357
      [  100.336768] RDX: 00007ffe0b2f7d00 RSI: 0000000040406469 RDI: 0000000000000003
      [  100.336775] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000022
      [  100.336782] R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000002
      [  100.336789] R13: 0000000000419101 R14: 00007ffe0b2f7d60 R15: 00007ffe0b2f7d50
      [  100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
      [  100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: ffffc90000563ac0
      [  100.336886] ---[ end trace 22b36545479e5eb7 ]---
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170111140858.1922-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      c781c978
  12. 19 12月, 2016 2 次提交
    • C
      drm/i915: Swap if(enable_execlists) in i915_gem_request_alloc for a vfunc · f73e7399
      Chris Wilson 提交于
      A fairly trivial move of a matching pair of routines (for preparing a
      request for construction) onto an engine vfunc. The ulterior motive is
      to be able to create a mock request implementation.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-7-chris@chris-wilson.co.uk
      f73e7399
    • C
      drm/i915: Unify active context tracking between legacy/execlists/guc · e8a9c58f
      Chris Wilson 提交于
      The requests conversion introduced a nasty bug where we could generate a
      new request in the middle of constructing a request if we needed to idle
      the system in order to evict space for a context. The request to idle
      would be executed (and waited upon) before the current one, creating a
      minor havoc in the seqno accounting, as we will consider the current
      request to already be completed (prior to deferred seqno assignment) but
      ring->last_retired_head would have been updated and still could allow
      us to overwrite the current request before execution.
      
      We also employed two different mechanisms to track the active context
      until it was switched out. The legacy method allowed for waiting upon an
      active context (it could forcibly evict any vma, including context's),
      but the execlists method took a step backwards by pinning the vma for
      the entire active lifespan of the context (the only way to evict was to
      idle the entire GPU, not individual contexts). However, to circumvent
      the tricky issue of locking (i.e. we cannot take struct_mutex at the
      time of i915_gem_request_submit(), where we would want to move the
      previous context onto the active tracker and unpin it), we take the
      execlists approach and keep the contexts pinned until retirement.
      The benefit of the execlists approach, more important for execlists than
      legacy, was the reduction in work in pinning the context for each
      request - as the context was kept pinned until idle, it could short
      circuit the pinning for all active contexts.
      
      We introduce new engine vfuncs to pin and unpin the context
      respectively. The context is pinned at the start of the request, and
      only unpinned when the following request is retired (this ensures that
      the context is idle and coherent in main memory before we unpin it). We
      move the engine->last_context tracking into the retirement itself
      (rather than during request submission) in order to allow the submission
      to be reordered or unwound without undue difficultly.
      
      And finally an ulterior motive for unifying context handling was to
      prepare for mock requests.
      
      v2: Rename to last_retired_context, split out legacy_context tracking
      for MI_SET_CONTEXT.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-3-chris@chris-wilson.co.uk
      e8a9c58f
  13. 05 12月, 2016 1 次提交
  14. 25 11月, 2016 5 次提交
  15. 21 11月, 2016 2 次提交
  16. 19 11月, 2016 1 次提交
    • C
      drm/i915: Check that each request phase is completed before retiring · 786d290c
      Chris Wilson 提交于
      Trying to chase an impossible bug (ivb):
      
      [  207.765411] [drm:i915_reset_and_wakeup [i915]] resetting chip
      [  207.765734] [drm:i915_gem_reset [i915]] resetting render ring to restart from tail of request 0x4ee834
      [  207.765791] [drm:intel_print_rc6_info [i915]] Enabling RC6 states: RC6 on RC6p on RC6pp off
      [  207.767213] [drm:intel_guc_setup [i915]] GuC fw status: path (null), fetch NONE, load NONE
      [  207.767515] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:203!
      [  207.767551] invalid opcode: 0000 [#1] PREEMPT SMP
      [  207.767576] Modules linked in: snd_hda_intel i915 cdc_ncm usbnet mii x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me mei snd_pcm sdhci_pci sdhci mmc_core e1000e ptp pps_core [last unloaded: i915]
      [  207.767808] CPU: 3 PID: 8855 Comm: gem_ringfill Tainted: G     U          4.9.0-rc5-CI-Patchwork_3052+ #1
      [  207.767854] Hardware name: LENOVO 2356GCG/2356GCG, BIOS G7ET31WW (1.13 ) 07/02/2012
      [  207.767894] task: ffff88012c82a740 task.stack: ffffc9000383c000
      [  207.767927] RIP: 0010:[<ffffffffa00a0a3a>]  [<ffffffffa00a0a3a>] i915_gem_request_retire+0x2a/0x4b0 [i915]
      [  207.767999] RSP: 0018:ffffc9000383fb20  EFLAGS: 00010293
      [  207.768027] RAX: 00000000004ee83c RBX: ffff880135dcb480 RCX: 00000000004ee83a
      [  207.768062] RDX: ffff88012fea42a8 RSI: 0000000000000001 RDI: ffff88012c82af68
      [  207.768095] RBP: ffffc9000383fb48 R08: 0000000000000000 R09: 0000000000000000
      [  207.768129] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880135dcb480
      [  207.768163] R13: ffff88012fea42a8 R14: 0000000000000000 R15: 00000000000001d8
      [  207.768200] FS:  00007f955f658740(0000) GS:ffff88013e2c0000(0000) knlGS:0000000000000000
      [  207.768239] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  207.768258] CR2: 0000555899725930 CR3: 00000001316f6000 CR4: 00000000001406e0
      [  207.768286] Stack:
      [  207.768299]  ffff880135dcb480 ffff880135dcbe00 ffff88012fea42a8 0000000000000000
      [  207.768350]  00000000000001d8 ffffc9000383fb70 ffffffffa00a1339 0000000000000000
      [  207.768402]  ffff88012f296c88 00000000000003f0 ffffc9000383fbb0 ffffffffa00b582d
      [  207.768453] Call Trace:
      [  207.768493]  [<ffffffffa00a1339>] i915_gem_request_retire_upto+0x49/0x90 [i915]
      [  207.768553]  [<ffffffffa00b582d>] intel_ring_begin+0x15d/0x2d0 [i915]
      [  207.768608]  [<ffffffffa00b59cb>] intel_ring_alloc_request_extras+0x2b/0x40 [i915]
      [  207.768667]  [<ffffffffa00a2fd9>] i915_gem_request_alloc+0x359/0x440 [i915]
      [  207.768723]  [<ffffffffa008bd03>] i915_gem_do_execbuffer.isra.15+0x783/0x1a10 [i915]
      [  207.768766]  [<ffffffff811a6a2e>] ? __might_fault+0x3e/0x90
      [  207.768816]  [<ffffffffa008d380>] i915_gem_execbuffer2+0xc0/0x250 [i915]
      [  207.768854]  [<ffffffff815532a6>] drm_ioctl+0x1f6/0x480
      [  207.768900]  [<ffffffffa008d2c0>] ? i915_gem_execbuffer+0x330/0x330 [i915]
      [  207.768939]  [<ffffffff81202f6e>] do_vfs_ioctl+0x8e/0x690
      [  207.768972]  [<ffffffff818193ac>] ? retint_kernel+0x2d/0x2d
      [  207.769004]  [<ffffffff810d6ef2>] ? trace_hardirqs_on_caller+0x122/0x1b0
      [  207.769039]  [<ffffffff812035ac>] SyS_ioctl+0x3c/0x70
      [  207.769068]  [<ffffffff818189ae>] entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  207.769103] Code: 90 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 8b 35 fa 7b e1 e1 85 f6 0f 85 55 03 00 00 41 8b 84 24 80 02 00 00 85 c0 75 02 <0f> 0b 49 8b 94 24 a8 00 00 00 48 8b 8a e0 01 00 00 8b 89 c0 00
      [  207.769400] RIP  [<ffffffffa00a0a3a>] i915_gem_request_retire+0x2a/0x4b0 [i915]
      [  207.769463]  RSP <ffffc9000383fb20>
      
      Let's add a couple more BUG_ONs before this to ascertain that the request
      did make it to hardware. The impossible part of this stacktrace is that
      request must have been considered completed by the i915_request_wait()
      before we tried to retire it.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161118143412.26508-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>
      786d290c
  17. 18 11月, 2016 1 次提交
    • C
      drm/i915: Be more careful to drop the GT wakeref · 4302055b
      Chris Wilson 提交于
      Since we can retire requests from multiple paths, we cannot assume that
      i915_gem_retire_requests() is the sole path on which we can transition
      to gt.active_requests == 0. A consequence of this is that we would skip
      the function if we had already retired all the requests and not
      scheduled the idle worker.
      
      This is fallout from changing the routine from considering active_engines
      (for which it was the only consumer) to active_requests.
      
      v2: Move kicking the idle working to i915_gem_request_retire() otherwise
      we could postpone the idle callback everytime we called retire_requests
      even though we did no work.
      v3: We only need to move the idle work kicking!
      v4: Drop the BUG_ON(!awake) as we may be called from the shrinker in the
      middle of constructing a request before we have marked the device awake.
      v5: Add a BUG_ON() for active_requests underflow upon retirement (Joonas)
      
      Fixes: 28176ef4 ("drm/i915: Reserve space in the global seqno during request allocation")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161115164620.17185-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      4302055b