1. 15 8月, 2017 1 次提交
  2. 27 7月, 2017 9 次提交
  3. 21 7月, 2017 1 次提交
  4. 13 7月, 2017 1 次提交
    • M
      drm/i915: use __GFP_RETRY_MAYFAIL · dbb32956
      Michal Hocko 提交于
      Commit 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than
      incur the oom for gfx allocations") has tried to remove disruptive OOM
      killer because the userspace should be able to cope with allocation
      failures.
      
      At the time only __GFP_NORETRY could achieve that and it turned out that
      this would fail the allocations just too easily.  So "drm/i915: Remove
      __GFP_NORETRY from our buffer allocator" removed it and hoped for a
      better solution.  __GFP_RETRY_MAYFAIL is that solution.  It will keep
      retrying the allocation until there is no more progress and we would go
      OOM.  Instead we fail the allocation and let the caller to deal with it.
      
      Link: http://lkml.kernel.org/r/20170623085345.11304-6-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Alex Belits <alex.belits@cavium.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Cc: David Daney <david.daney@cavium.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: NeilBrown <neilb@suse.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dbb32956
  5. 28 6月, 2017 2 次提交
  6. 23 6月, 2017 1 次提交
    • C
      drm/i915: Break modeset deadlocks on reset · 36703e79
      Chris Wilson 提交于
      Trying to do a modeset from within a reset is fraught with danger. We
      can fall into a cyclic deadlock where the modeset is waiting on a
      previous modeset that is waiting on a request, and since the GPU hung
      that request completion is waiting on the reset. As modesetting doesn't
      allow its locks to be broken and restarted, or for its *own* reset
      mechanism to take over the display, we have to do something very
      evil instead. If we detect that we are stuck waiting to prepare the
      display reset (by using a very simple timeout), resort to cancelling all
      in-flight requests and throwing the user data into /dev/null, which is
      marginally better than the driver locking up and keeping that data to
      itself.
      
      This is not a fix; this is just a workaround that unbreaks machines
      until we can resolve the deadlock in a way that doesn't lose data!
      
      v2: Move the retirement from set-wegded to the i915_reset() error path,
      after which we no longer any delayed worker cleanup for
      i915_handle_error()
      v3: C abuse for syntactic sugar
      v4: Cover all waits with the timeout to catch more driver breakage
      
      References: https://bugs.freedesktop.org/show_bug.cgi?id=99093Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170622105625.16952-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      36703e79
  7. 21 6月, 2017 4 次提交
  8. 19 6月, 2017 2 次提交
    • C
      drm/i915: Remove __GFP_NORETRY from our buffer allocator · ce2c5872
      Chris Wilson 提交于
      I tried __GFP_NORETRY in the belief that __GFP_RECLAIM was effective. It
      struggles with handling reclaim of our dirty buffers and relies on
      reclaim via kswapd. As a result, a single pass of direct reclaim is
      unreliable when i915 occupies the majority of available memory, and the
      only means of effectively waiting on kswapd to amke progress is by not
      setting the __GFP_NORETRY flag and lopping. That leaves us with the
      dilemma of invoking the oomkiller instead of propagating the allocation
      failure back to userspace where it can be handled more gracefully (one
      hopes).  In the future we may have __GFP_MAYFAIL to allow repeats up until
      we genuinely run out of memory and the oomkiller would have been invoked.
      Until then, let the oomkiller wreck havoc.
      
      v2: Stop playing with side-effects of gfp flags and await __GFP_MAYFAIL
      v3: Update comments that direct reclaim only appears to be ignoring our
      dirty buffers!
      
      Fixes: 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations")
      Testcase: igt/gem_tiled_swapping
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Michal Hocko <mhocko@suse.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-2-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      (cherry picked from commit eaf41801)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      ce2c5872
    • C
      drm/i915: Encourage our shrinker more when our shmemfs allocations fails · b8d5a9cc
      Chris Wilson 提交于
      Commit 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than
      incur the oom for gfx allocations") made the bold decision to try and
      avoid the oomkiller by reporting -ENOMEM to userspace if our allocation
      failed after attempting to free enough buffer objects. In short, it
      appears we were giving up too easily (even before we start wondering if
      one pass of reclaim is as strong as we would like). Part of the problem
      is that if we only shrink just enough pages for our expected allocation,
      the likelihood of those pages becoming available to us is less than 100%
      To counter-act that we ask for twice the number of pages to be made
      available. Furthermore, we allow the shrinker to pull pages from the
      active list in later passes.
      
      v2: Be a little more cautious in paging out gfx buffers, and leave that
      to a more balanced approach from shrink_slab(). Important when combined
      with "drm/i915: Start writeback from the shrinker" as anything shrunk is
      immediately swapped out and so should be more conservative.
      
      Fixes: 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-1-chris@chris-wilson.co.uk
      (cherry picked from commit 4846bf0c)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      b8d5a9cc
  9. 16 6月, 2017 5 次提交
    • C
      drm/i915: Async GPU relocation processing · 7dd4f672
      Chris Wilson 提交于
      If the user requires patching of their batch or auxiliary buffers, we
      currently make the alterations on the cpu. If they are active on the GPU
      at the time, we wait under the struct_mutex for them to finish executing
      before we rewrite the contents. This happens if shared relocation trees
      are used between different contexts with separate address space (and the
      buffers then have different addresses in each), the 3D state will need
      to be adjusted between execution on each context. However, we don't need
      to use the CPU to do the relocation patching, as we could queue commands
      to the GPU to perform it and use fences to serialise the operation with
      the current activity and future - so the operation on the GPU appears
      just as atomic as performing it immediately. Performing the relocation
      rewrites on the GPU is not free, in terms of pure throughput, the number
      of relocations/s is about halved - but more importantly so is the time
      under the struct_mutex.
      
      v2: Break out the request/batch allocation for clearer error flow.
      v3: A few asserts to ensure rq ordering is maintained
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      7dd4f672
    • C
      drm/i915: Wait upon userptr get-user-pages within execbuffer · 8a2421bd
      Chris Wilson 提交于
      This simply hides the EAGAIN caused by userptr when userspace causes
      resource contention. However, it is quite beneficial with highly
      contended userptr users as we avoid repeating the setup costs and
      kernel-user context switches.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      8a2421bd
    • C
      drm/i915: Store a direct lookup from object handle to vma · 4ff4b44c
      Chris Wilson 提交于
      The advent of full-ppgtt lead to an extra indirection between the object
      and its binding. That extra indirection has a noticeable impact on how
      fast we can convert from the user handles to our internal vma for
      execbuffer. In order to bypass the extra indirection, we use a
      resizable hashtable to jump from the object to the per-ctx vma.
      rhashtable was considered but we don't need the online resizing feature
      and the extra complexity proved to undermine its usefulness. Instead, we
      simply reallocate the hastable on demand in a background task and
      serialize it before iterating.
      
      In non-full-ppgtt modes, multiple files and multiple contexts can share
      the same vma. This leads to having multiple possible handle->vma links,
      so we only use the first to establish the fast path. The majority of
      buffers are not shared and so we should still be able to realise
      speedups with multiple clients.
      
      v2: Prettier names, more magic.
      v3: Many style tweaks, most notably hiding the misuse of execobj[].rsvd2
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      4ff4b44c
    • C
      drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty · 7fc92e96
      Chris Wilson 提交于
      For ease of use (i.e. avoiding a few checks and function calls), store
      the object's cache coherency next to the cache is dirty bit.
      
      Specifically this patch aims to reduce the frequency of no-op calls to
      i915_gem_object_clflush() to counter-act the increase of such calls for
      GPU only objects in the previous patch.
      
      v2: Replace cache_dirty & ~cache_coherent with cache_dirty &&
      !cache_coherent as gcc generates much better code for the latter
      (Tvrtko)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Dongwon Kim <dongwon.kim@intel.com>
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Tested-by: NDongwon Kim <dongwon.kim@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170616105455.16977-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      7fc92e96
    • C
      drm/i915: Mark CPU cache as dirty on every transition for CPU writes · e27ab73d
      Chris Wilson 提交于
      Currently, we only mark the CPU cache as dirty if we skip a clflush.
      This leads to some confusion where we have to ask if the object is in
      the write domain or missed a clflush. If we always mark the cache as
      dirty, this becomes a much simply question to answer.
      
      The goal remains to do as few clflushes as required and to do them as
      late as possible, in the hope of deferring the work to a kthread and not
      block the caller (e.g. execbuf, flips).
      
      v2: Always call clflush before GPU execution when the cache_dirty flag
      is set. This may cause some extra work on llc systems that migrate dirty
      buffers back and forth - but we do try to limit that by only setting
      cache_dirty at the end of the gpu sequence.
      
      v3: Always mark the cache as dirty upon a level change, as we need to
      invalidate any stale cachelines due to external writes.
      Reported-by: NDongwon Kim <dongwon.kim@intel.com>
      Fixes: a6a7cc4b ("drm/i915: Always flush the dirty CPU cache when pinning the scanout")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Dongwon Kim <dongwon.kim@intel.com>
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Tested-by: NDongwon Kim <dongwon.kim@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170615123850.26843-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      e27ab73d
  10. 14 6月, 2017 3 次提交
  11. 07 6月, 2017 1 次提交
    • C
      drm/i915: Short-circuit i915_gem_wait_for_idle() if already idle · e0da1963
      Chris Wilson 提交于
      If the device is asleep (no GT wakeref), we know the GPU is already idle.
      If we add an early return, we can avoid touching registers and checking
      hw state outside of the assumed GT wakelock. This prevents causing such
      errors whilst debugging:
      
      [ 2613.401647] RPM wakelock ref not held during HW access
      [ 2613.401684] ------------[ cut here ]------------
      [ 2613.401720] WARNING: CPU: 5 PID: 7739 at drivers/gpu/drm/i915/intel_drv.h:1787 gen6_read32+0x21f/0x2b0 [i915]
      [ 2613.401731] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek coretemp snd_hda_codec_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm r8169 mii mei_me lpc_ich mei prime_numbers [last unloaded: i915]
      [ 2613.401823] CPU: 5 PID: 7739 Comm: drv_missed_irq Tainted: G     U          4.12.0-rc2-CI-CI_DRM_421+ #1
      [ 2613.401825] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
      [ 2613.401840] task: ffff880409e3a740 task.stack: ffffc900084dc000
      [ 2613.401861] RIP: 0010:gen6_read32+0x21f/0x2b0 [i915]
      [ 2613.401863] RSP: 0018:ffffc900084dfce8 EFLAGS: 00010292
      [ 2613.401869] RAX: 000000000000002a RBX: ffff8804016a8000 RCX: 0000000000000006
      [ 2613.401871] RDX: 0000000000000006 RSI: ffffffff81cbf2d9 RDI: ffffffff81c9e3a7
      [ 2613.401874] RBP: ffffc900084dfd18 R08: ffff880409e3afc8 R09: 0000000000000000
      [ 2613.401877] R10: 000000008a1c483f R11: 0000000000000000 R12: 000000000000209c
      [ 2613.401879] R13: 0000000000000001 R14: ffff8804016a8000 R15: ffff8804016ac150
      [ 2613.401882] FS:  00007f39ef3dd8c0(0000) GS:ffff88041fb40000(0000) knlGS:0000000000000000
      [ 2613.401885] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 2613.401887] CR2: 00000000023717c8 CR3: 00000002e7b34000 CR4: 00000000001406e0
      [ 2613.401889] Call Trace:
      [ 2613.401912]  intel_engine_is_idle+0x76/0x90 [i915]
      [ 2613.401931]  i915_gem_wait_for_idle+0xe6/0x1e0 [i915]
      [ 2613.401951]  fault_irq_set+0x40/0x90 [i915]
      [ 2613.401970]  i915_ring_test_irq_set+0x42/0x50 [i915]
      [ 2613.401976]  simple_attr_write+0xc7/0xe0
      [ 2613.401981]  full_proxy_write+0x4f/0x70
      [ 2613.401987]  __vfs_write+0x23/0x120
      [ 2613.401992]  ? rcu_read_lock_sched_held+0x75/0x80
      [ 2613.401996]  ? rcu_sync_lockdep_assert+0x2a/0x50
      [ 2613.401999]  ? __sb_start_write+0xfa/0x1f0
      [ 2613.402004]  vfs_write+0xc5/0x1d0
      [ 2613.402008]  ? trace_hardirqs_on_caller+0xe7/0x1c0
      [ 2613.402013]  SyS_write+0x44/0xb0
      [ 2613.402020]  entry_SYSCALL_64_fastpath+0x1c/0xb1
      [ 2613.402022] RIP: 0033:0x7f39eded6670
      [ 2613.402025] RSP: 002b:00007fffdcdcb1a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 2613.402030] RAX: ffffffffffffffda RBX: ffffffff81470203 RCX: 00007f39eded6670
      [ 2613.402033] RDX: 0000000000000001 RSI: 000000000041bc33 RDI: 0000000000000006
      [ 2613.402036] RBP: ffffc900084dff88 R08: 00007f39ef3dd8c0 R09: 0000000000000001
      [ 2613.402038] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000041bc33
      [ 2613.402041] R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000
      [ 2613.402046]  ? __this_cpu_preempt_check+0x13/0x20
      [ 2613.402052] Code: 01 9b fa e0 0f ff e9 28 fe ff ff 80 3d 6a dd 0e 00 00 0f 85 29 fe ff ff 48 c7 c7 48 19 29 a0 c6 05 56 dd 0e 00 01 e8 da 9a fa e0 <0f> ff e9 0f fe ff ff b9 01 00 00 00 ba 01 00 00 00 44 89 e6 48
      [ 2613.402199] ---[ end trace 31f0cfa93ab632bf ]---
      
      Fixes: 25112b64 ("drm/i915: Wait for all engines to be idle as part of i915_gem_wait_for_idle()")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170530121334.17364-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      (cherry picked from commit 863e9fde)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      e0da1963
  12. 02 6月, 2017 1 次提交
  13. 31 5月, 2017 1 次提交
    • C
      drm/i915: Short-circuit i915_gem_wait_for_idle() if already idle · 863e9fde
      Chris Wilson 提交于
      If the device is asleep (no GT wakeref), we know the GPU is already idle.
      If we add an early return, we can avoid touching registers and checking
      hw state outside of the assumed GT wakelock. This prevents causing such
      errors whilst debugging:
      
      [ 2613.401647] RPM wakelock ref not held during HW access
      [ 2613.401684] ------------[ cut here ]------------
      [ 2613.401720] WARNING: CPU: 5 PID: 7739 at drivers/gpu/drm/i915/intel_drv.h:1787 gen6_read32+0x21f/0x2b0 [i915]
      [ 2613.401731] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek coretemp snd_hda_codec_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm r8169 mii mei_me lpc_ich mei prime_numbers [last unloaded: i915]
      [ 2613.401823] CPU: 5 PID: 7739 Comm: drv_missed_irq Tainted: G     U          4.12.0-rc2-CI-CI_DRM_421+ #1
      [ 2613.401825] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
      [ 2613.401840] task: ffff880409e3a740 task.stack: ffffc900084dc000
      [ 2613.401861] RIP: 0010:gen6_read32+0x21f/0x2b0 [i915]
      [ 2613.401863] RSP: 0018:ffffc900084dfce8 EFLAGS: 00010292
      [ 2613.401869] RAX: 000000000000002a RBX: ffff8804016a8000 RCX: 0000000000000006
      [ 2613.401871] RDX: 0000000000000006 RSI: ffffffff81cbf2d9 RDI: ffffffff81c9e3a7
      [ 2613.401874] RBP: ffffc900084dfd18 R08: ffff880409e3afc8 R09: 0000000000000000
      [ 2613.401877] R10: 000000008a1c483f R11: 0000000000000000 R12: 000000000000209c
      [ 2613.401879] R13: 0000000000000001 R14: ffff8804016a8000 R15: ffff8804016ac150
      [ 2613.401882] FS:  00007f39ef3dd8c0(0000) GS:ffff88041fb40000(0000) knlGS:0000000000000000
      [ 2613.401885] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 2613.401887] CR2: 00000000023717c8 CR3: 00000002e7b34000 CR4: 00000000001406e0
      [ 2613.401889] Call Trace:
      [ 2613.401912]  intel_engine_is_idle+0x76/0x90 [i915]
      [ 2613.401931]  i915_gem_wait_for_idle+0xe6/0x1e0 [i915]
      [ 2613.401951]  fault_irq_set+0x40/0x90 [i915]
      [ 2613.401970]  i915_ring_test_irq_set+0x42/0x50 [i915]
      [ 2613.401976]  simple_attr_write+0xc7/0xe0
      [ 2613.401981]  full_proxy_write+0x4f/0x70
      [ 2613.401987]  __vfs_write+0x23/0x120
      [ 2613.401992]  ? rcu_read_lock_sched_held+0x75/0x80
      [ 2613.401996]  ? rcu_sync_lockdep_assert+0x2a/0x50
      [ 2613.401999]  ? __sb_start_write+0xfa/0x1f0
      [ 2613.402004]  vfs_write+0xc5/0x1d0
      [ 2613.402008]  ? trace_hardirqs_on_caller+0xe7/0x1c0
      [ 2613.402013]  SyS_write+0x44/0xb0
      [ 2613.402020]  entry_SYSCALL_64_fastpath+0x1c/0xb1
      [ 2613.402022] RIP: 0033:0x7f39eded6670
      [ 2613.402025] RSP: 002b:00007fffdcdcb1a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 2613.402030] RAX: ffffffffffffffda RBX: ffffffff81470203 RCX: 00007f39eded6670
      [ 2613.402033] RDX: 0000000000000001 RSI: 000000000041bc33 RDI: 0000000000000006
      [ 2613.402036] RBP: ffffc900084dff88 R08: 00007f39ef3dd8c0 R09: 0000000000000001
      [ 2613.402038] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000041bc33
      [ 2613.402041] R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000
      [ 2613.402046]  ? __this_cpu_preempt_check+0x13/0x20
      [ 2613.402052] Code: 01 9b fa e0 0f ff e9 28 fe ff ff 80 3d 6a dd 0e 00 00 0f 85 29 fe ff ff 48 c7 c7 48 19 29 a0 c6 05 56 dd 0e 00 01 e8 da 9a fa e0 <0f> ff e9 0f fe ff ff b9 01 00 00 00 ba 01 00 00 00 44 89 e6 48
      [ 2613.402199] ---[ end trace 31f0cfa93ab632bf ]---
      
      Fixes: 25112b64 ("drm/i915: Wait for all engines to be idle as part of i915_gem_wait_for_idle()")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170530121334.17364-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      863e9fde
  14. 26 5月, 2017 1 次提交
  15. 18 5月, 2017 1 次提交
  16. 17 5月, 2017 5 次提交
  17. 03 5月, 2017 1 次提交
    • C
      drm/i915: Squash repeated awaits on the same fence · 47979480
      Chris Wilson 提交于
      Track the latest fence waited upon on each context, and only add a new
      asynchronous wait if the new fence is more recent than the recorded
      fence for that context. This requires us to filter out unordered
      timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the
      absence of a universal identifier, we have to use our own
      i915->mm.unordered_timeline token.
      
      v2: Throw around the debug crutches
      v3: Inline the likely case of the pre-allocation cache being full.
      v4: Drop the pre-allocation support, we can lose the most recent fence
      in case of allocation failure -- it just means we may emit more awaits
      than strictly necessary but will not break.
      v5: Trim allocation size for leaf nodes, they only need an array of u32
      not pointers.
      v6: Create mock_timeline to tidy selftest writing
      v7: s/intel_timeline_sync_get/intel_timeline_sync_is_later/ (Tvrtko)
      v8: Prune the stale sync points when we idle.
      v9: Include a small benchmark in the kselftests
      v10: Separate the idr implementation into its own compartment. (Tvrkto)
      v11: Refactor igt_sync kselftests to avoid deep nesting (Tvrkto)
      v12: __sync_leaf_idx() to assert that p->height is 0 when checking leaves
      v13: kselftests to investigate struct i915_syncmap itself (Tvrtko)
      v14: Foray into ascii art graphs
      v15: Take into account that the random lookup/insert does 2 prng calls,
      not 1, when benchmarking, and use for_each_set_bit() (Tvrtko)
      v16: Improved ascii art
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170503093924.5320-4-chris@chris-wilson.co.uk
      47979480