1. 14 5月, 2018 1 次提交
  2. 08 5月, 2018 1 次提交
  3. 04 5月, 2018 1 次提交
    • C
      drm/i915: Lazily unbind vma on close · 3365e226
      Chris Wilson 提交于
      When userspace is passing around swapbuffers using DRI, we frequently
      have to open and close the same object in the foreign address space.
      This shows itself as the same object being rebound at roughly 30fps
      (with a second object also being rebound at 30fps), which involves us
      having to rewrite the page tables and maintain the drm_mm range manager
      every time.
      
      However, since the object still exists and it is only the local handle
      that disappears, if we are lazy and do not unbind the VMA immediately
      when the local user closes the object but defer it until the GPU is
      idle, then we can reuse the same VMA binding. We still have to be
      careful to mark the handle and lookup tables as closed to maintain the
      uABI, just allowing the underlying VMA to be resurrected if the user is
      able to access the same object from the same context again.
      
      If the object itself is destroyed (neither userspace keeping a handle to
      it), the VMA will be reaped immediately as usual.
      
      In the future, this will be even more useful as instantiating a new VMA
      for use on the GPU will become heavier. A nuisance indeed, so nip it in
      the bud.
      
      v2: s/__i915_vma_final_close/i915_vma_destroy/ etc.
      v3: Leave a hint as to why we deferred the unbind on close.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-1-chris@chris-wilson.co.uk
      3365e226
  4. 03 5月, 2018 2 次提交
    • C
      drm/i915: Split i915_gem_timeline into individual timelines · a89d1f92
      Chris Wilson 提交于
      We need to move to a more flexible timeline that doesn't assume one
      fence context per engine, and so allow for a single timeline to be used
      across a combination of engines. This means that preallocating a fence
      context per engine is now a hindrance, and so we want to introduce the
      singular timeline. From the code perspective, this has the notable
      advantage of clearing up a lot of mirky semantics and some clumsy
      pointer chasing.
      
      By splitting the timeline up into a single entity rather than an array
      of per-engine timelines, we can realise the goal of the previous patch
      of tracking the timeline alongside the ring.
      
      v2: Tweak wait_for_idle to stop the compiling thinking that ret may be
      uninitialised.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-2-chris@chris-wilson.co.uk
      a89d1f92
    • C
      drm/i915: Move timeline from GTT to ring · 65fcb806
      Chris Wilson 提交于
      In the future, we want to move a request between engines. To achieve
      this, we first realise that we have two timelines in effect here. The
      first runs through the GTT is required for ordering vma access, which is
      tracked currently by engine. The second is implied by sequential
      execution of commands inside the ringbuffer. This timeline is one that
      maps to userspace's expectations when submitting requests (i.e. given the
      same context, batch A is executed before batch B). As the rings's
      timelines map to userspace and the GTT timeline an implementation
      detail, move the timeline from the GTT into the ring itself (per-context
      in logical-ring-contexts/execlists, or a global per-engine timeline for
      the shared ringbuffers in legacy submission.
      
      The two timelines are still assumed to be equivalent at the moment (no
      migrating requests between engines yet) and so we can simply move from
      one to the other without adding extra ordering.
      
      v2: Reinforce that one isn't allowed to mix the engine execution
      timeline with the client timeline from userspace (on the ring).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-1-chris@chris-wilson.co.uk
      65fcb806
  5. 30 4月, 2018 3 次提交
  6. 26 4月, 2018 1 次提交
  7. 19 4月, 2018 1 次提交
  8. 12 4月, 2018 1 次提交
  9. 11 4月, 2018 1 次提交
  10. 07 4月, 2018 3 次提交
  11. 24 3月, 2018 1 次提交
  12. 16 3月, 2018 4 次提交
  13. 14 3月, 2018 4 次提交
    • J
      drm/i915/guc: Check the locking status of GuC WOPCM registers · f08e2035
      Jackie Li 提交于
      GuC WOPCM registers are write-once registers. Current driver code accesses
      these registers without checking the accessibility to these registers which
      will lead to unpredictable driver behaviors if these registers were touch
      by other components (such as faulty BIOS code).
      
      This patch moves the GuC WOPCM registers updating code into intel_wopcm.c
      and adds check before and after the update to GuC WOPCM registers so that
      we can make sure the driver is in a known state after writing to these
      write-once registers.
      
      v6:
       - Made sure module reloading won't bug the kernel while doing
         locking status checking
      
      v7:
       - Fixed patch format issues
      
      v8:
       - Fixed coding style issue on register lock bit macro definition (Sagar)
      
      v9:
       - Avoided to use redundant !! to cast uint to bool (Chris)
       - Return error code instead of GEM_BUG_ON for locked with invalid register
         values case (Sagar)
       - Updated guc_wopcm_hw_init to use guc_wopcm as first parameter (Michal)
       - Added code to set and validate the HuC_LOADING_AGENT_GUC bit in GuC
         WOPCM offset register based on the presence of HuC firmware (Michal)
       - Use bit fields instead of macros for GuC WOPCM flags (Michal)
      
      v10:
       - Refined variable names, removed redundant comments (Joonas)
       - Introduced lockable_reg to handle the write once register write and
         propagate the write error to caller (Joonas)
       - Used lockable_reg abstraction to avoid locking bit check on generic
         i915_reg_t (Michal)
       - Added log message for error paths (Michal)
       - Removed hw_updated flag and only relies on real hardware status
      
      v11:
       - Replaced lockable_reg with simplified function (Michal)
       - Used new macros for locking bits of WOPCM size/offset registers instead
         of using BIT(0) directly (Michal)
       - use intel_wopcm_init_hw() called from intel_gem_init_hw() to do GuC
         WOPCM register setup instead of calling from intel_uc_init_hw() (Michal)
      
      v12:
       - Updated function kernel-doc to align with code changes (Michal)
       - Updated code to use wopcm pointer directly (Michal)
      
      v13:
       - Updated the ordering of s-o-b/cc/r-b tags (Sagar)
      
      BSpec: 10875, 10833
      Signed-off-by: NJackie Li <yaodong.li@intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11)
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12)
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-5-git-send-email-yaodong.li@intel.com
      f08e2035
    • J
      drm/i915: Implement dynamic GuC WOPCM offset and size calculation · 6b0478fb
      Jackie Li 提交于
      Hardware may have specific restrictions on GuC WOPCM offset and size. On
      Gen9, the value of the GuC WOPCM size register needs to be larger than the
      value of GuC WOPCM offset register + a Gen9 specific offset (144KB) for
      reserved GuC WOPCM. Fail to enforce such a restriction on GuC WOPCM size
      will lead to GuC firmware execution failures. On the other hand, with
      current static GuC WOPCM offset and size values (512KB for both offset and
      size), the GuC WOPCM size verification will fail on Gen9 even if it can be
      fixed by lowering the GuC WOPCM offset by calculating its value based on
      HuC firmware size (which is likely less than 200KB on Gen9), so that we can
      have a GuC WOPCM size value which is large enough to pass the GuC WOPCM
      size check.
      
      This patch updates the reserved GuC WOPCM size for RC6 context on Gen9 to
      24KB to strictly align with the Gen9 GuC WOPCM layout. It also adds support
      to verify the GuC WOPCM size aganist the Gen9 hardware restrictions. To
      meet all above requirements, let's provide dynamic partitioning of the
      WOPCM that will be based on platform specific HuC/GuC firmware sizes.
      
      v2:
       - Removed intel_wopcm_init (Ville/Sagar/Joonas)
       - Renamed and Moved the intel_wopcm_partition into intel_guc (Sagar)
       - Removed unnecessary function calls (Joonas)
       - Init GuC WOPCM partition as soon as firmware fetching is completed
      
      v3:
       - Fixed indentation issues (Chris)
       - Removed layering violation code (Chris/Michal)
       - Created separat files for GuC wopcm code  (Michal)
       - Used inline function to avoid code duplication (Michal)
      
      v4:
       - Preset the GuC WOPCM top during early GuC init (Chris)
       - Fail intel_uc_init_hw() as soon as GuC WOPCM partitioning failed
      
      v5:
       - Moved GuC DMA WOPCM register updating code into intel_wopcm.c
       - Took care of the locking status before writing to GuC DMA
         Write-Once registers. (Joonas)
      
      v6:
       - Made sure the GuC WOPCM size to be multiple of 4K (4K aligned)
      
      v8:
       - Updated comments and fixed naming issues (Sagar/Joonas)
       - Updated commit message to include more description about the hardware
         restriction on GuC WOPCM size (Sagar)
      
      v9:
       - Minor changes variable names and code comments (Sagar)
       - Added detailed GuC WOPCM layout drawing (Sagar/Michal)
       - Refined macro definitions to be reader friendly (Michal)
       - Removed redundent check to valid flag (Michal)
       - Unified first parameter for exported GuC WOPCM functions (Michal)
       - Refined the name and parameter list of hardware restriction checking
         functions (Michal)
      
      v10:
       - Used shorter function name for internal functions (Joonas)
       - Moved init-ealry function into c file (Joonas)
       - Consolidated and removed redundant size checks (Joonas/Michal)
       - Removed unnecessary unlikely() from code which is only called once
         during boot (Joonas)
       - More fixes to kernel-doc format and content (Michal)
       - Avoided the use of PAGE_MASK for 4K pages (Michal)
       - Added error log messages to error paths (Michal)
      
      v11:
       - Replaced intel_guc_wopcm with more generic intel_wopcm and attached
         intel_wopcm to drm_i915_private instead intel_guc (Michal)
       - dynamic calculation of GuC non-wopcm memory start (a.k.a WOPCM Top
         offset from GuC WOPCM base) (Michal)
       - Moved WOPCM marco definitions into .c source file (Michal)
       - Exported WOPCM layout diagram as kernel-doc (Michal)
      
      v12:
       - Updated naming, function kernel-doc to align with new changes (Michal)
      
      v13:
       - Updated the ordering of s-o-b/cc/r-b tags (Sagar)
       - Corrected one tense error in comment (Sagar)
       - Corrected typos and removed spurious comments (Joonas)
      
      Bspec: 12690
      Signed-off-by: NJackie Li <yaodong.li@intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: John Spotswood <john.a.spotswood@intel.com>
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v8)
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v9)
      Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11)
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12)
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-2-git-send-email-yaodong.li@intel.com
      6b0478fb
    • C
      drm/i915: Show GEM_TRACE when detecting a failed GPU idle · 629820fc
      Chris Wilson 提交于
      If we timeout waiting for the GPU to idle, something went seriously
      wrong. We currently dump the engine state, but we can also dump the
      ftrace buffer showing our last operations (when available).
      
      In passing, note that since commit 559e040f ("drm/i915: Show the GPU
      state when declaring wedged", we now show the engine state twice, once
      in detecting the failed idle and then again on declaring wedged.
      
      v2: ftrace_dump() takes a parameter specifying whether to dump all cpu
      buffers or the local cpu's.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180309101114.1138-1-chris@chris-wilson.co.uk
      629820fc
    • D
      drm/i915/frontbuffer: Pull frontbuffer_flush out of gem_obj_pin_to_display · 07bcd99b
      Dhinakaran Pandiyan 提交于
      i915_gem_obj_pin_to_display() calls frontbuffer_flush with origin set to
      DIRTYFB. The callers however are at a vantage point to decide if hardware
      frontbuffer tracking can do the flush for us. For example, legacy cursor
      updates, like flips, write to MMIO registers, which then triggers PSR flush
      by the hardware. Moving frontbuffer_flush out will enable us to skip a
      software initiated flush by setting origin to FLIP. Thanks to Chris for the
      idea.
      
      v2:
      Rebased due to Ville adding intel_plane_pin_fb().
      Minor code reordering as fb_obj_flush doesn't need struct_mutex (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180307033420.3086-1-dhinakaran.pandiyan@intel.com
      07bcd99b
  14. 13 3月, 2018 2 次提交
  15. 09 3月, 2018 4 次提交
  16. 06 3月, 2018 1 次提交
    • C
      drm/i915: Suspend submission tasklets around wedging · 88d3dfb6
      Chris Wilson 提交于
      After staring hard at sequences like
      
      [   28.199013]  systemd-1       2..s. 26062228us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0?], tail=1 [1?]
      [   28.199095]  systemd-1       2..s. 26062229us : execlists_submission_tasklet: rcs0 csb[1]: status=0x00000018:0x00000000, active=0x1
      [   28.199177]  systemd-1       2..s. 26062230us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.1, seqno=3, prio=-1024
      [   28.199258]  systemd-1       2..s. 26062231us : execlists_submission_tasklet: rcs0 completed ctx=0
      [   28.199340]  gem_eio-829     1..s1 26066853us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.1, seqno=1, prio=0
      [   28.199421]   <idle>-0       2..s. 26066863us : execlists_submission_tasklet: rcs0 cs-irq head=1 [1?], tail=2 [2?]
      [   28.199503]   <idle>-0       2..s. 26066865us : execlists_submission_tasklet: rcs0 csb[2]: status=0x00000001:0x00000000, active=0x1
      [   28.199585]  gem_eio-829     1..s1 26067077us : execlists_submission_tasklet: rcs0 in[1]:  ctx=3.1, seqno=2, prio=0
      [   28.199667]  gem_eio-829     1..s1 26067078us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.2, seqno=1, prio=0
      [   28.199749]   <idle>-0       2..s. 26067084us : execlists_submission_tasklet: rcs0 cs-irq head=2 [2?], tail=3 [3?]
      [   28.199830]   <idle>-0       2..s. 26067085us : execlists_submission_tasklet: rcs0 csb[3]: status=0x00008002:0x00000001, active=0x1
      [   28.199912]   <idle>-0       2..s. 26067086us : execlists_submission_tasklet: rcs0 out[0]: ctx=1.2, seqno=1, prio=0
      [   28.199994]  gem_eio-829     2..s. 28246084us : execlists_submission_tasklet: rcs0 cs-irq head=3 [3?], tail=4 [4?]
      [   28.200096]  gem_eio-829     2..s. 28246088us : execlists_submission_tasklet: rcs0 csb[4]: status=0x00000014:0x00000001, active=0x5
      [   28.200178]  gem_eio-829     2..s. 28246089us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.0, seqno=0, prio=0
      [   28.200260]  gem_eio-829     2..s. 28246127us : execlists_submission_tasklet: execlists_submission_tasklet:886 GEM_BUG_ON(buf[2 * head + 1] != port->context_id)
      
      the conclusion is that the only place where the ports are reset to zero,
      is from engine->cancel_requests called during i915_gem_set_wedged().
      
      The race is horrible as it results from calling set-wedged on active HW
      (the GPU reset failed) and as such we need to be careful as the HW state
      changes beneath us. Fortunately, it's the same scary conditions as
      affect normal reset, so we can reuse the same machinery to disable state
      tracking as we clobber it.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104945Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Fixes: af7a8ffa ("drm/i915: Use rcu instead of stop_machine in set_wedged")
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180302113324.23189-2-chris@chris-wilson.co.uk
      (cherry picked from commit 963ddd63)
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      88d3dfb6
  17. 03 3月, 2018 2 次提交
    • M
      drm/i915/uc: Introduce intel_uc_suspend|resume · 7cfca4af
      Michal Wajdeczko 提交于
      We want to use higher level 'uc' functions as the main entry points to
      the GuC/HuC code to hide some details and keep code layered.
      
      While here, move call to disable_guc_interrupts after sending suspend
      action to the GuC to allow it work also with CTB as comm mechanism.
      
      v2: update commit msg (Sagar)
      Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NSagar Arun Kamble <sagar.a.kamble@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180302111550.21328-1-michal.wajdeczko@intel.com
      7cfca4af
    • C
      drm/i915: Suspend submission tasklets around wedging · 963ddd63
      Chris Wilson 提交于
      After staring hard at sequences like
      
      [   28.199013]  systemd-1       2..s. 26062228us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0?], tail=1 [1?]
      [   28.199095]  systemd-1       2..s. 26062229us : execlists_submission_tasklet: rcs0 csb[1]: status=0x00000018:0x00000000, active=0x1
      [   28.199177]  systemd-1       2..s. 26062230us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.1, seqno=3, prio=-1024
      [   28.199258]  systemd-1       2..s. 26062231us : execlists_submission_tasklet: rcs0 completed ctx=0
      [   28.199340]  gem_eio-829     1..s1 26066853us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.1, seqno=1, prio=0
      [   28.199421]   <idle>-0       2..s. 26066863us : execlists_submission_tasklet: rcs0 cs-irq head=1 [1?], tail=2 [2?]
      [   28.199503]   <idle>-0       2..s. 26066865us : execlists_submission_tasklet: rcs0 csb[2]: status=0x00000001:0x00000000, active=0x1
      [   28.199585]  gem_eio-829     1..s1 26067077us : execlists_submission_tasklet: rcs0 in[1]:  ctx=3.1, seqno=2, prio=0
      [   28.199667]  gem_eio-829     1..s1 26067078us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.2, seqno=1, prio=0
      [   28.199749]   <idle>-0       2..s. 26067084us : execlists_submission_tasklet: rcs0 cs-irq head=2 [2?], tail=3 [3?]
      [   28.199830]   <idle>-0       2..s. 26067085us : execlists_submission_tasklet: rcs0 csb[3]: status=0x00008002:0x00000001, active=0x1
      [   28.199912]   <idle>-0       2..s. 26067086us : execlists_submission_tasklet: rcs0 out[0]: ctx=1.2, seqno=1, prio=0
      [   28.199994]  gem_eio-829     2..s. 28246084us : execlists_submission_tasklet: rcs0 cs-irq head=3 [3?], tail=4 [4?]
      [   28.200096]  gem_eio-829     2..s. 28246088us : execlists_submission_tasklet: rcs0 csb[4]: status=0x00000014:0x00000001, active=0x5
      [   28.200178]  gem_eio-829     2..s. 28246089us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.0, seqno=0, prio=0
      [   28.200260]  gem_eio-829     2..s. 28246127us : execlists_submission_tasklet: execlists_submission_tasklet:886 GEM_BUG_ON(buf[2 * head + 1] != port->context_id)
      
      the conclusion is that the only place where the ports are reset to zero,
      is from engine->cancel_requests called during i915_gem_set_wedged().
      
      The race is horrible as it results from calling set-wedged on active HW
      (the GPU reset failed) and as such we need to be careful as the HW state
      changes beneath us. Fortunately, it's the same scary conditions as
      affect normal reset, so we can reuse the same machinery to disable state
      tracking as we clobber it.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104945Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Fixes: af7a8ffa ("drm/i915: Use rcu instead of stop_machine in set_wedged")
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180302113324.23189-2-chris@chris-wilson.co.uk
      963ddd63
  18. 02 3月, 2018 1 次提交
  19. 22 2月, 2018 1 次提交
  20. 21 2月, 2018 2 次提交
    • C
      drm/i915: Move the policy for placement of the GGTT vma into the caller · 5935485f
      Chris Wilson 提交于
      Currently we make the unilateral decision inside
      i915_gem_object_pin_to_display() where the VMA should resided (inside
      the fence and mappable region or above?). This is not our decision to
      make as it impacts on how the display engine can use the resulting
      scanout object, and it would rather instruct us where to place the VMA so
      that it can enable the features it wants. As such, make the pin flags an
      argument to i915_gem_object_pin_to_display() and control them from
      intel_pin_and_fence_fb_obj()
      
      Whilst taking control of the mapping for ourselves, start tracking how
      we use it to avoid trying to free a fence we never claimed:
      
      <3>[  227.151869] GEM_BUG_ON(vma->fence->pin_count <= 0)
      <4>[  227.152064] ------------[ cut here ]------------
      <2>[  227.152068] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:391!
      <4>[  227.152084] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
      <0>[  227.152092] Dumping ftrace buffer:
      <0>[  227.152099]    (ftrace buffer empty)
      <4>[  227.152102] Modules linked in: i915 snd_hda_codec_analog snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm lpc_ich e1000e mei_me mei prime_numbers
      <4>[  227.152131] CPU: 1 PID: 1587 Comm: kworker/u16:49 Tainted: G     U           4.16.0-rc1-gbab67b2f6177-kasan_7+ #1
      <4>[  227.152134] Hardware name: Dell Inc. OptiPlex 755                 /0PU052, BIOS A08 02/19/2008
      <4>[  227.152236] Workqueue: events_unbound intel_atomic_commit_work [i915]
      <4>[  227.152292] RIP: 0010:intel_unpin_fb_vma+0x23a/0x2a0 [i915]
      <4>[  227.152295] RSP: 0018:ffff88005aad7b68 EFLAGS: 00010286
      <4>[  227.152300] RAX: 0000000000000026 RBX: ffff88005c359580 RCX: 0000000000000000
      <4>[  227.152304] RDX: 0000000000000026 RSI: ffffffff8707d840 RDI: ffffed000b55af63
      <4>[  227.152307] RBP: ffff880056817e58 R08: 0000000000000001 R09: 0000000000000000
      <4>[  227.152311] R10: ffff88005aad7b88 R11: 0000000000000000 R12: ffff8800568184d0
      <4>[  227.152314] R13: ffff880065b5ab08 R14: 0000000000000000 R15: dffffc0000000000
      <4>[  227.152318] FS:  0000000000000000(0000) GS:ffff88006ac40000(0000) knlGS:0000000000000000
      <4>[  227.152322] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>[  227.152325] CR2: 00007f5fb25550a8 CR3: 0000000068c78000 CR4: 00000000000006e0
      <4>[  227.152328] Call Trace:
      <4>[  227.152385]  intel_cleanup_plane_fb+0x6b/0xd0 [i915]
      <4>[  227.152395]  drm_atomic_helper_cleanup_planes+0x166/0x280
      <4>[  227.152452]  intel_atomic_commit_tail+0x159d/0x3380 [i915]
      <4>[  227.152463]  ? process_one_work+0x66e/0x1460
      <4>[  227.152516]  ? skl_update_crtcs+0x9c0/0x9c0 [i915]
      <4>[  227.152523]  ? lock_acquire+0x13d/0x390
      <4>[  227.152527]  ? lock_acquire+0x13d/0x390
      <4>[  227.152534]  process_one_work+0x71a/0x1460
      <4>[  227.152540]  ? __schedule+0x815/0x1e20
      <4>[  227.152547]  ? pwq_dec_nr_in_flight+0x2b0/0x2b0
      <4>[  227.152553]  ? _raw_spin_lock_irq+0xa/0x40
      <4>[  227.152559]  worker_thread+0xdf/0xf60
      <4>[  227.152569]  ? process_one_work+0x1460/0x1460
      <4>[  227.152573]  kthread+0x2cf/0x3c0
      <4>[  227.152578]  ? _kthread_create_on_node+0xa0/0xa0
      <4>[  227.152583]  ret_from_fork+0x3a/0x50
      <4>[  227.152591] Code: c6 00 11 86 c0 48 c7 c7 e0 bd 85 c0 e8 60 e7 a9 c4 0f ff e9 1f fe ff ff 48 c7 c6 40 10 86 c0 48 c7 c7 e0 ca 85 c0 e8 2b 95 bd c4 <0f> 0b 48 89 ef e8 4c 44 e8 c4 e9 ef fd ff ff e8 42 44 e8 c4 e9
      <1>[  227.152720] RIP: intel_unpin_fb_vma+0x23a/0x2a0 [i915] RSP: ffff88005aad7b68
      
      v2: i915_vma_pin_fence() is a no-op if a fence isn't required, so check
      vma->fence as well.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180220134208.24988-2-chris@chris-wilson.co.uk
      5935485f
    • C
      drm/i915: Also check view->type for a normal GGTT view · ac87a6fd
      Chris Wilson 提交于
      We cannot simply use !view as shorthand for all normal GGTT views as a
      few callers will always populate a i915_ggtt_view struct and set the
      type to NORMAL instead. So check for (!view || view->type == NORMAL)
      inside i915_gem_object_ggtt_pin().
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180220134208.24988-1-chris@chris-wilson.co.uk
      ac87a6fd
  21. 20 2月, 2018 1 次提交
    • C
      drm/i915: Track number of pending freed objects · c9c70471
      Chris Wilson 提交于
      During igt, we frequently call into the driver to reset both HW and
      driver state (idling the device, waiting for it to become idle and
      freeing off old objects) to ensure that we start each test/subtest/pass
      from known state. This process incurs an RCU barrier or two to ensure
      that any such pending frees are indeed flushed before we return.
      However, unconditionally waiting on the RCU barrier adds needless delay
      to many callers, which adds up to several seconds when repeated thousands
      of times. We can skip the rcu_barrier() if by tracking how many outstanding
      frees we have, we know there are none.
      
      The same path is used along suspend, where we may be able to save the
      unconditional RCU barrier.
      
      To put it into perspective with a completely meaningless
      microbenchmark, igt/gem_sync/idle is improved from 50ms to 30us on bdw.
      
      v2: Remove the extra synchronize_rcu() inside i915_drop_caches_set()
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180219220631.25001-1-chris@chris-wilson.co.uk
      c9c70471
  22. 16 2月, 2018 1 次提交
  23. 10 2月, 2018 1 次提交