1. 01 6月, 2018 2 次提交
  2. 30 5月, 2018 1 次提交
  3. 24 5月, 2018 1 次提交
    • C
      drm/i915: Look for an active kernel context before switching · 09a4c02e
      Chris Wilson 提交于
      We were not very carefully checking to see if an older request on the
      engine was an earlier switch-to-kernel-context before deciding to emit a
      new switch. The end result would be that we could get into a permanent
      loop of trying to emit a new request to perform the switch simply to
      flush the existing switch.
      
      What we need is a means of tracking the completion of each timeline
      versus the kernel context, that is to detect if a more recent request
      has been submitted that would result in a switch away from the kernel
      context. To realise this, we need only to look in our syncmap on the
      kernel context and check that we have synchronized against all active
      rings.
      
      v2: Since all ringbuffer clients currently share the same timeline, we do
      have to use the gem_context to distinguish clients.
      
      As a bonus, include all the tracing used to debug the death inside
      suspend.
      
      v3: Test, test, test. Construct a selftest to exercise and assert the
      expected behaviour that multiple switch-to-contexts do not emit
      redundant requests.
      Reported-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Fixes: a89d1f92 ("drm/i915: Split i915_gem_timeline into individual timelines")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180524081135.15278-1-chris@chris-wilson.co.uk
      09a4c02e
  4. 18 5月, 2018 2 次提交
  5. 17 5月, 2018 4 次提交
  6. 14 5月, 2018 1 次提交
  7. 08 5月, 2018 1 次提交
  8. 04 5月, 2018 1 次提交
    • C
      drm/i915: Lazily unbind vma on close · 3365e226
      Chris Wilson 提交于
      When userspace is passing around swapbuffers using DRI, we frequently
      have to open and close the same object in the foreign address space.
      This shows itself as the same object being rebound at roughly 30fps
      (with a second object also being rebound at 30fps), which involves us
      having to rewrite the page tables and maintain the drm_mm range manager
      every time.
      
      However, since the object still exists and it is only the local handle
      that disappears, if we are lazy and do not unbind the VMA immediately
      when the local user closes the object but defer it until the GPU is
      idle, then we can reuse the same VMA binding. We still have to be
      careful to mark the handle and lookup tables as closed to maintain the
      uABI, just allowing the underlying VMA to be resurrected if the user is
      able to access the same object from the same context again.
      
      If the object itself is destroyed (neither userspace keeping a handle to
      it), the VMA will be reaped immediately as usual.
      
      In the future, this will be even more useful as instantiating a new VMA
      for use on the GPU will become heavier. A nuisance indeed, so nip it in
      the bud.
      
      v2: s/__i915_vma_final_close/i915_vma_destroy/ etc.
      v3: Leave a hint as to why we deferred the unbind on close.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-1-chris@chris-wilson.co.uk
      3365e226
  9. 03 5月, 2018 2 次提交
    • C
      drm/i915: Split i915_gem_timeline into individual timelines · a89d1f92
      Chris Wilson 提交于
      We need to move to a more flexible timeline that doesn't assume one
      fence context per engine, and so allow for a single timeline to be used
      across a combination of engines. This means that preallocating a fence
      context per engine is now a hindrance, and so we want to introduce the
      singular timeline. From the code perspective, this has the notable
      advantage of clearing up a lot of mirky semantics and some clumsy
      pointer chasing.
      
      By splitting the timeline up into a single entity rather than an array
      of per-engine timelines, we can realise the goal of the previous patch
      of tracking the timeline alongside the ring.
      
      v2: Tweak wait_for_idle to stop the compiling thinking that ret may be
      uninitialised.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-2-chris@chris-wilson.co.uk
      a89d1f92
    • C
      drm/i915: Move timeline from GTT to ring · 65fcb806
      Chris Wilson 提交于
      In the future, we want to move a request between engines. To achieve
      this, we first realise that we have two timelines in effect here. The
      first runs through the GTT is required for ordering vma access, which is
      tracked currently by engine. The second is implied by sequential
      execution of commands inside the ringbuffer. This timeline is one that
      maps to userspace's expectations when submitting requests (i.e. given the
      same context, batch A is executed before batch B). As the rings's
      timelines map to userspace and the GTT timeline an implementation
      detail, move the timeline from the GTT into the ring itself (per-context
      in logical-ring-contexts/execlists, or a global per-engine timeline for
      the shared ringbuffers in legacy submission.
      
      The two timelines are still assumed to be equivalent at the moment (no
      migrating requests between engines yet) and so we can simply move from
      one to the other without adding extra ordering.
      
      v2: Reinforce that one isn't allowed to mix the engine execution
      timeline with the client timeline from userspace (on the ring).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-1-chris@chris-wilson.co.uk
      65fcb806
  10. 30 4月, 2018 3 次提交
  11. 26 4月, 2018 1 次提交
  12. 19 4月, 2018 1 次提交
  13. 12 4月, 2018 1 次提交
  14. 11 4月, 2018 1 次提交
  15. 07 4月, 2018 3 次提交
  16. 24 3月, 2018 1 次提交
  17. 16 3月, 2018 4 次提交
  18. 14 3月, 2018 4 次提交
    • J
      drm/i915/guc: Check the locking status of GuC WOPCM registers · f08e2035
      Jackie Li 提交于
      GuC WOPCM registers are write-once registers. Current driver code accesses
      these registers without checking the accessibility to these registers which
      will lead to unpredictable driver behaviors if these registers were touch
      by other components (such as faulty BIOS code).
      
      This patch moves the GuC WOPCM registers updating code into intel_wopcm.c
      and adds check before and after the update to GuC WOPCM registers so that
      we can make sure the driver is in a known state after writing to these
      write-once registers.
      
      v6:
       - Made sure module reloading won't bug the kernel while doing
         locking status checking
      
      v7:
       - Fixed patch format issues
      
      v8:
       - Fixed coding style issue on register lock bit macro definition (Sagar)
      
      v9:
       - Avoided to use redundant !! to cast uint to bool (Chris)
       - Return error code instead of GEM_BUG_ON for locked with invalid register
         values case (Sagar)
       - Updated guc_wopcm_hw_init to use guc_wopcm as first parameter (Michal)
       - Added code to set and validate the HuC_LOADING_AGENT_GUC bit in GuC
         WOPCM offset register based on the presence of HuC firmware (Michal)
       - Use bit fields instead of macros for GuC WOPCM flags (Michal)
      
      v10:
       - Refined variable names, removed redundant comments (Joonas)
       - Introduced lockable_reg to handle the write once register write and
         propagate the write error to caller (Joonas)
       - Used lockable_reg abstraction to avoid locking bit check on generic
         i915_reg_t (Michal)
       - Added log message for error paths (Michal)
       - Removed hw_updated flag and only relies on real hardware status
      
      v11:
       - Replaced lockable_reg with simplified function (Michal)
       - Used new macros for locking bits of WOPCM size/offset registers instead
         of using BIT(0) directly (Michal)
       - use intel_wopcm_init_hw() called from intel_gem_init_hw() to do GuC
         WOPCM register setup instead of calling from intel_uc_init_hw() (Michal)
      
      v12:
       - Updated function kernel-doc to align with code changes (Michal)
       - Updated code to use wopcm pointer directly (Michal)
      
      v13:
       - Updated the ordering of s-o-b/cc/r-b tags (Sagar)
      
      BSpec: 10875, 10833
      Signed-off-by: NJackie Li <yaodong.li@intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11)
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12)
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-5-git-send-email-yaodong.li@intel.com
      f08e2035
    • J
      drm/i915: Implement dynamic GuC WOPCM offset and size calculation · 6b0478fb
      Jackie Li 提交于
      Hardware may have specific restrictions on GuC WOPCM offset and size. On
      Gen9, the value of the GuC WOPCM size register needs to be larger than the
      value of GuC WOPCM offset register + a Gen9 specific offset (144KB) for
      reserved GuC WOPCM. Fail to enforce such a restriction on GuC WOPCM size
      will lead to GuC firmware execution failures. On the other hand, with
      current static GuC WOPCM offset and size values (512KB for both offset and
      size), the GuC WOPCM size verification will fail on Gen9 even if it can be
      fixed by lowering the GuC WOPCM offset by calculating its value based on
      HuC firmware size (which is likely less than 200KB on Gen9), so that we can
      have a GuC WOPCM size value which is large enough to pass the GuC WOPCM
      size check.
      
      This patch updates the reserved GuC WOPCM size for RC6 context on Gen9 to
      24KB to strictly align with the Gen9 GuC WOPCM layout. It also adds support
      to verify the GuC WOPCM size aganist the Gen9 hardware restrictions. To
      meet all above requirements, let's provide dynamic partitioning of the
      WOPCM that will be based on platform specific HuC/GuC firmware sizes.
      
      v2:
       - Removed intel_wopcm_init (Ville/Sagar/Joonas)
       - Renamed and Moved the intel_wopcm_partition into intel_guc (Sagar)
       - Removed unnecessary function calls (Joonas)
       - Init GuC WOPCM partition as soon as firmware fetching is completed
      
      v3:
       - Fixed indentation issues (Chris)
       - Removed layering violation code (Chris/Michal)
       - Created separat files for GuC wopcm code  (Michal)
       - Used inline function to avoid code duplication (Michal)
      
      v4:
       - Preset the GuC WOPCM top during early GuC init (Chris)
       - Fail intel_uc_init_hw() as soon as GuC WOPCM partitioning failed
      
      v5:
       - Moved GuC DMA WOPCM register updating code into intel_wopcm.c
       - Took care of the locking status before writing to GuC DMA
         Write-Once registers. (Joonas)
      
      v6:
       - Made sure the GuC WOPCM size to be multiple of 4K (4K aligned)
      
      v8:
       - Updated comments and fixed naming issues (Sagar/Joonas)
       - Updated commit message to include more description about the hardware
         restriction on GuC WOPCM size (Sagar)
      
      v9:
       - Minor changes variable names and code comments (Sagar)
       - Added detailed GuC WOPCM layout drawing (Sagar/Michal)
       - Refined macro definitions to be reader friendly (Michal)
       - Removed redundent check to valid flag (Michal)
       - Unified first parameter for exported GuC WOPCM functions (Michal)
       - Refined the name and parameter list of hardware restriction checking
         functions (Michal)
      
      v10:
       - Used shorter function name for internal functions (Joonas)
       - Moved init-ealry function into c file (Joonas)
       - Consolidated and removed redundant size checks (Joonas/Michal)
       - Removed unnecessary unlikely() from code which is only called once
         during boot (Joonas)
       - More fixes to kernel-doc format and content (Michal)
       - Avoided the use of PAGE_MASK for 4K pages (Michal)
       - Added error log messages to error paths (Michal)
      
      v11:
       - Replaced intel_guc_wopcm with more generic intel_wopcm and attached
         intel_wopcm to drm_i915_private instead intel_guc (Michal)
       - dynamic calculation of GuC non-wopcm memory start (a.k.a WOPCM Top
         offset from GuC WOPCM base) (Michal)
       - Moved WOPCM marco definitions into .c source file (Michal)
       - Exported WOPCM layout diagram as kernel-doc (Michal)
      
      v12:
       - Updated naming, function kernel-doc to align with new changes (Michal)
      
      v13:
       - Updated the ordering of s-o-b/cc/r-b tags (Sagar)
       - Corrected one tense error in comment (Sagar)
       - Corrected typos and removed spurious comments (Joonas)
      
      Bspec: 12690
      Signed-off-by: NJackie Li <yaodong.li@intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: John Spotswood <john.a.spotswood@intel.com>
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v8)
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v9)
      Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11)
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12)
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-2-git-send-email-yaodong.li@intel.com
      6b0478fb
    • C
      drm/i915: Show GEM_TRACE when detecting a failed GPU idle · 629820fc
      Chris Wilson 提交于
      If we timeout waiting for the GPU to idle, something went seriously
      wrong. We currently dump the engine state, but we can also dump the
      ftrace buffer showing our last operations (when available).
      
      In passing, note that since commit 559e040f ("drm/i915: Show the GPU
      state when declaring wedged", we now show the engine state twice, once
      in detecting the failed idle and then again on declaring wedged.
      
      v2: ftrace_dump() takes a parameter specifying whether to dump all cpu
      buffers or the local cpu's.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180309101114.1138-1-chris@chris-wilson.co.uk
      629820fc
    • D
      drm/i915/frontbuffer: Pull frontbuffer_flush out of gem_obj_pin_to_display · 07bcd99b
      Dhinakaran Pandiyan 提交于
      i915_gem_obj_pin_to_display() calls frontbuffer_flush with origin set to
      DIRTYFB. The callers however are at a vantage point to decide if hardware
      frontbuffer tracking can do the flush for us. For example, legacy cursor
      updates, like flips, write to MMIO registers, which then triggers PSR flush
      by the hardware. Moving frontbuffer_flush out will enable us to skip a
      software initiated flush by setting origin to FLIP. Thanks to Chris for the
      idea.
      
      v2:
      Rebased due to Ville adding intel_plane_pin_fb().
      Minor code reordering as fb_obj_flush doesn't need struct_mutex (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180307033420.3086-1-dhinakaran.pandiyan@intel.com
      07bcd99b
  19. 13 3月, 2018 2 次提交
  20. 09 3月, 2018 4 次提交