1. 10 12月, 2014 2 次提交
    • D
      drm/i915: Invert the mask and val arguments in wa_add() and WA_REG() · cf4b0de6
      Damien Lespiau 提交于
      While trying to unify the order of those arguments throughout the
      driver, Daniel noticed what we were inverting them in this part of the
      code.
      Suggested-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      cf4b0de6
    • D
      drm/i915/bdw: Fix the write setting up the WIZ hashing mode · 98533251
      Damien Lespiau 提交于
      I was playing with clang and oh surprise! a warning trigerred by
      -Wshift-overflow (gcc doesn't have this one):
      
          WA_SET_BIT_MASKED(GEN7_GT_MODE,
                            GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
      
          drivers/gpu/drm/i915/intel_ringbuffer.c:786:2: warning: signed shift result
            (0x28002000000) requires 43 bits to represent, but 'int' only has 32 bits
            [-Wshift-overflow]
              WA_SET_BIT_MASKED(GEN7_GT_MODE,
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          drivers/gpu/drm/i915/intel_ringbuffer.c:737:15: note: expanded from macro
            'WA_SET_BIT_MASKED'
              WA_REG(addr, _MASKED_BIT_ENABLE(mask), (mask) & 0xffff)
      
      Turned out GEN6_WIZ_HASHING_MASK was already shifted by 16, and we were
      trying to shift it a bit more.
      
      The other thing is that it's not the usual case of setting WA bits here, we
      need to have separate mask and value.
      
      To fix this, I've introduced a new _MASKED_FIELD() macro that takes both the
      (unshifted) mask and the desired value and the rest of the patch ripples
      through from it.
      
      This bug was introduced when reworking the WA emission in:
      
        Commit 7225342a
        Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
        Date:   Tue Oct 7 17:21:26 2014 +0300
      
            drm/i915: Build workaround list in ring initialization
      
      v2: Invert the order of the mask and value arguments (Daniel Vetter)
          Rewrite _MASKED_BIT_ENABLE() and _MASKED_BIT_DISABLE() with
          _MASKED_FIELD() (Jani Nikula)
          Make sure we only evaluate 'a' once in _MASKED_BIT_ENABLE() (Dave Gordon)
          Add check to ensure the value is within the mask boundaries (Chris Wilson)
      
      v3: Ensure the the value and mask are 16 bits (Dave Gordon)
      
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
      Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      98533251
  2. 20 11月, 2014 2 次提交
    • C
      drm/i915: Remove DRI1 ring accessors and API · 5c6c6003
      Chris Wilson 提交于
      With the deprecation of UMS, and by association DRI1, we have a tough
      choice when updating the ring access routines. We either rewrite the
      DRI1 routines blindly without testing (so likely to be broken) or take
      the liberty of declaring them no longer supported and remove them
      entirely. This takes the latter approach.
      
      v2: Also remove the DRI1 sarea updates
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Fix rebase conflicts.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      5c6c6003
    • T
      drm/i915/bdw: Pin the ringbuffer backing object to GGTT on-demand · 7ba717cf
      Thomas Daniel 提交于
      Same as with the context, pinning to GGTT regardless is harmful (it
      badly fragments the GGTT and can even exhaust it).
      
      Unfortunately, this case is also more complex than the previous one
      because we need to map and access the ringbuffer in several places
      along the execbuffer path (and we cannot make do by leaving the
      default ringbuffer pinned, as before). Also, the context object
      itself contains a pointer to the ringbuffer address that we have to
      keep updated if we are going to allow the ringbuffer to move around.
      
      v2: Same as with the context pinning, we cannot really do it during
      an interrupt. Also, pin the default ringbuffers objects regardless
      (makes error capture a lot easier).
      
      v3: Rebased. Take a pin reference of the ringbuffer for each item
      in the execlist request queue because the hardware may still be using
      the ringbuffer after the MI_USER_INTERRUPT to notify the seqno update
      is executed.  The ringbuffer must remain pinned until the context save
      is complete.  No longer pin and unpin ringbuffer in
      populate_lr_context() - this transient address is meaningless and the
      pinning can cause a sleep while atomic.
      
      v4: Moved ringbuffer pin and unpin into the lr_context_pin functions.
      Downgraded pinning check BUG_ONs to WARN_ONs.
      
      v5: Reinstated WARN_ONs for unexpected execlist states.  Removed unused
      variable.
      
      Issue: VIZ-4277
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NAkash Goel <akash.goels@gmail.com>
      Reviewed-by: Deepak S<deepak.s@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7ba717cf
  3. 14 11月, 2014 4 次提交
  4. 05 11月, 2014 1 次提交
  5. 24 10月, 2014 3 次提交
  6. 30 9月, 2014 1 次提交
  7. 29 9月, 2014 1 次提交
    • R
      drm/i915: Minimize the huge amount of unecessary fbc sw cache clean. · 1d73c2a8
      Rodrigo Vivi 提交于
      The sw cache clean on BDW is a tempoorary workaround because we cannot
      set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
      However we are doing much more than needed. Not only when using blt ring.
      So, with this extra w/a we minimize the ammount of cache cleans and call it only
      on same cases that it was being called on gen7.
      
      The traditional FBC Cache clean happens over LRI on BLT ring when there is a
      frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable
      to let BLT flush that it must clean FBC cache.
      
      fbc.need_sw_cache_clean works in the opposite information direction
      of ring->fbc_dirty telling software on frontbuffer tracking to perform
      the cache clean on sw side.
      
      v2: Clean it a little bit and fully check for Broadwell instead of gen8.
      
      v3: Rebase after frontbuffer organization.
      
      v4: Wiggle confused me. So fixing v3!
      
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1d73c2a8
  8. 24 9月, 2014 2 次提交
  9. 19 9月, 2014 2 次提交
    • D
      drm/i915: Fix irq checks in ring->irq_get/put functions · 7cd512f1
      Daniel Vetter 提交于
      Yet another place that wasn't properly transformed when implementing
      SOix. While at it convert the checks to WARN_ON on gen5+ (since we
      don't have UMS potentially doing stupid things on those platforms).
      And also add the corresponding checks to the put functions (again with
      a WARN_ON) for gen5+.
      
      v2: Drop the WARNINGS in the irq_put functions (including the existing
      one for vebox), Chris convinced me that they're not that terribly
      useful.
      
      v3: Don't forget about execlist code.
      
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: "Volkin, Bradley D" <bradley.d.volkin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      7cd512f1
    • C
      drm/i915: HSW always use GGTT selector for secure batches · 77072258
      Chris Wilson 提交于
      gen6 and earlier conflate address space selection (ppgtt vs ggtt) with
      the security bit (i.e. only privileged batches were allowed to run from
      ggtt). From Haswell only, you are able to select the security bit
      separate from the address space - and we always requested to use ppgtt.
      This breaks the golden render state batch execution with full-ppgtt as
      that is only present in the global GTT and more generally any secure
      batch that is not colocated in the ppgtt and ggtt. So we need to
      disable the use of the ppgtt selector bit for secure batches, or else we
      hang immediately upon boot and thence after every GPU reset...
      
      v2: Only HSW differentiates between secure dispatch and ggtt, so simply
      ignore the differentiation and always use secure==ggtt.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      [danvet: Rectify commit message as noted by Chris.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      77072258
  10. 15 9月, 2014 1 次提交
  11. 08 9月, 2014 1 次提交
    • C
      drm/i915: Evict CS TLBs between batches · c4d69da1
      Chris Wilson 提交于
      Running igt, I was encountering the invalid TLB bug on my 845g, despite
      that it was using the CS workaround. Examining the w/a buffer in the
      error state, showed that the copy from the user batch into the
      workaround itself was suffering from the invalid TLB bug (the first
      cacheline was broken with the first two words reversed). Time to try a
      fresh approach. This extends the workaround to write into each page of
      our scratch buffer in order to overflow the TLB and evict the invalid
      entries. This could be refined to only do so after we update the GTT,
      but for simplicity, we do it before each batch.
      
      I suspect this supersedes our current workaround, but for safety keep
      doing both.
      
      v2: The magic number shall be 2.
      
      This doesn't conclusively prove that it is the mythical TLB bug we've
      been trying to workaround for so long, that it requires touching a number
      of pages to prevent the corruption indicates to me that it is TLB
      related, but the corruption (the reversed cacheline) is more subtle than
      a TLB bug, where we would expect it to read the wrong page entirely.
      
      Oh well, it prevents a reliable hang for me and so probably for others
      as well.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      c4d69da1
  12. 04 9月, 2014 1 次提交
  13. 03 9月, 2014 7 次提交
  14. 20 8月, 2014 1 次提交
  15. 13 8月, 2014 1 次提交
    • D
      drm/i915: Fix up checks for aliasing ppgtt · 896ab1a5
      Daniel Vetter 提交于
      A subsequent patch will no longer initialize the aliasing ppgtt if we
      have full ppgtt enabled, since we simply don't need that any more.
      
      Unfortunately a few places check for the aliasing ppgtt instead of
      checking for ppgtt in general. Fix them up.
      
      One special case are the gtt offset and size macros, which have some
      code to remap the aliasing ppgtt to the global gtt. The aliasing ppgtt
      is _not_ a logical address space, so passing that in as the vm is
      plain and simple a bug. So just WARN about it and carry on - we have a
      gracefully fall-through anyway if we can't find the vma.
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      896ab1a5
  16. 12 8月, 2014 2 次提交
    • O
      drm/i915/bdw: GEN-specific logical ring emit flush · 4712274c
      Oscar Mateo 提交于
      Same as the legacy-style ring->flush.
      
      v2: The BSD invalidate bit still exists in GEN8! Add it for the VCS
      rings (but still consolidate the blt and bsd ring flushes into one).
      This was noticed by Brad Volkin.
      
      v3: The command for BSD and for other rings is slightly different:
      get it exactly the same as in gen6_ring_flush + gen6_bsd_ring_flush
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Checkpatch.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      4712274c
    • O
      drm/i915/bdw: New logical ring submission mechanism · 82e104cc
      Oscar Mateo 提交于
      Well, new-ish: if all this code looks familiar, that's because it's
      a clone of the existing submission mechanism (with some modifications
      here and there to adapt it to LRCs and Execlists).
      
      And why did we do this instead of reusing code, one might wonder?
      Well, there are some fears that the differences are big enough that
      they will end up breaking all platforms.
      
      Also, Execlists offer several advantages, like control over when the
      GPU is done with a given workload, that can help simplify the
      submission mechanism, no doubt. I am interested in getting Execlists
      to work first and foremost, but in the future this parallel submission
      mechanism will help us to fine tune the mechanism without affecting
      old gens.
      
      v2: Pass the ringbuffer only (whenever possible).
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Appease checkpatch. Again. And drop the legacy sarea gunk
      that somehow crept in.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      82e104cc
  17. 11 8月, 2014 5 次提交
  18. 08 8月, 2014 2 次提交
  19. 07 8月, 2014 1 次提交