1. 14 2月, 2015 9 次提交
  2. 29 1月, 2015 1 次提交
    • S
      drm/i915/skl: Enabling PSR on Skylake · e3d99845
      Sonika Jindal 提交于
      Mainly taking care of some register offsets, otherwise things are similar to
      hsw. Also, programming ddi aux to use hardcoded values for psr data select.
      
      v2: introduce  EDP_PSR_AUX_BASE macro (Chris)
      v3: Moving to HW tracking for SKL+ platforms, so activating source psr during
      psr_enabling and then avoiding psr entries and exits for each frontbuffer
      updates.
      v4: Using SKL DDI AUX regs instead of changing PSR_AUX regs definition (Rodrigo)
      Signed-off-by: NSonika Jindal <sonika.jindal@intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      [danvet: Drop the hunks to short-circuit sw tracking: We'd need to
      push this down one level, and I don't fully trust the test coverage
      yet to do so. So much prefer we pick a whitelist approach for the
      cases we know work correctly.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e3d99845
  3. 27 1月, 2015 3 次提交
  4. 13 1月, 2015 1 次提交
  5. 06 1月, 2015 1 次提交
    • K
      drm/i915: Make sample_c messages go faster on Haswell. · 94411593
      Kenneth Graunke 提交于
      Haswell significantly improved the performance of sampler_c messages,
      but the optimization appears to be off by default.  Later platforms
      remove this bit, and apparently always enable the optimization.
      
      Improves performance in "Counter Strike: Global Offensive" by 18%
      at default settings on Iris Pro.
      
      This may break sampling of paletted formats (P8/A8P8/P8A8).  It's
      unclear whether it affects sampling of paletted formats in general,
      or just the sample_c message (which is never used).
      
      While libva does have support for using paletted formats (primarily
      for OSDs), that support appears to have been broken for at least a
      year, so I couldn't observe a regression from this:
      
      I tried to get libva-intel to use paletted formats, and observe a
      regression...but the only thing I found that used it was mplayer's OSD
      (on screen display).  Even without my patch, the colors were totally
      wrong with that, and it's according to a few distro wikis, that's been
      the case for over a year.
      
      If libva's code for paletted formats /is/ broken, they could always
      add code to disable this bit using the command validator when fixing
      it.
      
      Further investigation from Haihao shows that libva mplayer OSD seems
      to work at least on his setup (still unclear what's wron with Ken's),
      and that it's not affected by this patch. Quoting the discussion
      between Haihao and Ken:
      
      > > > If you use "-vo gl" or "-vo xv", the OSD is solid white text with a black
      > > > border around it.  I presume that it's supposed to be white with vaapi as
      > > > well, but I guess I'm not entirely sure.
      > > >
      > > > It's possible that the optimization doesn't affect the palette as long as
      > > > you never use sample_c with the paletted textures.
      > >
      > > I verified the palette takes effect in the following way:
      > >
      > > 1. Only support P8A8 format in the driver
      > >
      > > 2. ran the above command and I saw white OSD text
      > >
      > > 3. Only support P4A4 format in the driver and don't use
      > > 3DSTATE_SAMPLER_PALETTE_LOAD0 to load the value to the texture palette,
      > > so the palette keeps unchanged.
      > >
      > > 4. ran the above command and I saw black OSD text.
      > >
      > > 5. Load the right value to the texture palette and ran the above command
      > > again, I saw white OSD text.
      > >
      > > Hence I think sample_c with the paletted textures is used in the driver.
      >
      > That sounds like the palette is actually working, then.  Great :)
      >
      > I doubt that libva would use sample_c - sampling with a shadow comparison?
      > It looks like it just uses sample and sample+killpix.
      
      You are right, libva driver doesn't use sample_c message.
      
      > I'm pretty sure the sample_c optimization just uses the palette memory as
      > storage for some stuff, so it's quite possible it just works if you're
      > only using sample and sample+killpix.
      
      Thanks for the explanation, it makes sense to me.
      Signed-off-by: NKenneth Graunke <kenneth@whitecape.org>
      [danvet: Add wa name from Ville's review to the comment and copypaste
      the explanation why we don't care about libva (already broken) from
      Ken. Also add conclusion from libva devs that&why this is all fine.]
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Cc: "Xiang, Haihao" <haihao.xiang@intel.com>
      Cc: libva@lists.freedesktop.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      94411593
  6. 16 12月, 2014 3 次提交
    • C
      drm/i915: Disable PSMI sleep messages on all rings around context switches · 2c550183
      Chris Wilson 提交于
      There exists a current workaround to prevent a hang on context switch
      should the ring go to sleep in the middle of the restore,
      WaProgramMiArbOnOffAroundMiSetContext (applicable to all gen7+). In
      spite of disabling arbitration (which prevents the ring from powering
      down during the critical section) we were still hitting hangs that had
      the hallmarks of the known erratum. That is we are still seeing hangs
      "on the last instruction in the context restore". By comparing -nightly
      (broken) with requests (working), we were able to deduce that it was the
      semaphore LRI cross-talk that reproduced the original failure. The key
      was that requests implemented deferred semaphore signalling, and
      disabling that, i.e. emitting the semaphore signal to every other ring
      after every batch restored the frequent hang.  Explicitly disabling PSMI
      sleep on the RCS ring was insufficient, all the rings had to be awake to
      prevent the hangs. Fortunately, we can reduce the wakelock to the
      MI_SET_CONTEXT operation itself, and so should be able to limit the extra
      power implications.
      
      Since the MI_ARB_ON_OFF workaround is listed for all gen7 and above
      products, we should apply this extra hammer for all of the same
      platforms despite so far that we have only been able to reproduce the
      hang on certain ivb and hsw models. The last question is whether we want
      to always use the extra hammer or only when we know semaphores are in
      operation. At the moment, we only use LRI on non-RCS rings for
      semaphores, but that may change in the future with the possibility of
      reintroducing this bug under subtle conditions.
      
      v2: Make it explicit that the PSMI LRI are an extension to the original
      workaround for the other rings.
      v3: Bikeshedding variable names and whitespacing
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80660
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83677
      Cc: Simon Farnsworth <simon@farnz.org.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Tested-by: NPeter Frühberger <fritsch@xbmc.org>
      Reviewed-by: NDaniel Vetter <daniel@ffwll.ch>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      2c550183
    • C
      drm/i915: Invalidate media caches on gen7 · 148b83d0
      Chris Wilson 提交于
      In the gen7 pipe control there is an extra bit to flush the media
      caches, so let's set it during cache invalidation flushes.
      
      v2: Rename to MEDIA_STATE_CLEAR to be more inline with spec.
      
      Cc: Simon Farnsworth <simon@farnz.org.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      148b83d0
    • J
      drm/i915: Add GPGPU_THREADS_DISPATCHED to the register whitelist · c61200c2
      Jordan Justen 提交于
      This will allow us to read the number of dispatched compute threads
      for GL_ARB_pipeline_statistics_query.
      Signed-off-by: NJordan Justen <jordan.l.justen@intel.com>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c61200c2
  7. 11 12月, 2014 3 次提交
  8. 10 12月, 2014 1 次提交
    • D
      drm/i915/bdw: Fix the write setting up the WIZ hashing mode · 98533251
      Damien Lespiau 提交于
      I was playing with clang and oh surprise! a warning trigerred by
      -Wshift-overflow (gcc doesn't have this one):
      
          WA_SET_BIT_MASKED(GEN7_GT_MODE,
                            GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
      
          drivers/gpu/drm/i915/intel_ringbuffer.c:786:2: warning: signed shift result
            (0x28002000000) requires 43 bits to represent, but 'int' only has 32 bits
            [-Wshift-overflow]
              WA_SET_BIT_MASKED(GEN7_GT_MODE,
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          drivers/gpu/drm/i915/intel_ringbuffer.c:737:15: note: expanded from macro
            'WA_SET_BIT_MASKED'
              WA_REG(addr, _MASKED_BIT_ENABLE(mask), (mask) & 0xffff)
      
      Turned out GEN6_WIZ_HASHING_MASK was already shifted by 16, and we were
      trying to shift it a bit more.
      
      The other thing is that it's not the usual case of setting WA bits here, we
      need to have separate mask and value.
      
      To fix this, I've introduced a new _MASKED_FIELD() macro that takes both the
      (unshifted) mask and the desired value and the rest of the patch ripples
      through from it.
      
      This bug was introduced when reworking the WA emission in:
      
        Commit 7225342a
        Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
        Date:   Tue Oct 7 17:21:26 2014 +0300
      
            drm/i915: Build workaround list in ring initialization
      
      v2: Invert the order of the mask and value arguments (Daniel Vetter)
          Rewrite _MASKED_BIT_ENABLE() and _MASKED_BIT_DISABLE() with
          _MASKED_FIELD() (Jani Nikula)
          Make sure we only evaluate 'a' once in _MASKED_BIT_ENABLE() (Dave Gordon)
          Add check to ensure the value is within the mask boundaries (Chris Wilson)
      
      v3: Ensure the the value and mask are 16 bits (Dave Gordon)
      
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
      Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      98533251
  9. 05 12月, 2014 3 次提交
  10. 03 12月, 2014 4 次提交
  11. 20 11月, 2014 2 次提交
    • D
      drm/i915: Pin tiled objects for L-shaped configs · 656bfa3a
      Daniel Vetter 提交于
      Let's just throw in the towel on this one and take the cheap way out.
      
      Based on a patch from Chris Wilson, but checking for a different bit.
      Chris' patch checked for even bank layout, this one here for a magic
      bit. Given the evidence we've gathered (not much) both work I think,
      but checking for the magic bit might be more accurate.
      
      Anyway, works on my gm45 here.
      
      For paranoi restrict to gen4 (and mobile), since we've only ever seen
      this on gm45 and i965gm.
      
      Also add some debugfs output so that we can skip the tiled swapping
      tests properly in these cases.
      
      v2: Clean up the quirk'ed pin count in free_object to avoid upsetting
      the WARN_ON. Spotted by Chris.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28813
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45092Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      656bfa3a
    • T
      drm/i915: Use efficient frequency for HSW/BDW · 93ee2920
      Tom O'Rourke 提交于
      Added gen6_init_rps_frequencies() to initialize
      the rps frequency values.  This function replaces
      parse_rp_state_cap().  In addition to reading RPn,
      RP0, and RP1 from RP_STATE_CAP register, the new
      function reads efficient frequency (aka RPe) from
      pcode for Haswell and Broadwell and sets the turbo
      softlimits.  The turbo minimum frequency softlimit
      is set to RPe for Haswell and Broadwell and to RPn
      otherwise.
      
      For RPe, the efficiency is based on the frequency/power
      ratio (MHz/W); this is considering GT power and not
      package power.  The efficent frequency is the highest
      frequency for which the frequency/power ratio is within
      some threshold of the highest frequency/power ratio.
      A fixed decrease in frequency results in smaller
      decrease in power at frequencies less than RPe than
      at frequencies above RPe.
      
      v2: Following suggestions from Chris Wilson and
      Daniel Vetter to extend and rename parse_rp_state_cap
      and to open-code a poorly named function.
      Signed-off-by: NTom O'Rourke <Tom.O'Rourke@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Remove unused variables.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      93ee2920
  12. 18 11月, 2014 1 次提交
    • V
      drm/i915: Drop the HSW special case from __gen6_gt_wait_for_thread_c0() · eb88bd1b
      Ville Syrjälä 提交于
      Bits [18:16] of GEN6_GT_THREAD_STATUS_REG have always had the same
      meaning since SNB. So treating them as something special for HSW doesn't
      make sense to me.
      
      Also the bits *seem* to work exactly the same way on IVB, HSW GT2 and
      HSW GT3. At least intel_reg_read gives the identical results on all
      platforms with and without forcewake.
      
      Also the HSW PM guide rev 0.99 (ww05 2013) doesn't say anything about
      those bits. It just says to poll for bits [2:0]. As does the more recent
      BDW PM guide.
      
      So just drop the HSW special case and treat all platforms the same way.
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: Deepak S<deepak.s@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      eb88bd1b
  13. 17 11月, 2014 1 次提交
  14. 14 11月, 2014 7 次提交