1. 28 5月, 2019 1 次提交
    • V
      drm/i915: Make sure we have enough memory bandwidth on ICL · c457d9cf
      Ville Syrjälä 提交于
      ICL has so many planes that it can easily exceed the maximum
      effective memory bandwidth of the system. We must therefore check
      that we don't exceed that limit.
      
      The algorithm is very magic number heavy and lacks sufficient
      explanation for now. We also have no sane way to query the
      memory clock and timings, so we must rely on a combination of
      raw readout from the memory controller and hardcoded assumptions.
      The memory controller values obviously change as the system
      jumps between the different SAGV points, so we try to stabilize
      it first by disabling SAGV for the duration of the readout.
      
      The utilized bandwidth is tracked via a device wide atomic
      private object. That is actually not robust because we can't
      afford to enforce strict global ordering between the pipes.
      Thus I think I'll need to change this to simply chop up the
      available bandwidth between all the active pipes. Each pipe
      can then do whatever it wants as long as it doesn't exceed
      its budget. That scheme will also require that we assume that
      any number of planes could be active at any time.
      
      TODO: make it robust and deal with all the open questions
      
      v2: Sleep longer after disabling SAGV
      v3: Poll for the dclk to get raised (seen it take 250ms!)
          If the system has 2133MT/s memory then we pointlessly
          wait one full second :(
      v4: Use the new pcode interface to get the qgv points rather
          that using hardcoded numbers
      v5: Move the pcode stuff into intel_bw.c (Matt)
          s/intel_sagv_info/intel_qgv_info/
          Do the NV12/P010 as per spec for now (Matt)
          s/IS_ICELAKE/IS_GEN11/
      v6: Ignore bandwidth limits if the pcode query fails
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NMatt Roper <matthew.d.roper@intel.com>
      Acked-by: NClint Taylor <Clinton.A.Taylor@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
      c457d9cf
  2. 25 5月, 2019 1 次提交
  3. 23 5月, 2019 1 次提交
  4. 20 5月, 2019 1 次提交
  5. 14 5月, 2019 1 次提交
    • I
      drm/i915: Add support for asynchronous display power disabling · e0da2d63
      Imre Deak 提交于
      By disabling a power domain asynchronously we can restrict holding a
      reference on that power domain to the actual code sequence that
      requires the power to be on for the HW access it's doing, by also
      avoiding unneeded on-off-on togglings of the power domain (since the
      disabling happens with a delay).
      
      One benefit is potential power saving due to the following two reasons:
      1. The fact that we will now be holding the reference only for the
         necessary duration by the end of the patchset. While simply not
         delaying the disabling has the same benefit, it has the problem that
         frequent on-off-on power switching has its own power cost (see the 2.
         point below) and the debug trace for power well on/off events will
         cause a lot of dmesg spam (see details about this further below).
      2. Avoiding the power cost of freuqent on-off-on power switching. This
         requires us to find the optimal disabling delay based on the measured
         power cost of on->off and off->on switching of each power well vs.
         the power of keeping the given power well on.
      
         In this patchset I'm not providing this optimal delay for two
         reasons:
         a) I don't have the means yet to perform the measurement (with high
            enough signal-to-noise ratio, or with the help of an energy
            counter that takes switching into account). I'm currently looking
            for a way to measure this.
      
         b) Before reducing the disabling delay we need an alternative way for
            debug tracing powerwell on/off events. Simply avoiding/throttling
            the debug messages is not a solution, see further below.
      
         Note that even in the case where we can't measure any considerable
         power cost of frequent on-off switching of powerwells, it still would
         make sense to do the disabling asynchronously (with 0 delay) to avoid
         blocking on the disabling. On VLV I measured this disabling time
         overhead to be 1ms on average with a worst case of 4ms.
      
      In the case of the AUX power domains on ICL we would also need to keep
      the sequence where we hold the power reference short, the way it would
      be by the end of this patchset where we hold it only for the actual AUX
      transfer. Anything else would make the locking we need for ICL TypeC
      ports (whenever we hold a reference on any AUX power domain) rather
      problematic, adding for instance unnecessary lockdep dependencies to
      the required TypeC port lock.
      
      I chose the disabling delay to be 100msec for now to avoid the unneeded
      toggling (and so not to introduce dmesg spamming) in the DP MST sideband
      signaling code. We could optimize this delay later, once we have the
      means to measure the switching power cost (see above).
      
      Note that simply removing/throttling the debug tracing for power well
      on/off events is not a solution. We need to know the exact spots of
      these events and cannot rely only on incorrect register accesses caught
      (due to not holding a wakeref at the time of access). Incorrect
      powerwell enabling/disabling could lead to other problems, for instance
      we need to keep certain powerwells enabled for the duration of modesets
      and AUX transfers.
      
      v2:
      - Clarify the commit log parts about power cost measurement and the
        problem of simply removing/throttling debug tracing. (Chris)
      - Optimize out local wakeref vars at intel_runtime_pm_put_raw() and
        intel_display_power_put_async() call sites if
        CONFIG_DRM_I915_DEBUG_RUNTIME_PM=n. (Chris)
      - Rebased on v2 of the wakeref w/o power-on guarantee patch.
      - Add missing docbook headers.
      v3:
      - Checkpatch spelling/missing-empty-line fix.
      v4:
      - Fix unintended local wakeref var optimization when using
        call-arguments with side-effects, by using inline funcs instead of
        macros. In this patch in particular this will fix the
        intel_display_power_grab_async_put_ref()->intel_runtime_pm_put_raw()
        call).
      
        No size change in practice (would be the same disregarding the
        corresponding change in intel_display_power_grab_async_put_ref()):
        $ size i915-macro.ko
           text	   data	    bss	    dec	    hex	filename
        2455190	 105890	  10272	2571352	 273c58	i915-macro.ko
        $ size i915-inline.ko
           text	   data	    bss	    dec	    hex	filename
        2455195	 105890	  10272	2571357	 273c5d	i915-inline.ko
      
        Kudos to Stan for reporting the raw-wakeref WARNs this issue caused. His
        config has CONFIG_DRM_I915_DEBUG_RUNTIME_PM=n, which I didn't retest
        after v1, and we are also not testing this config in CI.
      
        Now tested both with CONFIG_DRM_I915_DEBUG_RUNTIME_PM=y/n on ICL,
        connecting both Chamelium and regular DP, HDMI sinks.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
      Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190513192533.12586-1-imre.deak@intel.com
      e0da2d63
  6. 08 5月, 2019 3 次提交
  7. 03 5月, 2019 8 次提交
  8. 02 5月, 2019 1 次提交
  9. 01 5月, 2019 1 次提交
  10. 30 4月, 2019 8 次提交
  11. 26 4月, 2019 5 次提交
  12. 25 4月, 2019 4 次提交
  13. 24 4月, 2019 1 次提交
  14. 20 4月, 2019 1 次提交
    • C
      drm/i915: Start writeback from the shrinker · 2d6692e6
      Chris Wilson 提交于
      When we are called to relieve mempressue via the shrinker, the only way
      we can make progress is either by discarding unwanted pages (those
      objects that userspace has marked MADV_DONTNEED) or by reclaiming the
      dirty objects via swap. As we know that is the only way to make further
      progress, we can initiate the writeback as we invalidate the objects.
      This means the objects we put onto the inactive anon lru list are
      already marked for reclaim+writeback and so will trigger a wait upon the
      writeback inside direct reclaim, greatly improving the success rate of
      direct reclaim on i915 objects.
      
      The corollary is that we may start a slow swap on opportunistic
      mempressure from the likes of the compaction + migration kthreads. This
      is limited by those threads only being allowed to shrink idle pages, but
      also that if we reactivate the page before it is swapped out by gpu
      activity, we only page the cost of repinning the page. The cost is most
      felt when an object is reused after mempressure, which hopefully
      excludes the latency sensitive tasks (as we are just extending the
      impact of swap thrashing to them).
      
      Apparently this is not the first time we've had this idea. Back in
      commit 5537252b ("drm/i915: Invalidate our pages under memory
      pressure") we wanted to start writeback but settled on invalidate after
      Hugh Dickins warned us about a possibility of a deadlock within shmemfs
      if we started writeback from shrink_slab. Looking at the callchain,
      using writeback from i915_gem_shrink should be equivalent to the pageout
      also employed by shrink_slab, i.e. it should not be any riskier afaict.
      
      v2: Leave mmapings intact. At this point, the only mmapings of our
      objects will be via CPU mmaps on the shmemfs filp, which are
      out-of-scope for our LRU tracking. Instead leave those pages to the
      inactive anon LRU page list for aging and pageout as normal.
      
      v3: Be selective on which paths trigger writeback, in particular
      excluding paths shrinking just to reclaim vm space (e.g. mmap, vmap
      reapers) and avoid starting writeback on the entire process space from
      within the pm freezer.
      
      References: https://bugs.freedesktop.org/show_bug.cgi?id=108686Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Michal Hocko <mhocko@suse.com>
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
      Link: https://patchwork.freedesktop.org/patch/msgid/20190420115539.29081-1-chris@chris-wilson.co.uk
      2d6692e6
  15. 19 4月, 2019 1 次提交
  16. 17 4月, 2019 1 次提交
  17. 11 4月, 2019 1 次提交