1. 14 4月, 2016 9 次提交
  2. 13 4月, 2016 25 次提交
  3. 12 4月, 2016 6 次提交
    • T
      drm/i915: Only grab correct forcewake for the engine with execlists · 3756685a
      Tvrtko Ursulin 提交于
      Rather than blindly waking up all forcewake domains on command
      submission, we can teach each engine what is (or are) the correct
      one to take.
      
      On platforms with multiple forcewake domains like VLV, CHV, SKL
      and BXT, this has the potential of lowering the GPU and CPU
      power use and submission latency.
      
      To implement it we add a function named
      intel_uncore_forcewake_for_reg whose purpose is to query which
      forcewake domains need to be taken to read or write a specific
      register with raw mmio accessors.
      
      These enables the execlists engine setup  to query which
      forcewake domains are relevant per engine on the currently
      running platform.
      
      v2:
        * Kerneldoc.
        * Split from intel_uncore.c macro extraction, WARN_ON,
          no warns on old platforms. (Chris Wilson)
      
      v3:
        * Single domain per engine, mention all registers,
          bi-directional function and a new name, fix handling
          of gen6 and gen7 writes. (Chris Wilson)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1460468251-14069-1-git-send-email-tvrtko.ursulin@linux.intel.com
      3756685a
    • T
      drm/i915: Remove forcewake request registers from the shadowed table · a70ecc16
      Tvrtko Ursulin 提交于
      Chris Wilson points out that we can remove them from the array
      since they are always written to with raw accessors.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      a70ecc16
    • T
      drm/i915: Extract knowledge of register forcewake domains · 6863b76c
      Tvrtko Ursulin 提交于
      Knowledge of which register per platform belonds in which
      forcewake domain was embedded in the MMIO accessors themselves.
      
      Extract it into standalone macros so they can be used from
      new code in the following patches.
      
      This causes GCC to compile some of the MMIO accessors slightly
      differently and grows the code a tiny amount. But none of the
      growth is on the fast-path so it does not matter hugely.
      
      Affected sizes before:
      
      00000000000026f0 00000000000001a5 t gen6_read16
      0000000000002390 00000000000001a5 t gen6_read32
      00000000000028a0 00000000000001a5 t gen6_read64
      
      00000000000061d0 000000000000019e t gen8_write16
      0000000000006510 000000000000019d t gen8_write32
      0000000000006370 000000000000019d t gen8_write64
      00000000000021f0 000000000000019d t gen8_write8
      
      Affected sizes after:
      
      0000000000002840 00000000000001aa t gen6_read16
      00000000000024e0 00000000000001a9 t gen6_read32
      00000000000029f0 00000000000001a9 t gen6_read64
      
      0000000000004f20 00000000000001b5 t gen8_write16
      0000000000004ba0 00000000000001b4 t gen8_write32
      00000000000050e0 00000000000001b4 t gen8_write64
      0000000000004d60 00000000000001b4 t gen8_write8
      
      Other MMIO accessors are not affected in size.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      6863b76c
    • T
      drm/i915: Do not serialize forcewake acquire across domains · 4e1176dd
      Tvrtko Ursulin 提交于
      On platforms with multiple forcewake domains it seems more efficient
      to request all desired ones and then to wait for acks to avoid
      needlessly serializing on each domain.
      
      v2: Rebase.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1460045074-1006-1-git-send-email-tvrtko.ursulin@linux.intel.com
      4e1176dd
    • T
      drm/i915: Simplify for_each_fw_domain iterators · 33c582c1
      Tvrtko Ursulin 提交于
      As the vast majority of users do not use the domain id variable,
      we can eliminate it from the iterator and also change the latter
      using the same principle as was recently done for for_each_engine.
      
      For a couple of callers which do need the domain mask, store it
      in the domain array (which already has the domain id), then both
      can be retrieved thence.
      
      Result is clearer code and smaller generated binary, especially
      in the tight fw get/put loops. Also, relationship between domain
      id and mask is no longer assumed in the macro.
      
      v2: Improve grammar in the commit message and rename the
          iterator to for_each_fw_domain_masked for consistency.
          (Dave Gordon)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDave Gordon <david.s.gordon@intel.com>
      33c582c1
    • T
      drm/i915: Use consistent forcewake auto-release timeout across kernel configs · a57a4a67
      Tvrtko Ursulin 提交于
      Because it is based on jiffies, current implementation releases the
      forcewake at any time between straight away and between 1ms and 10ms,
      depending on the kernel configuration (CONFIG_HZ).
      
      This is probably not what has been desired, since the dynamics of keeping
      parts of the GPU awake should not be correlated with this kernel
      configuration parameter.
      
      Change the auto-release mechanism to use hrtimers and set the timeout to
      1ms with a 1ms of slack. This should make the GPU power consistent
      across kernel configs, and timer slack should enable some timer coalescing
      where multiple force-wake domains exist, or with unrelated timers.
      
      For GlBench/T-Rex this decreases the number of forcewake releases from
      ~480 to ~300 per second, and for a heavy combined OGL/OCL test from
      ~670 to ~360 (HZ=1000 kernel).
      
      Even though this reduction can be attributed to the average release period
      extending from 0-1ms to 1-2ms, as discussed above, it will make the
      forcewake timeout consistent for different CONFIG_HZ values.
      
      Real life measurements with the above workload has shown that, with this
      patch, both manage to auto-release the forcewake between 2-4 times per
      10ms, even though the number of forcewake gets is dramatically different.
      
      T-Rex requests between 5-10 explicit gets and 5-10 implict gets in each
      10ms period, while the OGL/OCL test requests 250 and 380 times in the same
      period.
      
      The two data points together suggest that the nature of the forwake
      accesses is bursty and that further changes and potential timeout
      extensions, or moving the start of timeout from the first to the last
      automatic forcewake grab, should be carefully measured for power and
      performance effects.
      
      v2:
        * Commit spelling. (Dave Gordon)
        * More discussion on numbers in the commit. (Chris Wilson)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NDave Gordon <david.s.gordon@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      a57a4a67