1. 16 10月, 2013 2 次提交
  2. 15 10月, 2013 8 次提交
  3. 11 10月, 2013 1 次提交
  4. 10 10月, 2013 3 次提交
  5. 09 10月, 2013 1 次提交
  6. 04 10月, 2013 3 次提交
    • C
      drm/i915: Tweak RPS thresholds to more aggressively downclock · dd75fdc8
      Chris Wilson 提交于
      After applying wait-boost we often find ourselves stuck at higher clocks
      than required. The current threshold value requires the GPU to be
      continuously and completely idle for 313ms before it is dropped by one
      bin. Conversely, we require the GPU to be busy for an average of 90% over
      a 84ms period before we upclock. So the current thresholds almost never
      downclock the GPU, and respond very slowly to sudden demands for more
      power. It is easy to observe that we currently lock into the wrong bin
      and both underperform in benchmarks and consume more power than optimal
      (just by repeating the task and measuring the different results).
      
      An alternative approach, as discussed in the bspec, is to use a
      continuous threshold for upclocking, and an average value for downclocking.
      This is good for quickly detecting and reacting to state changes within a
      frame, however it fails with the common throttling method of waiting
      upon the outstanding frame - at least it is difficult to choose a
      threshold that works well at 15,000fps and at 60fps. So continue to use
      average busy/idle loads to determine frequency change.
      
      v2: Use 3 power zones to keep frequencies low in steady-state mostly
      idle (e.g. scrolling, interactive 2D drawing), and frequencies high
      for demanding games. In between those end-states, we use a
      fast-reclocking algorithm to converge more quickly on the desired bin.
      
      v3: Bug fixes - make sure we reset adj after switching power zones.
      
      v4: Tune - drop the continuous busy thresholds as it prevents us from
      choosing the right frequency for glxgears style swap benchmarks. Instead
      the goal is to be able to find the right clocks irrespective of the
      wait-boost.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Kenneth Graunke <kenneth@whitecape.org>
      Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
      Cc: Owen Taylor <otaylor@redhat.com>
      Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com>
      Cc: "Zhuang, Lena" <lena.zhuang@intel.com>
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      dd75fdc8
    • C
      drm/i915: Boost RPS frequency for CPU stalls · b29c19b6
      Chris Wilson 提交于
      If we encounter a situation where the CPU blocks waiting for results
      from the GPU, give the GPU a kick to boost its the frequency.
      
      This should work to reduce user interface stalls and to quickly promote
      mesa to high frequencies - but the cost is that our requested frequency
      stalls high (as we do not idle for long enough before rc6 to start
      reducing frequencies, nor are we aggressive at down clocking an
      underused GPU). However, this should be mitigated by rc6 itself powering
      off the GPU when idle, and that energy use is dependent upon the workload
      of the GPU in addition to its frequency (e.g. the math or sampler
      functions only consume power when used). Still, this is likely to
      adversely affect light workloads.
      
      In particular, this nearly eliminates the highly noticeable wake-up lag
      in animations from idle. For example, expose or workspace transitions.
      (However, given the situation where we fail to downclock, our requested
      frequency is almost always the maximum, except for Baytrail where we
      manually downclock upon idling. This often masks the latency of
      upclocking after being idle, so animations are typically smooth - at the
      cost of increased power consumption.)
      
      Stéphane raised the concern that this will punish good applications and
      reward bad applications - but due to the nature of how mesa performs its
      client throttling, I believe all mesa applications will be roughly
      equally affected. To address this concern, and to prevent applications
      like compositors from permanently boosting the RPS state, we ratelimit the
      frequency of the wait-boosts each client recieves.
      
      Unfortunately, this techinique is ineffective with Ironlake - which also
      has dynamic render power states and suffers just as dramatically. For
      Ironlake, the thermal/power headroom is shared with the CPU through
      Intelligent Power Sharing and the intel-ips module. This leaves us with
      no GPU boost frequencies available when coming out of idle, and due to
      hardware limitations we cannot change the arbitration between the CPU and
      GPU quickly enough to be effective.
      
      v2: Limit each client to receiving a single boost for each active period.
          Tested by QA to only marginally increase power, and to demonstrably
          increase throughput in games. No latency measurements yet.
      
      v3: Cater for front-buffer rendering with manual throttling.
      
      v4: Tidy up.
      
      v5: Sadly the compositor needs frequent boosts as it may never idle, but
      due to its picking mechanism (using ReadPixels) may require frequent
      waits. Those waits, along with the waits for the vrefresh swap, conspire
      to keep the GPU at low frequencies despite the interactive latency. To
      overcome this we ditch the one-boost-per-active-period and just ratelimit
      the number of wait-boosts each client can receive.
      Reported-and-tested-by: NPaul Neumann <paul104x@yahoo.de>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68716Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Kenneth Graunke <kenneth@whitecape.org>
      Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
      Cc: Owen Taylor <otaylor@redhat.com>
      Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com>
      Cc: "Zhuang, Lena" <lena.zhuang@intel.com>
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      [danvet: No extern for function prototypes in headers.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b29c19b6
    • B
      drm/i915: Clean up the ring scaling calculations · f6aca45c
      Ben Widawsky 提交于
      This patch attempts to clean up the ring/IA scaling programming in the
      following ways.
      1. Fix the comment about the DDR frequency. The math is 266MHz, not
      133MHz. Formula was right, docs are wrong.
      
      2. Mask the DCLK register since I don't know how it is defined on future
      platforms.
      
      3. use mult_frac instead of magic math.
      
      This helps for future platform enabling.
      
      v2: Actually use the right patch. The v1 was a mix of things, none of
      which was right. Note that due to rounding, we actually get different
      values (slightly higher) for the effective ring frequency.
      
      v3: Use 1.25 instead of 1.33 as the original code did. (Jesse)
      
      CC: Jesse Barnes <jbarnes@virtuousgeek.org>
      CC: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      f6aca45c
  7. 01 10月, 2013 5 次提交
  8. 21 9月, 2013 4 次提交
  9. 20 9月, 2013 1 次提交
  10. 17 9月, 2013 5 次提交
  11. 10 9月, 2013 4 次提交
  12. 09 9月, 2013 1 次提交
  13. 04 9月, 2013 1 次提交
  14. 03 9月, 2013 1 次提交
    • M
      drm/i915: Don't mask EI UP interrupt on IVB|SNB · a9c1f90c
      Mika Kuoppala 提交于
      Submitting a batchbuffer which simulates a gpu
      hang by doing MI_BATCH_BUFFER_START into itself,
      to test hangcheck, started to hard hang the whole box
      (IVB). Bisecting lead to this commit:
      
      commit 664b422c2966cd39b8f67e8d53a566ea8c877cd6
      Author: Vinit Azad <vinit.azad@intel.com>
      Date:   Wed Aug 14 13:34:33 2013 -0700
      
          drm/i915: Only unmask required PM interrupts
      
      Experimenting with the mask register showed that
      unmasking EI UP will prevent the hard hang in IVB and SNB.
      HSW doesn't hang with EI UP masked.
      
      Considering we are just disabling interrupts that aren't even
      delivered to driver, this change is more likely to paper over some
      weirdness in gpu's internal state machine. But until better
      explanation can be found, let's trade little bit of power
      for stability on these architectures.
      
      v2: - Unmask EI_EXPIRED directly in I915_WRITE (Vinit)
      v3: - Only unmask on SNB and IVB
      
      Cc: Vinit Azad <vinit.azad@intel.com>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Acked-by: NVinit Azad <vinit.azad@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a9c1f90c