1. 21 8月, 2012 1 次提交
    • D
      drm/i915: use hsw rps tuning values everywhere on gen6+ · 1ee9ae32
      Daniel Vetter 提交于
      James Bottomley reported [1] a massive power regression, due to the
      enabling of semaphores by default in 3.5. A workaround for him is to
      again disable semaphores. And indeed, his system has a very hard time
      to enter rc6 with semaphores enabled.
      
      Ben Widawsky run around with a kill-a-watt a lot and noticed:
      - There are indeed a few rare systems that seem to have a hard time
        entering rc6 when desktop-idle.
      - One machine, The Indestructible Toshiba regressed in this behaviour
        between 3.5 and 3.6 in a merge commit! So rc6 behaviour with the
        current setting seems to be highly timing dependent and not robust
        at all.
      - The behaviour James reported wrt semaphores seems to be a freak
        timing thing that only happens on his specific machine, confirming
        that enabling semaphores shouldn't reduce rc6 residency.
      
      Now furthermore the Google ChromeOS guys reported [2] a while ago that
      at least on some machines a simply a blinking cursor can keep the gpu
      turbo at the highest frequency. This is because the current rps limits
      used on snb/ivb are highly asymmetric.
      
      On the theory that gpu turbo and rc6 tuning values are related, we've
      tried whether the much saner looking (since much less asymmetric) rps
      tuning values used for hsw would also help entering rc6 more robustly.
      
      And it seems to mostly work, and we don't really have the resources to
      through-roughly tune things in any better way: The values from the
      ChromeOS ppl seem to fare a bit worse for James' machine, so I guess
      we better stick with something vpg (the gpu hw/windows group)
      provided, hoping that they've done their jobs.
      
      Reference[1]: http://lists.freedesktop.org/archives/dri-devel/2012-July/025675.html
      Reference[2]: http://lists.freedesktop.org/archives/intel-gfx/2012-July/018692.html
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53393Tested-by: NBen Widawsky <ben@bwidawsk.net>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1ee9ae32
  2. 27 7月, 2012 1 次提交
    • D
      drm/i915: fix forcewake related hangs on snb · 6af2d180
      Daniel Vetter 提交于
      ... by adding seemingly redudant posting reads.
      
      This little dragon lair exploded the first time around when we've
      refactored the code a bit to use the common wait_for_atomic_us in
      "drm/i915: Group the GT routines together in both code and vtable",
      which caused QA to file fdo bug #51738.
      
      Chris Wilson entertained a few approaches to fixing #51738: Replacing
      the udelay(1) with the previously-used udelay(10) (or any other
      "sufficiently larger" delay), adding a posting read, or ditching the
      delay completely and using cpu_relax. We went with the cpu_relax and
      "915: Workaround hang with BSD and forcewake on SandyBridge". Which
      blew up in fdo bug #52424, but adding the posting read while still
      using cpu_relax seems to also fix that, it looks like the
      posting read is the important ingriedient to fix these rc6 related
      hangs on snb.
      
      Popular theories as to why this is like it is include:
      - A herd of pink elephants got royally angered somehow.
      
      - The gpu has internally different functional units and judging by the
        register offsets, the forcewake request register and the forcewake
        ack registers are _not_ in the same functional unit (or at least
        aren't reached through the same routes). Hence the posting read
        syncs up with the wrong block and gets the entire gpu confused.
      
      - ...
      
      As a minimal ducttape fix for 3.6, let's just put these posting reads
      into place again. We can try fancier approaches (like adding back the
      cpu_relax instead of the udelay) in -next.
      
      This (re-)fixes a regression introduced in
      
      commit 990bbdad
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Mon Jul 2 11:51:02 2012 -0300
      
          drm/i915: Group the GT routines together in both code and vtable
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Tested-by: NDu Yan <yanx.du@intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52424
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51738uSigned-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6af2d180
  3. 20 7月, 2012 2 次提交
  4. 05 7月, 2012 9 次提交
  5. 04 7月, 2012 1 次提交
  6. 26 6月, 2012 2 次提交
  7. 21 6月, 2012 1 次提交
  8. 19 6月, 2012 4 次提交
  9. 18 6月, 2012 1 次提交
    • B
      drm/i915: set IDICOS to medium uncore resources · 20848223
      Ben Widawsky 提交于
      I'm seeing about a 5% FPS improvement across various benchmarks on my
      IVB i3. Rumor has it that the higher end parts show even more benefit.
      
      This derives from a patch originally given to me by Bernard. The docs
      are  confusing about the definition names (ie. medium really seems like
      max), but it would seem it gives more cache to the GT at the expense of
      uncore. This configuration makes the split most in favor of the GT. I've
      not tried the other IDICOS values.
      
      Cc: "Kilarski, Bernard R" <bernard.r.kilarski@intel.com>
      Acked-by: NEric Anholt <eric@anholt.net>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      20848223
  10. 14 6月, 2012 1 次提交
  11. 25 5月, 2012 1 次提交
  12. 24 5月, 2012 2 次提交
  13. 20 5月, 2012 5 次提交
  14. 03 5月, 2012 9 次提交