• C
    drm/i915: Boost RPS frequency for CPU stalls · b29c19b6
    Chris Wilson 提交于
    If we encounter a situation where the CPU blocks waiting for results
    from the GPU, give the GPU a kick to boost its the frequency.
    
    This should work to reduce user interface stalls and to quickly promote
    mesa to high frequencies - but the cost is that our requested frequency
    stalls high (as we do not idle for long enough before rc6 to start
    reducing frequencies, nor are we aggressive at down clocking an
    underused GPU). However, this should be mitigated by rc6 itself powering
    off the GPU when idle, and that energy use is dependent upon the workload
    of the GPU in addition to its frequency (e.g. the math or sampler
    functions only consume power when used). Still, this is likely to
    adversely affect light workloads.
    
    In particular, this nearly eliminates the highly noticeable wake-up lag
    in animations from idle. For example, expose or workspace transitions.
    (However, given the situation where we fail to downclock, our requested
    frequency is almost always the maximum, except for Baytrail where we
    manually downclock upon idling. This often masks the latency of
    upclocking after being idle, so animations are typically smooth - at the
    cost of increased power consumption.)
    
    Stéphane raised the concern that this will punish good applications and
    reward bad applications - but due to the nature of how mesa performs its
    client throttling, I believe all mesa applications will be roughly
    equally affected. To address this concern, and to prevent applications
    like compositors from permanently boosting the RPS state, we ratelimit the
    frequency of the wait-boosts each client recieves.
    
    Unfortunately, this techinique is ineffective with Ironlake - which also
    has dynamic render power states and suffers just as dramatically. For
    Ironlake, the thermal/power headroom is shared with the CPU through
    Intelligent Power Sharing and the intel-ips module. This leaves us with
    no GPU boost frequencies available when coming out of idle, and due to
    hardware limitations we cannot change the arbitration between the CPU and
    GPU quickly enough to be effective.
    
    v2: Limit each client to receiving a single boost for each active period.
        Tested by QA to only marginally increase power, and to demonstrably
        increase throughput in games. No latency measurements yet.
    
    v3: Cater for front-buffer rendering with manual throttling.
    
    v4: Tidy up.
    
    v5: Sadly the compositor needs frequent boosts as it may never idle, but
    due to its picking mechanism (using ReadPixels) may require frequent
    waits. Those waits, along with the waits for the vrefresh swap, conspire
    to keep the GPU at low frequencies despite the interactive latency. To
    overcome this we ditch the one-boost-per-active-period and just ratelimit
    the number of wait-boosts each client can receive.
    Reported-and-tested-by: NPaul Neumann <paul104x@yahoo.de>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68716Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
    Cc: Kenneth Graunke <kenneth@whitecape.org>
    Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
    Cc: Owen Taylor <otaylor@redhat.com>
    Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com>
    Cc: "Zhuang, Lena" <lena.zhuang@intel.com>
    Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
    [danvet: No extern for function prototypes in headers.]
    Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
    b29c19b6
i915_gem.c 126.7 KB