提交 · f64a28a7c5ab2fc342326de9e126acf3cc0f91d6 · openeuler / raspberrypi-kernel

06 11月, 2013 2 次提交

drm/i915/vlv: fixup DDR freq detection per Punit spec · f64a28a7

由 Jesse Barnes 提交于 11月 04, 2013

Either the docs were wrong or the values have changed since the old days
before we had wheels.
Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f64a28a7

drm/i915: move VLV DDR freq fetch into init_clock_gating · 85b1d7b3

由 Jesse Barnes 提交于 11月 04, 2013

We don't want it delayed with the RPS work.
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

85b1d7b3

01 11月, 2013 1 次提交

drm/i915: add back checking for i915_disable_power_well · 1ad577ac

由 Imre Deak 提交于 10月 29, 2013

In

commit 6efdf354
Author: Imre Deak <imre.deak@intel.com>
Date:   Wed Oct 16 17:25:52 2013 +0300

the check for i915_disable_power_well flag was removed by overlook,
so add it back now.
Reported-by: NPaulo Zanoni <paulo.zanoni@intel.com>
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1ad577ac

30 10月, 2013 1 次提交

drm/i915: rename i915_init_power_well to init_power_domains_init · ddb642fb

由 Imre Deak 提交于 10月 28, 2013

Similarly rename the other related functions in the power domain
interface.

Higher level driver code calling these functions knows only about power
domains, not the underlying power wells which may be different on
different platforms. Also these functions really init/cleanup/resume
power domains and only through that all related power wells, so rename
them accordingly.

Note that I left i915_{request,release}_power_well as is, since that
really changes the state only of a single power well (and is HSW
specific). It should also get a better name once we make it more
generic by controlling things through a new audio power domain.

v4:
- use intel prefix instead of i915 everywhere (Paulo)
- use a $prefix_$block_$action format (Daniel)
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ddb642fb

28 10月, 2013 3 次提交

drm/i915: remove device field from struct power_well · b4ed4484

由 Imre Deak 提交于 10月 25, 2013

The only real need for this field was in
i915_{request,release}_power_well, but there we can get at it by a
container_of magic. Also since in the future we'll have multiple power
wells each with its own power_well struct it makes sense to remove the
field from there where it'd be just redundancy.
Suggested-by: NPaulo Zanoni <paulo.zanoni@intel.com>
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b4ed4484

drm/i915: use power get/put instead of set for power on after init · baa70707

由 Imre Deak 提交于 10月 25, 2013

Currently we make sure that all power domains are enabled during driver
init and turn off unneded ones only after the first modeset. Similarly
during suspend we enable all power domains, which will remain on through
the following resume until the first modeset.

This logic is supported by intel_set_power_well() in the power domain
framework. It would be nice to simplify the API, so that we only have
get/put functions and make it more explicit on the higher level how this
"power well on during init" logic works. This will make it also easier
if in the future we want to shorten the time the power wells are on.

For this add a new device private flag tracking whether we have the
power wells on because of init/suspend and use only
intel_display_power_get()/put(). As nothing else uses
intel_set_power_well() we can remove it.

This also fixes

commit 6efdf354
Author: Imre Deak <imre.deak@intel.com>
Date:   Wed Oct 16 17:25:52 2013 +0300

    drm/i915: enable only the needed power domains during modeset

where removing intel_set_power_well() resulted in not releasing the
reference on the power well that was taken during init and thus leaving
the power well on all the time. Regression reported by Paulo.

v2:
- move the init_power_on flag to the power_domains struct (Daniel)

v3:
- add note about this being a regression fix too (Paulo)
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

baa70707

drm/i915: prepare for multiple power wells · 83c00f55

由 Imre Deak 提交于 10月 25, 2013

In the future we'll need to support multiple power wells, so prepare for
that here. Create a new power domains struct which contains all
power domain/well specific fields. Since we'll have one lock protecting
all power wells, move power_well->lock to the new struct too.

No functional change.
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NPaulo Zanoni <paulo.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

83c00f55

27 10月, 2013 3 次提交

drm/i915: Remove WaFbcDisableDpfcClockGating on HSW · 8c7b72f2

由 Ben Widawsky 提交于 10月 24, 2013

Production HSW does not need it. I confirmed this with Art.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8c7b72f2

drm/i915: Remove WaFbcDisableDpfcClockGating on IVB · a74b0c48

由 Ben Widawsky 提交于 10月 24, 2013

Production IVB does not need it. I confirmed this with Art.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a74b0c48

drm/i915: Convert straggling MCHBAR registers · 153b4b95

由 Ben Widawsky 提交于 10月 22, 2013

All our registers which are written through the MCHBAR are defined
descriptively as an offset to the MCHBAR. We had 3 outliers here.
Convert these as well so all registers which are offsets are MCHBAR can
be easily identified/found within the code.

With this, convert DCLK to also follow this format.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

153b4b95

22 10月, 2013 2 次提交

drm/i915: change power_well->lock to be mutex · 959cbc1b

由 Imre Deak 提交于 10月 16, 2013

There is no hard need for this to be a spin lock, as we don't take these
locks in irq context from anywhere. An upcoming patch will add calls to
punit read/write functions from within regions protected by this lock
and those functions need a mutex in turn. As a solution for that convert
the spin lock to be a mutex.
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

959cbc1b

drm/i915: factor out is_always_on_domain · bddc7645

由 Imre Deak 提交于 10月 16, 2013

It is just cleaner this way and makes it easier to add support for
other HW generations with always-on power wells powering a different
set of domains.
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bddc7645

21 10月, 2013 1 次提交

drm/i915: Print RC6 info less often · dc39fff7

由 Ben Widawsky 提交于 10月 18, 2013

Since we use intel_enable_rc6() now for more than just when we're
enabling RC6, we'll see this message many times, and it is just
confusing.

As an example, calc_residency calls this function whenever poked via
sysfs. This leaves the impression in dmesg that we're constantly
re-enabling RC6.

While at it, move the defines and description from drv.h to intel_pm.c,
since these are only ever used in that code.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

dc39fff7

16 10月, 2013 7 次提交

drm/i915: Check 5/6 DDB split only when sprites are enabled · ec98c8d1

由 Ville Syrjälä 提交于 10月 11, 2013

Using the 5/6 DDB split make sense only when sprites are enabled.
So check that before we waste any cycles computing the merged
watermarks with the 5/6 DDB split.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ec98c8d1

drm/i915: Rename ilk_check_wm to ilk_validate_wm_level · d9395655

由 Ville Syrjälä 提交于 10月 09, 2013

Makes the behaviour of the function more clear.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d9395655

drm/i915: Rename ilk_wm_max to ilk_compute_wm_maximums · 34982fe1

由 Ville Syrjälä 提交于 10月 09, 2013

Makes the intention more clear.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

34982fe1

drm/i915: Remove a somewhat silly debug print from watermark code · dcaf13f7

由 Ville Syrjälä 提交于 10月 09, 2013

This debug print just adds overhead to the watermark merging process,
and doesn't really give enough information to be useful. Just kill
and let's add something much better a bit later.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

dcaf13f7

drm/i915: Init HSW watermark tracking in intel_modeset_setup_hw_state() · 243e6a44

由 Ville Syrjälä 提交于 10月 14, 2013

Fill out the HSW watermark s/w tracking structures with the current
hardware state in intel_modeset_setup_hw_state(). This allows us to skip
the HW state readback during watermark programming and just use the values
we keep around in dev_priv->wm. Reduces the overhead of the watermark
programming quite a bit.

v2: s/init_wm/wm_get_hw_state
    Remove stale comment about sprites
    Make DDB partitioning readout safer
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Fix whitespace fail.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

243e6a44

drm/i915: Improve watermark dirtyness checks · 49a687c4

由 Ville Syrjälä 提交于 10月 11, 2013

Currently hsw_write_vm_values() may write to certain watermark
registers needlessly. For instance if only, say, LP3 changes,
the current code will again disable all LP1+ watermarks even
though only LP3 needs to be reconfigured.

Add an easy to read function that will compute the dirtyness of the
watermarks, and use that information to further optimize the watermark
programming.

v2: Disable LP1+ watermarks around changing LP0 watermarks for Paulo
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

49a687c4

drm/i915: Store current watermark state in dev_priv->wm · 609cedef

由 Ville Syrjälä 提交于 10月 09, 2013

To make it easier to check what watermark updates are actually
necessary, keep copies of the relevant bits that match the current
hardware state.

Also add DDB partitioning into hsw_wm_values as that's another piece
of state we want to track.

We don't read out the hardware state on init yet, so we can't really
start using this yet, but it will be used later.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Paulo asked for a comment around the memcmp to say that we
depend upon zero-initializing the entire structures due to padding.
But a later patch in this series removes the memcmp again. So this is
ok as-is.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

609cedef

15 10月, 2013 9 次提交

drm/i915: Kill fbc_wm_enabled from intel_wm_config · 7eaa4d56

由 Ville Syrjälä 提交于 10月 09, 2013

The fbc_wm_enabled member in intel_wm_config is useless for the time
being. The original idea for it was that we'd pre-compute it and so
that the WM merging process could know whether it needs to worry
about FBC watermarks at all.

But we don't have a convenient way to pre-check for the possibility
of FBC being used. intel_update_fbc() should be split up for that.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7eaa4d56

drm/i915: Refactor wm_lp to level calculation · b380ca3c

由 Ville Syrjälä 提交于 10月 09, 2013

On HSW the LP1,LP2,LP3 levels are either 1,2,3 or 1,3,4. We make the
conversion from LPn to to the level at one point current. Later we're
going to do it in a few places, so move it to a separate function.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b380ca3c

drm/i915: Check 5/6 DDB split only when sprites are enabled · a5db6b62

由 Ville Syrjälä 提交于 10月 11, 2013

Using the 5/6 DDB split make sense only when sprites are enabled.
So check that before we waste any cycles computing the merged
watermarks with the 5/6 DDB split.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a5db6b62

drm/i915: Move some computations out from hsw_compute_wm_parameters() · a485bfb8

由 Ville Syrjälä 提交于 10月 09, 2013

Move the watermark max computations into haswell_update_wm(). This
allows keeping the 1/2 vs. 5/6 split code in one place, and avoid having
to pass around so many things. We also save a bit of stack space by only
requiring one copy of struct hsw_wm_maximums.

Also move the intel_wm_config out from hsw_compute_wm_parameters() and
pass it it. We'll have some need for it in haswell_update_wm() later.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a485bfb8

drm/i915: Use intel_pipe_wm in hsw_find_best_results · 198a1e9b

由 Ville Syrjälä 提交于 10月 09, 2013

Let's try to keep using the intermediate intel_pipe_wm representation
for as long as possible. It avoids subtle knowledge about the
internals of the hardware registers when trying to choose the
best watermark configuration.

While at it replace the memset() w/ zero initialization.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

198a1e9b

drm/i915: Move LP1+ watermark merging out from hsw_compute_wm_results() · 0362c781

由 Ville Syrjälä 提交于 10月 09, 2013

I want to convert hsw_find_best_result() to use intel_pipe_wm, so we
need to move the merging to happen outside hsw_compute_wm_results().
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0362c781

drm/i915: Don't re-compute pipe watermarks except for the affected pipe · 7c4a395f

由 Ville Syrjälä 提交于 10月 09, 2013

No point in re-computing the watermarks for all pipes, when only one
pipe has changed. The watermarks stored under intel_crtc.wm.active are
still valid for the other pipes. We just need to redo the merging.

We can also skip the merge/update procedure completely if the new
watermarks for the affected pipe come out unchanged.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7c4a395f

drm/i915: Add intel_pipe_wm and prepare for watermark pre-compute · 0b2ae6d7

由 Ville Syrjälä 提交于 10月 09, 2013

Introduce a new struct intel_pipe_wm which contains all the
watermarks for a single pipe. Use it to unify the LP0 and LP1+
watermark computations so that we can just iterate through the
watermark levels neatly and call ilk_compute_wm_level() for each.

Also add another tool ilk_wm_merge() that merges the LP1+ watermarks
from all pipes. For that, embed one intel_pipe_wm inside intel_crtc that
contains the currently valid watermarks for each pipe.

This is mainly preparatory work for pre-computing the watermarks for
each pipe and merging them at a later time. For now the merging still
happens immediately.

v2: Add some comments about level 0 DDB split and intel_wm_config
    Add WARN_ON for level 0 being disabled
    s/lp_wm/merged
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0b2ae6d7

drm/i915: disable LVDS clock gating on CPT v2 · cd664078

由 Jesse Barnes 提交于 10月 02, 2013

Needed to prevent display corruption in high res panels.

v2: use correct unit names (Rodrigo)
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Tested-by: NUlrich Drepper <drepper@gmail.com>
Reviewed-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

cd664078

11 10月, 2013 1 次提交

drm/i915: Avoid tweaking RPS before it is enabled · c0951f0c

由 Chris Wilson 提交于 10月 10, 2013

As we delay the initial RPS enabling (upon boot and after resume), there
is a chance that we may start to render and trigger RPS boosts before we
set up the punit. Any changes we make could result in inconsistent
hardware state, with a danger of causing undefined behaviour. However,
as the boosting is a optional tweak to RPS, we can simply ignore it
whilst RPS is not yet enabled.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c0951f0c

10 10月, 2013 3 次提交

drm/i915: Rename primary_disabled to primary_enabled · 4c445e0e

由 Ville Syrjälä 提交于 10月 09, 2013

Let's try to avoid these confusing negated booleans.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4c445e0e

drm/i915: Use the real cpu max frequency for ring scaling · eda79642

由 Ben Widawsky 提交于 10月 07, 2013

The policy's max frequency is not equal to the CPU's max frequency. The
ring frequency is derived from the CPU frequency, and not the policy
frequency.

One example of how this may differ through sysfs. If the sysfs max
frequency is modified, that will be used for the max ring frequency
calculation.
(/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq). As far as I
know, no current governor uses anything but max as the default, but in
theory, they could. Similarly distributions might set policy as part of
their init process.

It's ideal to use the real frequency because when we're currently scaled
up on the GPU. In this case we likely want to race to idle, and using a
less than max ring frequency is non-optimal for this situation.

AFAIK, this patch should have no impact on a majority of people.

This behavior hasn't been changed since it was first introduced:
commit 23b2f8bb
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Tue Jun 28 13:04:16 2011 -0700

    drm/i915: load a ring frequency scaling table v3

CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

eda79642

drm/i915: Rename intel_flush_display_plane to intel_flush_primary_plane · 1dba99f4

由 Ville Syrjälä 提交于 10月 01, 2013

The intel_flush_primary_plane name actually tells us which plane
we're talking about.

Also reorganize the internals a bit and add a missing POSTING_READ()
to make sure the hardware has seen the changes by the time we
return from the function.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1dba99f4

09 10月, 2013 1 次提交

drm: Collect per-crtc vblank stuff to a struct · 5380e929

由 Ville Syrjälä 提交于 10月 04, 2013

drm_vblank_init() is too ugly. Make it a bit easier on the eye by
collecting all the per-crtc vblank counters, timestamps etc. to
a structure and just allocate an array of those.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

5380e929

04 10月, 2013 3 次提交

drm/i915: Tweak RPS thresholds to more aggressively downclock · dd75fdc8

由 Chris Wilson 提交于 9月 25, 2013

After applying wait-boost we often find ourselves stuck at higher clocks
than required. The current threshold value requires the GPU to be
continuously and completely idle for 313ms before it is dropped by one
bin. Conversely, we require the GPU to be busy for an average of 90% over
a 84ms period before we upclock. So the current thresholds almost never
downclock the GPU, and respond very slowly to sudden demands for more
power. It is easy to observe that we currently lock into the wrong bin
and both underperform in benchmarks and consume more power than optimal
(just by repeating the task and measuring the different results).

An alternative approach, as discussed in the bspec, is to use a
continuous threshold for upclocking, and an average value for downclocking.
This is good for quickly detecting and reacting to state changes within a
frame, however it fails with the common throttling method of waiting
upon the outstanding frame - at least it is difficult to choose a
threshold that works well at 15,000fps and at 60fps. So continue to use
average busy/idle loads to determine frequency change.

v2: Use 3 power zones to keep frequencies low in steady-state mostly
idle (e.g. scrolling, interactive 2D drawing), and frequencies high
for demanding games. In between those end-states, we use a
fast-reclocking algorithm to converge more quickly on the desired bin.

v3: Bug fixes - make sure we reset adj after switching power zones.

v4: Tune - drop the continuous busy thresholds as it prevents us from
choosing the right frequency for glxgears style swap benchmarks. Instead
the goal is to be able to find the right clocks irrespective of the
wait-boost.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
Cc: Owen Taylor <otaylor@redhat.com>
Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com>
Cc: "Zhuang, Lena" <lena.zhuang@intel.com>
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

dd75fdc8

drm/i915: Boost RPS frequency for CPU stalls · b29c19b6

由 Chris Wilson 提交于 9月 25, 2013

If we encounter a situation where the CPU blocks waiting for results
from the GPU, give the GPU a kick to boost its the frequency.

This should work to reduce user interface stalls and to quickly promote
mesa to high frequencies - but the cost is that our requested frequency
stalls high (as we do not idle for long enough before rc6 to start
reducing frequencies, nor are we aggressive at down clocking an
underused GPU). However, this should be mitigated by rc6 itself powering
off the GPU when idle, and that energy use is dependent upon the workload
of the GPU in addition to its frequency (e.g. the math or sampler
functions only consume power when used). Still, this is likely to
adversely affect light workloads.

In particular, this nearly eliminates the highly noticeable wake-up lag
in animations from idle. For example, expose or workspace transitions.
(However, given the situation where we fail to downclock, our requested
frequency is almost always the maximum, except for Baytrail where we
manually downclock upon idling. This often masks the latency of
upclocking after being idle, so animations are typically smooth - at the
cost of increased power consumption.)

Stéphane raised the concern that this will punish good applications and
reward bad applications - but due to the nature of how mesa performs its
client throttling, I believe all mesa applications will be roughly
equally affected. To address this concern, and to prevent applications
like compositors from permanently boosting the RPS state, we ratelimit the
frequency of the wait-boosts each client recieves.

Unfortunately, this techinique is ineffective with Ironlake - which also
has dynamic render power states and suffers just as dramatically. For
Ironlake, the thermal/power headroom is shared with the CPU through
Intelligent Power Sharing and the intel-ips module. This leaves us with
no GPU boost frequencies available when coming out of idle, and due to
hardware limitations we cannot change the arbitration between the CPU and
GPU quickly enough to be effective.

v2: Limit each client to receiving a single boost for each active period.
Tested by QA to only marginally increase power, and to demonstrably
increase throughput in games. No latency measurements yet.

v3: Cater for front-buffer rendering with manual throttling.

v4: Tidy up.

v5: Sadly the compositor needs frequent boosts as it may never idle, but
due to its picking mechanism (using ReadPixels) may require frequent
waits. Those waits, along with the waits for the vrefresh swap, conspire
to keep the GPU at low frequencies despite the interactive latency. To
overcome this we ditch the one-boost-per-active-period and just ratelimit
the number of wait-boosts each client can receive.
Reported-and-tested-by: NPaul Neumann <paul104x@yahoo.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68716Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
Cc: Owen Taylor <otaylor@redhat.com>
Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com>
Cc: "Zhuang, Lena" <lena.zhuang@intel.com>
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
[danvet: No extern for function prototypes in headers.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b29c19b6

drm/i915: Clean up the ring scaling calculations · f6aca45c

由 Ben Widawsky 提交于 10月 02, 2013

This patch attempts to clean up the ring/IA scaling programming in the
following ways.
1. Fix the comment about the DDR frequency. The math is 266MHz, not
133MHz. Formula was right, docs are wrong.

2. Mask the DCLK register since I don't know how it is defined on future
platforms.

3. use mult_frac instead of magic math.

This helps for future platform enabling.

v2: Actually use the right patch. The v1 was a mix of things, none of
which was right. Note that due to rounding, we actually get different
values (slightly higher) for the effective ring frequency.

v3: Use 1.25 instead of 1.33 as the original code did. (Jesse)

CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f6aca45c

03 10月, 2013 1 次提交

drm/i915/hsw: Disable L3 caching of atomic memory operations. · f3fc4884

由 Francisco Jerez 提交于 10月 02, 2013

Otherwise using any atomic memory operation will lock up the GPU due
to a Haswell hardware bug.

v2: Use the _MASKED_BIT_ENABLE macro.  Drop drm parameter definition.
Signed-off-by: NFrancisco Jerez <currojerez@riseup.net>
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: <stable@vger.kernel.org>
[danvet: Fix checkpatch fail.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f3fc4884

02 10月, 2013 1 次提交

drm/i915: fix rps.vlv_work initialization · 671952a2

由 Imre Deak 提交于 10月 01, 2013

During driver loading we are initializing rps.vlv_work in
valleyview_enable_rps() via the rps.delayed_resume_work delayed work.
This is too late since we are using vlv_work already via
i915_driver_load()->intel_uncore_sanitize()->
intel_disable_gt_powersave(). This at least leads to the following
kernel warning:

 INFO: trying to register non-static key.
 the code is fine but needs lockdep annotation.
 turning off the locking correctness validator.

Fix this by initialzing vlv_work before we call intel_uncore_sanitize().

The regression was introduced in

commit 7dcd2677
Author: Konstantin Khlebnikov <khlebnikov@openvz.org>
Date:   Wed Jul 17 10:22:58 2013 +0400

    drm/i915: fix long-standing SNB regression in power consumption
    after resume

though there was no good reason to initialize the static vlv_work from
another delayed work to begin with (especially since this will happen
multiple times).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69397Tested-by: Nshui yangwei <yangweix.shui@intel.com>
Signed-off-by: NImre Deak <imre.deak@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

671952a2

01 10月, 2013 1 次提交

drm/i915: Make intel_resume_power_well() static · 51340990

由 Damien Lespiau 提交于 9月 28, 2013

Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

51340990