提交 · 8d8b00318593e2852aef19b92f99851d50a203fc · openeuler / Kernel

12 9月, 2019 1 次提交

drm/i915/pmu: Skip busyness sampling when and where not needed · 54fc577d

由 Tvrtko Ursulin 提交于 9月 11, 2019

Since d0aa694b ("drm/i915/pmu: Always sample an active ringbuffer")
the cost of sampling the engine state on execlists platforms became a
little bit higher when both engine busyness and one of the wait states are
being monitored. (Previously the busyness sampling on legacy platforms was
done via seqno comparison so there was no cost of mmio read.)

We can avoid that by skipping busyness sampling when engine supports
software busy stats and so avoid the cost of potential mmio read and
sample accumulation.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911160730.22687-1-tvrtko.ursulin@linux.intel.com

54fc577d

15 8月, 2019 1 次提交

drm/i915: Convert a few more bland dmesg info to be device specific · 88f8065c

由 Chris Wilson 提交于 8月 15, 2019

Looking around the GT initialisation, we have a few log messages we
think are interesting enough present to the user (such as the amount of L4
cache) and a few to inform them of the result of actions or conflicting
HW restrictions (i.e. quirks). These are device specific messages, so
use the dev family of printk.

v2: shave off a few bytes of .rodata!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815093604.3618-1-chris@chris-wilson.co.uk

88f8065c

06 8月, 2019 1 次提交

drm/i915/gt: Move the [class][inst] lookup for engines onto the GT · 750e76b4

由 Chris Wilson 提交于 8月 06, 2019

To maintain a fast lookup from a GT centric irq handler, we want the
engine lookup tables on the intel_gt. To avoid having multiple copies of
the same multi-dimension lookup table, move the generic user engine
lookup into an rbtree (for fast and flexible indexing).

v2: Split uabi_instance cf uabi_class
v3: Set uabi_class/uabi_instance after collating all engines to provide a
stable uabi across parallel unordered construction.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk

750e76b4

02 8月, 2019 5 次提交

drm/i915/pmu: Atomically acquire the gt_pm wakeref · 51fbd8de

由 Chris Wilson 提交于 8月 02, 2019

Currently, we only sample if the intel_gt is awake, but we acquire our
own runtime_pm wakeref. Since intel_gt has transitioned to tracking its
own wakeref, we can atomically test and acquire that wakeref instead.

v2: Take engine->wakeref for engine sampling
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801233616.23007-1-chris@chris-wilson.co.uk

51fbd8de

drm/i915/pmu: Make get_rc6 take intel_gt · 518ea582

由 Tvrtko Ursulin 提交于 8月 01, 2019

RC6 is a GT state so make the function parameter reflect that.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801162330.2729-4-tvrtko.ursulin@linux.intel.com

518ea582

drm/i915/pmu: Convert sampling to gt · 08ce5c64

由 Tvrtko Ursulin 提交于 8月 01, 2019

Engines and frequencies are a GT thing so adjust sampling routines to
match.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801162330.2729-3-tvrtko.ursulin@linux.intel.com

08ce5c64

drm/i915/pmu: Convert engine sampling to uncore mmio · 28fba096

由 Tvrtko Ursulin 提交于 8月 01, 2019

Drops one macro using implicit dev_priv.

v2:
 * Use ENGINE_READ_FW. (Chris)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801162330.2729-2-tvrtko.ursulin@linux.intel.com

28fba096

drm/i915/pmu: Make more struct i915_pmu centric · 908091c8

由 Tvrtko Ursulin 提交于 8月 01, 2019

Just tidy the code a bit by removing a sea of overly verbose i915->pmu.*.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801162330.2729-1-tvrtko.ursulin@linux.intel.com

908091c8

04 7月, 2019 1 次提交

drm/i915: Show support for accurate sw PMU busyness tracking · bf73fc0f

由 Chris Wilson 提交于 7月 03, 2019

Expose whether or not we support the PMU software tracking in our
scheduler capabilities, so userspace can query at runtime.

v2: Use I915_SCHEDULER_CAP_ENGINE_BUSY_STATS for a less ambiguous
capability name.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190703143702.11339-1-chris@chris-wilson.co.uk

bf73fc0f

14 6月, 2019 2 次提交

drm/i915: update with_intel_runtime_pm to use the rpm structure · c447ff7d

由 Daniele Ceraolo Spurio 提交于 6月 13, 2019

Matching the underlying get/put functions.
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com

c447ff7d

drm/i915: update rpm_get/put to use the rpm structure · d858d569

由 Daniele Ceraolo Spurio 提交于 6月 13, 2019

The functions where internally already only using the structure, so we
need to just flip the interface.

v2: rebase
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com

d858d569

12 6月, 2019 1 次提交

drm/i915: Remove I915_READ_NOTRACE · 5a31d30b

由 Tvrtko Ursulin 提交于 6月 11, 2019

Only a few call sites remain which have been converted to uncore mmio
accessors and so the macro can be removed.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611104548.30545-5-tvrtko.ursulin@linux.intel.com

5a31d30b

30 4月, 2019 1 次提交

drm/i915: move some leftovers to intel_pm.h from i915_drv.h · ecbb5fb7

由 Jani Nikula 提交于 4月 29, 2019

Commit 696173b0 ("drm/i915: extract intel_pm.h from intel_drv.h")
missed the declarations in i915_drv.h.
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/770f5f1c2dd99e4d6a314b70184e71b928a6d362.1556540890.git.jani.nikula@intel.com

ecbb5fb7

25 4月, 2019 1 次提交

drm/i915: Move GraphicsTechnology files under gt/ · 112ed2d3

由 Chris Wilson 提交于 4月 24, 2019

Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk

112ed2d3

06 3月, 2019 1 次提交

drm/i915: Store the BIT(engine->id) as the engine's mask · 8a68d464

由 Chris Wilson 提交于 3月 05, 2019

In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.

v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk

8a68d464

23 2月, 2019 1 次提交

drm/i915/pmu: Always sample an active ringbuffer · d0aa694b

由 Chris Wilson 提交于 2月 23, 2019

As we no longer have a precise indication of requests queued to an
engine, make no presumptions and just sample the ring registers to see
if the engine is busy.

v2: Report busy while the ring is idling on a semaphore/event.
v3: Give the struct a name!
v4: Always 0 outside the powerwell; trusting the powerwell is
accurate enough for our sampling pmu.
v5: Protect against gen7 mmio madness and try to improve grammar
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190223000102.14290-1-chris@chris-wilson.co.uk

d0aa694b

12 2月, 2019 1 次提交

drm/i915/pmu: Fix enable count array size and bounds checking · d8b879bb

由 Tvrtko Ursulin 提交于 2月 05, 2019

Enable count array is supposed to have one counter for each possible
engine sampler. As such, array sizing and bounds checking is not correct
and would blow up the asserts if more samplers were added.

No ill-effect in the current code base but lets fix it for correctness.

At the same time tidy the assert for readability and robustness.

v2:
 * One check per assert. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: b46a33e2 ("drm/i915/pmu: Expose a PMU interface for perf queries")
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130353.21105-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 26a11dee)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

d8b879bb

06 2月, 2019 1 次提交

drm/i915/pmu: Fix enable count array size and bounds checking · 26a11dee

由 Tvrtko Ursulin 提交于 2月 05, 2019

Enable count array is supposed to have one counter for each possible
engine sampler. As such, array sizing and bounds checking is not correct
and would blow up the asserts if more samplers were added.

No ill-effect in the current code base but lets fix it for correctness.

At the same time tidy the assert for readability and robustness.

v2:
 * One check per assert. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: b46a33e2 ("drm/i915/pmu: Expose a PMU interface for perf queries")
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130353.21105-1-tvrtko.ursulin@linux.intel.com

26a11dee

16 1月, 2019 1 次提交

drm/i915: Move on the new pm runtime interface · 3b4ed2e2

由 Vincent Guittot 提交于 12月 21, 2018

Use the new PM-runtime interface to get the accounted suspended time:
pm_runtime_suspended_time().

This new interface helps to simplify and cleanup the code that computes
__I915_SAMPLE_RC6_ESTIMATED and to remove direct access to internals of
PM-runtime.
Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

3b4ed2e2

15 1月, 2019 3 次提交

drm/i915: Syntatic sugar for using intel_runtime_pm · d4225a53

由 Chris Wilson 提交于 1月 14, 2019

Frequently, we use intel_runtime_pm_get/_put around a small block.
Formalise that usage by providing a macro to define such a block with an
automatic closure to scope the intel_runtime_pm wakeref to that block,
i.e. macro abuse smelling of python.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk

d4225a53

drm/i915/pmu: Track rpm wakeref · 00e27cbe

由 Chris Wilson 提交于 1月 14, 2019

Track the wakeref used for temporary access to the device, and discard
it upon release so that leaks can be identified.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-8-chris@chris-wilson.co.uk

00e27cbe

drm/i915: Markup paired operations on wakerefs · 16e4dd03

由 Chris Wilson 提交于 1月 14, 2019

The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).

For regular builds, the compiler should be able to eliminate the unused
local variables and the program growth should be minimal. Fwiw, it came
out as a net improvement as gcc was able to refactor rpm_get and
rpm_get_if_in_use together,

v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual
mark up for smaller more targeted patches.
v3: Mention the cookie in Returns
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk

16e4dd03

05 8月, 2018 1 次提交

x86: Don't include linux/irq.h from asm/hardirq.h · 447ae316

由 Nicolai Stange 提交于 7月 29, 2018

The next patch in this series will have to make the definition of
irq_cpustat_t available to entering_irq().

Inclusion of asm/hardirq.h into asm/apic.h would cause circular header
dependencies like

  asm/smp.h
    asm/apic.h
      asm/hardirq.h
        linux/irq.h
          linux/topology.h
            linux/smp.h
              asm/smp.h

or

  linux/gfp.h
    linux/mmzone.h
      asm/mmzone.h
        asm/mmzone_64.h
          asm/smp.h
            asm/apic.h
              asm/hardirq.h
                linux/irq.h
                  linux/irqdesc.h
                    linux/kobject.h
                      linux/sysfs.h
                        linux/kernfs.h
                          linux/idr.h
                            linux/gfp.h

and others.

This causes compilation errors because of the header guards becoming
effective in the second inclusion: symbols/macros that had been defined
before wouldn't be available to intermediate headers in the #include chain
anymore.

A possible workaround would be to move the definition of irq_cpustat_t
into its own header and include that from both, asm/hardirq.h and
asm/apic.h.

However, this wouldn't solve the real problem, namely asm/harirq.h
unnecessarily pulling in all the linux/irq.h cruft: nothing in
asm/hardirq.h itself requires it. Also, note that there are some other
archs, like e.g. arm64, which don't have that #include in their
asm/hardirq.h.

Remove the linux/irq.h #include from x86' asm/hardirq.h.

Fix resulting compilation errors by adding appropriate #includes to *.c
files as needed.

Note that some of these *.c files could be cleaned up a bit wrt. to their
set of #includes, but that should better be done from separate patches, if
at all.
Signed-off-by: NNicolai Stange <nstange@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

447ae316

05 6月, 2018 1 次提交

drm/i915/pmu: Do not assume fixed hrtimer period · 9f473ecf

由 Tvrtko Ursulin 提交于 6月 05, 2018

As Chris has discovered on his Ivybridge, and later automated test runs
have confirmed, on most of our platforms hrtimer faced with heavy GPU load
can occasionally become sufficiently imprecise to affect PMU sampling
calculations.

This means we cannot assume sampling frequency is what we asked for, but
we need to measure the interval ourselves.

This patch is similar to Chris' original proposal for per-engine counters,
but instead of introducing a new set to work around the problem with
frequency sampling, it swaps around the way internal frequency accounting
is done. Instead of accumulating current frequency and dividing by
sampling frequency on readout, it accumulates frequency scaled by each
period.

v2:
 * Typo in commit message, comment on period calculation and USEC_PER_SEC.
   (Chris Wilson)

Testcase: igt/perf_pmu/*busy* # snb, ivb, hsw
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605140253.3541-1-tvrtko.ursulin@linux.intel.com

9f473ecf

18 4月, 2018 1 次提交

drm/i915/pmu: Inspect runtime PM state more carefully while estimating RC6 · e6be6bd8

由 Tvrtko Ursulin 提交于 4月 10, 2018

While thinking about sporadic failures of perf_pmu/rc6-runtime-pm* tests
on some CI machines I have concluded that: a) the PMU readout of RC6 can
race against runtime PM transitions, and b) there are other reasons than
being runtime suspended which can cause intel_runtime_pm_get_if_in_use to
fail.

Therefore when estimating RC6 the code needs to assert we are indeed in
suspended state, and if not, the best we can do is return the last known
RC6 value.

Without this check we can calculate the estimated value based on un-
initialized or inappropriate internal state, which can result in over-
estimation, or in any case incorrect value being returned.

v2:
 * Re-arrange the code a bit to avoid second unlock and return branch.
   (Chris Wilson)

v3:
 * Insert some strategic blank lines and improve commit msg.
   (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 1fe699e3 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180410112704.24462-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 2924bdee)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

e6be6bd8

11 4月, 2018 1 次提交

drm/i915/pmu: Inspect runtime PM state more carefully while estimating RC6 · 2924bdee

由 Tvrtko Ursulin 提交于 4月 10, 2018

While thinking about sporadic failures of perf_pmu/rc6-runtime-pm* tests
on some CI machines I have concluded that: a) the PMU readout of RC6 can
race against runtime PM transitions, and b) there are other reasons than
being runtime suspended which can cause intel_runtime_pm_get_if_in_use to
fail.

Therefore when estimating RC6 the code needs to assert we are indeed in
suspended state, and if not, the best we can do is return the last known
RC6 value.

Without this check we can calculate the estimated value based on un-
initialized or inappropriate internal state, which can result in over-
estimation, or in any case incorrect value being returned.

v2:
 * Re-arrange the code a bit to avoid second unlock and return branch.
   (Chris Wilson)

v3:
 * Insert some strategic blank lines and improve commit msg.
   (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 1fe699e3 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180410112704.24462-1-tvrtko.ursulin@linux.intel.com

2924bdee

16 3月, 2018 1 次提交

drm/i915/pmu: Work around compiler warnings on some kernel configs · 22de4e7a

由 Tvrtko Ursulin 提交于 3月 14, 2018

Arnd Bergman reports:
"""
The conditional spinlock confuses gcc into thinking the 'flags' value
might contain uninitialized data:

drivers/gpu/drm/i915/i915_pmu.c: In function '__i915_pmu_event_read':
arch/x86/include/asm/paravirt_types.h:573:3: error: 'flags' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The code is correct, but it's easy to see how the compiler gets confused
here. This avoids the problem by pulling the lock outside of the function
into its only caller.
"""

On deeper look it seems this is caused by paravirt spinlocks
implementation when CONFIG_PARAVIRT_DEBUG is set, which by being
complicated, manages to convince gcc locked parameter can be changed
externally (impossible).

Work around it by removing the conditional locking parameters altogether.
(It was never the most elegant code anyway.)

Slight penalty we now pay is an additional irqsave spin lock/unlock cycle
on the event enable path. But since enable is not a fast path, that is
preferrable to the alternative solution which was doing MMIO under irqsave
spinlock.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 1fe699e3 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180314080535.17490-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit ad055fb8)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

22de4e7a

15 3月, 2018 1 次提交

drm/i915/pmu: Work around compiler warnings on some kernel configs · ad055fb8

由 Tvrtko Ursulin 提交于 3月 14, 2018

Arnd Bergman reports:
"""
The conditional spinlock confuses gcc into thinking the 'flags' value
might contain uninitialized data:

drivers/gpu/drm/i915/i915_pmu.c: In function '__i915_pmu_event_read':
arch/x86/include/asm/paravirt_types.h:573:3: error: 'flags' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The code is correct, but it's easy to see how the compiler gets confused
here. This avoids the problem by pulling the lock outside of the function
into its only caller.
"""

On deeper look it seems this is caused by paravirt spinlocks
implementation when CONFIG_PARAVIRT_DEBUG is set, which by being
complicated, manages to convince gcc locked parameter can be changed
externally (impossible).

Work around it by removing the conditional locking parameters altogether.
(It was never the most elegant code anyway.)

Slight penalty we now pay is an additional irqsave spin lock/unlock cycle
on the event enable path. But since enable is not a fast path, that is
preferrable to the alternative solution which was doing MMIO under irqsave
spinlock.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 1fe699e3 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180314080535.17490-1-tvrtko.ursulin@linux.intel.com

ad055fb8

10 3月, 2018 1 次提交

drm/i915: Make header i915_pmu.h more robust · 058a9b43

由 Michal Wajdeczko 提交于 3月 08, 2018

Definitions in i915_pmu.h header depend on other types and
declarations that were not explicitly included. Fix that by
adding related headers and forward declarations.
While here, change license text to SPDX format.

v2: don't drop "intel_ringbuffer.h" (Tvrtko)
Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308095037.18264-4-michal.wajdeczko@intel.com

058a9b43

14 2月, 2018 3 次提交

drm/i915/pmu: Fix building without CONFIG_PM · 4b8b41d1

由 Chris Wilson 提交于 2月 13, 2018

As we peek inside struct device to query members guarded by CONFIG_PM,
so must be the code.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Fixes: 1fe699e3 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207160428.17015-1-chris@chris-wilson.co.uk
(cherry picked from commit 05273c95)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213095747.2424-4-tvrtko.ursulin@linux.intel.com

4b8b41d1

drm/i915/pmu: Fix sleep under atomic in RC6 readout · 4c83f0a7

由 Tvrtko Ursulin 提交于 2月 13, 2018

We are not allowed to call intel_runtime_pm_get from the PMU counter read
callback since the former can sleep, and the latter is running under IRQ
context.

To workaround this, we record the last known RC6 and while runtime
suspended estimate its increase by querying the runtime PM core
timestamps.

Downside of this approach is that we can temporarily lose a chunk of RC6
time, from the last PMU read-out to runtime suspend entry, but that will
eventually catch up, once device comes back online and in the presence of
PMU queries.

Also, we have to be careful not to overshoot the RC6 estimate, so once
resumed after a period of approximation, we only update the counter once
it catches up. With the observation that RC6 is increasing while the
device is suspended, this should not pose a problem and can only cause
slight inaccuracies due clock base differences.

v2: Simplify by estimating on top of PM core counters. (Imre)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943
Fixes: 6060b6ae ("drm/i915/pmu: Add RC6 residency metrics")
Testcase: igt/perf_pmu/rc6-runtime-pm
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206183311.17924-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 1fe699e3)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213095747.2424-3-tvrtko.ursulin@linux.intel.com

4c83f0a7

drm/i915/pmu: Fix PMU enable vs execlists tasklet race · d3f84c8b

由 Tvrtko Ursulin 提交于 2月 13, 2018

Commit 99e48bf9 ("drm/i915: Lock out execlist tasklet while peeking
inside for busy-stats") added a tasklet_disable call in busy stats
enabling, but we failed to understand that the PMU enable callback runs
as an hard IRQ (IPI).

Consequence of this is that the PMU enable callback can interrupt the
execlists tasklet, and will then deadlock when it calls
intel_engine_stats_enable->tasklet_disable.

To fix this, I realized it is possible to move the engine stats enablement
and disablement to PMU event init and destroy hooks. This allows for much
simpler implementation since those hooks run in normal context (can
sleep).

v2: Extract engine_event_destroy. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 99e48bf9 ("drm/i915: Lock out execlist tasklet while peeking inside for busy-stats")
Testcase: igt/perf_pmu/enable-race-*
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205093448.13877-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit b2f78cda)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213095747.2424-2-tvrtko.ursulin@linux.intel.com

d3f84c8b

08 2月, 2018 1 次提交

drm/i915/pmu: Fix building without CONFIG_PM · 05273c95

由 Chris Wilson 提交于 2月 07, 2018

As we peek inside struct device to query members guarded by CONFIG_PM,
so must be the code.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Fixes: 1fe699e3 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207160428.17015-1-chris@chris-wilson.co.uk

05273c95

07 2月, 2018 1 次提交

drm/i915/pmu: Fix sleep under atomic in RC6 readout · 1fe699e3

由 Tvrtko Ursulin 提交于 2月 06, 2018

We are not allowed to call intel_runtime_pm_get from the PMU counter read
callback since the former can sleep, and the latter is running under IRQ
context.

To workaround this, we record the last known RC6 and while runtime
suspended estimate its increase by querying the runtime PM core
timestamps.

Downside of this approach is that we can temporarily lose a chunk of RC6
time, from the last PMU read-out to runtime suspend entry, but that will
eventually catch up, once device comes back online and in the presence of
PMU queries.

Also, we have to be careful not to overshoot the RC6 estimate, so once
resumed after a period of approximation, we only update the counter once
it catches up. With the observation that RC6 is increasing while the
device is suspended, this should not pose a problem and can only cause
slight inaccuracies due clock base differences.

v2: Simplify by estimating on top of PM core counters. (Imre)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943
Fixes: 6060b6ae ("drm/i915/pmu: Add RC6 residency metrics")
Testcase: igt/perf_pmu/rc6-runtime-pm
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206183311.17924-1-tvrtko.ursulin@linux.intel.com

1fe699e3

06 2月, 2018 1 次提交

drm/i915/pmu: Fix PMU enable vs execlists tasklet race · b2f78cda

由 Tvrtko Ursulin 提交于 2月 05, 2018

Commit 99e48bf9 ("drm/i915: Lock out execlist tasklet while peeking
inside for busy-stats") added a tasklet_disable call in busy stats
enabling, but we failed to understand that the PMU enable callback runs
as an hard IRQ (IPI).

Consequence of this is that the PMU enable callback can interrupt the
execlists tasklet, and will then deadlock when it calls
intel_engine_stats_enable->tasklet_disable.

To fix this, I realized it is possible to move the engine stats enablement
and disablement to PMU event init and destroy hooks. This allows for much
simpler implementation since those hooks run in normal context (can
sleep).

v2: Extract engine_event_destroy. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 99e48bf9 ("drm/i915: Lock out execlist tasklet while peeking inside for busy-stats")
Testcase: igt/perf_pmu/enable-race-*
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205093448.13877-1-tvrtko.ursulin@linux.intel.com

b2f78cda

24 1月, 2018 1 次提交

drm/i915/pmu: Fix sysfs exported counter config · 8810bc56

由 Tvrtko Ursulin 提交于 1月 23, 2018

We need to generate the event config value using the uAPI class and not
the driver internal one.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 109ec558 ("drm/i915/pmu: Only enumerate available counters in sysfs")
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180123134558.3222-1-tvrtko.ursulin@linux.intel.com

8810bc56

13 1月, 2018 2 次提交

drm/i915/pmu: Use kcalloc instead of kzalloc · dd5fec87

由 Tvrtko Ursulin 提交于 1月 12, 2018

kcalloc is preffered for allocating arrays.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180112170340.5387-2-tvrtko.ursulin@linux.intel.com

dd5fec87

drm/i915/pmu: fix noderef.cocci warnings · 4c501230

由 Fengguang Wu 提交于 1月 12, 2018

drivers/gpu/drm/i915/i915_pmu.c:795:34-40: ERROR: application of sizeof to pointer

sizeof when applied to a pointer typed expression gives the size of
the pointer

Generated by: scripts/coccinelle/misc/noderef.cocci

Fixes: 109ec558 ("drm/i915/pmu: Only enumerate available counters in sysfs")
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180112170340.5387-1-tvrtko.ursulin@linux.intel.com

4c501230

11 1月, 2018 2 次提交

drm/i915/pmu: Initialise our dynamic sysfs attributes for use with lockdep · 2bbba4e9

由 Chris Wilson 提交于 1月 11, 2018

As we kmalloc our dynamic sysfs attributes, we have to give them an
external static lock_class_key for them to use with lockdep.

Fixes: 109ec558 ("drm/i915/pmu: Only enumerate available counters in sysfs")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180111140402.3984-1-chris@chris-wilson.co.uk

2bbba4e9

drm/i915/pmu: Only enumerate available counters in sysfs · 109ec558

由 Tvrtko Ursulin 提交于 1月 11, 2018

Switch over to dynamically creating device attributes, which are in turn
used by the perf core to expose available counters in sysfs.

This way we do not expose counters which are not avaiable on the current
platform, and are so more consistent between what we reply to open
attempts via the perf_event_open(2), and what is discoverable in sysfs.

v2:
 * Simplify attribute pointer freeing loop.
 * Changed attr init from macro to function.
 * More common error unwind. (Chris Wilson)
 * Rename some locals. (Chris Wilson)

v3:
 * Fixed double semi-colon. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180111083525.32394-1-tvrtko.ursulin@linux.intel.com

109ec558

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功