1. 22 11月, 2017 2 次提交
    • T
      drm/i915: Engine busy time tracking · 30e17b78
      Tvrtko Ursulin 提交于
      Track total time requests have been executing on the hardware.
      
      We add new kernel API to allow software tracking of time GPU
      engines are spending executing requests.
      
      Both per-engine and global API is added with the latter also
      being exported for use by external users.
      
      v2:
       * Squashed with the internal API.
       * Dropped static key.
       * Made per-engine.
       * Store time in monotonic ktime.
      
      v3: Moved stats clearing to disable.
      
      v4:
       * Comments.
       * Don't export the API just yet.
      
      v5: Whitespace cleanup.
      
      v6:
       * Rename ref to active.
       * Drop engine aggregate stats for now.
       * Account initial busy period after enabling stats.
      
      v7:
       * Rebase.
      
      v8:
       * Move context in notification after the notifier. (Chris Wilson)
      
      v9:
      
      In cases where stats tracking is getting disabled while there is
      an active context on an engine, add up the current value to the
      total. This also implies we don't clear the total when tracking
      is disabled any longer. There is no real need to do so because
      we define the stats as relative while enabled, meaning
      comparison between two samples while tracking is enabled is the
      valid usage. However, when busy stats will later be plugged into
      the perf PMU API, it is beneficial to not reset the total, since
      the PMU core likes to do some counter disable/enable cycles on
      startup, and while doing so during a single long context
      executing on an engine we would lose some accuracy and so make
      unit testing more difficult than needs to be.
      
      v10:
       * Fix accounting for preemption.
      
      v11:
       * Rebase for i915_modparams.enable_execlists removal.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-5-tvrtko.ursulin@linux.intel.com
      30e17b78
    • T
      drm/i915/pmu: Expose a PMU interface for perf queries · b46a33e2
      Tvrtko Ursulin 提交于
      From: Chris Wilson <chris@chris-wilson.co.uk>
      From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      
      The first goal is to be able to measure GPU (and invidual ring) busyness
      without having to poll registers from userspace. (Which not only incurs
      holding the forcewake lock indefinitely, perturbing the system, but also
      runs the risk of hanging the machine.) As an alternative we can use the
      perf event counter interface to sample the ring registers periodically
      and send those results to userspace.
      
      Functionality we are exporting to userspace is via the existing perf PMU
      API and can be exercised via the existing tools. For example:
      
        perf stat -a -e i915/rcs0-busy/ -I 1000
      
      Will print the render engine busynnes once per second. All the performance
      counters can be enumerated (perf list) and have their unit of measure
      correctly reported in sysfs.
      
      v1-v2 (Chris Wilson):
      
      v2: Use a common timer for the ring sampling.
      
      v3: (Tvrtko Ursulin)
       * Decouple uAPI from i915 engine ids.
       * Complete uAPI defines.
       * Refactor some code to helpers for clarity.
       * Skip sampling disabled engines.
       * Expose counters in sysfs.
       * Pass in fake regs to avoid null ptr deref in perf core.
       * Convert to class/instance uAPI.
       * Use shared driver code for rc6 residency, power and frequency.
      
      v4: (Dmitry Rogozhkin)
       * Register PMU with .task_ctx_nr=perf_invalid_context
       * Expose cpumask for the PMU with the single CPU in the mask
       * Properly support pmu->stop(): it should call pmu->read()
       * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
       * Introduce refcounting of event subscriptions.
       * Make pmu.busy_stats a refcounter to avoid busy stats going away
         with some deleted event.
       * Expose cpumask for i915 PMU to avoid multiple events creation of
         the same type followed by counter aggregation by perf-stat.
       * Track CPUs getting online/offline to migrate perf context. If (likely)
         cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
         needed to see effect of CPU status tracking.
       * End result is that only global events are supported and perf stat
         works correctly.
       * Deny perf driver level sampling - it is prohibited for uncore PMU.
      
      v5: (Tvrtko Ursulin)
      
       * Don't hardcode number of engine samplers.
       * Rewrite event ref-counting for correctness and simplicity.
       * Store initial counter value when starting already enabled events
         to correctly report values to all listeners.
       * Fix RC6 residency readout.
       * Comments, GPL header.
      
      v6:
       * Add missing entry to v4 changelog.
       * Fix accounting in CPU hotplug case by copying the approach from
         arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)
      
      v7:
       * Log failure message only on failure.
       * Remove CPU hotplug notification state on unregister.
      
      v8:
       * Fix error unwind on failed registration.
       * Checkpatch cleanup.
      
      v9:
       * Drop the energy metric, it is available via intel_rapl_perf.
         (Ville Syrjälä)
       * Use HAS_RC6(p). (Chris Wilson)
       * Handle unsupported non-engine events. (Dmitry Rogozhkin)
       * Rebase for intel_rc6_residency_ns needing caller managed
         runtime pm.
       * Drop HAS_RC6 checks from the read callback since creating those
         events will be rejected at init time already.
       * Add counter units to sysfs so perf stat output is nicer.
       * Cleanup the attribute tables for brevity and readability.
      
      v10:
       * Fixed queued accounting.
      
      v11:
       * Move intel_engine_lookup_user to intel_engine_cs.c
       * Commit update. (Joonas Lahtinen)
      
      v12:
       * More accurate sampling. (Chris Wilson)
       * Store and report frequency in MHz for better usability from
         perf stat.
       * Removed metrics: queued, interrupts, rc6 counters.
       * Sample engine busyness based on seqno difference only
         for less MMIO (and forcewake) on all platforms. (Chris Wilson)
      
      v13:
       * Comment spelling, use mul_u32_u32 to work around potential GCC
         issue and somne code alignment changes. (Chris Wilson)
      
      v14:
       * Rebase.
      
      v15:
       * Rebase for RPS refactoring.
      
      v16:
       * Use the dynamic slot in the CPU hotplug state machine so that we are
         free to setup our state as multi-instance. Previously we were re-using
         the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as
         multi-instance, nor owned by our driver to start with.
       * Register the CPU hotplug handlers after the PMU, otherwise the callback
         will get called before the PMU is initialized which can end up in
         perf_pmu_migrate_context with an un-initialized base.
       * Added workaround for a probable bug in cpuhp core.
      
      v17:
       * Remove workaround for the cpuhp bug.
      
      v18:
       * Rebase for drm_i915_gem_engine_class getting upstream before us.
      
      v19:
       * Rebase. (trivial)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NDmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com
      b46a33e2
  2. 21 11月, 2017 4 次提交
  3. 16 11月, 2017 1 次提交
  4. 14 11月, 2017 4 次提交
  5. 11 11月, 2017 4 次提交
  6. 10 11月, 2017 1 次提交
    • C
      drm/i915: Restore the wait for idle engine after flushing interrupts · 30b29406
      Chris Wilson 提交于
      So it appears that commit 5427f207 ("drm/i915: Bump wait-times for
      the final CS interrupt before parking") was a little over optimistic in
      its belief that it had successfully waited for all residual activity on
      the engines before parking. Numerous sightings in CI since then of
      
      <7>[   52.542886] [IGT] core_auth: executing
      <3>[   52.561013] [drm:intel_engines_park [i915]] *ERROR* vcs0 is not idle before parking
      <7>[   52.561215] intel_engines_park vcs0
      <7>[   52.561229] intel_engines_park 	current seqno 98, last 98, hangcheck 0 [-247449 ms], inflight 0
      <7>[   52.561238] intel_engines_park 	Reset count: 0
      <7>[   52.561266] intel_engines_park 	Requests:
      <7>[   52.561363] intel_engines_park 	RING_START: 0x00000000 [0x00000000]
      <7>[   52.561377] intel_engines_park 	RING_HEAD:  0x00000000 [0x00000000]
      <7>[   52.561390] intel_engines_park 	RING_TAIL:  0x00000000 [0x00000000]
      <7>[   52.561406] intel_engines_park 	RING_CTL:   0x00000000
      <7>[   52.561422] intel_engines_park 	RING_MODE:  0x00000200 [idle]
      <7>[   52.561442] intel_engines_park 	ACTHD:  0x00000000_00000000
      <7>[   52.561459] intel_engines_park 	BBADDR: 0x00000000_00000000
      <7>[   52.561474] intel_engines_park 	Execlist status: 0x00000301 00000000
      <7>[   52.561489] intel_engines_park 	Execlist CSB read 5 [5 cached], write 5 [5 from hws], interrupt posted? no
      <7>[   52.561500] intel_engines_park 		ELSP[0] idle
      <7>[   52.561510] intel_engines_park 		ELSP[1] idle
      <7>[   52.561519] intel_engines_park 		HW active? 0x0
      <7>[   52.561608] intel_engines_park Idle? yes
      <7>[   52.561617] intel_engines_park
      
      on Braswell, which indicates that the engine just needs that little bit
      longer after flushing the tasklet to settle. So give it a few more
      milliseconds before declaring an err and applying the emergency brake.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103479Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171110112550.28909-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      30b29406
  7. 09 11月, 2017 2 次提交
  8. 08 11月, 2017 1 次提交
  9. 02 11月, 2017 1 次提交
  10. 01 11月, 2017 2 次提交
  11. 27 10月, 2017 2 次提交
  12. 26 10月, 2017 1 次提交
  13. 25 10月, 2017 1 次提交
  14. 24 10月, 2017 1 次提交
    • C
      drm/i915: Filter out spurious execlists context-switch interrupts · 4a118ecb
      Chris Wilson 提交于
      Back in commit a4b2b015 ("drm/i915: Don't mark an execlists
      context-switch when idle") we noticed the presence of late
      context-switch interrupts. We were able to filter those out by looking
      at whether the ELSP remained active, but in commit beecec90
      ("drm/i915/execlists: Preemption!") that became problematic as we now
      anticipate receiving a context-switch event for preemption while ELSP
      may be empty. To restore the spurious interrupt suppression, add a
      counter for the expected number of pending context-switches and skip if
      we do not need to handle this interrupt to make forward progress.
      
      v2: Don't forget to switch on for preempt.
      v3: Reduce the counter to a on/off boolean tracker. Declare the HW as
      active when we first submit, and idle after the final completion event
      (with which we confirm the HW says it is idle), and track each source
      of activity separately. With a finite number of sources, it should aide
      us in debugging which gets stuck.
      
      Fixes: beecec90 ("drm/i915/execlists: Preemption!")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michal Winiarski <michal.winiarski@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171023213237.26536-3-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      4a118ecb
  15. 18 10月, 2017 2 次提交
  16. 17 10月, 2017 1 次提交
  17. 16 10月, 2017 1 次提交
  18. 10 10月, 2017 1 次提交
  19. 06 10月, 2017 1 次提交
  20. 05 10月, 2017 3 次提交
  21. 04 10月, 2017 3 次提交
  22. 25 9月, 2017 1 次提交