1. 08 12月, 2017 1 次提交
    • T
      drm/i915: Restore GT performance in headless mode with DMC loaded · b6876374
      Tvrtko Ursulin 提交于
      It seems that the DMC likes to transition between the DC states a lot when
      there are no connected displays (no active power domains) during command
      submission.
      
      This activity on DC states has a negative impact on the performance of the
      chip with huge latencies observed in the interrupt handlers and elsewhere.
      Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
      eight.
      
      Work around it by introducing a new power domain named,
      POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
      held for the duration of command submission activity.
      
      CNL has the same problem which will be addressed as a follow-up. Doing
      that requires a fix for a DC6 context corruption problem in the CNL DMC
      firmware which is yet to be released.
      
      v2:
       * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
       * Protect macro body with braces. (Jani Nikula)
      
      v3:
       * Add dedicated power domain for clarity. (Chris, Imre)
       * Commit message and comment text updates.
       * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
         firmware release.
      
      v4:
       * Power domain should be inner to device runtime pm. (Chris)
       * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
       * Handle async DMC loading by moving the GT_IRQ power domain logic into
         intel_runtime_pm. (Daniel, Chris)
       * Include small core GEN9 as well. (Imre)
      
      v5
       * Special handling for async DMC load is not needed since on failure the
         power domain reference is kept permanently taken. (Imre)
      
      v6:
       * Drop the NEEDS_CSR_GT_PERF_WA macro since all firmwares have now been
         deployed. (Imre, Chris)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
      Testcase: igt/gem_exec_nop/headless
      Cc: Imre Deak <imre.deak@intel.com>
      Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v5)
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      [Imre: Add note about applying the WA on CNL as a follow-up]
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171205132854.26380-1-tvrtko.ursulin@linux.intel.com
      b6876374
  2. 07 12月, 2017 1 次提交
  3. 06 12月, 2017 3 次提交
  4. 02 12月, 2017 1 次提交
  5. 01 12月, 2017 1 次提交
  6. 28 11月, 2017 3 次提交
  7. 22 11月, 2017 6 次提交
    • T
      drm/i915: Convert intel_rc6_residency_us to ns · 36cc8b96
      Tvrtko Ursulin 提交于
      Will be used for exposing the PMU counters.
      
      v2:
       * Move intel_runtime_pm_get/put to the callers. (Chris Wilson)
       * Restore full unit conversion precision.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-8-tvrtko.ursulin@linux.intel.com
      36cc8b96
    • T
      drm/i915/pmu: Expose a PMU interface for perf queries · b46a33e2
      Tvrtko Ursulin 提交于
      From: Chris Wilson <chris@chris-wilson.co.uk>
      From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      
      The first goal is to be able to measure GPU (and invidual ring) busyness
      without having to poll registers from userspace. (Which not only incurs
      holding the forcewake lock indefinitely, perturbing the system, but also
      runs the risk of hanging the machine.) As an alternative we can use the
      perf event counter interface to sample the ring registers periodically
      and send those results to userspace.
      
      Functionality we are exporting to userspace is via the existing perf PMU
      API and can be exercised via the existing tools. For example:
      
        perf stat -a -e i915/rcs0-busy/ -I 1000
      
      Will print the render engine busynnes once per second. All the performance
      counters can be enumerated (perf list) and have their unit of measure
      correctly reported in sysfs.
      
      v1-v2 (Chris Wilson):
      
      v2: Use a common timer for the ring sampling.
      
      v3: (Tvrtko Ursulin)
       * Decouple uAPI from i915 engine ids.
       * Complete uAPI defines.
       * Refactor some code to helpers for clarity.
       * Skip sampling disabled engines.
       * Expose counters in sysfs.
       * Pass in fake regs to avoid null ptr deref in perf core.
       * Convert to class/instance uAPI.
       * Use shared driver code for rc6 residency, power and frequency.
      
      v4: (Dmitry Rogozhkin)
       * Register PMU with .task_ctx_nr=perf_invalid_context
       * Expose cpumask for the PMU with the single CPU in the mask
       * Properly support pmu->stop(): it should call pmu->read()
       * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
       * Introduce refcounting of event subscriptions.
       * Make pmu.busy_stats a refcounter to avoid busy stats going away
         with some deleted event.
       * Expose cpumask for i915 PMU to avoid multiple events creation of
         the same type followed by counter aggregation by perf-stat.
       * Track CPUs getting online/offline to migrate perf context. If (likely)
         cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
         needed to see effect of CPU status tracking.
       * End result is that only global events are supported and perf stat
         works correctly.
       * Deny perf driver level sampling - it is prohibited for uncore PMU.
      
      v5: (Tvrtko Ursulin)
      
       * Don't hardcode number of engine samplers.
       * Rewrite event ref-counting for correctness and simplicity.
       * Store initial counter value when starting already enabled events
         to correctly report values to all listeners.
       * Fix RC6 residency readout.
       * Comments, GPL header.
      
      v6:
       * Add missing entry to v4 changelog.
       * Fix accounting in CPU hotplug case by copying the approach from
         arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)
      
      v7:
       * Log failure message only on failure.
       * Remove CPU hotplug notification state on unregister.
      
      v8:
       * Fix error unwind on failed registration.
       * Checkpatch cleanup.
      
      v9:
       * Drop the energy metric, it is available via intel_rapl_perf.
         (Ville Syrjälä)
       * Use HAS_RC6(p). (Chris Wilson)
       * Handle unsupported non-engine events. (Dmitry Rogozhkin)
       * Rebase for intel_rc6_residency_ns needing caller managed
         runtime pm.
       * Drop HAS_RC6 checks from the read callback since creating those
         events will be rejected at init time already.
       * Add counter units to sysfs so perf stat output is nicer.
       * Cleanup the attribute tables for brevity and readability.
      
      v10:
       * Fixed queued accounting.
      
      v11:
       * Move intel_engine_lookup_user to intel_engine_cs.c
       * Commit update. (Joonas Lahtinen)
      
      v12:
       * More accurate sampling. (Chris Wilson)
       * Store and report frequency in MHz for better usability from
         perf stat.
       * Removed metrics: queued, interrupts, rc6 counters.
       * Sample engine busyness based on seqno difference only
         for less MMIO (and forcewake) on all platforms. (Chris Wilson)
      
      v13:
       * Comment spelling, use mul_u32_u32 to work around potential GCC
         issue and somne code alignment changes. (Chris Wilson)
      
      v14:
       * Rebase.
      
      v15:
       * Rebase for RPS refactoring.
      
      v16:
       * Use the dynamic slot in the CPU hotplug state machine so that we are
         free to setup our state as multi-instance. Previously we were re-using
         the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as
         multi-instance, nor owned by our driver to start with.
       * Register the CPU hotplug handlers after the PMU, otherwise the callback
         will get called before the PMU is initialized which can end up in
         perf_pmu_migrate_context with an un-initialized base.
       * Added workaround for a probable bug in cpuhp core.
      
      v17:
       * Remove workaround for the cpuhp bug.
      
      v18:
       * Rebase for drm_i915_gem_engine_class getting upstream before us.
      
      v19:
       * Rebase. (trivial)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NDmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com
      b46a33e2
    • T
      drm/i915: Extract intel_get_cagf · c84b2705
      Tvrtko Ursulin 提交于
      Code to be shared between debugfs and the PMU implementation.
      
      v2: Checkpatch cleanup.
      v3: Also consolidate i915_sysfs.c/gt_act_freq_mhz_show.
      v4: Rebase.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-1-tvrtko.ursulin@linux.intel.com
      c84b2705
    • V
      drm/i915: Switch fbc over to for_each_new_intel_plane_in_state() · dd57602e
      Ville Syrjälä 提交于
      Stop using the old for_each_intel_plane_in_state() type iteration
      macro and replace it with for_each_new_intel_plane_in_state().
      And similarly replace drm_atomic_get_existing_crtc_state() with
      intel_atomic_get_new_crtc_state(). Switch over to intel_ types
      as well to make the code less cluttered.
      
      v2: s/plane/i9xx_plane/ etc. (James)
      
      Cc: James Ausmus <james.ausmus@intel.com>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-8-ville.syrjala@linux.intel.comSigned-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      dd57602e
    • V
      drm/i915: Use enum i9xx_plane_id for the .get_fifo_size() hooks · bdaf8439
      Ville Syrjälä 提交于
      Replace the 0 and 1 with PLANE_A and PLANE_B in the pre-g4x wm code.
      
      v2: s/old_plane_id/i9xx_plane_id/ (Daniel)
      v3: s/plane/i9xx_plane/ etc. (James)
      
      Cc: James Ausmus <james.ausmus@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-5-ville.syrjala@linux.intel.comSigned-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      bdaf8439
    • V
      drm/i915: s/enum plane/enum i9xx_plane_id/ · ed15030d
      Ville Syrjälä 提交于
      Rename enum plane to enum i9xx_plane_id to make it clear that it only
      applies to pre-SKL platforms.
      
      enum i9xx_plane_id is a global identifier, whereas enum plane_id is
      per-pipe. We need the old global identifier to index the primary plane
      (and the pre-g4x sprite C if we ever expose it) registers on pre-SKL
      platforms.
      
      v2: Reorder patches
      v3: s/old_plane_id/i9xx_plane_id/ (Daniel)
          Pimp the commit message a bit
          Note that i9xx_plane_id doesn't apply to SKL+
      v4: Rebase due to power domain handling in plane readout
      v5: Rebase due to crtc->dspaddr_offset removal
      v6: s/plane/i9xx_plane/ etc. (James)
      
      Cc: James Ausmus <james.ausmus@intel.com>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-4-ville.syrjala@linux.intel.comReviewed-by: NJames Ausmus <james.ausmus@intel.com>
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      ed15030d
  8. 21 11月, 2017 4 次提交
  9. 18 11月, 2017 1 次提交
  10. 14 11月, 2017 1 次提交
  11. 13 11月, 2017 2 次提交
  12. 12 11月, 2017 1 次提交
  13. 11 11月, 2017 2 次提交
  14. 10 11月, 2017 3 次提交
  15. 08 11月, 2017 1 次提交
    • C
      drm/i915: Read ilk FDI PLL frequency once during initialisation · 58ecd9d5
      Chris Wilson 提交于
      During intel_atomic_check(), we do not take the intel_runtime_pm_get()
      wakeref and so should do the atomic modeset precalculations without
      referring to the HW. However, on Ironlake we see
      
      <7>[   23.487557] [drm:intel_atomic_check [i915]] [CONNECTOR:47:VGA-1] checking for sink bpp constrains
      <7>[   23.487615] [drm:intel_atomic_check [i915]] clamping display bpp (was 36) to default limit of 24
      <4>[   23.487621] RPM wakelock ref not held during HW access
      <4>[   23.487652] ------------[ cut here ]------------
      <4>[   23.487697] WARNING: CPU: 0 PID: 1343 at drivers/gpu/drm/i915/intel_drv.h:1813 gen5_read32+0x183/0x200 [i915]
      <4>[   23.487701] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm lpc_ich e1000e mei_me ptp mei pps_core prime_numbers
      <4>[   23.487784] CPU: 0 PID: 1343 Comm: debugfs_test Tainted: G        W       4.14.0-rc7-CI-Trybot_1378+ #1
      <4>[   23.487788] Hardware name: Hewlett-Packard HP Compaq 8100 Elite SFF PC/304Ah, BIOS 786H1 v01.13 07/14/2011
      <4>[   23.487793] task: ffff8801f90aa6c0 task.stack: ffffc900013ec000
      <4>[   23.487838] RIP: 0010:gen5_read32+0x183/0x200 [i915]
      <4>[   23.487842] RSP: 0018:ffffc900013efb58 EFLAGS: 00010292
      <4>[   23.487849] RAX: 000000000000002a RBX: ffff880205c00000 RCX: 0000000000000006
      <4>[   23.487854] RDX: 000000000000140a RSI: ffffffff81d0eb14 RDI: ffffffff81cc26f6
      <4>[   23.487857] RBP: ffffc900013efb80 R08: ffff8801f90aaff8 R09: 0000000000000000
      <4>[   23.487861] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
      <4>[   23.487865] R13: 0000000000046000 R14: ffff88020ffaba78 R15: ffff88020b109bf8
      <4>[   23.487870] FS:  00007f53b5e40a40(0000) GS:ffff88021bc00000(0000) knlGS:0000000000000000
      <4>[   23.487874] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>[   23.487878] CR2: 000055e41900c0e8 CR3: 00000001fa0d6005 CR4: 00000000000206f0
      <4>[   23.487882] Call Trace:
      <4>[   23.487931]  intel_atomic_check+0x745/0x1290 [i915]
      <4>[   23.487948]  drm_atomic_check_only+0x459/0x560
      <4>[   23.487956]  ? drm_atomic_set_crtc_for_connector+0xc9/0x100
      <4>[   23.488025]  drm_atomic_commit+0x18/0x50
      <4>[   23.488035]  restore_fbdev_mode_atomic+0x190/0x1f0
      <4>[   23.488059]  restore_fbdev_mode+0x32/0x120
      <4>[   23.488072]  drm_fb_helper_restore_fbdev_mode_unlocked+0x50/0xa0
      <4>[   23.488139]  intel_fbdev_restore_mode+0x34/0x90 [i915]
      <4>[   23.488194]  i915_driver_lastclose+0xe/0x10 [i915]
      <4>[   23.488208]  drm_lastclose+0x39/0xf0
      <4>[   23.488219]  drm_release+0x30c/0x3c0
      <4>[   23.488236]  __fput+0xb9/0x200
      <4>[   23.488252]  ____fput+0xe/0x10
      <4>[   23.488264]  task_work_run+0x89/0xc0
      <4>[   23.488278]  exit_to_usermode_loop+0x83/0x90
      <4>[   23.488290]  syscall_return_slowpath+0xd0/0x110
      <4>[   23.488304]  entry_SYSCALL_64_fastpath+0xaf/0xb1
      <4>[   23.488312] RIP: 0033:0x7f53b4317560
      <4>[   23.488320] RSP: 002b:00007ffca7e70748 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
      <4>[   23.488333] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00007f53b4317560
      <4>[   23.488340] RDX: 0000000000000005 RSI: 00007ffca7e70640 RDI: 0000000000000004
      <4>[   23.488347] RBP: 000055e417783900 R08: 000055e418f9e290 R09: 0000000000000000
      <4>[   23.488356] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
      <4>[   23.488363] R13: 00007f53b4302c40 R14: 0000000000000000 R15: 0000000000000000
      <4>[   23.488384] Code: b5 f2 f2 e0 0f ff e9 c5 fe ff ff 80 3d 0e 4b 10 00 00 0f 85 c6 fe ff ff 48 c7 c7 30 73 29 a0 c6 05 fa 4a 10 00 01 e8 8e f2 f2 e0 <0f> ff e9 ac fe ff ff e8 51 9d f3 e0 85 c0 0f 85 01 ff ff ff 48
      <4>[   23.488780] ---[ end trace 6bc72ce7f1596190 ]---
      <7>[   23.488844] [drm:intel_atomic_check [i915]] checking fdi config on pipe A, lanes 1
      <7>[   23.488911] [drm:intel_atomic_check [i915]] hw max bpp: 36, pipe bpp: 24, dithering: 0
      
      due to intel_fdi_link_freq() poking at FDI_PLL_BIOS_0. Avoid this by
      recording the fdi pll frequency during device initiailisation.
      
      v2: Also extract the static FDI PLL frequencies for Sandybridge and
      Ivybridge.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171107214713.18704-1-chris@chris-wilson.co.ukReviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      58ecd9d5
  16. 06 11月, 2017 2 次提交
  17. 02 11月, 2017 2 次提交
  18. 01 11月, 2017 1 次提交
  19. 31 10月, 2017 1 次提交
  20. 28 10月, 2017 1 次提交
  21. 27 10月, 2017 1 次提交
  22. 25 10月, 2017 1 次提交
    • V
      drm/i915: Adjust system agent voltage on CNL if required by DDI ports · 53e9bf5e
      Ville Syrjälä 提交于
      On CNL we may need to bump up the system agent voltage not only due
      to CDCLK but also when driving DDI port with a sufficiently high clock.
      To that end start tracking the minimum acceptable voltage for each crtc.
      We do the tracking via crtcs because we don't have any kind of encoder
      state. Also there's no downside to doing it this way, and it matches how
      we track cdclk requirements on account of pixel rate.
      
      v2: Allow disabled crtcs to use the min voltage
          Add IS_CNL check to intel_ddi_compute_min_voltage() since
          we're using CNL specific values there
          s/intel_compute_min_voltage/cnl_compute_min_voltage/ since
          the function makes hw specific assumptions about the voltage
          values
      v3: Drop the test hack leftovers from skl_modeset_calc_cdclk()
      v4: s/voltage/voltage_level/ (Rodrigo)
          Replace DPLL DVFS FIXMEs with an explanation why we don't
          do anything there (Rodrigo)
      
      Cc: Mika Kahola <mika.kahola@intel.com>
      Cc: Manasi Navare <manasi.d.navare@intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171024095216.1638-9-ville.syrjala@linux.intel.com
      53e9bf5e