1. 07 3月, 2018 2 次提交
  2. 28 2月, 2018 1 次提交
    • C
      drm/i915: Don't deref request->ctx inside unlocked print_request() · 367a35a6
      Chris Wilson 提交于
      Although we protect the request itself, we don't lock inside
      intel_engine_dump() and so the request maybe retired as we peek into it.
      One consequence is that the request->ctx may be freed before we
      dereference it, leading to a use-after-free. Replace the hw_id we are
      peeking from inside request->ctx with the request->fence.context, with
      which we can still track from which context the request originated
      (although to tie to HW reports requires a little more legwork, but is
      good enough to follow the GEM traces).
      
      [52640.729670] general protection fault: 0000 [#2] SMP
      [52640.729694] Dumping ftrace buffer:
      [52640.729701]    (ftrace buffer empty)
      [52640.729705] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_\
      temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hwdep gha\
      sh_clmulni_intel snd_hda_core snd_pcm mei_me mei i915 r8169 mii prime_numbers i2c_hid
      [52640.729748] CPU: 2 PID: 4335 Comm: gem_exec_schedu Tainted: G     UD W        4.16.0-rc3+ #7
      [52640.729759] Hardware name: Acer Aspire E5-575G/Ironman_SK  , BIOS V1.12 08/02/2016
      [52640.729803] RIP: 0010:print_request+0x2b/0xb0 [i915]
      [52640.729811] RSP: 0018:ffffc90001453c18 EFLAGS: 00010206
      [52640.729820] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8801e0292d40 RCX: 0000000000000006
      [52640.729829] RDX: ffffc90001453c60 RSI: ffff8801e0292d40 RDI: 0000000000000003
      [52640.729838] RBP: ffffc90001453d80 R08: 0000000000000000 R09: 0000000000000001
      [52640.729847] R10: ffffc90001453bd0 R11: ffffc90001453c73 R12: ffffc90001453c60
      [52640.729856] R13: ffffc90001453d80 R14: ffff8801d5a683c8 R15: ffff8801e0292d40
      [52640.729866] FS:  00007f1ee50548c0(0000) GS:ffff8801e8200000(0000) knlGS:0000000000000000
      [52640.729876] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [52640.729884] CR2: 00007f1ee5077000 CR3: 00000001d9411004 CR4: 00000000003606e0
      [52640.729893] Call Trace:
      [52640.729922]  intel_engine_print_registers+0x623/0x890 [i915]
      [52640.729948]  intel_engine_dump+0x4a3/0x590 [i915]
      [52640.729957]  ? seq_printf+0x3a/0x50
      [52640.729977]  i915_engine_info+0xb8/0xe0 [i915]
      [52640.729984]  ? drm_mode_gamma_get_ioctl+0xf0/0xf0
      [52640.729990]  seq_read+0xd5/0x410
      [52640.729997]  full_proxy_read+0x4b/0x70
      [52640.730004]  __vfs_read+0x1e/0x120
      [52640.730009]  ? do_sys_open+0x134/0x220
      [52640.730015]  ? kmem_cache_free+0x174/0x2b0
      [52640.730021]  vfs_read+0xa1/0x150
      [52640.730026]  SyS_read+0x40/0xa0
      [52640.730032]  do_syscall_64+0x65/0x1a0
      [52640.730038]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      Reported-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180228094732.28462-1-chris@chris-wilson.co.uk
      367a35a6
  3. 24 2月, 2018 1 次提交
  4. 22 2月, 2018 1 次提交
  5. 12 2月, 2018 2 次提交
  6. 09 2月, 2018 1 次提交
  7. 08 2月, 2018 1 次提交
  8. 01 2月, 2018 1 次提交
  9. 23 1月, 2018 1 次提交
  10. 20 1月, 2018 2 次提交
  11. 15 1月, 2018 1 次提交
  12. 11 1月, 2018 2 次提交
  13. 06 1月, 2018 1 次提交
  14. 23 12月, 2017 1 次提交
    • C
      drm/i915: Show HWSP in intel_engine_dump() · c1bf2728
      Chris Wilson 提交于
      Looking at a CI failure with an ominous line of
      [  362.550715] hangcheck current seqno ffffff6b, last ffffff8c, hangcheck ffffff6b [6016 ms], inflight 118
      with no apparent cause for the seqno to be negative, left me wondering
      if someone had scribbled over the HWSP. So include the HWSP in the
      engine dump to see if there are more signs of random scribbling.
      
      v2: Fix row pointer, i is now incremented by 8 so doesn't need scaling
      by 8, and we don't need to keep volatile here as the status_page isn't
      marked up as volatile itself.
      v3: Use hexdump, with suppression of identical lines. (Tvrtko)
          Which results in
      
      HWSP:
      00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      *
      00000040 00000001 00000000 00000018 00000002 00000001 00000000 00000018 00000000
      00000060 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000003
      00000080 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      *
      000000c0 00000002 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      000000e0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      *
      
          instead of 128 lines of mostly 0s.
      v4: Tidy up the locals
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171222182521.18106-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      c1bf2728
  15. 20 12月, 2017 1 次提交
  16. 18 12月, 2017 1 次提交
  17. 13 12月, 2017 1 次提交
    • C
      drm/i915: Don't check #active_requests from i915_gem_wait_for_idle() · d7dc4131
      Chris Wilson 提交于
      i915_gem_wait_for_idle() is called from inside the shrinker, to ensure
      that we drain the last resources from the GPU in dire circumstances (OOM).
      As we may allocate whilst building a request, it is then possible to hit
      the shrinker with a request under construction, and so we must account
      for the incomplete request whilst waiting. In particular, we
      preincrement (in reserve_engine) the i915->gt.active_requests counter
      and mark the GPU as busy, therefore we can not use that counter for
      shortcircuiting the wait-for-idle.
      
      [  950.859024] GEM_BUG_ON(i915->gt.active_requests)
      [  950.859041] WARNING: CPU: 2 PID: 2178 at drivers/gpu/drm/i915/i915_gem.c:3615 i915_gem_wait_for_idle.part.56+0x166/0x4e0
      [  950.859041] Modules linked in: ccm tun fuse nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_security iptable_raw arc4 iwldvm mac80211 snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic snd_hda_intel snd_hda_codec btusb snd_hda_core btrtl btbcm iwlwifi snd_hwdep btintel bluetooth snd_seq snd_seq_device snd_pcm ecdh_generic x86_pkg_temp_thermal tpm_infineon coretemp tpm_tis crc32_pclmul wmi_bmof crc32c_intel iTCO_wdt hp_wmi snd_timer iTCO_vendor_support sparse_keymap tpm_tis_core mei_me cfg80211
      [  950.859082]  snd joydev tpm mei rfkill pcspkr wmi soundcore lpc_ich hp_accel lis3lv02d input_polldev binfmt_misc e1000e ptp serio_raw pps_core
      [  950.859094] CPU: 2 PID: 2178 Comm: gem_exec_nop Tainted: G     U           4.15.0-rc2+ #900
      [  950.859102] Hardware name: Hewlett-Packard HP ProBook 6360b/1620, BIOS 68SCF Ver. B.42 12/29/2010
      [  950.859107] task: c5119cb4 task.stack: f3ccb8d8
      [  950.859112] EIP: i915_gem_wait_for_idle.part.56+0x166/0x4e0
      [  950.859113] EFLAGS: 00010296 CPU: 2
      [  950.859114] EAX: 00000024 EBX: f36c1888 ECX: f777a044 EDX: 00000007
      [  950.859115] ESI: f36c1888 EDI: edd53958 EBP: edd53970 ESP: edd53938
      [  950.859116]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      [  950.859117] CR0: 80050033 CR2: b7f39000 CR3: 2f2b3000 CR4: 000406d0
      [  950.859118] Call Trace:
      [  950.859125]  ? drm_printk+0x70/0x70
      [  950.859129]  i915_gem_wait_for_idle+0x18/0x30
      [  950.859133]  i915_gem_shrink+0x360/0x410
      [  950.859138]  ? vmpressure+0xa8/0xf0
      [  950.859142]  ? ktime_get+0x4a/0x100
      [  950.859147]  i915_gem_shrink_all+0x21/0x40
      [  950.859151]  i915_gem_shrinker_oom+0x23/0x130
      [  950.859156]  notifier_call_chain+0x4e/0x70
      [  950.859160]  __blocking_notifier_call_chain+0x2f/0x60
      [  950.859164]  blocking_notifier_call_chain+0x11/0x20
      [  950.859169]  out_of_memory+0x207/0x280
      [  950.859174]  __alloc_pages_nodemask+0xd47/0xe60
      [  950.859179]  new_slab+0x32d/0x450
      [  950.859183]  ___slab_alloc.constprop.81+0x358/0x4e0
      [  950.859189]  ? i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859193]  ? __slab_free+0x1fe/0x310
      [  950.859197]  ? native_sched_clock+0x1e/0xc0
      [  950.859201]  ? i915_gem_request_alloc+0xcf/0x510
      [  950.859205]  ? sched_clock+0x9/0x10
      [  950.859209]  __slab_alloc.constprop.80+0x29/0x40
      [  950.859212]  ? __slab_alloc.constprop.80+0x29/0x40
      [  950.859216]  kmem_cache_alloc_trace+0x160/0x1a0
      [  950.859220]  ? i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859224]  i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859229]  i915_gem_request_await_dma_fence+0x1eb/0x390
      [  950.859233]  i915_gem_request_await_object+0xee/0x230
      [  950.859239]  i915_gem_do_execbuffer+0xc16/0x1200
      [  950.859246]  ? irqtime_account_irq+0x3e/0xc0
      [  950.859251]  ? irq_exit+0x4f/0xb0
      [  950.859257]  ? smp_apic_timer_interrupt+0x5f/0x110
      [  950.859261]  ? apic_timer_interrupt+0x35/0x3c
      [  950.859266]  i915_gem_execbuffer2_ioctl+0x212/0x440
      [  950.859270]  ? apic_timer_interrupt+0x35/0x3c
      [  950.859274]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859279]  ? insn_get_seg_base+0x1b/0x50
      [  950.859283]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859287]  drm_ioctl_kernel+0x51/0xa0
      [  950.859291]  drm_ioctl+0x2a3/0x350
      [  950.859294]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859300]  ? sched_clock+0x9/0x10
      [  950.859303]  ? drm_getunique+0x70/0x70
      [  950.859308]  do_vfs_ioctl+0x7d/0x640
      [  950.859311]  ? native_sched_clock+0x1e/0xc0
      [  950.859315]  ? sched_clock+0x9/0x10
      [  950.859319]  ? sched_clock_cpu+0x13/0x120
      [  950.859323]  SyS_ioctl+0x4e/0x80
      [  950.859326]  do_fast_syscall_32+0x75/0x250
      [  950.859331]  ? irq_exit+0x4f/0xb0
      [  950.859334]  entry_SYSENTER_32+0x47/0x71
      [  950.859338] EIP: 0xb7f81d11
      [  950.859339] EFLAGS: 00000296 CPU: 2
      [  950.859340] EAX: ffffffda EBX: 00000003 ECX: 40406469 EDX: bfde4c20
      [  950.859340] ESI: 00000003 EDI: 40406469 EBP: 00000003 ESP: bfde4b38
      [  950.859341]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      [  950.859343] Code: e8 30 60 01 00 83 c4 10 83 c3 04 39 f3 75 e0 8b 45 d8 8b 80 14 37 00 00 85 c0 74 13 68 dd 33 e4 c0 68 49 6f e3 c0 e8 4a 55 be ff <0f> ff 5e 5f b8 fe ff ff 3f bb 0a 00 00 00 e8 b7 14 c4 ff 8b 15
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212132148.8124-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      d7dc4131
  18. 11 12月, 2017 1 次提交
  19. 09 12月, 2017 5 次提交
  20. 30 11月, 2017 1 次提交
  21. 29 11月, 2017 1 次提交
  22. 22 11月, 2017 2 次提交
    • T
      drm/i915: Engine busy time tracking · 30e17b78
      Tvrtko Ursulin 提交于
      Track total time requests have been executing on the hardware.
      
      We add new kernel API to allow software tracking of time GPU
      engines are spending executing requests.
      
      Both per-engine and global API is added with the latter also
      being exported for use by external users.
      
      v2:
       * Squashed with the internal API.
       * Dropped static key.
       * Made per-engine.
       * Store time in monotonic ktime.
      
      v3: Moved stats clearing to disable.
      
      v4:
       * Comments.
       * Don't export the API just yet.
      
      v5: Whitespace cleanup.
      
      v6:
       * Rename ref to active.
       * Drop engine aggregate stats for now.
       * Account initial busy period after enabling stats.
      
      v7:
       * Rebase.
      
      v8:
       * Move context in notification after the notifier. (Chris Wilson)
      
      v9:
      
      In cases where stats tracking is getting disabled while there is
      an active context on an engine, add up the current value to the
      total. This also implies we don't clear the total when tracking
      is disabled any longer. There is no real need to do so because
      we define the stats as relative while enabled, meaning
      comparison between two samples while tracking is enabled is the
      valid usage. However, when busy stats will later be plugged into
      the perf PMU API, it is beneficial to not reset the total, since
      the PMU core likes to do some counter disable/enable cycles on
      startup, and while doing so during a single long context
      executing on an engine we would lose some accuracy and so make
      unit testing more difficult than needs to be.
      
      v10:
       * Fix accounting for preemption.
      
      v11:
       * Rebase for i915_modparams.enable_execlists removal.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-5-tvrtko.ursulin@linux.intel.com
      30e17b78
    • T
      drm/i915/pmu: Expose a PMU interface for perf queries · b46a33e2
      Tvrtko Ursulin 提交于
      From: Chris Wilson <chris@chris-wilson.co.uk>
      From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      
      The first goal is to be able to measure GPU (and invidual ring) busyness
      without having to poll registers from userspace. (Which not only incurs
      holding the forcewake lock indefinitely, perturbing the system, but also
      runs the risk of hanging the machine.) As an alternative we can use the
      perf event counter interface to sample the ring registers periodically
      and send those results to userspace.
      
      Functionality we are exporting to userspace is via the existing perf PMU
      API and can be exercised via the existing tools. For example:
      
        perf stat -a -e i915/rcs0-busy/ -I 1000
      
      Will print the render engine busynnes once per second. All the performance
      counters can be enumerated (perf list) and have their unit of measure
      correctly reported in sysfs.
      
      v1-v2 (Chris Wilson):
      
      v2: Use a common timer for the ring sampling.
      
      v3: (Tvrtko Ursulin)
       * Decouple uAPI from i915 engine ids.
       * Complete uAPI defines.
       * Refactor some code to helpers for clarity.
       * Skip sampling disabled engines.
       * Expose counters in sysfs.
       * Pass in fake regs to avoid null ptr deref in perf core.
       * Convert to class/instance uAPI.
       * Use shared driver code for rc6 residency, power and frequency.
      
      v4: (Dmitry Rogozhkin)
       * Register PMU with .task_ctx_nr=perf_invalid_context
       * Expose cpumask for the PMU with the single CPU in the mask
       * Properly support pmu->stop(): it should call pmu->read()
       * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
       * Introduce refcounting of event subscriptions.
       * Make pmu.busy_stats a refcounter to avoid busy stats going away
         with some deleted event.
       * Expose cpumask for i915 PMU to avoid multiple events creation of
         the same type followed by counter aggregation by perf-stat.
       * Track CPUs getting online/offline to migrate perf context. If (likely)
         cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
         needed to see effect of CPU status tracking.
       * End result is that only global events are supported and perf stat
         works correctly.
       * Deny perf driver level sampling - it is prohibited for uncore PMU.
      
      v5: (Tvrtko Ursulin)
      
       * Don't hardcode number of engine samplers.
       * Rewrite event ref-counting for correctness and simplicity.
       * Store initial counter value when starting already enabled events
         to correctly report values to all listeners.
       * Fix RC6 residency readout.
       * Comments, GPL header.
      
      v6:
       * Add missing entry to v4 changelog.
       * Fix accounting in CPU hotplug case by copying the approach from
         arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)
      
      v7:
       * Log failure message only on failure.
       * Remove CPU hotplug notification state on unregister.
      
      v8:
       * Fix error unwind on failed registration.
       * Checkpatch cleanup.
      
      v9:
       * Drop the energy metric, it is available via intel_rapl_perf.
         (Ville Syrjälä)
       * Use HAS_RC6(p). (Chris Wilson)
       * Handle unsupported non-engine events. (Dmitry Rogozhkin)
       * Rebase for intel_rc6_residency_ns needing caller managed
         runtime pm.
       * Drop HAS_RC6 checks from the read callback since creating those
         events will be rejected at init time already.
       * Add counter units to sysfs so perf stat output is nicer.
       * Cleanup the attribute tables for brevity and readability.
      
      v10:
       * Fixed queued accounting.
      
      v11:
       * Move intel_engine_lookup_user to intel_engine_cs.c
       * Commit update. (Joonas Lahtinen)
      
      v12:
       * More accurate sampling. (Chris Wilson)
       * Store and report frequency in MHz for better usability from
         perf stat.
       * Removed metrics: queued, interrupts, rc6 counters.
       * Sample engine busyness based on seqno difference only
         for less MMIO (and forcewake) on all platforms. (Chris Wilson)
      
      v13:
       * Comment spelling, use mul_u32_u32 to work around potential GCC
         issue and somne code alignment changes. (Chris Wilson)
      
      v14:
       * Rebase.
      
      v15:
       * Rebase for RPS refactoring.
      
      v16:
       * Use the dynamic slot in the CPU hotplug state machine so that we are
         free to setup our state as multi-instance. Previously we were re-using
         the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as
         multi-instance, nor owned by our driver to start with.
       * Register the CPU hotplug handlers after the PMU, otherwise the callback
         will get called before the PMU is initialized which can end up in
         perf_pmu_migrate_context with an un-initialized base.
       * Added workaround for a probable bug in cpuhp core.
      
      v17:
       * Remove workaround for the cpuhp bug.
      
      v18:
       * Rebase for drm_i915_gem_engine_class getting upstream before us.
      
      v19:
       * Rebase. (trivial)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NDmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com
      b46a33e2
  23. 21 11月, 2017 4 次提交
  24. 16 11月, 2017 1 次提交
  25. 14 11月, 2017 4 次提交