1. 17 9月, 2019 1 次提交
  2. 13 9月, 2019 1 次提交
  3. 12 9月, 2019 2 次提交
  4. 10 9月, 2019 2 次提交
    • C
      drm/i915/execlists: Ignore lost completion events · 198d2533
      Chris Wilson 提交于
      Icelake hit an issue where it missed reporting a completion event and
      instead jumped straight to a idle->active event (skipping over the
      active->idle and not even hitting the lite-restore preemption).
      
      661497511us : process_csb: rcs0 cs-irq head=11, tail=0
      661497512us : process_csb: rcs0 csb[0]: status=0x10008002:0x00000020 [lite-restore]
      661497512us : trace_ports: rcs0: preempted { 28cc8:11052, 0:0 }
      661497513us : trace_ports: rcs0: promote { 28cc8:11054, 0:0 }
      661497514us : __i915_request_submit: rcs0 fence 28cc8:11056, current 11052
      661497514us : __execlists_submission_tasklet: rcs0: queue_priority_hint:-2147483648, submit:yes
      661497515us : trace_ports: rcs0: submit { 28cc8:11056, 0:0 }
      661497530us : process_csb: rcs0 cs-irq head=0, tail=1
      661497530us : process_csb: rcs0 csb[1]: status=0x10008002:0x00000020 [lite-restore]
      661497531us : trace_ports: rcs0: preempted { 28cc8:11054!, 0:0 }
      661497535us : trace_ports: rcs0: promote { 28cc8:11056, 0:0 }
      661497540us : __i915_request_submit: rcs0 fence 28cc8:11058, current 11054
      661497544us : __execlists_submission_tasklet: rcs0: queue_priority_hint:-2147483648, submit:yes
      661497545us : trace_ports: rcs0: submit { 28cc8:11058, 0:0 }
      661497553us : process_csb: rcs0 cs-irq head=1, tail=2
      661497553us : process_csb: rcs0 csb[2]: status=0x10000001:0x00000000 [idle->active]
      661497574us : process_csb: process_csb:1538 GEM_BUG_ON(*execlists->active)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190907084334.28952-1-chris@chris-wilson.co.uk
      198d2533
    • C
      drm/i915/execlists: Clear STOP_RING bit on reset · fa9a09f1
      Chris Wilson 提交于
      During reset, we try to ensure no forward progress of the CS prior to
      the reset by setting the STOP_RING bit in RING_MI_MODE. Since gen9, this
      register is context saved and do we end up in the odd situation where we
      save the STOP_RING bit and so try to stop the engine again immediately
      upon resume. This is quite unexpected and causes us to complain about an
      early CS completion event!
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111514Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190910080208.4223-1-chris@chris-wilson.co.uk
      fa9a09f1
  5. 09 9月, 2019 1 次提交
  6. 07 9月, 2019 2 次提交
  7. 31 8月, 2019 2 次提交
  8. 30 8月, 2019 1 次提交
  9. 28 8月, 2019 2 次提交
  10. 27 8月, 2019 1 次提交
  11. 22 8月, 2019 1 次提交
  12. 20 8月, 2019 3 次提交
  13. 17 8月, 2019 2 次提交
  14. 16 8月, 2019 2 次提交
  15. 15 8月, 2019 3 次提交
  16. 13 8月, 2019 1 次提交
  17. 12 8月, 2019 1 次提交
  18. 10 8月, 2019 3 次提交
  19. 09 8月, 2019 3 次提交
    • C
      drm/i915/execlists: Backtrack along timeline · 6cd34b10
      Chris Wilson 提交于
      After a preempt-to-busy, we may find an active request that is caught
      between execution states. Walk back along the timeline instead of the
      execution list to be safe.
      
      [  106.417541] i915 0000:00:02.0: Resetting rcs0 for preemption time out
      [  106.417659] ==================================================================
      [  106.418041] BUG: KASAN: slab-out-of-bounds in __execlists_reset+0x2f2/0x440 [i915]
      [  106.418123] Read of size 8 at addr ffff888703506b30 by task swapper/1/0
      [  106.418194]
      [  106.418267] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G     U            5.3.0-rc3+ #5
      [  106.418344] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [  106.418434] Call Trace:
      [  106.418508]  <IRQ>
      [  106.418585]  dump_stack+0x5b/0x90
      [  106.418941]  ? __execlists_reset+0x2f2/0x440 [i915]
      [  106.419022]  print_address_description+0x67/0x32d
      [  106.419376]  ? __execlists_reset+0x2f2/0x440 [i915]
      [  106.419731]  ? __execlists_reset+0x2f2/0x440 [i915]
      [  106.419810]  __kasan_report.cold.6+0x1a/0x3c
      [  106.419888]  ? __trace_bprintk+0xc0/0xd0
      [  106.420239]  ? __execlists_reset+0x2f2/0x440 [i915]
      [  106.420318]  check_memory_region+0x144/0x1c0
      [  106.420671]  __execlists_reset+0x2f2/0x440 [i915]
      [  106.421029]  execlists_reset+0x3d/0x50 [i915]
      [  106.421387]  intel_engine_reset+0x203/0x3a0 [i915]
      [  106.421744]  ? igt_reset_nop+0x2b0/0x2b0 [i915]
      [  106.421825]  ? _raw_spin_trylock_bh+0xe0/0xe0
      [  106.421901]  ? rcu_core+0x1b9/0x6a0
      [  106.422251]  preempt_reset+0x9a/0xf0 [i915]
      [  106.422333]  tasklet_action_common.isra.15+0xc0/0x1e0
      [  106.422685]  ? execlists_submit_request+0x200/0x200 [i915]
      [  106.422764]  __do_softirq+0x106/0x3cf
      [  106.422840]  irq_exit+0xdc/0xf0
      [  106.422914]  smp_apic_timer_interrupt+0x81/0x1c0
      [  106.422988]  apic_timer_interrupt+0xf/0x20
      [  106.423059]  </IRQ>
      [  106.423144] RIP: 0010:cpuidle_enter_state+0xc3/0x620
      [  106.423222] Code: 24 0f 1f 44 00 00 31 ff e8 da 87 9c ff 80 7c 24 10 00 74 12 9c 58 f6 c4 02 0f 85 33 05 00 00 31 ff e8 c1 77 a3 ff fb 45 85 e4 <0f> 89 bf 02 00 00 48 8d 7d 10 e8 4e 45 b9 ff c7 45 10 00 00 00 00
      [  106.423311] RSP: 0018:ffff88881c30fda8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
      [  106.423390] RAX: 0000000000000000 RBX: ffffffff825b4c80 RCX: ffffffff810c8a00
      [  106.423465] RDX: dffffc0000000000 RSI: 0000000039f89620 RDI: ffff88881f6b00a8
      [  106.423540] RBP: ffff88881f6b5bf8 R08: 0000000000000002 R09: 000000000002ed80
      [  106.423616] R10: 0000003fdd956146 R11: ffff88881c2d1e47 R12: 0000000000000008
      [  106.423691] R13: 0000000000000008 R14: ffffffff825b4f80 R15: ffffffff825b4fc0
      [  106.423772]  ? sched_idle_set_state+0x20/0x30
      [  106.423851]  ? cpuidle_enter_state+0xa6/0x620
      [  106.423874]  ? tick_nohz_idle_stop_tick+0x1d1/0x3f0
      [  106.423896]  cpuidle_enter+0x37/0x60
      [  106.423919]  do_idle+0x246/0x280
      [  106.423941]  ? arch_cpu_idle_exit+0x30/0x30
      [  106.423964]  ? __wake_up_common+0x46/0x240
      [  106.423986]  cpu_startup_entry+0x14/0x20
      [  106.424009]  start_secondary+0x1b0/0x200
      [  106.424031]  ? set_cpu_sibling_map+0x990/0x990
      [  106.424054]  secondary_startup_64+0xa4/0xb0
      [  106.424075]
      [  106.424096] Allocated by task 626:
      [  106.424119]  save_stack+0x19/0x80
      [  106.424143]  __kasan_kmalloc.constprop.7+0xc1/0xd0
      [  106.424165]  kmem_cache_alloc+0xb2/0x1d0
      [  106.424277]  i915_sched_lookup_priolist+0x1ab/0x320 [i915]
      [  106.424385]  execlists_submit_request+0x73/0x200 [i915]
      [  106.424498]  submit_notify+0x59/0x60 [i915]
      [  106.424600]  __i915_sw_fence_complete+0x9b/0x330 [i915]
      [  106.424713]  __i915_request_commit+0x4bf/0x570 [i915]
      [  106.424818]  intel_engine_pulse+0x213/0x310 [i915]
      [  106.424925]  context_close+0x22f/0x470 [i915]
      [  106.425033]  i915_gem_context_destroy_ioctl+0x7b/0xa0 [i915]
      [  106.425058]  drm_ioctl_kernel+0x131/0x170
      [  106.425081]  drm_ioctl+0x2d9/0x4f1
      [  106.425104]  do_vfs_ioctl+0x115/0x890
      [  106.425126]  ksys_ioctl+0x35/0x70
      [  106.425147]  __x64_sys_ioctl+0x38/0x40
      [  106.425169]  do_syscall_64+0x66/0x220
      [  106.425191]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  106.425213]
      [  106.425234] Freed by task 0:
      [  106.425255] (stack is not available)
      [  106.425276]
      [  106.425297] The buggy address belongs to the object at ffff888703506a40
      [  106.425297]  which belongs to the cache i915_priolist of size 104
      [  106.425321] The buggy address is located 136 bytes to the right of
      [  106.425321]  104-byte region [ffff888703506a40, ffff888703506aa8)
      [  106.425345] The buggy address belongs to the page:
      [  106.425367] page:ffffea001c0d4180 refcount:1 mapcount:0 mapping:ffff88873e1cf740 index:0xffff888703506e40 compound_mapcount: 0
      [  106.425391] flags: 0x8000000000010200(slab|head)
      [  106.425415] raw: 8000000000010200 ffffea0020192b88 ffff8888174b5450 ffff88873e1cf740
      [  106.425439] raw: ffff888703506e40 000000000010000e 00000001ffffffff 0000000000000000
      [  106.425464] page dumped because: kasan: bad access detected
      [  106.425486]
      [  106.425506] Memory state around the buggy address:
      [  106.425528]  ffff888703506a00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
      [  106.425551]  ffff888703506a80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
      [  106.425573] >ffff888703506b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  106.425597]                                      ^
      [  106.425619]  ffff888703506b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  106.425642]  ffff888703506c00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
      [  106.425664] ==================================================================
      
      Fixes: 22b7a426 ("drm/i915/execlists: Preempt-to-busy")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190809073723.6593-1-chris@chris-wilson.co.uk
      6cd34b10
    • J
      drm/i915: extract i915_perf.h from i915_drv.h · db94e9f1
      Jani Nikula 提交于
      It used to be handy that we only had a couple of headers, but over time
      i915_drv.h has become unwieldy. Extract declarations to a separate
      header file corresponding to the implementation module, clarifying the
      modularity of the driver.
      
      Ensure the new header is self-contained, and do so with minimal further
      includes, using forward declarations as needed. Include the new header
      only where needed, and sort the modified include directives while at it
      and as needed.
      
      No functional changes.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/d7826e365695f691a3ac69a69ff6f2bbdb62700d.1565271681.git.jani.nikula@intel.com
      db94e9f1
    • C
      drm/i915: Defer final intel_wakeref_put to process context · c7302f20
      Chris Wilson 提交于
      As we need to acquire a mutex to serialise the final
      intel_wakeref_put, we need to ensure that we are in process context at
      that time. However, we want to allow operation on the intel_wakeref from
      inside timer and other hardirq context, which means that need to defer
      that final put to a workqueue.
      
      Inside the final wakeref puts, we are safe to operate in any context, as
      we are simply marking up the HW and state tracking for the potential
      sleep. It's only the serialisation with the potential sleeping getting
      that requires careful wait avoidance. This allows us to retain the
      immediate processing as before (we only need to sleep over the same
      races as the current mutex_lock).
      
      v2: Add a selftest to ensure we exercise the code while lockdep watches.
      v3: That test was extremely loud and complained about many things!
      v4: Not a whale!
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111295
      References: https://bugs.freedesktop.org/show_bug.cgi?id=111245
      References: https://bugs.freedesktop.org/show_bug.cgi?id=111256
      Fixes: 18398904 ("drm/i915: Only recover active engines")
      Fixes: 51fbd8de ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190808202758.10453-1-chris@chris-wilson.co.uk
      c7302f20
  20. 07 8月, 2019 1 次提交
  21. 02 8月, 2019 1 次提交
  22. 01 8月, 2019 1 次提交
    • C
      drm/i915/execlists: Always clear pending&inflight requests on reset · 10e36489
      Chris Wilson 提交于
      If we skip the reset as we found the engine inactive at the time of the
      reset, we still need to clear the residual inflight & pending request
      bookkeeping to reflect the current state of HW.
      
      Otherwise, we may end up stuck in a loop like:
      
      <7> [416.490346] hangcheck rcs0
      <7> [416.490371] hangcheck 	Awake? 1
      <7> [416.490376] hangcheck 	Hangcheck: 8003 ms ago
      <7> [416.490380] hangcheck 	Reset count: 0 (global 0)
      <7> [416.490383] hangcheck 	Requests:
      <7> [416.491210] hangcheck 	RING_START: 0x0017b000
      <7> [416.491983] hangcheck 	RING_HEAD:  0x00000048
      <7> [416.491992] hangcheck 	RING_TAIL:  0x00000048
      <7> [416.492006] hangcheck 	RING_CTL:   0x00000000
      <7> [416.492037] hangcheck 	RING_MODE:  0x00000200 [idle]
      <7> [416.492044] hangcheck 	RING_IMR: 00000000
      <7> [416.492809] hangcheck 	ACTHD:  0x00000000_9ca00048
      <7> [416.492824] hangcheck 	BBADDR: 0x00000000_00001004
      <7> [416.492838] hangcheck 	DMA_FADDR: 0x00000000_00000000
      <7> [416.492845] hangcheck 	IPEIR: 0x00000000
      <7> [416.492852] hangcheck 	IPEHR: 0x00000000
      <7> [416.492863] hangcheck 	Execlist status: 0x00018001 00000000, entries 12
      <7> [416.492869] hangcheck 	Execlist CSB read 1, write 1, tasklet queued? no (enabled)
      <7> [416.492938] hangcheck 		Pending[0] ring:{start:0017b000, hwsp:fedf9000, seqno:00016fd6}, rq:  20ffa:16fd6!+  prio=-4094 @ 8307ms: signaled
      <7> [416.492972] hangcheck 		Queue priority hint: -4093
      <7> [416.492979] hangcheck 		Q  20ffa:16fd8-  prio=-4093 @ 8307ms: [i915]
      <7> [416.492985] hangcheck 		Q  20ffa:16fda  prio=-4094 @ 8307ms: [i915]
      <7> [416.492990] hangcheck 		Q  20ffa:16fdc  prio=-4094 @ 8307ms: [i915]
      <7> [416.492996] hangcheck 		Q  20ffa:16fde  prio=-4094 @ 8307ms: [i915]
      <7> [416.493001] hangcheck 		Q  20ffa:16fe0  prio=-4094 @ 8307ms: [i915]
      <7> [416.493007] hangcheck 		Q  20ffa:16fe2  prio=-4094 @ 8307ms: [i915]
      <7> [416.493013] hangcheck 		Q  20ffa:16fe4  prio=-4094 @ 8307ms: [i915]
      <7> [416.493021] hangcheck 		...skipping 21 queued requests...
      <7> [416.493027] hangcheck 		Q  20ffa:17010  prio=-4094 @ 8307ms: [i915]
      <7> [416.493081] hangcheck HWSP:
      <7> [416.493089] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      <7> [416.493094] hangcheck *
      <7> [416.493100] hangcheck [0040] 10008002 00000000 10000018 00000000 10000018 00000000 10000001 00000000
      <7> [416.493106] hangcheck [0060] 10000018 00000000 10000001 00000000 10000018 00000000 10000001 00000000
      <7> [416.493111] hangcheck *
      <7> [416.493117] hangcheck [00a0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
      <7> [416.493123] hangcheck [00c0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      <7> [416.493127] hangcheck *
      <7> [416.493132] hangcheck Idle? no
      <6> [416.512124] i915 0000:00:02.0: GPU HANG: ecode 11:0:0x00000000, hang on rcs0
      <6> [416.512205] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
      <6> [416.512207] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
      <6> [416.512208] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
      <6> [416.512210] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
      <6> [416.512212] [drm] GPU crash dump saved to /sys/class/drm/card0/error
      <5> [416.513602] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
      <7> [424.489258] hangcheck rcs0
      <7> [424.489263] hangcheck 	Awake? 1
      <7> [424.489267] hangcheck 	Hangcheck: 5954 ms ago
      <7> [424.489271] hangcheck 	Reset count: 1 (global 0)
      <7> [424.489274] hangcheck 	Requests:
      <7> [424.490128] hangcheck 	RING_START: 0x00000000
      <7> [424.490870] hangcheck 	RING_HEAD:  0x00000000
      <7> [424.490877] hangcheck 	RING_TAIL:  0x00000000
      <7> [424.490887] hangcheck 	RING_CTL:   0x00000000
      <7> [424.490897] hangcheck 	RING_MODE:  0x00000200 [idle]
      <7> [424.490904] hangcheck 	RING_IMR: 00000000
      <7> [424.490917] hangcheck 	ACTHD:  0x00000000_00000000
      <7> [424.490930] hangcheck 	BBADDR: 0x00000000_00000000
      <7> [424.490943] hangcheck 	DMA_FADDR: 0x00000000_00000000
      <7> [424.490950] hangcheck 	IPEIR: 0x00000000
      <7> [424.490956] hangcheck 	IPEHR: 0x00000000
      <7> [424.490968] hangcheck 	Execlist status: 0x00000001 00000000, entries 12
      <7> [424.490972] hangcheck 	Execlist CSB read 11, write 11, tasklet queued? no (enabled)
      <7> [424.490983] hangcheck 		Pending[0] ring:{start:0017b000, hwsp:fedf9000, seqno:00016fd6}, rq:  20ffa:16fd6!+  prio=-4094 @ 16305ms: signaled
      <7> [424.490989] hangcheck 		Queue priority hint: -4093
      <7> [424.490996] hangcheck 		Q  20ffa:16fd8-  prio=-4093 @ 16305ms: [i915]
      <7> [424.491001] hangcheck 		Q  20ffa:16fda  prio=-4094 @ 16305ms: [i915]
      <7> [424.491006] hangcheck 		Q  20ffa:16fdc  prio=-4094 @ 16305ms: [i915]
      <7> [424.491011] hangcheck 		Q  20ffa:16fde  prio=-4094 @ 16305ms: [i915]
      <7> [424.491016] hangcheck 		Q  20ffa:16fe0  prio=-4094 @ 16305ms: [i915]
      <7> [424.491022] hangcheck 		Q  20ffa:16fe2  prio=-4094 @ 16305ms: [i915]
      <7> [424.491048] hangcheck 		Q  20ffa:16fe4  prio=-4094 @ 16305ms: [i915]
      <7> [424.491057] hangcheck 		...skipping 21 queued requests...
      <7> [424.491063] hangcheck 		Q  20ffa:17010  prio=-4094 @ 16305ms: [i915]
      <7> [424.491095] hangcheck HWSP:
      <7> [424.491102] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      <7> [424.491106] hangcheck *
      <7> [424.491113] hangcheck [0040] 10008002 00000000 10000018 00000000 10000018 00000000 10000001 00000000
      <7> [424.491118] hangcheck [0060] 10000018 00000000 10000001 00000000 10000018 00000000 10000001 00000000
      <7> [424.491122] hangcheck *
      <7> [424.491127] hangcheck [00a0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0000000b
      <7> [424.491133] hangcheck [00c0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      <7> [424.491136] hangcheck *
      <7> [424.491141] hangcheck Idle? no
      <5> [424.491834] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
      
      Where not having cleared the pending array on reset, it persists
      indefinitely.
      
      Fixes: fff8102a ("drm/i915/execlists: Process interrupted context on reset")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NAndi Shyti <andi.shyti@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190730133035.1977-2-chris@chris-wilson.co.uk
      10e36489
  23. 30 7月, 2019 2 次提交
  24. 29 7月, 2019 1 次提交