1. 19 7月, 2018 1 次提交
    • R
      drm/i915: Kill sink_crc for good · 5fd9df6a
      Rodrigo Vivi 提交于
      It was originally introduced following the VESA spec in order to validate PSR.
      
      However we found so many issues around sink_crc that instead of helping PSR
      development it only brought another layer of trouble to the table.
      
      So, sink_crc has been a black whole for us in question of time, effort and hope.
      
      First of the problems is that HW statement is clear: "Do not attempt to use
      aux communication with PSR enabled". So the main reason behind sink_crc is
      already compromised.
      
      For a while we had hope on the aux-mutex could workaround this problem on SKL+
      platforms, but that mutex was not reliable, not tested,
      and we shouldn't use according to HW engineers.
      
      Also, nor source, nor sink designed and implemented the sink_crc to be used like
      we are trying to use here.
      
      Well, the sink side of things is also apparently not prepared for this
      case. Each panel that we tried seemed to have a different behavior with same
      code and same source.
      
      So, for all the time we lost on trying to ducktape all these different issues
      I believe it is now time to move PSR to a more reliable validation.
      Maybe not a perfect one as we dreamed for this sink_crc, but at least more
      reliable.
      
      Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180705192528.30515-1-rodrigo.vivi@intel.com
      5fd9df6a
  2. 18 7月, 2018 2 次提交
  3. 17 7月, 2018 4 次提交
  4. 16 7月, 2018 3 次提交
  5. 15 7月, 2018 1 次提交
  6. 14 7月, 2018 4 次提交
    • C
      drm/i915/execlists: Drop clear_gtiir() on GPU reset · 60a94324
      Chris Wilson 提交于
      With the new CSB processing code, we are not vulnerable to delayed
      delivery of a pre-reset interrupt as we use the CSB status pointers in
      the HWSP to decide if we need to parse any CSB events and no longer need
      to wait for the first post-reset interrupt to be assured that the CSB
      mmio registers are valid.
      
      The new icl code to clear registers has a nasty lock inversion:
      [   57.409776] ======================================================
      [   57.409779] WARNING: possible circular locking dependency detected
      [   57.409783] 4.18.0-rc4-CI-CI_DII_1137+ #1 Tainted: G     U  W
      [   57.409785] ------------------------------------------------------
      [   57.409788] swapper/6/0 is trying to acquire lock:
      [   57.409790] 000000004f304ee5 (&engine->timeline.lock/1){-.-.}, at: execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.409841]
                     but task is already holding lock:
      [   57.409844] 00000000aad89594 (&(&rq->lock)->rlock#2){-.-.}, at: notify_ring+0x2b2/0x480 [i915]
      [   57.409869]
                     which lock already depends on the new lock.
      
      [   57.409872]
                     the existing dependency chain (in reverse order) is:
      [   57.409876]
                     -> #2 (&(&rq->lock)->rlock#2){-.-.}:
      [   57.409900]        notify_ring+0x2b2/0x480 [i915]
      [   57.409922]        gen8_cs_irq_handler+0x39/0xa0 [i915]
      [   57.409943]        gen11_irq_handler+0x2f0/0x420 [i915]
      [   57.409949]        __handle_irq_event_percpu+0x42/0x370
      [   57.409952]        handle_irq_event_percpu+0x2b/0x70
      [   57.409956]        handle_irq_event+0x2f/0x50
      [   57.409959]        handle_edge_irq+0xe7/0x190
      [   57.409964]        handle_irq+0x67/0x160
      [   57.409967]        do_IRQ+0x5e/0x120
      [   57.409971]        ret_from_intr+0x0/0x1d
      [   57.409974]        _raw_spin_unlock_irqrestore+0x4e/0x60
      [   57.409979]        tasklet_action_common.isra.5+0x47/0xb0
      [   57.409982]        __do_softirq+0xd9/0x505
      [   57.409985]        irq_exit+0xa9/0xc0
      [   57.409988]        do_IRQ+0x9a/0x120
      [   57.409991]        ret_from_intr+0x0/0x1d
      [   57.409995]        cpuidle_enter_state+0xac/0x360
      [   57.409999]        do_idle+0x1f3/0x250
      [   57.410004]        cpu_startup_entry+0x6a/0x70
      [   57.410010]        start_secondary+0x19d/0x1f0
      [   57.410015]        secondary_startup_64+0xa5/0xb0
      [   57.410018]
                     -> #1 (&(&dev_priv->irq_lock)->rlock){-.-.}:
      [   57.410081]        clear_gtiir+0x30/0x200 [i915]
      [   57.410116]        execlists_reset+0x6e/0x2b0 [i915]
      [   57.410140]        i915_reset_engine+0x111/0x190 [i915]
      [   57.410165]        i915_handle_error+0x11a/0x4a0 [i915]
      [   57.410198]        i915_hangcheck_elapsed+0x378/0x530 [i915]
      [   57.410204]        process_one_work+0x248/0x6c0
      [   57.410207]        worker_thread+0x37/0x380
      [   57.410211]        kthread+0x119/0x130
      [   57.410215]        ret_from_fork+0x3a/0x50
      [   57.410217]
                     -> #0 (&engine->timeline.lock/1){-.-.}:
      [   57.410224]        _raw_spin_lock_irqsave+0x33/0x50
      [   57.410256]        execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.410289]        submit_notify+0x8d/0x124 [i915]
      [   57.410314]        __i915_sw_fence_complete+0x81/0x250 [i915]
      [   57.410339]        dma_i915_sw_fence_wake+0xd/0x20 [i915]
      [   57.410344]        dma_fence_signal_locked+0x79/0x200
      [   57.410368]        notify_ring+0x2ba/0x480 [i915]
      [   57.410392]        gen8_cs_irq_handler+0x39/0xa0 [i915]
      [   57.410416]        gen11_irq_handler+0x2f0/0x420 [i915]
      [   57.410421]        __handle_irq_event_percpu+0x42/0x370
      [   57.410425]        handle_irq_event_percpu+0x2b/0x70
      [   57.410428]        handle_irq_event+0x2f/0x50
      [   57.410432]        handle_edge_irq+0xe7/0x190
      [   57.410436]        handle_irq+0x67/0x160
      [   57.410439]        do_IRQ+0x5e/0x120
      [   57.410445]        ret_from_intr+0x0/0x1d
      [   57.410449]        cpuidle_enter_state+0xac/0x360
      [   57.410453]        do_idle+0x1f3/0x250
      [   57.410456]        cpu_startup_entry+0x6a/0x70
      [   57.410460]        start_secondary+0x19d/0x1f0
      [   57.410464]        secondary_startup_64+0xa5/0xb0
      [   57.410466]
                     other info that might help us debug this:
      
      [   57.410471] Chain exists of:
                       &engine->timeline.lock/1 --> &(&dev_priv->irq_lock)->rlock --> &(&rq->lock)->rlock#2
      
      [   57.410481]  Possible unsafe locking scenario:
      
      [   57.410485]        CPU0                    CPU1
      [   57.410487]        ----                    ----
      [   57.410490]   lock(&(&rq->lock)->rlock#2);
      [   57.410494]                                lock(&(&dev_priv->irq_lock)->rlock);
      [   57.410498]                                lock(&(&rq->lock)->rlock#2);
      [   57.410503]   lock(&engine->timeline.lock/1);
      [   57.410506]
                      *** DEADLOCK ***
      
      [   57.410511] 4 locks held by swapper/6/0:
      [   57.410514]  #0: 0000000074575789 (&(&dev_priv->irq_lock)->rlock){-.-.}, at: gen11_irq_handler+0x8a/0x420 [i915]
      [   57.410542]  #1: 000000009b29b30e (rcu_read_lock){....}, at: notify_ring+0x1a/0x480 [i915]
      [   57.410573]  #2: 00000000aad89594 (&(&rq->lock)->rlock#2){-.-.}, at: notify_ring+0x2b2/0x480 [i915]
      [   57.410601]  #3: 000000009b29b30e (rcu_read_lock){....}, at: submit_notify+0x35/0x124 [i915]
      [   57.410635]
                     stack backtrace:
      [   57.410640] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G     U  W         4.18.0-rc4-CI-CI_DII_1137+ #1
      [   57.410644] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2222.A01.1805300339 05/30/2018
      [   57.410650] Call Trace:
      [   57.410652]  <IRQ>
      [   57.410657]  dump_stack+0x67/0x9b
      [   57.410662]  print_circular_bug.isra.16+0x1c8/0x2b0
      [   57.410666]  __lock_acquire+0x1897/0x1b50
      [   57.410671]  ? lock_acquire+0xa6/0x210
      [   57.410674]  lock_acquire+0xa6/0x210
      [   57.410706]  ? execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.410711]  _raw_spin_lock_irqsave+0x33/0x50
      [   57.410741]  ? execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.410769]  execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.410774]  ? _raw_spin_unlock_irqrestore+0x39/0x60
      [   57.410804]  submit_notify+0x8d/0x124 [i915]
      [   57.410828]  __i915_sw_fence_complete+0x81/0x250 [i915]
      [   57.410854]  dma_i915_sw_fence_wake+0xd/0x20 [i915]
      [   57.410858]  dma_fence_signal_locked+0x79/0x200
      [   57.410882]  notify_ring+0x2ba/0x480 [i915]
      [   57.410907]  gen8_cs_irq_handler+0x39/0xa0 [i915]
      [   57.410933]  gen11_irq_handler+0x2f0/0x420 [i915]
      [   57.410938]  __handle_irq_event_percpu+0x42/0x370
      [   57.410943]  handle_irq_event_percpu+0x2b/0x70
      [   57.410947]  handle_irq_event+0x2f/0x50
      [   57.410951]  handle_edge_irq+0xe7/0x190
      [   57.410955]  handle_irq+0x67/0x160
      [   57.410958]  do_IRQ+0x5e/0x120
      [   57.410962]  common_interrupt+0xf/0xf
      [   57.410965]  </IRQ>
      [   57.410969] RIP: 0010:cpuidle_enter_state+0xac/0x360
      [   57.410972] Code: 44 00 00 31 ff e8 84 93 91 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 31 02 00 00 31 ff e8 7d 30 98 ff e8 e8 0e 94 ff fb 4c 29 fb <48> ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7 ea b8 ff
      [   57.411015] RSP: 0018:ffffc90000133e90 EFLAGS: 00000216 ORIG_RAX: ffffffffffffffdd
      [   57.411023] RAX: ffff8804ae748040 RBX: 000000000002a97d RCX: 0000000000000000
      [   57.411029] RDX: 0000000000000046 RSI: ffffffff82141263 RDI: ffffffff820f05a7
      [   57.411035] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
      [   57.411041] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8229f078
      [   57.411045] R13: ffff8804ab2adfa8 R14: 0000000000000000 R15: 0000000d5de092e3
      [   57.411052]  do_idle+0x1f3/0x250
      [   57.411055]  cpu_startup_entry+0x6a/0x70
      [   57.411059]  start_secondary+0x19d/0x1f0
      [   57.411064]  secondary_startup_64+0xa5/0xb0
      
      The easiest remedy is to remove the defunct code.
      
      Fixes: ff047a87 ("drm/i915/icl: Correctly clear lost ctx-switch interrupts across reset for Gen11")
      References: fd8526e5 ("drm/i915/execlists: Trust the CSB")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-3-chris@chris-wilson.co.uk
      60a94324
    • C
      drm/i915: Do not short-circuit tasklets during reset · 9701975e
      Chris Wilson 提交于
      Inside intel_engine_is_idle(), we flush the tasklet to ensure that is
      being run in a timely fashion (ksoftirqd has taught us to expect the
      worst). However, if we are in the middle of reset, the HW may not yet be
      ready to execute the submission tasklet and so we must respect the
      disable flag.
      
      Fixes: dd0cf235 ("drm/i915: Speed up idle detection by kicking the tasklets")
      Testcase: igt/drv_selftest/live_hangcheck
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-2-chris@chris-wilson.co.uk
      9701975e
    • C
      drm/i915/selftests: Include the start of each subtest in the GEM trace · 9dd1a981
      Chris Wilson 提交于
      Knowing the boundary of each subtest can be instrumental in digesting
      the voluminous trace output and finding the critical piece of
      information.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-1-chris@chris-wilson.co.uk
      9dd1a981
    • C
      drm/i915/guc: Protect against no desc-pool on premature shutdown · 6710fcfc
      Chris Wilson 提交于
      Hopefully the final hack to get guc fault-injection happy before we can
      clean it up again, starting from a known good baseline...
      
      [  383.017530] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
      [  383.017556] Oops: 0000 [#1] PREEMPT SMP PTI
      [  383.017566] CPU: 7 PID: 4725 Comm: drv_module_relo Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4485+ #1
      [  383.017581] Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.10 12/28/2017
      [  383.017664] RIP: 0010:guc_stage_desc_pool_destroy+0x17/0xe0 [i915]
      [  383.017674] Code: 59 a0 c6 05 02 59 18 00 01 e8 5e 01 c3 e0 eb b1 0f 1f 00 53 48 89 fb 48 81 c7 90 02 00 00 e8 60 64 45 e1 48 8b 83 80 02 00 00 <48> 8b 80 a0 00 00 00 48 8b 90 68 02 00 00 48 83 ea 01 48 81 fa ff
      [  383.017771] RSP: 0018:ffffc900004bbdd0 EFLAGS: 00010282
      [  383.017782] RAX: 0000000000000000 RBX: ffff88012ff41300 RCX: 0000000000000000
      [  383.017794] RDX: 0000000000000000 RSI: ffffc900004bbd80 RDI: 0000000000000000
      [  383.017805] RBP: ffff88012ff40000 R08: 00000000d876ee11 R09: 0000000000000000
      [  383.017817] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012ff47770
      [  383.017828] R13: ffff88012ff40068 R14: ffff880264392ef8 R15: ffffffffa0639950
      [  383.017840] FS:  00007fb9c18c8980(0000) GS:ffff8802663c0000(0000) knlGS:0000000000000000
      [  383.017853] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  383.017864] CR2: 00000000000000a0 CR3: 00000001df6cc003 CR4: 00000000003606e0
      [  383.017875] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  383.017887] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  383.017898] Call Trace:
      [  383.017962]  intel_uc_fini+0x34/0xd0 [i915]
      [  383.018020]  i915_gem_fini+0x5c/0x100 [i915]
      [  383.018093]  i915_driver_unload+0xd2/0x110 [i915]
      [  383.018150]  i915_pci_remove+0x10/0x20 [i915]
      [  383.018165]  pci_device_remove+0x36/0xb0
      [  383.018179]  device_release_driver_internal+0x185/0x250
      [  383.018193]  driver_detach+0x35/0x70
      [  383.018205]  bus_remove_driver+0x53/0xd0
      [  383.018217]  pci_unregister_driver+0x25/0xa0
      [  383.018232]  __se_sys_delete_module+0x162/0x210
      [  383.018245]  ? do_syscall_64+0xd/0x190
      [  383.018257]  do_syscall_64+0x55/0x190
      [  383.018270]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  383.018282] RIP: 0033:0x7fb9c0f7c1b7
      [  383.018290] Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48
      [  383.018408] RSP: 002b:00007fffa01c2aa8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
      [  383.018425] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb9c0f7c1b7
      [  383.018440] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000560b96856d48
      [  383.018454] RBP: 0000560b96856ce0 R08: 0000560b96856d4c R09: 00007fffa01c2ae8
      [  383.018468] R10: 00007fffa01c1aa4 R11: 0000000000000206 R12: 0000560b954f7470
      
      Testcase: igt/drv_module_reload/basic-reload-inject
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180713172658.14070-1-chris@chris-wilson.co.uk
      6710fcfc
  7. 13 7月, 2018 20 次提交
  8. 12 7月, 2018 5 次提交