1. 16 7月, 2018 3 次提交
  2. 15 7月, 2018 1 次提交
  3. 14 7月, 2018 4 次提交
    • C
      drm/i915/execlists: Drop clear_gtiir() on GPU reset · 60a94324
      Chris Wilson 提交于
      With the new CSB processing code, we are not vulnerable to delayed
      delivery of a pre-reset interrupt as we use the CSB status pointers in
      the HWSP to decide if we need to parse any CSB events and no longer need
      to wait for the first post-reset interrupt to be assured that the CSB
      mmio registers are valid.
      
      The new icl code to clear registers has a nasty lock inversion:
      [   57.409776] ======================================================
      [   57.409779] WARNING: possible circular locking dependency detected
      [   57.409783] 4.18.0-rc4-CI-CI_DII_1137+ #1 Tainted: G     U  W
      [   57.409785] ------------------------------------------------------
      [   57.409788] swapper/6/0 is trying to acquire lock:
      [   57.409790] 000000004f304ee5 (&engine->timeline.lock/1){-.-.}, at: execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.409841]
                     but task is already holding lock:
      [   57.409844] 00000000aad89594 (&(&rq->lock)->rlock#2){-.-.}, at: notify_ring+0x2b2/0x480 [i915]
      [   57.409869]
                     which lock already depends on the new lock.
      
      [   57.409872]
                     the existing dependency chain (in reverse order) is:
      [   57.409876]
                     -> #2 (&(&rq->lock)->rlock#2){-.-.}:
      [   57.409900]        notify_ring+0x2b2/0x480 [i915]
      [   57.409922]        gen8_cs_irq_handler+0x39/0xa0 [i915]
      [   57.409943]        gen11_irq_handler+0x2f0/0x420 [i915]
      [   57.409949]        __handle_irq_event_percpu+0x42/0x370
      [   57.409952]        handle_irq_event_percpu+0x2b/0x70
      [   57.409956]        handle_irq_event+0x2f/0x50
      [   57.409959]        handle_edge_irq+0xe7/0x190
      [   57.409964]        handle_irq+0x67/0x160
      [   57.409967]        do_IRQ+0x5e/0x120
      [   57.409971]        ret_from_intr+0x0/0x1d
      [   57.409974]        _raw_spin_unlock_irqrestore+0x4e/0x60
      [   57.409979]        tasklet_action_common.isra.5+0x47/0xb0
      [   57.409982]        __do_softirq+0xd9/0x505
      [   57.409985]        irq_exit+0xa9/0xc0
      [   57.409988]        do_IRQ+0x9a/0x120
      [   57.409991]        ret_from_intr+0x0/0x1d
      [   57.409995]        cpuidle_enter_state+0xac/0x360
      [   57.409999]        do_idle+0x1f3/0x250
      [   57.410004]        cpu_startup_entry+0x6a/0x70
      [   57.410010]        start_secondary+0x19d/0x1f0
      [   57.410015]        secondary_startup_64+0xa5/0xb0
      [   57.410018]
                     -> #1 (&(&dev_priv->irq_lock)->rlock){-.-.}:
      [   57.410081]        clear_gtiir+0x30/0x200 [i915]
      [   57.410116]        execlists_reset+0x6e/0x2b0 [i915]
      [   57.410140]        i915_reset_engine+0x111/0x190 [i915]
      [   57.410165]        i915_handle_error+0x11a/0x4a0 [i915]
      [   57.410198]        i915_hangcheck_elapsed+0x378/0x530 [i915]
      [   57.410204]        process_one_work+0x248/0x6c0
      [   57.410207]        worker_thread+0x37/0x380
      [   57.410211]        kthread+0x119/0x130
      [   57.410215]        ret_from_fork+0x3a/0x50
      [   57.410217]
                     -> #0 (&engine->timeline.lock/1){-.-.}:
      [   57.410224]        _raw_spin_lock_irqsave+0x33/0x50
      [   57.410256]        execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.410289]        submit_notify+0x8d/0x124 [i915]
      [   57.410314]        __i915_sw_fence_complete+0x81/0x250 [i915]
      [   57.410339]        dma_i915_sw_fence_wake+0xd/0x20 [i915]
      [   57.410344]        dma_fence_signal_locked+0x79/0x200
      [   57.410368]        notify_ring+0x2ba/0x480 [i915]
      [   57.410392]        gen8_cs_irq_handler+0x39/0xa0 [i915]
      [   57.410416]        gen11_irq_handler+0x2f0/0x420 [i915]
      [   57.410421]        __handle_irq_event_percpu+0x42/0x370
      [   57.410425]        handle_irq_event_percpu+0x2b/0x70
      [   57.410428]        handle_irq_event+0x2f/0x50
      [   57.410432]        handle_edge_irq+0xe7/0x190
      [   57.410436]        handle_irq+0x67/0x160
      [   57.410439]        do_IRQ+0x5e/0x120
      [   57.410445]        ret_from_intr+0x0/0x1d
      [   57.410449]        cpuidle_enter_state+0xac/0x360
      [   57.410453]        do_idle+0x1f3/0x250
      [   57.410456]        cpu_startup_entry+0x6a/0x70
      [   57.410460]        start_secondary+0x19d/0x1f0
      [   57.410464]        secondary_startup_64+0xa5/0xb0
      [   57.410466]
                     other info that might help us debug this:
      
      [   57.410471] Chain exists of:
                       &engine->timeline.lock/1 --> &(&dev_priv->irq_lock)->rlock --> &(&rq->lock)->rlock#2
      
      [   57.410481]  Possible unsafe locking scenario:
      
      [   57.410485]        CPU0                    CPU1
      [   57.410487]        ----                    ----
      [   57.410490]   lock(&(&rq->lock)->rlock#2);
      [   57.410494]                                lock(&(&dev_priv->irq_lock)->rlock);
      [   57.410498]                                lock(&(&rq->lock)->rlock#2);
      [   57.410503]   lock(&engine->timeline.lock/1);
      [   57.410506]
                      *** DEADLOCK ***
      
      [   57.410511] 4 locks held by swapper/6/0:
      [   57.410514]  #0: 0000000074575789 (&(&dev_priv->irq_lock)->rlock){-.-.}, at: gen11_irq_handler+0x8a/0x420 [i915]
      [   57.410542]  #1: 000000009b29b30e (rcu_read_lock){....}, at: notify_ring+0x1a/0x480 [i915]
      [   57.410573]  #2: 00000000aad89594 (&(&rq->lock)->rlock#2){-.-.}, at: notify_ring+0x2b2/0x480 [i915]
      [   57.410601]  #3: 000000009b29b30e (rcu_read_lock){....}, at: submit_notify+0x35/0x124 [i915]
      [   57.410635]
                     stack backtrace:
      [   57.410640] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G     U  W         4.18.0-rc4-CI-CI_DII_1137+ #1
      [   57.410644] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2222.A01.1805300339 05/30/2018
      [   57.410650] Call Trace:
      [   57.410652]  <IRQ>
      [   57.410657]  dump_stack+0x67/0x9b
      [   57.410662]  print_circular_bug.isra.16+0x1c8/0x2b0
      [   57.410666]  __lock_acquire+0x1897/0x1b50
      [   57.410671]  ? lock_acquire+0xa6/0x210
      [   57.410674]  lock_acquire+0xa6/0x210
      [   57.410706]  ? execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.410711]  _raw_spin_lock_irqsave+0x33/0x50
      [   57.410741]  ? execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.410769]  execlists_submit_request+0x2b/0x1a0 [i915]
      [   57.410774]  ? _raw_spin_unlock_irqrestore+0x39/0x60
      [   57.410804]  submit_notify+0x8d/0x124 [i915]
      [   57.410828]  __i915_sw_fence_complete+0x81/0x250 [i915]
      [   57.410854]  dma_i915_sw_fence_wake+0xd/0x20 [i915]
      [   57.410858]  dma_fence_signal_locked+0x79/0x200
      [   57.410882]  notify_ring+0x2ba/0x480 [i915]
      [   57.410907]  gen8_cs_irq_handler+0x39/0xa0 [i915]
      [   57.410933]  gen11_irq_handler+0x2f0/0x420 [i915]
      [   57.410938]  __handle_irq_event_percpu+0x42/0x370
      [   57.410943]  handle_irq_event_percpu+0x2b/0x70
      [   57.410947]  handle_irq_event+0x2f/0x50
      [   57.410951]  handle_edge_irq+0xe7/0x190
      [   57.410955]  handle_irq+0x67/0x160
      [   57.410958]  do_IRQ+0x5e/0x120
      [   57.410962]  common_interrupt+0xf/0xf
      [   57.410965]  </IRQ>
      [   57.410969] RIP: 0010:cpuidle_enter_state+0xac/0x360
      [   57.410972] Code: 44 00 00 31 ff e8 84 93 91 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 31 02 00 00 31 ff e8 7d 30 98 ff e8 e8 0e 94 ff fb 4c 29 fb <48> ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7 ea b8 ff
      [   57.411015] RSP: 0018:ffffc90000133e90 EFLAGS: 00000216 ORIG_RAX: ffffffffffffffdd
      [   57.411023] RAX: ffff8804ae748040 RBX: 000000000002a97d RCX: 0000000000000000
      [   57.411029] RDX: 0000000000000046 RSI: ffffffff82141263 RDI: ffffffff820f05a7
      [   57.411035] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
      [   57.411041] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8229f078
      [   57.411045] R13: ffff8804ab2adfa8 R14: 0000000000000000 R15: 0000000d5de092e3
      [   57.411052]  do_idle+0x1f3/0x250
      [   57.411055]  cpu_startup_entry+0x6a/0x70
      [   57.411059]  start_secondary+0x19d/0x1f0
      [   57.411064]  secondary_startup_64+0xa5/0xb0
      
      The easiest remedy is to remove the defunct code.
      
      Fixes: ff047a87 ("drm/i915/icl: Correctly clear lost ctx-switch interrupts across reset for Gen11")
      References: fd8526e5 ("drm/i915/execlists: Trust the CSB")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-3-chris@chris-wilson.co.uk
      60a94324
    • C
      drm/i915: Do not short-circuit tasklets during reset · 9701975e
      Chris Wilson 提交于
      Inside intel_engine_is_idle(), we flush the tasklet to ensure that is
      being run in a timely fashion (ksoftirqd has taught us to expect the
      worst). However, if we are in the middle of reset, the HW may not yet be
      ready to execute the submission tasklet and so we must respect the
      disable flag.
      
      Fixes: dd0cf235 ("drm/i915: Speed up idle detection by kicking the tasklets")
      Testcase: igt/drv_selftest/live_hangcheck
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-2-chris@chris-wilson.co.uk
      9701975e
    • C
      drm/i915/selftests: Include the start of each subtest in the GEM trace · 9dd1a981
      Chris Wilson 提交于
      Knowing the boundary of each subtest can be instrumental in digesting
      the voluminous trace output and finding the critical piece of
      information.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-1-chris@chris-wilson.co.uk
      9dd1a981
    • C
      drm/i915/guc: Protect against no desc-pool on premature shutdown · 6710fcfc
      Chris Wilson 提交于
      Hopefully the final hack to get guc fault-injection happy before we can
      clean it up again, starting from a known good baseline...
      
      [  383.017530] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
      [  383.017556] Oops: 0000 [#1] PREEMPT SMP PTI
      [  383.017566] CPU: 7 PID: 4725 Comm: drv_module_relo Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4485+ #1
      [  383.017581] Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.10 12/28/2017
      [  383.017664] RIP: 0010:guc_stage_desc_pool_destroy+0x17/0xe0 [i915]
      [  383.017674] Code: 59 a0 c6 05 02 59 18 00 01 e8 5e 01 c3 e0 eb b1 0f 1f 00 53 48 89 fb 48 81 c7 90 02 00 00 e8 60 64 45 e1 48 8b 83 80 02 00 00 <48> 8b 80 a0 00 00 00 48 8b 90 68 02 00 00 48 83 ea 01 48 81 fa ff
      [  383.017771] RSP: 0018:ffffc900004bbdd0 EFLAGS: 00010282
      [  383.017782] RAX: 0000000000000000 RBX: ffff88012ff41300 RCX: 0000000000000000
      [  383.017794] RDX: 0000000000000000 RSI: ffffc900004bbd80 RDI: 0000000000000000
      [  383.017805] RBP: ffff88012ff40000 R08: 00000000d876ee11 R09: 0000000000000000
      [  383.017817] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012ff47770
      [  383.017828] R13: ffff88012ff40068 R14: ffff880264392ef8 R15: ffffffffa0639950
      [  383.017840] FS:  00007fb9c18c8980(0000) GS:ffff8802663c0000(0000) knlGS:0000000000000000
      [  383.017853] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  383.017864] CR2: 00000000000000a0 CR3: 00000001df6cc003 CR4: 00000000003606e0
      [  383.017875] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  383.017887] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  383.017898] Call Trace:
      [  383.017962]  intel_uc_fini+0x34/0xd0 [i915]
      [  383.018020]  i915_gem_fini+0x5c/0x100 [i915]
      [  383.018093]  i915_driver_unload+0xd2/0x110 [i915]
      [  383.018150]  i915_pci_remove+0x10/0x20 [i915]
      [  383.018165]  pci_device_remove+0x36/0xb0
      [  383.018179]  device_release_driver_internal+0x185/0x250
      [  383.018193]  driver_detach+0x35/0x70
      [  383.018205]  bus_remove_driver+0x53/0xd0
      [  383.018217]  pci_unregister_driver+0x25/0xa0
      [  383.018232]  __se_sys_delete_module+0x162/0x210
      [  383.018245]  ? do_syscall_64+0xd/0x190
      [  383.018257]  do_syscall_64+0x55/0x190
      [  383.018270]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  383.018282] RIP: 0033:0x7fb9c0f7c1b7
      [  383.018290] Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48
      [  383.018408] RSP: 002b:00007fffa01c2aa8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
      [  383.018425] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb9c0f7c1b7
      [  383.018440] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000560b96856d48
      [  383.018454] RBP: 0000560b96856ce0 R08: 0000560b96856d4c R09: 00007fffa01c2ae8
      [  383.018468] R10: 00007fffa01c1aa4 R11: 0000000000000206 R12: 0000560b954f7470
      
      Testcase: igt/drv_module_reload/basic-reload-inject
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180713172658.14070-1-chris@chris-wilson.co.uk
      6710fcfc
  4. 13 7月, 2018 20 次提交
  5. 12 7月, 2018 8 次提交
  6. 11 7月, 2018 4 次提交