1. 13 12月, 2017 3 次提交
    • C
      drm/i915: Don't check #active_requests from i915_gem_wait_for_idle() · d7dc4131
      Chris Wilson 提交于
      i915_gem_wait_for_idle() is called from inside the shrinker, to ensure
      that we drain the last resources from the GPU in dire circumstances (OOM).
      As we may allocate whilst building a request, it is then possible to hit
      the shrinker with a request under construction, and so we must account
      for the incomplete request whilst waiting. In particular, we
      preincrement (in reserve_engine) the i915->gt.active_requests counter
      and mark the GPU as busy, therefore we can not use that counter for
      shortcircuiting the wait-for-idle.
      
      [  950.859024] GEM_BUG_ON(i915->gt.active_requests)
      [  950.859041] WARNING: CPU: 2 PID: 2178 at drivers/gpu/drm/i915/i915_gem.c:3615 i915_gem_wait_for_idle.part.56+0x166/0x4e0
      [  950.859041] Modules linked in: ccm tun fuse nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_security iptable_raw arc4 iwldvm mac80211 snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic snd_hda_intel snd_hda_codec btusb snd_hda_core btrtl btbcm iwlwifi snd_hwdep btintel bluetooth snd_seq snd_seq_device snd_pcm ecdh_generic x86_pkg_temp_thermal tpm_infineon coretemp tpm_tis crc32_pclmul wmi_bmof crc32c_intel iTCO_wdt hp_wmi snd_timer iTCO_vendor_support sparse_keymap tpm_tis_core mei_me cfg80211
      [  950.859082]  snd joydev tpm mei rfkill pcspkr wmi soundcore lpc_ich hp_accel lis3lv02d input_polldev binfmt_misc e1000e ptp serio_raw pps_core
      [  950.859094] CPU: 2 PID: 2178 Comm: gem_exec_nop Tainted: G     U           4.15.0-rc2+ #900
      [  950.859102] Hardware name: Hewlett-Packard HP ProBook 6360b/1620, BIOS 68SCF Ver. B.42 12/29/2010
      [  950.859107] task: c5119cb4 task.stack: f3ccb8d8
      [  950.859112] EIP: i915_gem_wait_for_idle.part.56+0x166/0x4e0
      [  950.859113] EFLAGS: 00010296 CPU: 2
      [  950.859114] EAX: 00000024 EBX: f36c1888 ECX: f777a044 EDX: 00000007
      [  950.859115] ESI: f36c1888 EDI: edd53958 EBP: edd53970 ESP: edd53938
      [  950.859116]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      [  950.859117] CR0: 80050033 CR2: b7f39000 CR3: 2f2b3000 CR4: 000406d0
      [  950.859118] Call Trace:
      [  950.859125]  ? drm_printk+0x70/0x70
      [  950.859129]  i915_gem_wait_for_idle+0x18/0x30
      [  950.859133]  i915_gem_shrink+0x360/0x410
      [  950.859138]  ? vmpressure+0xa8/0xf0
      [  950.859142]  ? ktime_get+0x4a/0x100
      [  950.859147]  i915_gem_shrink_all+0x21/0x40
      [  950.859151]  i915_gem_shrinker_oom+0x23/0x130
      [  950.859156]  notifier_call_chain+0x4e/0x70
      [  950.859160]  __blocking_notifier_call_chain+0x2f/0x60
      [  950.859164]  blocking_notifier_call_chain+0x11/0x20
      [  950.859169]  out_of_memory+0x207/0x280
      [  950.859174]  __alloc_pages_nodemask+0xd47/0xe60
      [  950.859179]  new_slab+0x32d/0x450
      [  950.859183]  ___slab_alloc.constprop.81+0x358/0x4e0
      [  950.859189]  ? i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859193]  ? __slab_free+0x1fe/0x310
      [  950.859197]  ? native_sched_clock+0x1e/0xc0
      [  950.859201]  ? i915_gem_request_alloc+0xcf/0x510
      [  950.859205]  ? sched_clock+0x9/0x10
      [  950.859209]  __slab_alloc.constprop.80+0x29/0x40
      [  950.859212]  ? __slab_alloc.constprop.80+0x29/0x40
      [  950.859216]  kmem_cache_alloc_trace+0x160/0x1a0
      [  950.859220]  ? i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859224]  i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859229]  i915_gem_request_await_dma_fence+0x1eb/0x390
      [  950.859233]  i915_gem_request_await_object+0xee/0x230
      [  950.859239]  i915_gem_do_execbuffer+0xc16/0x1200
      [  950.859246]  ? irqtime_account_irq+0x3e/0xc0
      [  950.859251]  ? irq_exit+0x4f/0xb0
      [  950.859257]  ? smp_apic_timer_interrupt+0x5f/0x110
      [  950.859261]  ? apic_timer_interrupt+0x35/0x3c
      [  950.859266]  i915_gem_execbuffer2_ioctl+0x212/0x440
      [  950.859270]  ? apic_timer_interrupt+0x35/0x3c
      [  950.859274]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859279]  ? insn_get_seg_base+0x1b/0x50
      [  950.859283]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859287]  drm_ioctl_kernel+0x51/0xa0
      [  950.859291]  drm_ioctl+0x2a3/0x350
      [  950.859294]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859300]  ? sched_clock+0x9/0x10
      [  950.859303]  ? drm_getunique+0x70/0x70
      [  950.859308]  do_vfs_ioctl+0x7d/0x640
      [  950.859311]  ? native_sched_clock+0x1e/0xc0
      [  950.859315]  ? sched_clock+0x9/0x10
      [  950.859319]  ? sched_clock_cpu+0x13/0x120
      [  950.859323]  SyS_ioctl+0x4e/0x80
      [  950.859326]  do_fast_syscall_32+0x75/0x250
      [  950.859331]  ? irq_exit+0x4f/0xb0
      [  950.859334]  entry_SYSENTER_32+0x47/0x71
      [  950.859338] EIP: 0xb7f81d11
      [  950.859339] EFLAGS: 00000296 CPU: 2
      [  950.859340] EAX: ffffffda EBX: 00000003 ECX: 40406469 EDX: bfde4c20
      [  950.859340] ESI: 00000003 EDI: 40406469 EBP: 00000003 ESP: bfde4b38
      [  950.859341]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      [  950.859343] Code: e8 30 60 01 00 83 c4 10 83 c3 04 39 f3 75 e0 8b 45 d8 8b 80 14 37 00 00 85 c0 74 13 68 dd 33 e4 c0 68 49 6f e3 c0 e8 4a 55 be ff <0f> ff 5e 5f b8 fe ff ff 3f bb 0a 00 00 00 e8 b7 14 c4 ff 8b 15
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212132148.8124-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      d7dc4131
    • C
      drm/i915: Dump the engine state before declaring wedged from wait_for_engines() · 59e4b19d
      Chris Wilson 提交于
      If wait_for_engines() fails and we resort to declaring the HW wedged,
      dump the engine state for debugging.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171211194135.27095-2-chris@chris-wilson.co.uk
      59e4b19d
    • C
      drm/i915: Bump timeout for wait_for_engines() · ee42c00e
      Chris Wilson 提交于
      Extract the timeout we use in i915_gem_idle_work_handler() and reuse it
      for wait_for_engines() in i915_gem_wait_for_idle(). It too has the same
      problem in sometimes having to wait for an extended period before the HW
      settles, so make use of the same timeout.
      
      References: 5427f207 ("drm/i915: Bump wait-times for the final CS interrupt before parking")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171211194135.27095-1-chris@chris-wilson.co.uk
      ee42c00e
  2. 12 12月, 2017 1 次提交
  3. 08 12月, 2017 2 次提交
    • T
      drm/i915: Restore GT performance in headless mode with DMC loaded · b6876374
      Tvrtko Ursulin 提交于
      It seems that the DMC likes to transition between the DC states a lot when
      there are no connected displays (no active power domains) during command
      submission.
      
      This activity on DC states has a negative impact on the performance of the
      chip with huge latencies observed in the interrupt handlers and elsewhere.
      Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
      eight.
      
      Work around it by introducing a new power domain named,
      POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
      held for the duration of command submission activity.
      
      CNL has the same problem which will be addressed as a follow-up. Doing
      that requires a fix for a DC6 context corruption problem in the CNL DMC
      firmware which is yet to be released.
      
      v2:
       * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
       * Protect macro body with braces. (Jani Nikula)
      
      v3:
       * Add dedicated power domain for clarity. (Chris, Imre)
       * Commit message and comment text updates.
       * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
         firmware release.
      
      v4:
       * Power domain should be inner to device runtime pm. (Chris)
       * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
       * Handle async DMC loading by moving the GT_IRQ power domain logic into
         intel_runtime_pm. (Daniel, Chris)
       * Include small core GEN9 as well. (Imre)
      
      v5
       * Special handling for async DMC load is not needed since on failure the
         power domain reference is kept permanently taken. (Imre)
      
      v6:
       * Drop the NEEDS_CSR_GT_PERF_WA macro since all firmwares have now been
         deployed. (Imre, Chris)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
      Testcase: igt/gem_exec_nop/headless
      Cc: Imre Deak <imre.deak@intel.com>
      Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v5)
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      [Imre: Add note about applying the WA on CNL as a follow-up]
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171205132854.26380-1-tvrtko.ursulin@linux.intel.com
      b6876374
    • C
      drm/i915: Refactor common list iteration over GGTT vma · e2189dd0
      Chris Wilson 提交于
      In quite a few places, we have a list iteration over the vma on an
      object that only want to inspect GGTT vma. By construction, these are
      placed at the start of the list, so we have copied that knowledge into
      many callsites. Pull that knowledge back to i915_vma.h and provide a
      for_each_ggtt_vma() to tidy up the code.
      
      v2: Add a backreference from vma_create() to remind ourselves why we put
      ggtt vma at the head of the obj->vma_list (and ppgtt vma at the tail).
      v3: Fixup s/vma/V/
      Suggested-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171207211407.31549-1-chris@chris-wilson.co.uk
      e2189dd0
  4. 07 12月, 2017 2 次提交
  5. 06 12月, 2017 1 次提交
  6. 30 11月, 2017 1 次提交
  7. 28 11月, 2017 1 次提交
  8. 23 11月, 2017 1 次提交
    • C
      drm/i915: Call i915_gem_init_userptr() before taking struct_mutex · ee48700d
      Chris Wilson 提交于
      We don't need struct_mutex to initialise userptr (it just allocates a
      workqueue for itself etc), but we do need struct_mutex later on in
      i915_gem_init() in order to feed requests onto the HW.
      
      This should break the chain
      
      [  385.697902] ======================================================
      [  385.697907] WARNING: possible circular locking dependency detected
      [  385.697913] 4.14.0-CI-Patchwork_7234+ #1 Tainted: G     U
      [  385.697917] ------------------------------------------------------
      [  385.697922] perf_pmu/2631 is trying to acquire lock:
      [  385.697927]  (&mm->mmap_sem){++++}, at: [<ffffffff811bfe1e>] __might_fault+0x3e/0x90
      [  385.697941]
                     but task is already holding lock:
      [  385.697946]  (&cpuctx_mutex){+.+.}, at: [<ffffffff8116fe8c>] perf_event_ctx_lock_nested+0xbc/0x1d0
      [  385.697957]
                     which lock already depends on the new lock.
      
      [  385.697963]
                     the existing dependency chain (in reverse order) is:
      [  385.697970]
                     -> #4 (&cpuctx_mutex){+.+.}:
      [  385.697980]        __mutex_lock+0x86/0x9b0
      [  385.697985]        perf_event_init_cpu+0x5a/0x90
      [  385.697991]        perf_event_init+0x178/0x1a4
      [  385.697997]        start_kernel+0x27f/0x3f1
      [  385.698003]        verify_cpu+0x0/0xfb
      [  385.698006]
                     -> #3 (pmus_lock){+.+.}:
      [  385.698015]        __mutex_lock+0x86/0x9b0
      [  385.698020]        perf_event_init_cpu+0x21/0x90
      [  385.698025]        cpuhp_invoke_callback+0xca/0xc00
      [  385.698030]        _cpu_up+0xa7/0x170
      [  385.698035]        do_cpu_up+0x57/0x70
      [  385.698039]        smp_init+0x62/0xa6
      [  385.698044]        kernel_init_freeable+0x97/0x193
      [  385.698050]        kernel_init+0xa/0x100
      [  385.698055]        ret_from_fork+0x27/0x40
      [  385.698058]
                     -> #2 (cpu_hotplug_lock.rw_sem){++++}:
      [  385.698068]        cpus_read_lock+0x39/0xa0
      [  385.698073]        apply_workqueue_attrs+0x12/0x50
      [  385.698078]        __alloc_workqueue_key+0x1d8/0x4d8
      [  385.698134]        i915_gem_init_userptr+0x5f/0x80 [i915]
      [  385.698176]        i915_gem_init+0x7c/0x390 [i915]
      [  385.698213]        i915_driver_load+0x99e/0x15c0 [i915]
      [  385.698250]        i915_pci_probe+0x33/0x90 [i915]
      [  385.698256]        pci_device_probe+0xa1/0x130
      [  385.698262]        driver_probe_device+0x293/0x440
      [  385.698267]        __driver_attach+0xde/0xe0
      [  385.698272]        bus_for_each_dev+0x5c/0x90
      [  385.698277]        bus_add_driver+0x16d/0x260
      [  385.698282]        driver_register+0x57/0xc0
      [  385.698287]        do_one_initcall+0x3e/0x160
      [  385.698292]        do_init_module+0x5b/0x1fa
      [  385.698297]        load_module+0x2374/0x2dc0
      [  385.698302]        SyS_finit_module+0xaa/0xe0
      [  385.698307]        entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  385.698311]
                     -> #1 (&dev->struct_mutex){+.+.}:
      [  385.698320]        __mutex_lock+0x86/0x9b0
      [  385.698361]        i915_mutex_lock_interruptible+0x4c/0x130 [i915]
      [  385.698403]        i915_gem_fault+0x206/0x760 [i915]
      [  385.698409]        __do_fault+0x1a/0x70
      [  385.698413]        __handle_mm_fault+0x7c4/0xdb0
      [  385.698417]        handle_mm_fault+0x154/0x300
      [  385.698440]        __do_page_fault+0x2d6/0x570
      [  385.698445]        page_fault+0x22/0x30
      [  385.698449]
                     -> #0 (&mm->mmap_sem){++++}:
      [  385.698459]        lock_acquire+0xaf/0x200
      [  385.698464]        __might_fault+0x68/0x90
      [  385.698470]        _copy_to_user+0x1e/0x70
      [  385.698475]        perf_read+0x1aa/0x290
      [  385.698480]        __vfs_read+0x23/0x120
      [  385.698484]        vfs_read+0xa3/0x150
      [  385.698488]        SyS_read+0x45/0xb0
      [  385.698493]        entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  385.698497]
                     other info that might help us debug this:
      
      [  385.698505] Chain exists of:
                       &mm->mmap_sem --> pmus_lock --> &cpuctx_mutex
      
      [  385.698517]  Possible unsafe locking scenario:
      
      [  385.698522]        CPU0                    CPU1
      [  385.698526]        ----                    ----
      [  385.698529]   lock(&cpuctx_mutex);
      [  385.698553]                                lock(pmus_lock);
      [  385.698558]                                lock(&cpuctx_mutex);
      [  385.698564]   lock(&mm->mmap_sem);
      [  385.698568]
                      *** DEADLOCK ***
      
      [  385.698574] 1 lock held by perf_pmu/2631:
      [  385.698578]  #0:  (&cpuctx_mutex){+.+.}, at: [<ffffffff8116fe8c>] perf_event_ctx_lock_nested+0xbc/0x1d0
      [  385.698589]
                     stack backtrace:
      [  385.698595] CPU: 3 PID: 2631 Comm: perf_pmu Tainted: G     U          4.14.0-CI-Patchwork_7234+ #1
      [  385.698602] Hardware name:                  /NUC6CAYB, BIOS AYAPLCEL.86A.0040.2017.0619.1722 06/19/2017
      [  385.698609] Call Trace:
      [  385.698615]  dump_stack+0x5f/0x86
      [  385.698621]  print_circular_bug.isra.18+0x1d0/0x2c0
      [  385.698627]  __lock_acquire+0x19c3/0x1b60
      [  385.698634]  ? generic_exec_single+0x77/0xe0
      [  385.698640]  ? lock_acquire+0xaf/0x200
      [  385.698644]  lock_acquire+0xaf/0x200
      [  385.698650]  ? __might_fault+0x3e/0x90
      [  385.698655]  __might_fault+0x68/0x90
      [  385.698660]  ? __might_fault+0x3e/0x90
      [  385.698665]  _copy_to_user+0x1e/0x70
      [  385.698670]  perf_read+0x1aa/0x290
      [  385.698675]  __vfs_read+0x23/0x120
      [  385.698682]  ? __fget+0x101/0x1f0
      [  385.698686]  vfs_read+0xa3/0x150
      [  385.698691]  SyS_read+0x45/0xb0
      [  385.698696]  entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  385.698701] RIP: 0033:0x7ff1c46876ed
      [  385.698705] RSP: 002b:00007fff13552f90 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
      [  385.698712] RAX: ffffffffffffffda RBX: ffffc90000647ff0 RCX: 00007ff1c46876ed
      [  385.698718] RDX: 0000000000000010 RSI: 00007fff13552fa0 RDI: 0000000000000005
      [  385.698723] RBP: 000056063d300580 R08: 0000000000000000 R09: 0000000000000060
      [  385.698729] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000046
      [  385.698734] R13: 00007fff13552c6f R14: 00007ff1c6279d00 R15: 00007ff1c6279a40
      
      Testcase: igt/perf_pmu
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171122172621.16158-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      ee48700d
  9. 22 11月, 2017 1 次提交
  10. 21 11月, 2017 4 次提交
  11. 20 11月, 2017 1 次提交
  12. 16 11月, 2017 1 次提交
  13. 15 11月, 2017 1 次提交
  14. 14 11月, 2017 2 次提交
  15. 12 11月, 2017 1 次提交
  16. 11 11月, 2017 6 次提交
  17. 09 11月, 2017 1 次提交
    • C
      drm/i915: Lock llist_del_first() vs llist_del_all() · 0f763ff3
      Chris Wilson 提交于
      An oversight in commit 87701b4b ("drm/i915: Only free the oldest
      stale object before a fresh allocation") was that not only do we have to
      serialise concurrent users of llist_del_first(), but we also have to
      lock llist_del_first() vs llist_del_all().
      
      From llist.h,
      
       * This can be summarized as follows:
       *
       *           |   add    | del_first |  del_all
       * add       |    -     |     -     |     -
       * del_first |          |     L     |     L
       * del_all   |          |           |     -
       *
       * Where, a particular row's operation can happen concurrently with a column's
       * operation, with "-" being no lock needed, while "L" being lock is needed.
      
      This should hopefully explain:
      
      <4>[   89.287106] general protection fault: 0000 [#1] PREEMPT SMP
      <4>[   89.287126] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp i915 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core r8169 mii mei_me mei snd_pcm prime_numbers i2c_hid pinctrl_geminilake pinctrl_intel
      <4>[   89.287226] CPU: 2 PID: 23 Comm: ksoftirqd/2 Tainted: G     U          4.14.0-rc8-CI-CI_DRM_3315+ #1
      <4>[   89.287247] Hardware name: Intel Corp. Geminilake/GLK RVP2 LP4SD (07), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
      <4>[   89.287270] task: ffff88017ab34ec0 task.stack: ffffc90000128000
      <4>[   89.287290] RIP: 0010:llist_add_batch+0x4/0x20
      <4>[   89.287301] RSP: 0018:ffffc9000012bdb8 EFLAGS: 00010296
      <4>[   89.287314] RAX: ffffffff811017ad RBX: 6e468801a1560000 RCX: ef3e53fceecdeb81
      <4>[   89.287330] RDX: 6e468801a1566130 RSI: ffff880103d73d98 RDI: ffff880103d73d98
      <4>[   89.287346] RBP: ffffc9000012bdb8 R08: ffff88017ab35780 R09: 0000000000000000
      <4>[   89.287361] R10: ffffc9000012bd68 R11: 00000000abb18c3d R12: ffffffffa01369e0
      <4>[   89.287377] R13: ffff88017fd1b8f8 R14: ffff88017ab34ec0 R15: 000000000000000a
      <4>[   89.287393] FS:  0000000000000000(0000) GS:ffff88017fd00000(0000) knlGS:0000000000000000
      <4>[   89.287411] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>[   89.287424] CR2: 00007ff0c0755018 CR3: 000000016df9b000 CR4: 00000000003406e0
      <4>[   89.287440] Call Trace:
      <4>[   89.287511]  __i915_gem_free_object_rcu+0x20/0x40 [i915]
      <4>[   89.287527]  rcu_process_callbacks+0x27a/0x730
      <4>[   89.287544]  __do_softirq+0xc0/0x4ae
      <4>[   89.287559]  ? smpboot_thread_fn+0x2d/0x280
      <4>[   89.287571]  run_ksoftirqd+0x1f/0x70
      <4>[   89.287582]  smpboot_thread_fn+0x18a/0x280
      <4>[   89.287595]  kthread+0x114/0x150
      <4>[   89.287605]  ? sort_range+0x30/0x30
      <4>[   89.287615]  ? kthread_create_on_node+0x40/0x40
      <4>[   89.287628]  ret_from_fork+0x27/0x40
      <4>[   89.287641] Code: 0d 48 83 ea 01 4c 89 c1 48 83 fa ff 74 12 48 23 0c d7 74 ed 48 c1 e2 06 48 0f bd c9 48 8d 04 0a 5d c3 90 90 90 90 90 55 48 89 e5 <48> 8b 0a 48 89 0e 48 89 c8 f0 48 0f b1 3a 48 39 c1 75 ed 48 85
      <1>[   89.287774] RIP: llist_add_batch+0x4/0x20 RSP: ffffc9000012bdb8
      <4>[   89.287826] ---[ end trace e775d15174d8ae02 ]---
      
      (Lockless lists are only easy (and lockless) when only using
      llist_add/llist_del_all!)
      
      Fixes: 87701b4b ("drm/i915: Only free the oldest stale object before
      a fresh allocation")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171106111508.11941-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      (cherry picked from commit f991c492)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      0f763ff3
  18. 06 11月, 2017 1 次提交
    • C
      drm/i915: Lock llist_del_first() vs llist_del_all() · f991c492
      Chris Wilson 提交于
      An oversight in commit 87701b4b ("drm/i915: Only free the oldest
      stale object before a fresh allocation") was that not only do we have to
      serialise concurrent users of llist_del_first(), but we also have to
      lock llist_del_first() vs llist_del_all().
      
      From llist.h,
      
       * This can be summarized as follows:
       *
       *           |   add    | del_first |  del_all
       * add       |    -     |     -     |     -
       * del_first |          |     L     |     L
       * del_all   |          |           |     -
       *
       * Where, a particular row's operation can happen concurrently with a column's
       * operation, with "-" being no lock needed, while "L" being lock is needed.
      
      This should hopefully explain:
      
      <4>[   89.287106] general protection fault: 0000 [#1] PREEMPT SMP
      <4>[   89.287126] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp i915 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core r8169 mii mei_me mei snd_pcm prime_numbers i2c_hid pinctrl_geminilake pinctrl_intel
      <4>[   89.287226] CPU: 2 PID: 23 Comm: ksoftirqd/2 Tainted: G     U          4.14.0-rc8-CI-CI_DRM_3315+ #1
      <4>[   89.287247] Hardware name: Intel Corp. Geminilake/GLK RVP2 LP4SD (07), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
      <4>[   89.287270] task: ffff88017ab34ec0 task.stack: ffffc90000128000
      <4>[   89.287290] RIP: 0010:llist_add_batch+0x4/0x20
      <4>[   89.287301] RSP: 0018:ffffc9000012bdb8 EFLAGS: 00010296
      <4>[   89.287314] RAX: ffffffff811017ad RBX: 6e468801a1560000 RCX: ef3e53fceecdeb81
      <4>[   89.287330] RDX: 6e468801a1566130 RSI: ffff880103d73d98 RDI: ffff880103d73d98
      <4>[   89.287346] RBP: ffffc9000012bdb8 R08: ffff88017ab35780 R09: 0000000000000000
      <4>[   89.287361] R10: ffffc9000012bd68 R11: 00000000abb18c3d R12: ffffffffa01369e0
      <4>[   89.287377] R13: ffff88017fd1b8f8 R14: ffff88017ab34ec0 R15: 000000000000000a
      <4>[   89.287393] FS:  0000000000000000(0000) GS:ffff88017fd00000(0000) knlGS:0000000000000000
      <4>[   89.287411] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>[   89.287424] CR2: 00007ff0c0755018 CR3: 000000016df9b000 CR4: 00000000003406e0
      <4>[   89.287440] Call Trace:
      <4>[   89.287511]  __i915_gem_free_object_rcu+0x20/0x40 [i915]
      <4>[   89.287527]  rcu_process_callbacks+0x27a/0x730
      <4>[   89.287544]  __do_softirq+0xc0/0x4ae
      <4>[   89.287559]  ? smpboot_thread_fn+0x2d/0x280
      <4>[   89.287571]  run_ksoftirqd+0x1f/0x70
      <4>[   89.287582]  smpboot_thread_fn+0x18a/0x280
      <4>[   89.287595]  kthread+0x114/0x150
      <4>[   89.287605]  ? sort_range+0x30/0x30
      <4>[   89.287615]  ? kthread_create_on_node+0x40/0x40
      <4>[   89.287628]  ret_from_fork+0x27/0x40
      <4>[   89.287641] Code: 0d 48 83 ea 01 4c 89 c1 48 83 fa ff 74 12 48 23 0c d7 74 ed 48 c1 e2 06 48 0f bd c9 48 8d 04 0a 5d c3 90 90 90 90 90 55 48 89 e5 <48> 8b 0a 48 89 0e 48 89 c8 f0 48 0f b1 3a 48 39 c1 75 ed 48 85
      <1>[   89.287774] RIP: llist_add_batch+0x4/0x20 RSP: ffffc9000012bdb8
      <4>[   89.287826] ---[ end trace e775d15174d8ae02 ]---
      
      (Lockless lists are only easy (and lockless) when only using
      llist_add/llist_del_all!)
      
      Fixes: 87701b4b ("drm/i915: Only free the oldest stale object before
      a fresh allocation")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171106111508.11941-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      f991c492
  19. 03 11月, 2017 1 次提交
  20. 01 11月, 2017 1 次提交
  21. 31 10月, 2017 1 次提交
    • C
      drm/i915: Hold rcu_read_lock when iterating over the radixtree (objects) · 23e87338
      Chris Wilson 提交于
      Kasan spotted
      
          [IGT] gem_tiled_pread_pwrite: exiting, ret=0
          ==================================================================
          BUG: KASAN: use-after-free in __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
          Read of size 8 at addr ffff8801359da310 by task kworker/3:2/182
      
          CPU: 3 PID: 182 Comm: kworker/3:2 Tainted: G     U          4.14.0-rc6-CI-Custom_3340+ #1
          Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
          Workqueue: events __i915_gem_free_work [i915]
          Call Trace:
           dump_stack+0x68/0xa0
           print_address_description+0x78/0x290
           ? __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
           kasan_report+0x23d/0x350
           __asan_report_load8_noabort+0x19/0x20
           __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
           ? i915_gem_object_truncate+0x100/0x100 [i915]
           ? lock_acquire+0x380/0x380
           __i915_gem_object_put_pages+0x30d/0x530 [i915]
           __i915_gem_free_objects+0x551/0xbd0 [i915]
           ? lock_acquire+0x13e/0x380
           __i915_gem_free_work+0x4e/0x70 [i915]
           process_one_work+0x6f6/0x1590
           ? pwq_dec_nr_in_flight+0x2b0/0x2b0
           worker_thread+0xe6/0xe90
           ? pci_mmcfg_check_reserved+0x110/0x110
           kthread+0x309/0x410
           ? process_one_work+0x1590/0x1590
           ? kthread_create_on_node+0xb0/0xb0
           ret_from_fork+0x27/0x40
      
          Allocated by task 1801:
           save_stack_trace+0x1b/0x20
           kasan_kmalloc+0xee/0x190
           kasan_slab_alloc+0x12/0x20
           kmem_cache_alloc+0xdc/0x2e0
           radix_tree_node_alloc.constprop.12+0x48/0x330
           __radix_tree_create+0x274/0x480
           __radix_tree_insert+0xa2/0x610
           i915_gem_object_get_sg+0x224/0x670 [i915]
           i915_gem_object_get_page+0xb5/0x1c0 [i915]
           i915_gem_pread_ioctl+0x822/0xf60 [i915]
           drm_ioctl_kernel+0x13f/0x1c0
           drm_ioctl+0x6cf/0x980
           do_vfs_ioctl+0x184/0xf30
           SyS_ioctl+0x41/0x70
           entry_SYSCALL_64_fastpath+0x1c/0xb1
      
          Freed by task 37:
           save_stack_trace+0x1b/0x20
           kasan_slab_free+0xaf/0x190
           kmem_cache_free+0xbf/0x340
           radix_tree_node_rcu_free+0x79/0x90
           rcu_process_callbacks+0x46d/0xf40
           __do_softirq+0x21c/0x8d3
      
          The buggy address belongs to the object at ffff8801359da0f0
          which belongs to the cache radix_tree_node of size 576
          The buggy address is located 544 bytes inside of
          576-byte region [ffff8801359da0f0, ffff8801359da330)
          The buggy address belongs to the page:
          page:ffffea0004d67600 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
          flags: 0x8000000000008100(slab|head)
          raw: 8000000000008100 0000000000000000 0000000000000000 0000000100110011
          raw: ffffea0004b52920 ffffea0004b38020 ffff88015b416a80 0000000000000000
          page dumped because: kasan: bad access detected
      
          Memory state around the buggy address:
           ffff8801359da200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
           ffff8801359da280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
          >ffff8801359da300: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
      			     ^
           ffff8801359da380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
           ffff8801359da400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
          ==================================================================
          Disabling lock debugging due to kernel taint
      
      which looks like the slab containing the radixtree iter was freed as we
      traversed the tree, taking the rcu read lock across the loop should
      prevent that (deferring all the frees until the end).
      Reported-by: NTomi Sarvela <tomi.p.sarvela@intel.com>
      Fixes: 96d77634 ("drm/i915: Use a radixtree for random access to the object's backing storage")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171026130032.10677-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.william.auld@gmail.com>
      (cherry picked from commit bea6e987)
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      23e87338
  22. 27 10月, 2017 2 次提交
    • C
      drm/i915: Hold rcu_read_lock when iterating over the radixtree (objects) · bea6e987
      Chris Wilson 提交于
      Kasan spotted
      
          [IGT] gem_tiled_pread_pwrite: exiting, ret=0
          ==================================================================
          BUG: KASAN: use-after-free in __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
          Read of size 8 at addr ffff8801359da310 by task kworker/3:2/182
      
          CPU: 3 PID: 182 Comm: kworker/3:2 Tainted: G     U          4.14.0-rc6-CI-Custom_3340+ #1
          Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
          Workqueue: events __i915_gem_free_work [i915]
          Call Trace:
           dump_stack+0x68/0xa0
           print_address_description+0x78/0x290
           ? __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
           kasan_report+0x23d/0x350
           __asan_report_load8_noabort+0x19/0x20
           __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
           ? i915_gem_object_truncate+0x100/0x100 [i915]
           ? lock_acquire+0x380/0x380
           __i915_gem_object_put_pages+0x30d/0x530 [i915]
           __i915_gem_free_objects+0x551/0xbd0 [i915]
           ? lock_acquire+0x13e/0x380
           __i915_gem_free_work+0x4e/0x70 [i915]
           process_one_work+0x6f6/0x1590
           ? pwq_dec_nr_in_flight+0x2b0/0x2b0
           worker_thread+0xe6/0xe90
           ? pci_mmcfg_check_reserved+0x110/0x110
           kthread+0x309/0x410
           ? process_one_work+0x1590/0x1590
           ? kthread_create_on_node+0xb0/0xb0
           ret_from_fork+0x27/0x40
      
          Allocated by task 1801:
           save_stack_trace+0x1b/0x20
           kasan_kmalloc+0xee/0x190
           kasan_slab_alloc+0x12/0x20
           kmem_cache_alloc+0xdc/0x2e0
           radix_tree_node_alloc.constprop.12+0x48/0x330
           __radix_tree_create+0x274/0x480
           __radix_tree_insert+0xa2/0x610
           i915_gem_object_get_sg+0x224/0x670 [i915]
           i915_gem_object_get_page+0xb5/0x1c0 [i915]
           i915_gem_pread_ioctl+0x822/0xf60 [i915]
           drm_ioctl_kernel+0x13f/0x1c0
           drm_ioctl+0x6cf/0x980
           do_vfs_ioctl+0x184/0xf30
           SyS_ioctl+0x41/0x70
           entry_SYSCALL_64_fastpath+0x1c/0xb1
      
          Freed by task 37:
           save_stack_trace+0x1b/0x20
           kasan_slab_free+0xaf/0x190
           kmem_cache_free+0xbf/0x340
           radix_tree_node_rcu_free+0x79/0x90
           rcu_process_callbacks+0x46d/0xf40
           __do_softirq+0x21c/0x8d3
      
          The buggy address belongs to the object at ffff8801359da0f0
          which belongs to the cache radix_tree_node of size 576
          The buggy address is located 544 bytes inside of
          576-byte region [ffff8801359da0f0, ffff8801359da330)
          The buggy address belongs to the page:
          page:ffffea0004d67600 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
          flags: 0x8000000000008100(slab|head)
          raw: 8000000000008100 0000000000000000 0000000000000000 0000000100110011
          raw: ffffea0004b52920 ffffea0004b38020 ffff88015b416a80 0000000000000000
          page dumped because: kasan: bad access detected
      
          Memory state around the buggy address:
           ffff8801359da200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
           ffff8801359da280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
          >ffff8801359da300: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
      			     ^
           ffff8801359da380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
           ffff8801359da400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
          ==================================================================
          Disabling lock debugging due to kernel taint
      
      which looks like the slab containing the radixtree iter was freed as we
      traversed the tree, taking the rcu read lock across the loop should
      prevent that (deferring all the frees until the end).
      Reported-by: NTomi Sarvela <tomi.p.sarvela@intel.com>
      Fixes: 96d77634 ("drm/i915: Use a radixtree for random access to the object's backing storage")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171026130032.10677-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.william.auld@gmail.com>
      bea6e987
    • M
      drm/i915/guc: Preemption! With GuC · c41937fd
      Michał Winiarski 提交于
      Pretty similar to what we have on execlists.
      We're reusing most of the GEM code, however, due to GuC quirks we need a
      couple of extra bits.
      Preemption is implemented as GuC action, and actions can be pretty slow.
      Because of that, we're using a mutex to serialize them. Since we're
      requesting preemption from the tasklet, the task of creating a workitem
      and wrapping it in GuC action is delegated to a worker.
      
      To distinguish that preemption has finished, we're using additional
      piece of HWSP, and since we're not getting context switch interrupts,
      we're also adding a user interrupt.
      
      The fact that our special preempt context has completed unfortunately
      doesn't mean that we're ready to submit new work. We also need to wait
      for GuC to finish its own processing.
      
      v2: Don't compile out the wait for GuC, handle workqueue flush on reset,
      no need for ordered workqueue, put on a reviewer hat when looking at my own
      patches (Chris)
      Move struct work around in intel_guc, move user interruput outside of
      conditional (Michał)
      Keep ring around rather than chase though intel_context
      
      v3: Extract WA for flushing ggtt writes to a helper (Chris)
      Keep work_struct in intel_guc rather than engine (Michał)
      Use ordered workqueue for inject_preempt worker to avoid GuC quirks.
      
      v4: Drop now unused INTEL_GUC_PREEMPT_OPTION_IMMEDIATE (Daniele)
      Drop stray newlines, use container_of for intel_guc in worker,
      check for presence of workqueue when flushing it, rather than
      enable_guc_submission modparam, reorder preempt postprocessing (Chris)
      
      v5: Make wq NULL after destroying it
      
      v6: Swap struct guc_preempt_work members (Michał)
      Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Jeff McGee <jeff.mcgee@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171026133558.19580-1-michal.winiarski@intel.com
      c41937fd
  23. 26 10月, 2017 2 次提交
  24. 24 10月, 2017 2 次提交