1. 26 11月, 2021 2 次提交
    • T
      drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code · 8b91cdd4
      Thomas Hellström 提交于
      The capture code is typically run entirely in the fence signalling
      critical path. We're about to add lockdep annotation in an upcoming patch
      which reveals a lockdep splat similar to the below one.
      
      Fix the associated potential deadlocks using __GFP_KSWAPD_RECLAIM
      (which is the same as GFP_WAIT, but open-coded for clarity) rather than
      GFP_KERNEL for memory allocation in the capture path. This has the
      potential drawback that capture might fail in situations with memory
      pressure.
      
      [  234.842048] WARNING: possible circular locking dependency detected
      [  234.842050] 5.15.0-rc7+ #20 Tainted: G     U  W
      [  234.842052] ------------------------------------------------------
      [  234.842054] gem_exec_captur/1180 is trying to acquire lock:
      [  234.842056] ffffffffa3e51c00 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x4d/0x330
      [  234.842063]
                     but task is already holding lock:
      [  234.842064] ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
      [  234.842138]
                     which lock already depends on the new lock.
      
      [  234.842140]
                     the existing dependency chain (in reverse order) is:
      [  234.842142]
                     -> #2 (dma_fence_map){++++}-{0:0}:
      [  234.842145]        __dma_fence_might_wait+0x41/0xa0
      [  234.842149]        dma_resv_lockdep+0x1dc/0x28f
      [  234.842151]        do_one_initcall+0x58/0x2d0
      [  234.842154]        kernel_init_freeable+0x273/0x2bf
      [  234.842157]        kernel_init+0x16/0x120
      [  234.842160]        ret_from_fork+0x1f/0x30
      [  234.842163]
                     -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
      [  234.842166]        fs_reclaim_acquire+0x6d/0xd0
      [  234.842168]        __kmalloc_node+0x51/0x3a0
      [  234.842171]        alloc_cpumask_var_node+0x1b/0x30
      [  234.842174]        native_smp_prepare_cpus+0xc7/0x292
      [  234.842177]        kernel_init_freeable+0x160/0x2bf
      [  234.842179]        kernel_init+0x16/0x120
      [  234.842181]        ret_from_fork+0x1f/0x30
      [  234.842184]
                     -> #0 (fs_reclaim){+.+.}-{0:0}:
      [  234.842186]        __lock_acquire+0x1161/0x1dc0
      [  234.842189]        lock_acquire+0xb5/0x2b0
      [  234.842192]        fs_reclaim_acquire+0xa1/0xd0
      [  234.842193]        __kmalloc+0x4d/0x330
      [  234.842196]        i915_vma_coredump_create+0x78/0x5b0 [i915]
      [  234.842253]        intel_engine_coredump_add_vma+0x36/0xe0 [i915]
      [  234.842307]        __i915_gpu_coredump+0x290/0x5e0 [i915]
      [  234.842365]        i915_capture_error_state+0x57/0xa0 [i915]
      [  234.842415]        intel_gt_handle_error+0x348/0x3e0 [i915]
      [  234.842462]        intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
      [  234.842504]        simple_attr_write+0xc1/0xe0
      [  234.842507]        full_proxy_write+0x53/0x80
      [  234.842509]        vfs_write+0xbc/0x350
      [  234.842513]        ksys_write+0x58/0xd0
      [  234.842514]        do_syscall_64+0x38/0x90
      [  234.842516]        entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  234.842519]
                     other info that might help us debug this:
      
      [  234.842521] Chain exists of:
                       fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map
      
      [  234.842526]  Possible unsafe locking scenario:
      
      [  234.842528]        CPU0                    CPU1
      [  234.842529]        ----                    ----
      [  234.842531]   lock(dma_fence_map);
      [  234.842532]                                lock(mmu_notifier_invalidate_range_start);
      [  234.842535]                                lock(dma_fence_map);
      [  234.842537]   lock(fs_reclaim);
      [  234.842539]
                      *** DEADLOCK ***
      
      [  234.842540] 4 locks held by gem_exec_captur/1180:
      [  234.842543]  #0: ffff9007812d9460 (sb_writers#17){.+.+}-{0:0}, at: ksys_write+0x58/0xd0
      [  234.842547]  #1: ffff900781d9ecb8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write+0x3a/0xe0
      [  234.842552]  #2: ffffffffc11913a8 (capture_mutex){+.+.}-{3:3}, at: i915_capture_error_state+0x1a/0xa0 [i915]
      [  234.842602]  #3: ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
      [  234.842656]
                     stack backtrace:
      [  234.842658] CPU: 0 PID: 1180 Comm: gem_exec_captur Tainted: G     U  W         5.15.0-rc7+ #20
      [  234.842661] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
      [  234.842664] Call Trace:
      [  234.842666]  dump_stack_lvl+0x57/0x72
      [  234.842669]  check_noncircular+0xde/0x100
      [  234.842672]  ? __lock_acquire+0x3bf/0x1dc0
      [  234.842675]  __lock_acquire+0x1161/0x1dc0
      [  234.842678]  lock_acquire+0xb5/0x2b0
      [  234.842680]  ? __kmalloc+0x4d/0x330
      [  234.842683]  ? finish_task_switch.isra.0+0xf2/0x360
      [  234.842686]  ? i915_vma_coredump_create+0x78/0x5b0 [i915]
      [  234.842734]  fs_reclaim_acquire+0xa1/0xd0
      [  234.842737]  ? __kmalloc+0x4d/0x330
      [  234.842739]  __kmalloc+0x4d/0x330
      [  234.842742]  i915_vma_coredump_create+0x78/0x5b0 [i915]
      [  234.842793]  ? capture_vma+0xbe/0x110 [i915]
      [  234.842844]  intel_engine_coredump_add_vma+0x36/0xe0 [i915]
      [  234.842892]  __i915_gpu_coredump+0x290/0x5e0 [i915]
      [  234.842939]  i915_capture_error_state+0x57/0xa0 [i915]
      [  234.842985]  intel_gt_handle_error+0x348/0x3e0 [i915]
      [  234.843032]  ? __mutex_lock+0x81/0x830
      [  234.843035]  ? simple_attr_write+0x3a/0xe0
      [  234.843038]  ? __lock_acquire+0x3bf/0x1dc0
      [  234.843041]  intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
      [  234.843083]  ? _copy_from_user+0x45/0x80
      [  234.843086]  simple_attr_write+0xc1/0xe0
      [  234.843089]  full_proxy_write+0x53/0x80
      [  234.843091]  vfs_write+0xbc/0x350
      [  234.843094]  ksys_write+0x58/0xd0
      [  234.843096]  do_syscall_64+0x38/0x90
      [  234.843098]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  234.843101] RIP: 0033:0x7fa467480877
      [  234.843103] Code: 75 05 48 83 c4 58 c3 e8 37 4e ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
      [  234.843108] RSP: 002b:00007ffd14d79b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [  234.843112] RAX: ffffffffffffffda RBX: 00007ffd14d79b60 RCX: 00007fa467480877
      [  234.843114] RDX: 0000000000000014 RSI: 00007ffd14d79b60 RDI: 0000000000000007
      [  234.843116] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007ffd14d79ab0
      [  234.843119] R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000014
      [  234.843121] R13: 0000000000000000 R14: 00007ffd14d79b60 R15: 0000000000000005
      
      v5:
      - Use __GFP_KSWAPD_RECLAIM rather than __GFP_NOWAIT for clarity.
        (Daniel Vetter)
      v6:
      - Include an instance in execlists_capture_work().
      - Rework the commit message due to patch reordering.
      Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: NRamalingam C <ramalingam.c@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-3-thomas.hellstrom@linux.intel.com
      8b91cdd4
    • T
      drm/i915: Avoid allocating a page array for the gpu coredump · e45b98ba
      Thomas Hellström 提交于
      The gpu coredump typically takes place in a dma_fence signalling
      critical path, and hence can't use GFP_KERNEL allocations, as that
      means we might hit deadlocks under memory pressure. However
      changing to __GFP_KSWAPD_RECLAIM which will be done in an upcoming
      patch will instead mean a lower chance of the allocation succeeding.
      In particular large contigous allocations like the coredump page
      vector.
      Remove the page vector in favor of a linked list of single pages.
      Use the page lru list head as the list link, as the page owner is
      allowed to do that.
      Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: NRamalingam C <ramalingam.c@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-2-thomas.hellstrom@linux.intel.com
      e45b98ba
  2. 21 9月, 2021 1 次提交
  3. 12 8月, 2021 2 次提交
  4. 11 8月, 2021 2 次提交
  5. 28 7月, 2021 1 次提交
  6. 25 6月, 2021 1 次提交
  7. 19 6月, 2021 1 次提交
  8. 07 6月, 2021 1 次提交
  9. 03 6月, 2021 1 次提交
  10. 20 5月, 2021 4 次提交
  11. 07 5月, 2021 1 次提交
  12. 02 2月, 2021 1 次提交
  13. 20 1月, 2021 1 次提交
  14. 24 12月, 2020 1 次提交
  15. 09 11月, 2020 2 次提交
  16. 20 10月, 2020 1 次提交
    • C
      drm/i915: Use the active reference on the vma while capturing · db9bc2d3
      Chris Wilson 提交于
      During error capture, we need to take a reference to the vma from before
      the reset in order to catpure the contents of the vma later. Currently
      we are using both an active reference and a kref, but due to nature of
      the i915_vma reference handling, that kref is on the vma->obj and not
      the vma itself. This means the vma may be destroyed as soon as it is
      idle, that is in between the i915_active_release(&vma->active) and the
      i915_vma_put(vma):
      
      <3> [197.866181] BUG: KASAN: use-after-free in intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <3> [197.866339] Read of size 8 at addr ffff8881258cb800 by task gem_exec_captur/1041
      <3> [197.866467]
      <4> [197.866512] CPU: 2 PID: 1041 Comm: gem_exec_captur Not tainted 5.9.0-g5e4234f97efba-kasan_200+ #1
      <4> [197.866521] Hardware name: Intel Corp. Broxton P/Apollolake RVP1A, BIOS APLKRVPA.X64.0150.B11.1608081044 08/08/2016
      <4> [197.866530] Call Trace:
      <4> [197.866549]  dump_stack+0x99/0xd0
      <4> [197.866760]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.866783]  print_address_description.constprop.8+0x3e/0x60
      <4> [197.866797]  ? kmsg_dump_rewind_nolock+0xd4/0xd4
      <4> [197.866819]  ? lockdep_hardirqs_off+0xd4/0x120
      <4> [197.867037]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867249]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867270]  kasan_report.cold.10+0x1f/0x37
      <4> [197.867492]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867710]  intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867949]  i915_gpu_coredump.part.29+0x150/0x7b0 [i915]
      <4> [197.868186]  i915_capture_error_state+0x5e/0xc0 [i915]
      <4> [197.868396]  intel_gt_handle_error+0x6eb/0xa20 [i915]
      <4> [197.868624]  ? intel_gt_reset_global+0x370/0x370 [i915]
      <4> [197.868644]  ? check_flags+0x50/0x50
      <4> [197.868662]  ? __lock_acquire+0xd59/0x6b00
      <4> [197.868678]  ? register_lock_class+0x1ad0/0x1ad0
      <4> [197.868944]  i915_wedged_set+0xcf/0x1b0 [i915]
      <4> [197.869147]  ? i915_wedged_get+0x90/0x90 [i915]
      <4> [197.869371]  ? i915_wedged_get+0x90/0x90 [i915]
      <4> [197.869398]  simple_attr_write+0x153/0x1c0
      <4> [197.869428]  full_proxy_write+0xee/0x180
      <4> [197.869442]  ? __sb_start_write+0x1f3/0x310
      <4> [197.869465]  vfs_write+0x1a3/0x640
      <4> [197.869492]  ksys_write+0xec/0x1c0
      <4> [197.869507]  ? __ia32_sys_read+0xa0/0xa0
      <4> [197.869525]  ? lockdep_hardirqs_on_prepare+0x32b/0x4e0
      <4> [197.869541]  ? syscall_enter_from_user_mode+0x1c/0x50
      <4> [197.869566]  do_syscall_64+0x33/0x80
      <4> [197.869579]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      <4> [197.869590] RIP: 0033:0x7fd8b7aee281
      <4> [197.869604] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
      <4> [197.869613] RSP: 002b:00007ffea3b72008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      <4> [197.869625] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd8b7aee281
      <4> [197.869633] RDX: 0000000000000002 RSI: 00007fd8b81a82e7 RDI: 000000000000000d
      <4> [197.869641] RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000034
      <4> [197.869650] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd8b81a82e7
      <4> [197.869658] R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
      <3> [197.869707]
      <3> [197.869757] Allocated by task 1041:
      <4> [197.869833]  kasan_save_stack+0x19/0x40
      <4> [197.869843]  __kasan_kmalloc.constprop.5+0xc1/0xd0
      <4> [197.869853]  kmem_cache_alloc+0x106/0x8e0
      <4> [197.870059]  i915_vma_instance+0x212/0x1930 [i915]
      <4> [197.870270]  eb_lookup_vmas+0xe06/0x1d10 [i915]
      <4> [197.870475]  i915_gem_do_execbuffer+0x131d/0x4080 [i915]
      <4> [197.870682]  i915_gem_execbuffer2_ioctl+0x103/0x5d0 [i915]
      <4> [197.870701]  drm_ioctl_kernel+0x1d2/0x270
      <4> [197.870710]  drm_ioctl+0x40d/0x85c
      <4> [197.870721]  __x64_sys_ioctl+0x10d/0x170
      <4> [197.870731]  do_syscall_64+0x33/0x80
      <4> [197.870740]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      <3> [197.870748]
      <3> [197.870798] Freed by task 22:
      <4> [197.870865]  kasan_save_stack+0x19/0x40
      <4> [197.870875]  kasan_set_track+0x1c/0x30
      <4> [197.870884]  kasan_set_free_info+0x1b/0x30
      <4> [197.870894]  __kasan_slab_free+0x111/0x160
      <4> [197.870903]  kmem_cache_free+0xcd/0x710
      <4> [197.871109]  i915_vma_parked+0x618/0x800 [i915]
      <4> [197.871307]  __gt_park+0xdb/0x1e0 [i915]
      <4> [197.871501]  ____intel_wakeref_put_last+0xb1/0x190 [i915]
      <4> [197.871516]  process_one_work+0x8dc/0x15d0
      <4> [197.871525]  worker_thread+0x82/0xb30
      <4> [197.871535]  kthread+0x36d/0x440
      <4> [197.871545]  ret_from_fork+0x22/0x30
      <3> [197.871553]
      <3> [197.871602] The buggy address belongs to the object at ffff8881258cb740
       which belongs to the cache i915_vma of size 968
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2553
      Fixes: 2850748e ("drm/i915: Pull i915_vma_pin under the vm->mutex")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v5.5+
      Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201016092527.29039-1-chris@chris-wilson.co.uk
      (cherry picked from commit 178536b8)
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      db9bc2d3
  17. 16 10月, 2020 1 次提交
    • C
      drm/i915: Use the active reference on the vma while capturing · 178536b8
      Chris Wilson 提交于
      During error capture, we need to take a reference to the vma from before
      the reset in order to catpure the contents of the vma later. Currently
      we are using both an active reference and a kref, but due to nature of
      the i915_vma reference handling, that kref is on the vma->obj and not
      the vma itself. This means the vma may be destroyed as soon as it is
      idle, that is in between the i915_active_release(&vma->active) and the
      i915_vma_put(vma):
      
      <3> [197.866181] BUG: KASAN: use-after-free in intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <3> [197.866339] Read of size 8 at addr ffff8881258cb800 by task gem_exec_captur/1041
      <3> [197.866467]
      <4> [197.866512] CPU: 2 PID: 1041 Comm: gem_exec_captur Not tainted 5.9.0-g5e4234f97efba-kasan_200+ #1
      <4> [197.866521] Hardware name: Intel Corp. Broxton P/Apollolake RVP1A, BIOS APLKRVPA.X64.0150.B11.1608081044 08/08/2016
      <4> [197.866530] Call Trace:
      <4> [197.866549]  dump_stack+0x99/0xd0
      <4> [197.866760]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.866783]  print_address_description.constprop.8+0x3e/0x60
      <4> [197.866797]  ? kmsg_dump_rewind_nolock+0xd4/0xd4
      <4> [197.866819]  ? lockdep_hardirqs_off+0xd4/0x120
      <4> [197.867037]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867249]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867270]  kasan_report.cold.10+0x1f/0x37
      <4> [197.867492]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867710]  intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867949]  i915_gpu_coredump.part.29+0x150/0x7b0 [i915]
      <4> [197.868186]  i915_capture_error_state+0x5e/0xc0 [i915]
      <4> [197.868396]  intel_gt_handle_error+0x6eb/0xa20 [i915]
      <4> [197.868624]  ? intel_gt_reset_global+0x370/0x370 [i915]
      <4> [197.868644]  ? check_flags+0x50/0x50
      <4> [197.868662]  ? __lock_acquire+0xd59/0x6b00
      <4> [197.868678]  ? register_lock_class+0x1ad0/0x1ad0
      <4> [197.868944]  i915_wedged_set+0xcf/0x1b0 [i915]
      <4> [197.869147]  ? i915_wedged_get+0x90/0x90 [i915]
      <4> [197.869371]  ? i915_wedged_get+0x90/0x90 [i915]
      <4> [197.869398]  simple_attr_write+0x153/0x1c0
      <4> [197.869428]  full_proxy_write+0xee/0x180
      <4> [197.869442]  ? __sb_start_write+0x1f3/0x310
      <4> [197.869465]  vfs_write+0x1a3/0x640
      <4> [197.869492]  ksys_write+0xec/0x1c0
      <4> [197.869507]  ? __ia32_sys_read+0xa0/0xa0
      <4> [197.869525]  ? lockdep_hardirqs_on_prepare+0x32b/0x4e0
      <4> [197.869541]  ? syscall_enter_from_user_mode+0x1c/0x50
      <4> [197.869566]  do_syscall_64+0x33/0x80
      <4> [197.869579]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      <4> [197.869590] RIP: 0033:0x7fd8b7aee281
      <4> [197.869604] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
      <4> [197.869613] RSP: 002b:00007ffea3b72008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      <4> [197.869625] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd8b7aee281
      <4> [197.869633] RDX: 0000000000000002 RSI: 00007fd8b81a82e7 RDI: 000000000000000d
      <4> [197.869641] RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000034
      <4> [197.869650] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd8b81a82e7
      <4> [197.869658] R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
      <3> [197.869707]
      <3> [197.869757] Allocated by task 1041:
      <4> [197.869833]  kasan_save_stack+0x19/0x40
      <4> [197.869843]  __kasan_kmalloc.constprop.5+0xc1/0xd0
      <4> [197.869853]  kmem_cache_alloc+0x106/0x8e0
      <4> [197.870059]  i915_vma_instance+0x212/0x1930 [i915]
      <4> [197.870270]  eb_lookup_vmas+0xe06/0x1d10 [i915]
      <4> [197.870475]  i915_gem_do_execbuffer+0x131d/0x4080 [i915]
      <4> [197.870682]  i915_gem_execbuffer2_ioctl+0x103/0x5d0 [i915]
      <4> [197.870701]  drm_ioctl_kernel+0x1d2/0x270
      <4> [197.870710]  drm_ioctl+0x40d/0x85c
      <4> [197.870721]  __x64_sys_ioctl+0x10d/0x170
      <4> [197.870731]  do_syscall_64+0x33/0x80
      <4> [197.870740]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      <3> [197.870748]
      <3> [197.870798] Freed by task 22:
      <4> [197.870865]  kasan_save_stack+0x19/0x40
      <4> [197.870875]  kasan_set_track+0x1c/0x30
      <4> [197.870884]  kasan_set_free_info+0x1b/0x30
      <4> [197.870894]  __kasan_slab_free+0x111/0x160
      <4> [197.870903]  kmem_cache_free+0xcd/0x710
      <4> [197.871109]  i915_vma_parked+0x618/0x800 [i915]
      <4> [197.871307]  __gt_park+0xdb/0x1e0 [i915]
      <4> [197.871501]  ____intel_wakeref_put_last+0xb1/0x190 [i915]
      <4> [197.871516]  process_one_work+0x8dc/0x15d0
      <4> [197.871525]  worker_thread+0x82/0xb30
      <4> [197.871535]  kthread+0x36d/0x440
      <4> [197.871545]  ret_from_fork+0x22/0x30
      <3> [197.871553]
      <3> [197.871602] The buggy address belongs to the object at ffff8881258cb740
       which belongs to the cache i915_vma of size 968
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2553
      Fixes: 2850748e ("drm/i915: Pull i915_vma_pin under the vm->mutex")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v5.5+
      Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201016092527.29039-1-chris@chris-wilson.co.uk
      178536b8
  18. 01 10月, 2020 1 次提交
  19. 18 9月, 2020 2 次提交
  20. 24 8月, 2020 1 次提交
  21. 11 7月, 2020 1 次提交
  22. 09 7月, 2020 3 次提交
  23. 23 6月, 2020 1 次提交
    • J
      drm/i915/params: switch to device specific parameters · 8a25c4be
      Jani Nikula 提交于
      Start using device specific parameters instead of module parameters for
      most things. The module parameters become the immutable initial values
      for i915 parameters. The device specific parameters in i915->params
      start life as a copy of i915_modparams. Any later changes are only
      reflected in the debugfs.
      
      The stragglers are:
      
      * i915.force_probe and i915.modeset. Needed before dev_priv is
        available. This is fine because the parameters are read-only and never
        modified.
      
      * i915.verbose_state_checks. Passing dev_priv to I915_STATE_WARN and
        I915_STATE_WARN_ON would result in massive and ugly churn. This is
        handled by not exposing the parameter via debugfs, and leaving the
        parameter writable in sysfs. This may be fixed up in follow-up work.
      
      * i915.inject_probe_failure. Only makes sense in terms of the module,
        not the device. This is handled by not exposing the parameter via
        debugfs.
      
      v2: Fix uc i915 lookup code (Michał Winiarski)
      
      Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
      Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Acked-by: NMichał Winiarski <michal.winiarski@intel.com>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20200618150402.14022-1-jani.nikula@intel.com
      8a25c4be
  24. 05 5月, 2020 1 次提交
    • C
      drm/i915: Avoid dereferencing a dead context · 30523408
      Chris Wilson 提交于
      Once the intel_context is closed, the GEM context may be freed and so
      the link from intel_context.gem_context is invalid.
      
      <3>[  219.782944] BUG: KASAN: use-after-free in intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <3>[  219.782996] Read of size 8 at addr ffff8881d7dff0b8 by task kworker/0:1/12
      
      <4>[  219.783052] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G     U            5.7.0-rc2-g1f3ffd7683d54-kasan_118+ #1
      <4>[  219.783055] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
      <4>[  219.783105] Workqueue: events heartbeat [i915]
      <4>[  219.783109] Call Trace:
      <4>[  219.783113]  <IRQ>
      <4>[  219.783119]  dump_stack+0x96/0xdb
      <4>[  219.783177]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783182]  print_address_description.constprop.6+0x16/0x310
      <4>[  219.783239]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783295]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783300]  __kasan_report+0x137/0x190
      <4>[  219.783359]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783366]  kasan_report+0x32/0x50
      <4>[  219.783426]  intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783481]  execlists_reset+0x39c/0x13d0 [i915]
      <4>[  219.783494]  ? mark_held_locks+0x9e/0xe0
      <4>[  219.783546]  ? execlists_hold+0xfc0/0xfc0 [i915]
      <4>[  219.783551]  ? lockdep_hardirqs_on+0x348/0x5f0
      <4>[  219.783557]  ? _raw_spin_unlock_irqrestore+0x34/0x60
      <4>[  219.783606]  ? execlists_submission_tasklet+0x118/0x3a0 [i915]
      <4>[  219.783615]  tasklet_action_common.isra.14+0x13b/0x410
      <4>[  219.783623]  ? __do_softirq+0x1e4/0x9a7
      <4>[  219.783630]  __do_softirq+0x226/0x9a7
      <4>[  219.783643]  do_softirq_own_stack+0x2a/0x40
      <4>[  219.783647]  </IRQ>
      <4>[  219.783692]  ? heartbeat+0x3e2/0x10f0 [i915]
      <4>[  219.783696]  do_softirq.part.13+0x49/0x50
      <4>[  219.783700]  __local_bh_enable_ip+0x1a2/0x1e0
      <4>[  219.783748]  heartbeat+0x409/0x10f0 [i915]
      <4>[  219.783801]  ? __live_idle_pulse+0x9f0/0x9f0 [i915]
      <4>[  219.783806]  ? lock_acquire+0x1ac/0x8a0
      <4>[  219.783811]  ? process_one_work+0x811/0x1870
      <4>[  219.783827]  ? rcu_read_lock_sched_held+0x9c/0xd0
      <4>[  219.783832]  ? rcu_read_lock_bh_held+0xb0/0xb0
      <4>[  219.783836]  ? _raw_spin_unlock_irq+0x1f/0x40
      <4>[  219.783845]  process_one_work+0x8ca/0x1870
      <4>[  219.783848]  ? lock_acquire+0x1ac/0x8a0
      <4>[  219.783852]  ? worker_thread+0x1d0/0xb80
      <4>[  219.783864]  ? pwq_dec_nr_in_flight+0x2c0/0x2c0
      <4>[  219.783870]  ? do_raw_spin_lock+0x129/0x290
      <4>[  219.783886]  worker_thread+0x82/0xb80
      <4>[  219.783895]  ? __kthread_parkme+0xaf/0x1b0
      <4>[  219.783902]  ? process_one_work+0x1870/0x1870
      <4>[  219.783906]  kthread+0x34e/0x420
      <4>[  219.783911]  ? kthread_create_on_node+0xc0/0xc0
      <4>[  219.783918]  ret_from_fork+0x3a/0x50
      
      <3>[  219.783950] Allocated by task 1264:
      <4>[  219.783975]  save_stack+0x19/0x40
      <4>[  219.783978]  __kasan_kmalloc.constprop.3+0xa0/0xd0
      <4>[  219.784029]  i915_gem_create_context+0xa2/0xab8 [i915]
      <4>[  219.784081]  i915_gem_context_create_ioctl+0x1fa/0x450 [i915]
      <4>[  219.784085]  drm_ioctl_kernel+0x1d8/0x270
      <4>[  219.784088]  drm_ioctl+0x676/0x930
      <4>[  219.784092]  ksys_ioctl+0xb7/0xe0
      <4>[  219.784096]  __x64_sys_ioctl+0x6a/0xb0
      <4>[  219.784100]  do_syscall_64+0x94/0x530
      <4>[  219.784103]  entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      <3>[  219.784120] Freed by task 12:
      <4>[  219.784141]  save_stack+0x19/0x40
      <4>[  219.784145]  __kasan_slab_free+0x130/0x180
      <4>[  219.784148]  kmem_cache_free_bulk+0x1bd/0x500
      <4>[  219.784152]  kfree_rcu_work+0x1d8/0x890
      <4>[  219.784155]  process_one_work+0x8ca/0x1870
      <4>[  219.784158]  worker_thread+0x82/0xb80
      <4>[  219.784162]  kthread+0x34e/0x420
      <4>[  219.784165]  ret_from_fork+0x3a/0x50
      
      Fixes: 2e46a2a0 ("drm/i915: Use explicit flag to mark unreachable intel_context")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NAkeem G Abodunrin <akeem.g.abodunrin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200428090255.10035-1-chris@chris-wilson.co.uk
      (cherry picked from commit 24aac336)
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      30523408
  25. 30 4月, 2020 1 次提交
    • C
      drm/i915/gt: Keep a no-frills swappable copy of the default context state · be1cb55a
      Chris Wilson 提交于
      We need to keep the default context state around to instantiate new
      contexts (aka golden rendercontext), and we also keep it pinned while
      the engine is active so that we can quickly reset a hanging context.
      However, the default contexts are large enough to merit keeping in
      swappable memory as opposed to kernel memory, so we store them inside
      shmemfs. Currently, we use the normal GEM objects to create the default
      context image, but we can throw away all but the shmemfs file.
      
      This greatly simplifies the tricky power management code which wants to
      run underneath the normal GT locking, and we definitely do not want to
      use any high level objects that may appear to recurse back into the GT.
      Though perhaps the primary advantage of the complex GEM object is that
      we aggressively cache the mapping, but here we are recreating the
      vm_area everytime time we unpark. At the worst, we add a lightweight
      cache, but first find a microbenchmark that is impacted.
      
      Having started to create some utility functions to make working with
      shmemfs objects easier, we can start putting them to wider use, where
      GEM objects are overkill, such as storing persistent error state.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Ramalingam C <ramalingam.c@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk
      be1cb55a
  26. 29 4月, 2020 1 次提交
    • C
      drm/i915: Avoid dereferencing a dead context · 24aac336
      Chris Wilson 提交于
      Once the intel_context is closed, the GEM context may be freed and so
      the link from intel_context.gem_context is invalid.
      
      <3>[  219.782944] BUG: KASAN: use-after-free in intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <3>[  219.782996] Read of size 8 at addr ffff8881d7dff0b8 by task kworker/0:1/12
      
      <4>[  219.783052] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G     U            5.7.0-rc2-g1f3ffd7683d54-kasan_118+ #1
      <4>[  219.783055] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
      <4>[  219.783105] Workqueue: events heartbeat [i915]
      <4>[  219.783109] Call Trace:
      <4>[  219.783113]  <IRQ>
      <4>[  219.783119]  dump_stack+0x96/0xdb
      <4>[  219.783177]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783182]  print_address_description.constprop.6+0x16/0x310
      <4>[  219.783239]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783295]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783300]  __kasan_report+0x137/0x190
      <4>[  219.783359]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783366]  kasan_report+0x32/0x50
      <4>[  219.783426]  intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
      <4>[  219.783481]  execlists_reset+0x39c/0x13d0 [i915]
      <4>[  219.783494]  ? mark_held_locks+0x9e/0xe0
      <4>[  219.783546]  ? execlists_hold+0xfc0/0xfc0 [i915]
      <4>[  219.783551]  ? lockdep_hardirqs_on+0x348/0x5f0
      <4>[  219.783557]  ? _raw_spin_unlock_irqrestore+0x34/0x60
      <4>[  219.783606]  ? execlists_submission_tasklet+0x118/0x3a0 [i915]
      <4>[  219.783615]  tasklet_action_common.isra.14+0x13b/0x410
      <4>[  219.783623]  ? __do_softirq+0x1e4/0x9a7
      <4>[  219.783630]  __do_softirq+0x226/0x9a7
      <4>[  219.783643]  do_softirq_own_stack+0x2a/0x40
      <4>[  219.783647]  </IRQ>
      <4>[  219.783692]  ? heartbeat+0x3e2/0x10f0 [i915]
      <4>[  219.783696]  do_softirq.part.13+0x49/0x50
      <4>[  219.783700]  __local_bh_enable_ip+0x1a2/0x1e0
      <4>[  219.783748]  heartbeat+0x409/0x10f0 [i915]
      <4>[  219.783801]  ? __live_idle_pulse+0x9f0/0x9f0 [i915]
      <4>[  219.783806]  ? lock_acquire+0x1ac/0x8a0
      <4>[  219.783811]  ? process_one_work+0x811/0x1870
      <4>[  219.783827]  ? rcu_read_lock_sched_held+0x9c/0xd0
      <4>[  219.783832]  ? rcu_read_lock_bh_held+0xb0/0xb0
      <4>[  219.783836]  ? _raw_spin_unlock_irq+0x1f/0x40
      <4>[  219.783845]  process_one_work+0x8ca/0x1870
      <4>[  219.783848]  ? lock_acquire+0x1ac/0x8a0
      <4>[  219.783852]  ? worker_thread+0x1d0/0xb80
      <4>[  219.783864]  ? pwq_dec_nr_in_flight+0x2c0/0x2c0
      <4>[  219.783870]  ? do_raw_spin_lock+0x129/0x290
      <4>[  219.783886]  worker_thread+0x82/0xb80
      <4>[  219.783895]  ? __kthread_parkme+0xaf/0x1b0
      <4>[  219.783902]  ? process_one_work+0x1870/0x1870
      <4>[  219.783906]  kthread+0x34e/0x420
      <4>[  219.783911]  ? kthread_create_on_node+0xc0/0xc0
      <4>[  219.783918]  ret_from_fork+0x3a/0x50
      
      <3>[  219.783950] Allocated by task 1264:
      <4>[  219.783975]  save_stack+0x19/0x40
      <4>[  219.783978]  __kasan_kmalloc.constprop.3+0xa0/0xd0
      <4>[  219.784029]  i915_gem_create_context+0xa2/0xab8 [i915]
      <4>[  219.784081]  i915_gem_context_create_ioctl+0x1fa/0x450 [i915]
      <4>[  219.784085]  drm_ioctl_kernel+0x1d8/0x270
      <4>[  219.784088]  drm_ioctl+0x676/0x930
      <4>[  219.784092]  ksys_ioctl+0xb7/0xe0
      <4>[  219.784096]  __x64_sys_ioctl+0x6a/0xb0
      <4>[  219.784100]  do_syscall_64+0x94/0x530
      <4>[  219.784103]  entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      <3>[  219.784120] Freed by task 12:
      <4>[  219.784141]  save_stack+0x19/0x40
      <4>[  219.784145]  __kasan_slab_free+0x130/0x180
      <4>[  219.784148]  kmem_cache_free_bulk+0x1bd/0x500
      <4>[  219.784152]  kfree_rcu_work+0x1d8/0x890
      <4>[  219.784155]  process_one_work+0x8ca/0x1870
      <4>[  219.784158]  worker_thread+0x82/0xb80
      <4>[  219.784162]  kthread+0x34e/0x420
      <4>[  219.784165]  ret_from_fork+0x3a/0x50
      
      Fixes: 2e46a2a0 ("drm/i915: Use explicit flag to mark unreachable intel_context")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NAkeem G Abodunrin <akeem.g.abodunrin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200428090255.10035-1-chris@chris-wilson.co.uk
      24aac336
  27. 25 4月, 2020 1 次提交
  28. 08 4月, 2020 1 次提交
  29. 18 2月, 2020 1 次提交
  30. 17 2月, 2020 1 次提交