1. 17 5月, 2017 1 次提交
    • C
      drm/i915: Split execlist priority queue into rbtree + linked list · 6c067579
      Chris Wilson 提交于
      All the requests at the same priority are executed in FIFO order. They
      do not need to be stored in the rbtree themselves, as they are a simple
      list within a level. If we move the requests at one priority into a list,
      we can then reduce the rbtree to the set of priorities. This should keep
      the height of the rbtree small, as the number of active priorities can not
      exceed the number of active requests and should be typically only a few.
      
      Currently, we have ~2k possible different priority levels, that may
      increase to allow even more fine grained selection. Allocating those in
      advance seems a waste (and may be impossible), so we opt for allocating
      upon first use, and freeing after its requests are depleted. To avoid
      the possibility of an allocation failure causing us to lose a request,
      we preallocate the default priority (0) and bump any request to that
      priority if we fail to allocate it the appropriate plist. Having a
      request (that is ready to run, so not leading to corruption) execute
      out-of-order is better than leaking the request (and its dependency
      tree) entirely.
      
      There should be a benefit to reducing execlists_dequeue() to principally
      using a simple list (and reducing the frequency of both rbtree iteration
      and balancing on erase) but for typical workloads, request coalescing
      should be small enough that we don't notice any change. The main gain is
      from improving PI calls to schedule, and the explicit list within a
      level should make request unwinding simpler (we just need to insert at
      the head of the list rather than the tail and not have to make the
      rbtree search more complicated).
      
      v2: Avoid use-after-free when deleting a depleted priolist
      
      v3: Michał found the solution to handling the allocation failure
      gracefully. If we disable all priority scheduling following the
      allocation failure, those requests will be executed in fifo and we will
      ensure that this request and its dependencies are in strict fifo (even
      when it doesn't realise it is only a single list). Normal scheduling is
      restored once we know the device is idle, until the next failure!
      Suggested-by: NMichał Wajdeczko <michal.wajdeczko@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-8-chris@chris-wilson.co.uk
      6c067579
  2. 04 5月, 2017 1 次提交
  3. 03 5月, 2017 6 次提交
  4. 26 4月, 2017 2 次提交
    • C
      drm/i915: Confirm the request is still active before adding it to the await · 88326ef0
      Chris Wilson 提交于
      Although we do check the completion-status of the request before
      actually adding a wait on it (either to its submit fence or its
      completion dma-fence), we currently do not check before adding it to the
      dependency lists.
      
      In fact, without checking for a completed request we may try to use the
      signaler after it has been retired and its dependency tree freed:
      
      [   60.044057] BUG: KASAN: use-after-free in __list_add_valid+0x1d/0xd0 at addr ffff880348c9e6a0
      [   60.044118] Read of size 8 by task gem_exec_fence/530
      [   60.044164] CPU: 1 PID: 530 Comm: gem_exec_fence Tainted: G            E   4.11.0-rc7+ #46
      [   60.044226] Hardware name: ��������������������������������� ���������������������������������/���������������������������������, BIOS RYBDWi35.86A.0246.2
      [   60.044290] Call Trace:
      [   60.044337]  dump_stack+0x4d/0x6a
      [   60.044383]  kasan_object_err+0x21/0x70
      [   60.044435]  kasan_report+0x225/0x4e0
      [   60.044488]  ? __list_add_valid+0x1d/0xd0
      [   60.044534]  ? kasan_kmalloc+0xad/0xe0
      [   60.044587]  __asan_load8+0x5e/0x70
      [   60.044639]  __list_add_valid+0x1d/0xd0
      [   60.044788]  __i915_priotree_add_dependency+0x67/0x130 [i915]
      [   60.044895]  i915_gem_request_await_request+0xa8/0x370 [i915]
      [   60.044974]  i915_gem_request_await_dma_fence+0x129/0x140 [i915]
      [   60.045049]  i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
      [   60.045077]  ? save_stack+0xb1/0xd0
      [   60.045105]  ? save_stack_trace+0x1b/0x20
      [   60.045132]  ? save_stack+0x46/0xd0
      [   60.045158]  ? kasan_kmalloc+0xad/0xe0
      [   60.045184]  ? __kmalloc+0xd8/0x670
      [   60.045229]  ? drm_ioctl+0x359/0x640 [drm]
      [   60.045256]  ? SyS_ioctl+0x41/0x70
      [   60.045330]  ? i915_vma_move_to_active+0x540/0x540 [i915]
      [   60.045360]  ? tty_insert_flip_string_flags+0xa1/0xf0
      [   60.045387]  ? tty_flip_buffer_push+0x63/0x70
      [   60.045414]  ? remove_wait_queue+0xa9/0xc0
      [   60.045441]  ? kasan_unpoison_shadow+0x35/0x50
      [   60.045467]  ? kasan_kmalloc+0xad/0xe0
      [   60.045494]  ? kasan_check_write+0x14/0x20
      [   60.045568]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.045616]  drm_ioctl+0x359/0x640 [drm]
      [   60.045705]  ? i915_gem_execbuffer+0x5a0/0x5a0 [i915]
      [   60.045751]  ? drm_version+0x150/0x150 [drm]
      [   60.045778]  ? compat_start_thread+0x60/0x60
      [   60.045805]  ? plist_del+0xda/0x1a0
      [   60.045833]  do_vfs_ioctl+0x12e/0x910
      [   60.045860]  ? ioctl_preallocate+0x130/0x130
      [   60.045886]  ? pci_mmcfg_check_reserved+0xc0/0xc0
      [   60.045913]  ? vfs_write+0x196/0x240
      [   60.045939]  ? __fget_light+0xa7/0xc0
      [   60.045965]  SyS_ioctl+0x41/0x70
      [   60.045991]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.046017] RIP: 0033:0x7feb2baefc47
      [   60.046042] RSP: 002b:00007fff56d28e58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [   60.046075] RAX: ffffffffffffffda RBX: 00007fff56d290a8 RCX: 00007feb2baefc47
      [   60.046102] RDX: 00007fff56d29050 RSI: 00000000c0406469 RDI: 0000000000000003
      [   60.046129] RBP: 00007fff56d29050 R08: 000055ecc4cd27d0 R09: 00007feb2bda8600
      [   60.046154] R10: 0000000000000073 R11: 0000000000000246 R12: 00000000c0406469
      [   60.046177] R13: 0000000000000003 R14: 000000000000000f R15: 0000000000000099
      [   60.046203] Object at ffff880348c9e680, in cache i915_dependency size: 64
      [   60.046225] Allocated:
      [   60.046246] PID = 530
      [   60.046269]  save_stack_trace+0x1b/0x20
      [   60.046292]  save_stack+0x46/0xd0
      [   60.046318]  kasan_kmalloc+0xad/0xe0
      [   60.046343]  kasan_slab_alloc+0x12/0x20
      [   60.046368]  kmem_cache_alloc+0xab/0x650
      [   60.046445]  i915_gem_request_await_request+0x88/0x370 [i915]
      [   60.046559]  i915_gem_request_await_dma_fence+0x129/0x140 [i915]
      [   60.046705]  i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
      [   60.046849]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.046936]  drm_ioctl+0x359/0x640 [drm]
      [   60.046987]  do_vfs_ioctl+0x12e/0x910
      [   60.047038]  SyS_ioctl+0x41/0x70
      [   60.047090]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.047139] Freed:
      [   60.047179] PID = 530
      [   60.047223]  save_stack_trace+0x1b/0x20
      [   60.047269]  save_stack+0x46/0xd0
      [   60.047317]  kasan_slab_free+0x72/0xc0
      [   60.047366]  kmem_cache_free+0x39/0x160
      [   60.047512]  i915_gem_request_retire+0x83f/0x930 [i915]
      [   60.047657]  i915_gem_request_alloc+0x166/0x600 [i915]
      [   60.047799]  i915_gem_do_execbuffer.isra.37+0xad8/0x26b0 [i915]
      [   60.047897]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.047942]  drm_ioctl+0x359/0x640 [drm]
      [   60.047968]  do_vfs_ioctl+0x12e/0x910
      [   60.047993]  SyS_ioctl+0x41/0x70
      [   60.048019]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.048044] Memory state around the buggy address:
      [   60.048066]  ffff880348c9e580: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048105]  ffff880348c9e600: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048138] >ffff880348c9e680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [   60.048170]                                ^
      [   60.048191]  ffff880348c9e700: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048225]  ffff880348c9e780: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      
      Note to hit the use-after-free requires us to be passed back a request
      via a fence-array, that is from explicit fencing accumulated into a
      sync-file fence-array.
      
      Fixes: 52e54209 ("drm/i915/scheduler: Record all dependencies upon request construction")
      Testcase: igt/gem_exec_fence/expired-history
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170422081537.6468-1-chris@chris-wilson.co.uk
      (cherry picked from commit ade0b0c9)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      88326ef0
    • C
      drm/i915: Skip waking the signaler when enabling before request submission · f7b02a52
      Chris Wilson 提交于
      If we are enabling the breadcrumbs signaling prior to submitting the
      request, we know that we cannot have missed the interrupt and can
      therefore skip immediately waking the signaler to check.
      
      This reduces a significant chunk of the __i915_gem_request_submit()
      overhead for inter-engine synchronisation, for example in gem_exec_whisper.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170426080659.28771-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      f7b02a52
  5. 25 4月, 2017 1 次提交
  6. 22 4月, 2017 1 次提交
    • C
      drm/i915: Confirm the request is still active before adding it to the await · ade0b0c9
      Chris Wilson 提交于
      Although we do check the completion-status of the request before
      actually adding a wait on it (either to its submit fence or its
      completion dma-fence), we currently do not check before adding it to the
      dependency lists.
      
      In fact, without checking for a completed request we may try to use the
      signaler after it has been retired and its dependency tree freed:
      
      [   60.044057] BUG: KASAN: use-after-free in __list_add_valid+0x1d/0xd0 at addr ffff880348c9e6a0
      [   60.044118] Read of size 8 by task gem_exec_fence/530
      [   60.044164] CPU: 1 PID: 530 Comm: gem_exec_fence Tainted: G            E   4.11.0-rc7+ #46
      [   60.044226] Hardware name: ��������������������������������� ���������������������������������/���������������������������������, BIOS RYBDWi35.86A.0246.2
      [   60.044290] Call Trace:
      [   60.044337]  dump_stack+0x4d/0x6a
      [   60.044383]  kasan_object_err+0x21/0x70
      [   60.044435]  kasan_report+0x225/0x4e0
      [   60.044488]  ? __list_add_valid+0x1d/0xd0
      [   60.044534]  ? kasan_kmalloc+0xad/0xe0
      [   60.044587]  __asan_load8+0x5e/0x70
      [   60.044639]  __list_add_valid+0x1d/0xd0
      [   60.044788]  __i915_priotree_add_dependency+0x67/0x130 [i915]
      [   60.044895]  i915_gem_request_await_request+0xa8/0x370 [i915]
      [   60.044974]  i915_gem_request_await_dma_fence+0x129/0x140 [i915]
      [   60.045049]  i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
      [   60.045077]  ? save_stack+0xb1/0xd0
      [   60.045105]  ? save_stack_trace+0x1b/0x20
      [   60.045132]  ? save_stack+0x46/0xd0
      [   60.045158]  ? kasan_kmalloc+0xad/0xe0
      [   60.045184]  ? __kmalloc+0xd8/0x670
      [   60.045229]  ? drm_ioctl+0x359/0x640 [drm]
      [   60.045256]  ? SyS_ioctl+0x41/0x70
      [   60.045330]  ? i915_vma_move_to_active+0x540/0x540 [i915]
      [   60.045360]  ? tty_insert_flip_string_flags+0xa1/0xf0
      [   60.045387]  ? tty_flip_buffer_push+0x63/0x70
      [   60.045414]  ? remove_wait_queue+0xa9/0xc0
      [   60.045441]  ? kasan_unpoison_shadow+0x35/0x50
      [   60.045467]  ? kasan_kmalloc+0xad/0xe0
      [   60.045494]  ? kasan_check_write+0x14/0x20
      [   60.045568]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.045616]  drm_ioctl+0x359/0x640 [drm]
      [   60.045705]  ? i915_gem_execbuffer+0x5a0/0x5a0 [i915]
      [   60.045751]  ? drm_version+0x150/0x150 [drm]
      [   60.045778]  ? compat_start_thread+0x60/0x60
      [   60.045805]  ? plist_del+0xda/0x1a0
      [   60.045833]  do_vfs_ioctl+0x12e/0x910
      [   60.045860]  ? ioctl_preallocate+0x130/0x130
      [   60.045886]  ? pci_mmcfg_check_reserved+0xc0/0xc0
      [   60.045913]  ? vfs_write+0x196/0x240
      [   60.045939]  ? __fget_light+0xa7/0xc0
      [   60.045965]  SyS_ioctl+0x41/0x70
      [   60.045991]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.046017] RIP: 0033:0x7feb2baefc47
      [   60.046042] RSP: 002b:00007fff56d28e58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [   60.046075] RAX: ffffffffffffffda RBX: 00007fff56d290a8 RCX: 00007feb2baefc47
      [   60.046102] RDX: 00007fff56d29050 RSI: 00000000c0406469 RDI: 0000000000000003
      [   60.046129] RBP: 00007fff56d29050 R08: 000055ecc4cd27d0 R09: 00007feb2bda8600
      [   60.046154] R10: 0000000000000073 R11: 0000000000000246 R12: 00000000c0406469
      [   60.046177] R13: 0000000000000003 R14: 000000000000000f R15: 0000000000000099
      [   60.046203] Object at ffff880348c9e680, in cache i915_dependency size: 64
      [   60.046225] Allocated:
      [   60.046246] PID = 530
      [   60.046269]  save_stack_trace+0x1b/0x20
      [   60.046292]  save_stack+0x46/0xd0
      [   60.046318]  kasan_kmalloc+0xad/0xe0
      [   60.046343]  kasan_slab_alloc+0x12/0x20
      [   60.046368]  kmem_cache_alloc+0xab/0x650
      [   60.046445]  i915_gem_request_await_request+0x88/0x370 [i915]
      [   60.046559]  i915_gem_request_await_dma_fence+0x129/0x140 [i915]
      [   60.046705]  i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
      [   60.046849]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.046936]  drm_ioctl+0x359/0x640 [drm]
      [   60.046987]  do_vfs_ioctl+0x12e/0x910
      [   60.047038]  SyS_ioctl+0x41/0x70
      [   60.047090]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.047139] Freed:
      [   60.047179] PID = 530
      [   60.047223]  save_stack_trace+0x1b/0x20
      [   60.047269]  save_stack+0x46/0xd0
      [   60.047317]  kasan_slab_free+0x72/0xc0
      [   60.047366]  kmem_cache_free+0x39/0x160
      [   60.047512]  i915_gem_request_retire+0x83f/0x930 [i915]
      [   60.047657]  i915_gem_request_alloc+0x166/0x600 [i915]
      [   60.047799]  i915_gem_do_execbuffer.isra.37+0xad8/0x26b0 [i915]
      [   60.047897]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.047942]  drm_ioctl+0x359/0x640 [drm]
      [   60.047968]  do_vfs_ioctl+0x12e/0x910
      [   60.047993]  SyS_ioctl+0x41/0x70
      [   60.048019]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.048044] Memory state around the buggy address:
      [   60.048066]  ffff880348c9e580: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048105]  ffff880348c9e600: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048138] >ffff880348c9e680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [   60.048170]                                ^
      [   60.048191]  ffff880348c9e700: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048225]  ffff880348c9e780: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      
      Note to hit the use-after-free requires us to be passed back a request
      via a fence-array, that is from explicit fencing accumulated into a
      sync-file fence-array.
      
      Fixes: 52e54209 ("drm/i915/scheduler: Record all dependencies upon request construction")
      Testcase: igt/gem_exec_fence/expired-history
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170422081537.6468-1-chris@chris-wilson.co.uk
      ade0b0c9
  7. 15 4月, 2017 1 次提交
  8. 07 4月, 2017 2 次提交
  9. 31 3月, 2017 4 次提交
  10. 30 3月, 2017 1 次提交
  11. 21 3月, 2017 1 次提交
  12. 17 3月, 2017 1 次提交
  13. 03 3月, 2017 2 次提交
  14. 02 3月, 2017 4 次提交
  15. 28 2月, 2017 1 次提交
    • C
      drm/i915: Signal first fence from irq handler if complete · 56299fb7
      Chris Wilson 提交于
      As execlists and other non-semaphore multi-engine devices coordinate
      between engines using interrupts, we can shave off a few 10s of
      microsecond of scheduling latency by doing the fence signaling from the
      interrupt as opposed to a RT kthread. (Realistically the delay adds
      about 1% to an individual cross-engine workload.) We only signal the
      first fence in order to limit the amount of work we move into the
      interrupt handler. We also have to remember that our breadcrumbs may be
      unordered with respect to the interrupt and so we still require the
      waiter process to perform some heavyweight coherency fixups, as well as
      traversing the tree of waiters.
      
      v2: No need for early exit in irq handler - it breaks the flow between
      patches and prevents the tracepoint
      v3: Restore rcu hold across irq signaling of request
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-2-chris@chris-wilson.co.uk
      56299fb7
  16. 23 2月, 2017 11 次提交