1. 08 12月, 2017 1 次提交
    • T
      drm/i915: Restore GT performance in headless mode with DMC loaded · b6876374
      Tvrtko Ursulin 提交于
      It seems that the DMC likes to transition between the DC states a lot when
      there are no connected displays (no active power domains) during command
      submission.
      
      This activity on DC states has a negative impact on the performance of the
      chip with huge latencies observed in the interrupt handlers and elsewhere.
      Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
      eight.
      
      Work around it by introducing a new power domain named,
      POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
      held for the duration of command submission activity.
      
      CNL has the same problem which will be addressed as a follow-up. Doing
      that requires a fix for a DC6 context corruption problem in the CNL DMC
      firmware which is yet to be released.
      
      v2:
       * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
       * Protect macro body with braces. (Jani Nikula)
      
      v3:
       * Add dedicated power domain for clarity. (Chris, Imre)
       * Commit message and comment text updates.
       * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
         firmware release.
      
      v4:
       * Power domain should be inner to device runtime pm. (Chris)
       * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
       * Handle async DMC loading by moving the GT_IRQ power domain logic into
         intel_runtime_pm. (Daniel, Chris)
       * Include small core GEN9 as well. (Imre)
      
      v5
       * Special handling for async DMC load is not needed since on failure the
         power domain reference is kept permanently taken. (Imre)
      
      v6:
       * Drop the NEEDS_CSR_GT_PERF_WA macro since all firmwares have now been
         deployed. (Imre, Chris)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
      Testcase: igt/gem_exec_nop/headless
      Cc: Imre Deak <imre.deak@intel.com>
      Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v5)
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      [Imre: Add note about applying the WA on CNL as a follow-up]
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171205132854.26380-1-tvrtko.ursulin@linux.intel.com
      b6876374
  2. 24 11月, 2017 1 次提交
  3. 22 11月, 2017 1 次提交
  4. 20 11月, 2017 2 次提交
  5. 26 10月, 2017 1 次提交
  6. 11 10月, 2017 2 次提交
    • D
      drm/i915: Use rcu instead of stop_machine in set_wedged · af7a8ffa
      Daniel Vetter 提交于
      stop_machine is not really a locking primitive we should use, except
      when the hw folks tell us the hw is broken and that's the only way to
      work around it.
      
      This patch tries to address the locking abuse of stop_machine() from
      
      commit 20e4933c
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Tue Nov 22 14:41:21 2016 +0000
      
          drm/i915: Stop the machine as we install the wedged submit_request handler
      
      Chris said parts of the reasons for going with stop_machine() was that
      it's no overhead for the fast-path. But these callbacks use irqsave
      spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.
      
      To stay as close as possible to the stop_machine semantics we first
      update all the submit function pointers to the nop handler, then call
      synchronize_rcu() to make sure no new requests can be submitted. This
      should give us exactly the huge barrier we want.
      
      I pondered whether we should annotate engine->submit_request as __rcu
      and use rcu_assign_pointer and rcu_dereference on it. But the reason
      behind those is to make sure the compiler/cpu barriers are there for
      when you have an actual data structure you point at, to make sure all
      the writes are seen correctly on the read side. But we just have a
      function pointer, and .text isn't changed, so no need for these
      barriers and hence no need for annotations.
      
      Unfortunately there's a complication with the call to
      intel_engine_init_global_seqno:
      
      - Without stop_machine we must hold the corresponding spinlock.
      
      - Without stop_machine we must ensure that all requests are marked as
        having failed with dma_fence_set_error() before we call it. That
        means we need to split the nop request submission into two phases,
        both synchronized with rcu:
      
        1. Only stop submitting the requests to hw and mark them as failed.
      
        2. After all pending requests in the scheduler/ring are suitably
        marked up as failed and we can force complete them all, also force
        complete by calling intel_engine_init_global_seqno().
      
      This should fix the followwing lockdep splat:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G     U
      ------------------------------------------------------
      kworker/3:4/562 is trying to acquire lock:
       (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8113d4bc>] stop_machine+0x1c/0x40
      
      but task is already holding lock:
       (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #6 (&dev->struct_mutex){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             __mutex_lock+0x86/0x9b0
             mutex_lock_interruptible_nested+0x1b/0x20
             i915_mutex_lock_interruptible+0x51/0x130 [i915]
             i915_gem_fault+0x209/0x650 [i915]
             __do_fault+0x1e/0x80
             __handle_mm_fault+0xa08/0xed0
             handle_mm_fault+0x156/0x300
             __do_page_fault+0x2c5/0x570
             do_page_fault+0x28/0x250
             page_fault+0x22/0x30
      
      -> #5 (&mm->mmap_sem){++++}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             __might_fault+0x68/0x90
             _copy_to_user+0x23/0x70
             filldir+0xa5/0x120
             dcache_readdir+0xf9/0x170
             iterate_dir+0x69/0x1a0
             SyS_getdents+0xa5/0x140
             entry_SYSCALL_64_fastpath+0x1c/0xb1
      
      -> #4 (&sb->s_type->i_mutex_key#5){++++}:
             down_write+0x3b/0x70
             handle_create+0xcb/0x1e0
             devtmpfsd+0x139/0x180
             kthread+0x152/0x190
             ret_from_fork+0x27/0x40
      
      -> #3 ((complete)&req.done){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             wait_for_common+0x58/0x210
             wait_for_completion+0x1d/0x20
             devtmpfs_create_node+0x13d/0x160
             device_add+0x5eb/0x620
             device_create_groups_vargs+0xe0/0xf0
             device_create+0x3a/0x40
             msr_device_create+0x2b/0x40
             cpuhp_invoke_callback+0xc9/0xbf0
             cpuhp_thread_fun+0x17b/0x240
             smpboot_thread_fn+0x18a/0x280
             kthread+0x152/0x190
             ret_from_fork+0x27/0x40
      
      -> #2 (cpuhp_state-up){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             cpuhp_issue_call+0x133/0x1c0
             __cpuhp_setup_state_cpuslocked+0x139/0x2a0
             __cpuhp_setup_state+0x46/0x60
             page_writeback_init+0x43/0x67
             pagecache_init+0x3d/0x42
             start_kernel+0x3a8/0x3fc
             x86_64_start_reservations+0x2a/0x2c
             x86_64_start_kernel+0x6d/0x70
             verify_cpu+0x0/0xfb
      
      -> #1 (cpuhp_state_mutex){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             __mutex_lock+0x86/0x9b0
             mutex_lock_nested+0x1b/0x20
             __cpuhp_setup_state_cpuslocked+0x53/0x2a0
             __cpuhp_setup_state+0x46/0x60
             page_alloc_init+0x28/0x30
             start_kernel+0x145/0x3fc
             x86_64_start_reservations+0x2a/0x2c
             x86_64_start_kernel+0x6d/0x70
             verify_cpu+0x0/0xfb
      
      -> #0 (cpu_hotplug_lock.rw_sem){++++}:
             check_prev_add+0x430/0x840
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             cpus_read_lock+0x3d/0xb0
             stop_machine+0x1c/0x40
             i915_gem_set_wedged+0x1a/0x20 [i915]
             i915_reset+0xb9/0x230 [i915]
             i915_reset_device+0x1f6/0x260 [i915]
             i915_handle_error+0x2d8/0x430 [i915]
             hangcheck_declare_hang+0xd3/0xf0 [i915]
             i915_hangcheck_elapsed+0x262/0x2d0 [i915]
             process_one_work+0x233/0x660
             worker_thread+0x4e/0x3b0
             kthread+0x152/0x190
             ret_from_fork+0x27/0x40
      
      other info that might help us debug this:
      
      Chain exists of:
        cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&dev->struct_mutex);
                                     lock(&mm->mmap_sem);
                                     lock(&dev->struct_mutex);
        lock(cpu_hotplug_lock.rw_sem);
      
       *** DEADLOCK ***
      
      3 locks held by kworker/3:4/562:
       #0:  ("events_long"){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
       #1:  ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
       #2:  (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]
      
      stack backtrace:
      CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G     U          4.14.0-rc3-CI-CI_DRM_3179+ #1
      Hardware name:                  /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017
      Workqueue: events_long i915_hangcheck_elapsed [i915]
      Call Trace:
       dump_stack+0x68/0x9f
       print_circular_bug+0x235/0x3c0
       ? lockdep_init_map_crosslock+0x20/0x20
       check_prev_add+0x430/0x840
       ? irq_work_queue+0x86/0xe0
       ? wake_up_klogd+0x53/0x70
       __lock_acquire+0x1420/0x15e0
       ? __lock_acquire+0x1420/0x15e0
       ? lockdep_init_map_crosslock+0x20/0x20
       lock_acquire+0xb0/0x200
       ? stop_machine+0x1c/0x40
       ? i915_gem_object_truncate+0x50/0x50 [i915]
       cpus_read_lock+0x3d/0xb0
       ? stop_machine+0x1c/0x40
       stop_machine+0x1c/0x40
       i915_gem_set_wedged+0x1a/0x20 [i915]
       i915_reset+0xb9/0x230 [i915]
       i915_reset_device+0x1f6/0x260 [i915]
       ? gen8_gt_irq_ack+0x170/0x170 [i915]
       ? work_on_cpu_safe+0x60/0x60
       i915_handle_error+0x2d8/0x430 [i915]
       ? vsnprintf+0xd1/0x4b0
       ? scnprintf+0x3a/0x70
       hangcheck_declare_hang+0xd3/0xf0 [i915]
       ? intel_runtime_pm_put+0x56/0xa0 [i915]
       i915_hangcheck_elapsed+0x262/0x2d0 [i915]
       process_one_work+0x233/0x660
       worker_thread+0x4e/0x3b0
       kthread+0x152/0x190
       ? process_one_work+0x660/0x660
       ? kthread_create_on_node+0x40/0x40
       ret_from_fork+0x27/0x40
      Setting dangerous option reset - tainting kernel
      i915 0000:00:02.0: Resetting chip after gpu hang
      Setting dangerous option reset - tainting kernel
      i915 0000:00:02.0: Resetting chip after gpu hang
      
      v2: Have 1 global synchronize_rcu() barrier across all engines, and
      improve commit message.
      
      v3: We need to protect the seqno update with the timeline spinlock (in
      set_wedged) to avoid racing with other updates of the seqno, like we
      already do in nop_submit_request (Chris).
      
      v4: Use two-phase sequence to plug the race Chris spotted where we can
      complete requests before they're marked up with -EIO.
      
      v5: Review from Chris:
      - simplify nop_submit_request.
      - Add comment to rcu_read_lock section.
      - Align comments with the new style.
      
      v6: Remove unused variable to appease CI.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102886
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103096
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Marta Lofstedt <marta.lofstedt@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171011091019.1425-1-daniel.vetter@ffwll.ch
      af7a8ffa
    • S
      drm/i915: Name structure in dev_priv that contains RPS/RC6 state as "gt_pm" · 562d9bae
      Sagar Arun Kamble 提交于
      Prepared substructure rps for RPS related state. autoenable_work is
      used for RC6 too hence it is defined outside rps structure. As we do
      this lot many functions are refactored to use intel_rps *rps to access
      rps related members. Hence renamed intel_rps_client pointer variables
      to rps_client in various functions.
      
      v2: Rebase.
      
      v3: s/pm/gt_pm (Chris)
      Refactored access to rps structure by declaring struct intel_rps * in
      many functions.
      Signed-off-by: NSagar Arun Kamble <sagar.a.kamble@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com> #1
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-9-git-send-email-sagar.a.kamble@intel.comAcked-by: NImre Deak <imre.deak@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-8-chris@chris-wilson.co.uk
      562d9bae
  7. 05 10月, 2017 1 次提交
  8. 29 9月, 2017 1 次提交
  9. 23 9月, 2017 1 次提交
  10. 22 9月, 2017 1 次提交
  11. 18 8月, 2017 1 次提交
  12. 27 7月, 2017 3 次提交
  13. 28 6月, 2017 1 次提交
    • C
      drm/i915: Avoid keeping waitboost active for signaling threads · 7b92c1bd
      Chris Wilson 提交于
      Once a client has requested a waitboost, we keep that waitboost active
      until all clients are no longer waiting. This is because we don't
      distinguish which waiter deserves the boost. However, with the advent of
      fence signaling, the signaler threads appear as waiters to the RPS
      interrupt handler. So instead of using a single boolean to track when to
      keep the waitboost active, use a counter of all outstanding waitboosted
      requests.
      
      At this point, I have removed all vestiges of the rate limiting on
      clients. Whilst this means that compositors should remain more fluid,
      it also means that boosts are more prevalent. See commit b29c19b6
      ("drm/i915: Boost RPS frequency for CPU stalls") for a longer discussion
      on the pros and cons of both approaches.
      
      A drawback of this implementation is that it requires constant request
      submission to keep the waitboost trimmed (as it is now cancelled when the
      request is completed). This will be fine for a busy system, but near
      idle the boosts may be kept for longer than desired (effectively tens of
      vblanks worstcase) and there is a reliance on rc6 instead.
      
      v2: Remove defunct rps.client_lock
      Reported-by: NMichał Winiarski <michal.winiarski@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170628123548.9236-1-chris@chris-wilson.co.uk
      7b92c1bd
  14. 19 6月, 2017 1 次提交
  15. 08 6月, 2017 2 次提交
  16. 23 5月, 2017 1 次提交
  17. 17 5月, 2017 1 次提交
    • C
      drm/i915: Split execlist priority queue into rbtree + linked list · 6c067579
      Chris Wilson 提交于
      All the requests at the same priority are executed in FIFO order. They
      do not need to be stored in the rbtree themselves, as they are a simple
      list within a level. If we move the requests at one priority into a list,
      we can then reduce the rbtree to the set of priorities. This should keep
      the height of the rbtree small, as the number of active priorities can not
      exceed the number of active requests and should be typically only a few.
      
      Currently, we have ~2k possible different priority levels, that may
      increase to allow even more fine grained selection. Allocating those in
      advance seems a waste (and may be impossible), so we opt for allocating
      upon first use, and freeing after its requests are depleted. To avoid
      the possibility of an allocation failure causing us to lose a request,
      we preallocate the default priority (0) and bump any request to that
      priority if we fail to allocate it the appropriate plist. Having a
      request (that is ready to run, so not leading to corruption) execute
      out-of-order is better than leaking the request (and its dependency
      tree) entirely.
      
      There should be a benefit to reducing execlists_dequeue() to principally
      using a simple list (and reducing the frequency of both rbtree iteration
      and balancing on erase) but for typical workloads, request coalescing
      should be small enough that we don't notice any change. The main gain is
      from improving PI calls to schedule, and the explicit list within a
      level should make request unwinding simpler (we just need to insert at
      the head of the list rather than the tail and not have to make the
      rbtree search more complicated).
      
      v2: Avoid use-after-free when deleting a depleted priolist
      
      v3: Michał found the solution to handling the allocation failure
      gracefully. If we disable all priority scheduling following the
      allocation failure, those requests will be executed in fifo and we will
      ensure that this request and its dependencies are in strict fifo (even
      when it doesn't realise it is only a single list). Normal scheduling is
      restored once we know the device is idle, until the next failure!
      Suggested-by: NMichał Wajdeczko <michal.wajdeczko@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-8-chris@chris-wilson.co.uk
      6c067579
  18. 04 5月, 2017 1 次提交
  19. 03 5月, 2017 6 次提交
  20. 26 4月, 2017 2 次提交
    • C
      drm/i915: Confirm the request is still active before adding it to the await · 88326ef0
      Chris Wilson 提交于
      Although we do check the completion-status of the request before
      actually adding a wait on it (either to its submit fence or its
      completion dma-fence), we currently do not check before adding it to the
      dependency lists.
      
      In fact, without checking for a completed request we may try to use the
      signaler after it has been retired and its dependency tree freed:
      
      [   60.044057] BUG: KASAN: use-after-free in __list_add_valid+0x1d/0xd0 at addr ffff880348c9e6a0
      [   60.044118] Read of size 8 by task gem_exec_fence/530
      [   60.044164] CPU: 1 PID: 530 Comm: gem_exec_fence Tainted: G            E   4.11.0-rc7+ #46
      [   60.044226] Hardware name: ��������������������������������� ���������������������������������/���������������������������������, BIOS RYBDWi35.86A.0246.2
      [   60.044290] Call Trace:
      [   60.044337]  dump_stack+0x4d/0x6a
      [   60.044383]  kasan_object_err+0x21/0x70
      [   60.044435]  kasan_report+0x225/0x4e0
      [   60.044488]  ? __list_add_valid+0x1d/0xd0
      [   60.044534]  ? kasan_kmalloc+0xad/0xe0
      [   60.044587]  __asan_load8+0x5e/0x70
      [   60.044639]  __list_add_valid+0x1d/0xd0
      [   60.044788]  __i915_priotree_add_dependency+0x67/0x130 [i915]
      [   60.044895]  i915_gem_request_await_request+0xa8/0x370 [i915]
      [   60.044974]  i915_gem_request_await_dma_fence+0x129/0x140 [i915]
      [   60.045049]  i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
      [   60.045077]  ? save_stack+0xb1/0xd0
      [   60.045105]  ? save_stack_trace+0x1b/0x20
      [   60.045132]  ? save_stack+0x46/0xd0
      [   60.045158]  ? kasan_kmalloc+0xad/0xe0
      [   60.045184]  ? __kmalloc+0xd8/0x670
      [   60.045229]  ? drm_ioctl+0x359/0x640 [drm]
      [   60.045256]  ? SyS_ioctl+0x41/0x70
      [   60.045330]  ? i915_vma_move_to_active+0x540/0x540 [i915]
      [   60.045360]  ? tty_insert_flip_string_flags+0xa1/0xf0
      [   60.045387]  ? tty_flip_buffer_push+0x63/0x70
      [   60.045414]  ? remove_wait_queue+0xa9/0xc0
      [   60.045441]  ? kasan_unpoison_shadow+0x35/0x50
      [   60.045467]  ? kasan_kmalloc+0xad/0xe0
      [   60.045494]  ? kasan_check_write+0x14/0x20
      [   60.045568]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.045616]  drm_ioctl+0x359/0x640 [drm]
      [   60.045705]  ? i915_gem_execbuffer+0x5a0/0x5a0 [i915]
      [   60.045751]  ? drm_version+0x150/0x150 [drm]
      [   60.045778]  ? compat_start_thread+0x60/0x60
      [   60.045805]  ? plist_del+0xda/0x1a0
      [   60.045833]  do_vfs_ioctl+0x12e/0x910
      [   60.045860]  ? ioctl_preallocate+0x130/0x130
      [   60.045886]  ? pci_mmcfg_check_reserved+0xc0/0xc0
      [   60.045913]  ? vfs_write+0x196/0x240
      [   60.045939]  ? __fget_light+0xa7/0xc0
      [   60.045965]  SyS_ioctl+0x41/0x70
      [   60.045991]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.046017] RIP: 0033:0x7feb2baefc47
      [   60.046042] RSP: 002b:00007fff56d28e58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [   60.046075] RAX: ffffffffffffffda RBX: 00007fff56d290a8 RCX: 00007feb2baefc47
      [   60.046102] RDX: 00007fff56d29050 RSI: 00000000c0406469 RDI: 0000000000000003
      [   60.046129] RBP: 00007fff56d29050 R08: 000055ecc4cd27d0 R09: 00007feb2bda8600
      [   60.046154] R10: 0000000000000073 R11: 0000000000000246 R12: 00000000c0406469
      [   60.046177] R13: 0000000000000003 R14: 000000000000000f R15: 0000000000000099
      [   60.046203] Object at ffff880348c9e680, in cache i915_dependency size: 64
      [   60.046225] Allocated:
      [   60.046246] PID = 530
      [   60.046269]  save_stack_trace+0x1b/0x20
      [   60.046292]  save_stack+0x46/0xd0
      [   60.046318]  kasan_kmalloc+0xad/0xe0
      [   60.046343]  kasan_slab_alloc+0x12/0x20
      [   60.046368]  kmem_cache_alloc+0xab/0x650
      [   60.046445]  i915_gem_request_await_request+0x88/0x370 [i915]
      [   60.046559]  i915_gem_request_await_dma_fence+0x129/0x140 [i915]
      [   60.046705]  i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
      [   60.046849]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.046936]  drm_ioctl+0x359/0x640 [drm]
      [   60.046987]  do_vfs_ioctl+0x12e/0x910
      [   60.047038]  SyS_ioctl+0x41/0x70
      [   60.047090]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.047139] Freed:
      [   60.047179] PID = 530
      [   60.047223]  save_stack_trace+0x1b/0x20
      [   60.047269]  save_stack+0x46/0xd0
      [   60.047317]  kasan_slab_free+0x72/0xc0
      [   60.047366]  kmem_cache_free+0x39/0x160
      [   60.047512]  i915_gem_request_retire+0x83f/0x930 [i915]
      [   60.047657]  i915_gem_request_alloc+0x166/0x600 [i915]
      [   60.047799]  i915_gem_do_execbuffer.isra.37+0xad8/0x26b0 [i915]
      [   60.047897]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.047942]  drm_ioctl+0x359/0x640 [drm]
      [   60.047968]  do_vfs_ioctl+0x12e/0x910
      [   60.047993]  SyS_ioctl+0x41/0x70
      [   60.048019]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.048044] Memory state around the buggy address:
      [   60.048066]  ffff880348c9e580: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048105]  ffff880348c9e600: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048138] >ffff880348c9e680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [   60.048170]                                ^
      [   60.048191]  ffff880348c9e700: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048225]  ffff880348c9e780: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      
      Note to hit the use-after-free requires us to be passed back a request
      via a fence-array, that is from explicit fencing accumulated into a
      sync-file fence-array.
      
      Fixes: 52e54209 ("drm/i915/scheduler: Record all dependencies upon request construction")
      Testcase: igt/gem_exec_fence/expired-history
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170422081537.6468-1-chris@chris-wilson.co.uk
      (cherry picked from commit ade0b0c9)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      88326ef0
    • C
      drm/i915: Skip waking the signaler when enabling before request submission · f7b02a52
      Chris Wilson 提交于
      If we are enabling the breadcrumbs signaling prior to submitting the
      request, we know that we cannot have missed the interrupt and can
      therefore skip immediately waking the signaler to check.
      
      This reduces a significant chunk of the __i915_gem_request_submit()
      overhead for inter-engine synchronisation, for example in gem_exec_whisper.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170426080659.28771-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      f7b02a52
  21. 25 4月, 2017 1 次提交
  22. 22 4月, 2017 1 次提交
    • C
      drm/i915: Confirm the request is still active before adding it to the await · ade0b0c9
      Chris Wilson 提交于
      Although we do check the completion-status of the request before
      actually adding a wait on it (either to its submit fence or its
      completion dma-fence), we currently do not check before adding it to the
      dependency lists.
      
      In fact, without checking for a completed request we may try to use the
      signaler after it has been retired and its dependency tree freed:
      
      [   60.044057] BUG: KASAN: use-after-free in __list_add_valid+0x1d/0xd0 at addr ffff880348c9e6a0
      [   60.044118] Read of size 8 by task gem_exec_fence/530
      [   60.044164] CPU: 1 PID: 530 Comm: gem_exec_fence Tainted: G            E   4.11.0-rc7+ #46
      [   60.044226] Hardware name: ��������������������������������� ���������������������������������/���������������������������������, BIOS RYBDWi35.86A.0246.2
      [   60.044290] Call Trace:
      [   60.044337]  dump_stack+0x4d/0x6a
      [   60.044383]  kasan_object_err+0x21/0x70
      [   60.044435]  kasan_report+0x225/0x4e0
      [   60.044488]  ? __list_add_valid+0x1d/0xd0
      [   60.044534]  ? kasan_kmalloc+0xad/0xe0
      [   60.044587]  __asan_load8+0x5e/0x70
      [   60.044639]  __list_add_valid+0x1d/0xd0
      [   60.044788]  __i915_priotree_add_dependency+0x67/0x130 [i915]
      [   60.044895]  i915_gem_request_await_request+0xa8/0x370 [i915]
      [   60.044974]  i915_gem_request_await_dma_fence+0x129/0x140 [i915]
      [   60.045049]  i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
      [   60.045077]  ? save_stack+0xb1/0xd0
      [   60.045105]  ? save_stack_trace+0x1b/0x20
      [   60.045132]  ? save_stack+0x46/0xd0
      [   60.045158]  ? kasan_kmalloc+0xad/0xe0
      [   60.045184]  ? __kmalloc+0xd8/0x670
      [   60.045229]  ? drm_ioctl+0x359/0x640 [drm]
      [   60.045256]  ? SyS_ioctl+0x41/0x70
      [   60.045330]  ? i915_vma_move_to_active+0x540/0x540 [i915]
      [   60.045360]  ? tty_insert_flip_string_flags+0xa1/0xf0
      [   60.045387]  ? tty_flip_buffer_push+0x63/0x70
      [   60.045414]  ? remove_wait_queue+0xa9/0xc0
      [   60.045441]  ? kasan_unpoison_shadow+0x35/0x50
      [   60.045467]  ? kasan_kmalloc+0xad/0xe0
      [   60.045494]  ? kasan_check_write+0x14/0x20
      [   60.045568]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.045616]  drm_ioctl+0x359/0x640 [drm]
      [   60.045705]  ? i915_gem_execbuffer+0x5a0/0x5a0 [i915]
      [   60.045751]  ? drm_version+0x150/0x150 [drm]
      [   60.045778]  ? compat_start_thread+0x60/0x60
      [   60.045805]  ? plist_del+0xda/0x1a0
      [   60.045833]  do_vfs_ioctl+0x12e/0x910
      [   60.045860]  ? ioctl_preallocate+0x130/0x130
      [   60.045886]  ? pci_mmcfg_check_reserved+0xc0/0xc0
      [   60.045913]  ? vfs_write+0x196/0x240
      [   60.045939]  ? __fget_light+0xa7/0xc0
      [   60.045965]  SyS_ioctl+0x41/0x70
      [   60.045991]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.046017] RIP: 0033:0x7feb2baefc47
      [   60.046042] RSP: 002b:00007fff56d28e58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [   60.046075] RAX: ffffffffffffffda RBX: 00007fff56d290a8 RCX: 00007feb2baefc47
      [   60.046102] RDX: 00007fff56d29050 RSI: 00000000c0406469 RDI: 0000000000000003
      [   60.046129] RBP: 00007fff56d29050 R08: 000055ecc4cd27d0 R09: 00007feb2bda8600
      [   60.046154] R10: 0000000000000073 R11: 0000000000000246 R12: 00000000c0406469
      [   60.046177] R13: 0000000000000003 R14: 000000000000000f R15: 0000000000000099
      [   60.046203] Object at ffff880348c9e680, in cache i915_dependency size: 64
      [   60.046225] Allocated:
      [   60.046246] PID = 530
      [   60.046269]  save_stack_trace+0x1b/0x20
      [   60.046292]  save_stack+0x46/0xd0
      [   60.046318]  kasan_kmalloc+0xad/0xe0
      [   60.046343]  kasan_slab_alloc+0x12/0x20
      [   60.046368]  kmem_cache_alloc+0xab/0x650
      [   60.046445]  i915_gem_request_await_request+0x88/0x370 [i915]
      [   60.046559]  i915_gem_request_await_dma_fence+0x129/0x140 [i915]
      [   60.046705]  i915_gem_do_execbuffer.isra.37+0xb0a/0x26b0 [i915]
      [   60.046849]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.046936]  drm_ioctl+0x359/0x640 [drm]
      [   60.046987]  do_vfs_ioctl+0x12e/0x910
      [   60.047038]  SyS_ioctl+0x41/0x70
      [   60.047090]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.047139] Freed:
      [   60.047179] PID = 530
      [   60.047223]  save_stack_trace+0x1b/0x20
      [   60.047269]  save_stack+0x46/0xd0
      [   60.047317]  kasan_slab_free+0x72/0xc0
      [   60.047366]  kmem_cache_free+0x39/0x160
      [   60.047512]  i915_gem_request_retire+0x83f/0x930 [i915]
      [   60.047657]  i915_gem_request_alloc+0x166/0x600 [i915]
      [   60.047799]  i915_gem_do_execbuffer.isra.37+0xad8/0x26b0 [i915]
      [   60.047897]  i915_gem_execbuffer2+0xdb/0x2a0 [i915]
      [   60.047942]  drm_ioctl+0x359/0x640 [drm]
      [   60.047968]  do_vfs_ioctl+0x12e/0x910
      [   60.047993]  SyS_ioctl+0x41/0x70
      [   60.048019]  entry_SYSCALL_64_fastpath+0x17/0x98
      [   60.048044] Memory state around the buggy address:
      [   60.048066]  ffff880348c9e580: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048105]  ffff880348c9e600: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048138] >ffff880348c9e680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [   60.048170]                                ^
      [   60.048191]  ffff880348c9e700: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      [   60.048225]  ffff880348c9e780: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      
      Note to hit the use-after-free requires us to be passed back a request
      via a fence-array, that is from explicit fencing accumulated into a
      sync-file fence-array.
      
      Fixes: 52e54209 ("drm/i915/scheduler: Record all dependencies upon request construction")
      Testcase: igt/gem_exec_fence/expired-history
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170422081537.6468-1-chris@chris-wilson.co.uk
      ade0b0c9
  23. 15 4月, 2017 1 次提交
  24. 07 4月, 2017 2 次提交
  25. 31 3月, 2017 4 次提交