1. 01 11月, 2016 1 次提交
    • C
      drm/i915: Avoid accessing request->timeline outside of its lifetime · cb399eab
      Chris Wilson 提交于
      Whilst waiting on a request, we may do so without holding any locks or
      any guards beyond a reference to the request. In order to avoid taking
      locks within request deallocation, we drop references to its timeline
      (via the context and ppgtt) upon retirement. We should avoid chasing
      such pointers outside of their control, in particular we inspect the
      request->timeline to see if we may restore the RPS waitboost for a
      client. If we instead look at the engine->timeline, we will have similar
      behaviour on both full-ppgtt and !full-ppgtt systems and reduce the
      amount of reward we give towards stalling clients (i.e. only if the
      client stalls and the GPU is uncontended does it reclaim its boost).
      This restores behaviour back to pre-timelines, whilst fixing:
      
      [  645.078485] BUG: KASAN: use-after-free in i915_gem_object_wait_fence+0x1ee/0x2e0 at addr ffff8802335643a0
      [  645.078577] Read of size 4 by task gem_exec_schedu/28408
      [  645.078638] CPU: 1 PID: 28408 Comm: gem_exec_schedu Not tainted 4.9.0-rc2+ #64
      [  645.078724] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
      [  645.078816]  ffff88022daef9a0 ffffffff8143d059 ffff880235402a80 ffff880233564200
      [  645.078998]  ffff88022daef9c8 ffffffff81229c5c ffff88022daefa48 ffff880233564200
      [  645.079172]  ffff880235402a80 ffff88022daefa38 ffffffff81229ef0 000000008110a796
      [  645.079345] Call Trace:
      [  645.079404]  [<ffffffff8143d059>] dump_stack+0x68/0x9f
      [  645.079467]  [<ffffffff81229c5c>] kasan_object_err+0x1c/0x70
      [  645.079534]  [<ffffffff81229ef0>] kasan_report_error+0x1f0/0x4b0
      [  645.079601]  [<ffffffff8122a244>] kasan_report+0x34/0x40
      [  645.079676]  [<ffffffff81634f5e>] ? i915_gem_object_wait_fence+0x1ee/0x2e0
      [  645.079741]  [<ffffffff81229951>] __asan_load4+0x61/0x80
      [  645.079807]  [<ffffffff81634f5e>] i915_gem_object_wait_fence+0x1ee/0x2e0
      [  645.079876]  [<ffffffff816364bf>] i915_gem_object_wait+0x19f/0x590
      [  645.079944]  [<ffffffff81636320>] ? i915_gem_object_wait_priority+0x500/0x500
      [  645.080016]  [<ffffffff8110fb30>] ? debug_show_all_locks+0x1e0/0x1e0
      [  645.080084]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
      [  645.080157]  [<ffffffff8110a796>] ? __lock_is_held+0x46/0xc0
      [  645.080226]  [<ffffffff8163bc61>] ? i915_gem_set_domain_ioctl+0x141/0x690
      [  645.080296]  [<ffffffff8163bcc2>] i915_gem_set_domain_ioctl+0x1a2/0x690
      [  645.080366]  [<ffffffff811f8f85>] ? __might_fault+0x75/0xe0
      [  645.080433]  [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
      [  645.080508]  [<ffffffff8163bb20>] ? i915_gem_obj_prepare_shmem_write+0x3a0/0x3a0
      [  645.080603]  [<ffffffff815a52d0>] ? drm_ioctl_permit+0x120/0x120
      [  645.080670]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
      [  645.080738]  [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
      [  645.080804]  [<ffffffff8120268c>] ? do_mmap+0x47c/0x580
      [  645.080871]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
      [  645.080938]  [<ffffffff812755f0>] ? ioctl_preallocate+0x150/0x150
      [  645.081011]  [<ffffffff81108c53>] ? up_write+0x23/0x50
      [  645.081078]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
      [  645.081145]  [<ffffffff811da450>] ? vma_is_stack_for_current+0x90/0x90
      [  645.081214]  [<ffffffff8110d853>] ? mark_held_locks+0x23/0xc0
      [  645.082030]  [<ffffffff81288408>] ? __fget+0x168/0x250
      [  645.082106]  [<ffffffff819ad517>] ? entry_SYSCALL_64_fastpath+0x5/0xb1
      [  645.082176]  [<ffffffff81288592>] ? __fget_light+0xa2/0xc0
      [  645.082242]  [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
      [  645.082309]  [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  645.082374] Object at ffff880233564200, in cache kmalloc-8192 size: 8192
      [  645.082431] Allocated:
      [  645.082480] PID = 28408
      [  645.082535]  [  645.082566] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
      [  645.082623]  [  645.082656] [<ffffffff81228b06>] save_stack+0x46/0xd0
      [  645.082716]  [  645.082756] [<ffffffff812292fd>] kasan_kmalloc+0xad/0xe0
      [  645.082817]  [  645.082848] [<ffffffff81631752>] i915_ppgtt_create+0x52/0x220
      [  645.082908]  [  645.082941] [<ffffffff8161db96>] i915_gem_create_context+0x396/0x560
      [  645.083027]  [  645.083059] [<ffffffff8161f857>] i915_gem_context_create_ioctl+0x97/0xf0
      [  645.083152]  [  645.083183] [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
      [  645.083243]  [  645.083274] [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
      [  645.083334]  [  645.083372] [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
      [  645.083432]  [  645.083464] [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  645.083551] Freed:
      [  645.083599] PID = 27629
      [  645.083648]  [  645.083676] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
      [  645.083738]  [  645.083770] [<ffffffff81228b06>] save_stack+0x46/0xd0
      [  645.083830]  [  645.083862] [<ffffffff81229203>] kasan_slab_free+0x73/0xc0
      [  645.083922]  [  645.083961] [<ffffffff812279c9>] kfree+0xa9/0x170
      [  645.084021]  [  645.084053] [<ffffffff81629f60>] i915_ppgtt_release+0x100/0x180
      [  645.084139]  [  645.084171] [<ffffffff8161d414>] i915_gem_context_free+0x1b4/0x230
      [  645.084257]  [  645.084288] [<ffffffff816537b2>] intel_lr_context_unpin+0x192/0x230
      [  645.084380]  [  645.084413] [<ffffffff81645250>] i915_gem_request_retire+0x620/0x630
      [  645.084500]  [  645.085226] [<ffffffff816473d1>] i915_gem_retire_requests+0x181/0x280
      [  645.085313]  [  645.085352] [<ffffffff816352ba>] i915_gem_retire_work_handler+0xca/0xe0
      [  645.085440]  [  645.085471] [<ffffffff810c725b>] process_one_work+0x4fb/0x920
      [  645.085532]  [  645.085562] [<ffffffff810c770d>] worker_thread+0x8d/0x840
      [  645.085622]  [  645.085653] [<ffffffff810d21e5>] kthread+0x185/0x1b0
      [  645.085718]  [  645.085750] [<ffffffff819ad7a7>] ret_from_fork+0x27/0x40
      [  645.085811] Memory state around the buggy address:
      [  645.085869]  ffff880233564280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.085956]  ffff880233564300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.086053] >ffff880233564380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.086138]                                ^
      [  645.086193]  ffff880233564400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.086283]  ffff880233564480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      v2: Add a comment to document the hint like nature of
       intel_engine_last_submit()
      
      Fixes: 73cb9701 ("drm/i915: Combine seqno + tracking into a global timeline struct")
      Fixes: 80b204bc ("drm/i915: Enable multiple timelines")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-1-chris@chris-wilson.co.uk
      cb399eab
  2. 29 10月, 2016 6 次提交
    • C
      drm/i915: Convert breadcrumbs spinlock to be irqsafe · f6168e33
      Chris Wilson 提交于
      The breadcrumbs are about to be used from within IRQ context sections
      (e.g. nouveau signals a fence from an interrupt handler causing us to
      submit a new request) and/or from bottom-half tasklets (i.e.
      intel_lrc_irq_handler), therefore we need to employ the irqsafe spinlock
      variants.
      
      For example, deferring the request submission to the
      intel_lrc_irq_handler generates this trace:
      
      [   66.388639] =================================
      [   66.388650] [ INFO: inconsistent lock state ]
      [   66.388663] 4.9.0-rc2+ #56 Not tainted
      [   66.388672] ---------------------------------
      [   66.388682] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [   66.388695] swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
      [   66.388706]  (&(&b->lock)->rlock){+.?...} , at: [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
      [   66.388761] {SOFTIRQ-ON-W} state was registered at:
      [   66.388772]   [   66.388783] [<ffffffff810bd842>] __lock_acquire+0x682/0x1870
      [   66.388795]   [   66.388803] [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0
      [   66.388814]   [   66.388824] [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
      [   66.388835]   [   66.388845] [<ffffffff81401e41>] intel_engine_reset_breadcrumbs+0x21/0xb0
      [   66.388857]   [   66.388866] [<ffffffff81403ae7>] gen8_init_common_ring+0x67/0x100
      [   66.388878]   [   66.388887] [<ffffffff81403b92>] gen8_init_render_ring+0x12/0x60
      [   66.388903]   [   66.388912] [<ffffffff813f8707>] i915_gem_init_hw+0xf7/0x2a0
      [   66.388927]   [   66.388936] [<ffffffff813f899b>] i915_gem_init+0xbb/0xf0
      [   66.388950]   [   66.388959] [<ffffffff813b4980>] i915_driver_load+0x7e0/0x1330
      [   66.388978]   [   66.388988] [<ffffffff813c09d8>] i915_pci_probe+0x28/0x40
      [   66.389003]   [   66.389013] [<ffffffff812fa0db>] pci_device_probe+0x8b/0xf0
      [   66.389028]   [   66.389037] [<ffffffff8147737e>] driver_probe_device+0x21e/0x430
      [   66.389056]   [   66.389065] [<ffffffff8147766e>] __driver_attach+0xde/0xe0
      [   66.389080]   [   66.389090] [<ffffffff814751ad>] bus_for_each_dev+0x5d/0x90
      [   66.389105]   [   66.389113] [<ffffffff81477799>] driver_attach+0x19/0x20
      [   66.389134]   [   66.389144] [<ffffffff81475ced>] bus_add_driver+0x15d/0x260
      [   66.389159]   [   66.389168] [<ffffffff81477e3b>] driver_register+0x5b/0xd0
      [   66.389183]   [   66.389281] [<ffffffff812fa19b>] __pci_register_driver+0x5b/0x60
      [   66.389301]   [   66.389312] [<ffffffff81aed333>] i915_init+0x3e/0x45
      [   66.389326]   [   66.389336] [<ffffffff81ac2ffa>] do_one_initcall+0x8b/0x118
      [   66.389350]   [   66.389359] [<ffffffff81ac323a>] kernel_init_freeable+0x1b3/0x23b
      [   66.389378]   [   66.389387] [<ffffffff8160fc39>] kernel_init+0x9/0x100
      [   66.389402]   [   66.389411] [<ffffffff816180e7>] ret_from_fork+0x27/0x40
      [   66.389426] irq event stamp: 315865
      [   66.389438] hardirqs last  enabled at (315864): [<ffffffff816178f1>] _raw_spin_unlock_irqrestore+0x31/0x50
      [   66.389469] hardirqs last disabled at (315865): [<ffffffff816176b3>] _raw_spin_lock_irqsave+0x13/0x50
      [   66.389499] softirqs last  enabled at (315818): [<ffffffff8107a04c>] _local_bh_enable+0x1c/0x50
      [   66.389530] softirqs last disabled at (315819): [<ffffffff8107a50e>] irq_exit+0xbe/0xd0
      [   66.389559]
      [   66.389559] other info that might help us debug this:
      [   66.389580]  Possible unsafe locking scenario:
      [   66.389580]
      [   66.389598]        CPU0
      [   66.389609]        ----
      [   66.389620]   lock(&(&b->lock)->rlock);
      [   66.389650]   <Interrupt>
      [   66.389661]     lock(&(&b->lock)->rlock);
      [   66.389690]
      [   66.389690]  *** DEADLOCK ***
      [   66.389690]
      [   66.389715] 2 locks held by swapper/1/0:
      [   66.389728]  #0: (&(&tl->lock)->rlock){..-...}, at: [<ffffffff81403e01>] intel_lrc_irq_handler+0x201/0x3c0
      [   66.389785]  #1: (&(&req->lock)->rlock/1){..-...}, at: [<ffffffff813fc0af>] __i915_gem_request_submit+0x8f/0x170
      [   66.389854]
      [   66.389854] stack backtrace:
      [   66.389959] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-rc2+ #56
      [   66.389976] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
      [   66.389999]  ffff88027fd03c58 ffffffff812beae5 ffff88027696e680 ffffffff822afe20
      [   66.390036]  ffff88027fd03ca8 ffffffff810bb420 0000000000000001 0000000000000000
      [   66.390070]  0000000000000000 0000000000000006 0000000000000004 ffff88027696ee10
      [   66.390104] Call Trace:
      [   66.390117]  <IRQ>
      [   66.390128]  [<ffffffff812beae5>] dump_stack+0x68/0x93
      [   66.390147]  [<ffffffff810bb420>] print_usage_bug+0x1d0/0x1e0
      [   66.390164]  [<ffffffff810bb8a0>] mark_lock+0x470/0x4f0
      [   66.390181]  [<ffffffff810ba9d0>] ? print_shortest_lock_dependencies+0x1b0/0x1b0
      [   66.390203]  [<ffffffff810bd75d>] __lock_acquire+0x59d/0x1870
      [   66.390221]  [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0
      [   66.390237]  [<ffffffff810bedbc>] ? lock_acquire+0x6c/0xb0
      [   66.390255]  [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
      [   66.390273]  [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
      [   66.390291]  [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
      [   66.390309]  [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
      [   66.390327]  [<ffffffff813fc170>] __i915_gem_request_submit+0x150/0x170
      [   66.390345]  [<ffffffff81403e8b>] intel_lrc_irq_handler+0x28b/0x3c0
      [   66.390363]  [<ffffffff81079d97>] tasklet_action+0x57/0xc0
      [   66.390380]  [<ffffffff8107a249>] __do_softirq+0x119/0x240
      [   66.390396]  [<ffffffff8107a50e>] irq_exit+0xbe/0xd0
      [   66.390414]  [<ffffffff8101afd5>] do_IRQ+0x65/0x110
      [   66.390431]  [<ffffffff81618806>] common_interrupt+0x86/0x86
      [   66.390446]  <EOI>
      [   66.390457]  [<ffffffff814ec6d1>] ? cpuidle_enter_state+0x151/0x200
      [   66.390480]  [<ffffffff814ec7a2>] cpuidle_enter+0x12/0x20
      [   66.390498]  [<ffffffff810b639e>] call_cpuidle+0x1e/0x40
      [   66.390516]  [<ffffffff810b65ae>] cpu_startup_entry+0x10e/0x1f0
      [   66.390534]  [<ffffffff81036133>] start_secondary+0x103/0x130
      
      (This is split out of the defer global seqno allocation patch due to
      realisation that we need a more complete conversion if we want to defer
      request submission even further.)
      
      v2: lockdep was warning about mixed SOFTIRQ contexts not HARDIRQ
      contexts so we only need to use spin_lock_bh and not disable interrupts.
      
      v3: We need full irq protection as we may be called from a third party
      interrupt handler (via fences).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-32-chris@chris-wilson.co.uk
      f6168e33
    • C
      drm/i915: Move the global sync optimisation to the timeline · 85e17f59
      Chris Wilson 提交于
      Currently we try to reduce the number of synchronisations (now the
      number of requests we need to wait upon) by noting that if we have
      earlier waited upon a request, all subsequent requests in the timeline
      will be after the wait. This only applies to requests in this timeline,
      as other timelines will not be ordered by that waiter.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-30-chris@chris-wilson.co.uk
      85e17f59
    • C
      drm/i915: Introduce a global_seqno for each request · 65e4760e
      Chris Wilson 提交于
      Though we will have multiple timelines, we still have a single timeline
      of execution. This we can use to provide an execution and retirement order
      of requests. This keeps tracking execution of requests simple, and vital
      for preserving a single waiter (i.e. so that we can order the waiters so
      that only the earliest to wakeup need be woken). To accomplish this we
      distinguish the seqno used to order requests per-context (external) and
      that used internally for execution.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-26-chris@chris-wilson.co.uk
      65e4760e
    • C
      drm/i915: Combine seqno + tracking into a global timeline struct · 73cb9701
      Chris Wilson 提交于
      Our timelines are more than just a seqno. They also provide an ordered
      list of requests to be executed. Due to the restriction of handling
      individual address spaces, we are limited to a timeline per address
      space but we use a fence context per engine within.
      
      Our first step to introducing independent timelines per context (i.e. to
      allow each context to have a queue of requests to execute that have a
      defined set of dependencies on other requests) is to provide a timeline
      abstraction for the global execution queue.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
      73cb9701
    • C
      drm/i915: Move GEM activity tracking into a common struct reservation_object · d07f0e59
      Chris Wilson 提交于
      In preparation to support many distinct timelines, we need to expand the
      activity tracking on the GEM object to handle more than just a request
      per engine. We already use the struct reservation_object on the dma-buf
      to handle many fence contexts, so integrating that into the GEM object
      itself is the preferred solution. (For example, we can now share the same
      reservation_object between every consumer/producer using this buffer and
      skip the manual import/export via dma-buf.)
      
      v2: Reimplement busy-ioctl (by walking the reservation object), postpone
      the ABI change for another day. Similarly use the reservation object to
      find the last_write request (if active and from i915) for choosing
      display CS flips.
      
      Caveats:
      
       * busy-ioctl: busy-ioctl only reports on the native fences, it will not
      warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
      being rendered to by external fences. It also will not report the same
      busy state as wait-ioctl (or polling on the dma-buf) in the same
      circumstances. On the plus side, it does retain reporting of which
      *i915* engines are engaged with this object.
      
       * non-blocking atomic modesets take a step backwards as the wait for
      render completion blocks the ioctl. This is fixed in a subsequent
      patch to use a fence instead for awaiting on the rendering, see
      "drm/i915: Restore nonblocking awaits for modesetting"
      
       * dynamic array manipulation for shared-fences in reservation is slower
      than the previous lockless static assignment (e.g. gem_exec_lut_handle
      runtime on ivb goes from 42s to 66s), mainly due to atomic operations
      (maintaining the fence refcounts).
      
       * loss of object-level retirement callbacks, emulated by VMA retirement
      tracking.
      
       * minor loss of object-level last activity information from debugfs,
      could be replaced with per-vma information if desired
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
      d07f0e59
    • C
      drm/i915: Refactor object page API · a4f5ea64
      Chris Wilson 提交于
      The plan is to make obtaining the backing storage for the object avoid
      struct_mutex (i.e. use its own locking). The first step is to update the
      API so that normal users only call pin/unpin whilst working on the
      backing storage.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
      a4f5ea64
  3. 25 10月, 2016 2 次提交
  4. 20 10月, 2016 1 次提交
  5. 14 10月, 2016 5 次提交
  6. 13 10月, 2016 2 次提交
  7. 12 10月, 2016 6 次提交
  8. 05 10月, 2016 2 次提交
  9. 21 9月, 2016 2 次提交
  10. 08 9月, 2016 1 次提交
  11. 06 9月, 2016 1 次提交
  12. 22 8月, 2016 1 次提交
  13. 20 8月, 2016 1 次提交
  14. 19 8月, 2016 1 次提交
  15. 15 8月, 2016 8 次提交