1. 01 11月, 2016 1 次提交
    • C
      drm/i915: Avoid accessing request->timeline outside of its lifetime · cb399eab
      Chris Wilson 提交于
      Whilst waiting on a request, we may do so without holding any locks or
      any guards beyond a reference to the request. In order to avoid taking
      locks within request deallocation, we drop references to its timeline
      (via the context and ppgtt) upon retirement. We should avoid chasing
      such pointers outside of their control, in particular we inspect the
      request->timeline to see if we may restore the RPS waitboost for a
      client. If we instead look at the engine->timeline, we will have similar
      behaviour on both full-ppgtt and !full-ppgtt systems and reduce the
      amount of reward we give towards stalling clients (i.e. only if the
      client stalls and the GPU is uncontended does it reclaim its boost).
      This restores behaviour back to pre-timelines, whilst fixing:
      
      [  645.078485] BUG: KASAN: use-after-free in i915_gem_object_wait_fence+0x1ee/0x2e0 at addr ffff8802335643a0
      [  645.078577] Read of size 4 by task gem_exec_schedu/28408
      [  645.078638] CPU: 1 PID: 28408 Comm: gem_exec_schedu Not tainted 4.9.0-rc2+ #64
      [  645.078724] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
      [  645.078816]  ffff88022daef9a0 ffffffff8143d059 ffff880235402a80 ffff880233564200
      [  645.078998]  ffff88022daef9c8 ffffffff81229c5c ffff88022daefa48 ffff880233564200
      [  645.079172]  ffff880235402a80 ffff88022daefa38 ffffffff81229ef0 000000008110a796
      [  645.079345] Call Trace:
      [  645.079404]  [<ffffffff8143d059>] dump_stack+0x68/0x9f
      [  645.079467]  [<ffffffff81229c5c>] kasan_object_err+0x1c/0x70
      [  645.079534]  [<ffffffff81229ef0>] kasan_report_error+0x1f0/0x4b0
      [  645.079601]  [<ffffffff8122a244>] kasan_report+0x34/0x40
      [  645.079676]  [<ffffffff81634f5e>] ? i915_gem_object_wait_fence+0x1ee/0x2e0
      [  645.079741]  [<ffffffff81229951>] __asan_load4+0x61/0x80
      [  645.079807]  [<ffffffff81634f5e>] i915_gem_object_wait_fence+0x1ee/0x2e0
      [  645.079876]  [<ffffffff816364bf>] i915_gem_object_wait+0x19f/0x590
      [  645.079944]  [<ffffffff81636320>] ? i915_gem_object_wait_priority+0x500/0x500
      [  645.080016]  [<ffffffff8110fb30>] ? debug_show_all_locks+0x1e0/0x1e0
      [  645.080084]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
      [  645.080157]  [<ffffffff8110a796>] ? __lock_is_held+0x46/0xc0
      [  645.080226]  [<ffffffff8163bc61>] ? i915_gem_set_domain_ioctl+0x141/0x690
      [  645.080296]  [<ffffffff8163bcc2>] i915_gem_set_domain_ioctl+0x1a2/0x690
      [  645.080366]  [<ffffffff811f8f85>] ? __might_fault+0x75/0xe0
      [  645.080433]  [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
      [  645.080508]  [<ffffffff8163bb20>] ? i915_gem_obj_prepare_shmem_write+0x3a0/0x3a0
      [  645.080603]  [<ffffffff815a52d0>] ? drm_ioctl_permit+0x120/0x120
      [  645.080670]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
      [  645.080738]  [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
      [  645.080804]  [<ffffffff8120268c>] ? do_mmap+0x47c/0x580
      [  645.080871]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
      [  645.080938]  [<ffffffff812755f0>] ? ioctl_preallocate+0x150/0x150
      [  645.081011]  [<ffffffff81108c53>] ? up_write+0x23/0x50
      [  645.081078]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
      [  645.081145]  [<ffffffff811da450>] ? vma_is_stack_for_current+0x90/0x90
      [  645.081214]  [<ffffffff8110d853>] ? mark_held_locks+0x23/0xc0
      [  645.082030]  [<ffffffff81288408>] ? __fget+0x168/0x250
      [  645.082106]  [<ffffffff819ad517>] ? entry_SYSCALL_64_fastpath+0x5/0xb1
      [  645.082176]  [<ffffffff81288592>] ? __fget_light+0xa2/0xc0
      [  645.082242]  [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
      [  645.082309]  [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  645.082374] Object at ffff880233564200, in cache kmalloc-8192 size: 8192
      [  645.082431] Allocated:
      [  645.082480] PID = 28408
      [  645.082535]  [  645.082566] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
      [  645.082623]  [  645.082656] [<ffffffff81228b06>] save_stack+0x46/0xd0
      [  645.082716]  [  645.082756] [<ffffffff812292fd>] kasan_kmalloc+0xad/0xe0
      [  645.082817]  [  645.082848] [<ffffffff81631752>] i915_ppgtt_create+0x52/0x220
      [  645.082908]  [  645.082941] [<ffffffff8161db96>] i915_gem_create_context+0x396/0x560
      [  645.083027]  [  645.083059] [<ffffffff8161f857>] i915_gem_context_create_ioctl+0x97/0xf0
      [  645.083152]  [  645.083183] [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
      [  645.083243]  [  645.083274] [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
      [  645.083334]  [  645.083372] [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
      [  645.083432]  [  645.083464] [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  645.083551] Freed:
      [  645.083599] PID = 27629
      [  645.083648]  [  645.083676] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
      [  645.083738]  [  645.083770] [<ffffffff81228b06>] save_stack+0x46/0xd0
      [  645.083830]  [  645.083862] [<ffffffff81229203>] kasan_slab_free+0x73/0xc0
      [  645.083922]  [  645.083961] [<ffffffff812279c9>] kfree+0xa9/0x170
      [  645.084021]  [  645.084053] [<ffffffff81629f60>] i915_ppgtt_release+0x100/0x180
      [  645.084139]  [  645.084171] [<ffffffff8161d414>] i915_gem_context_free+0x1b4/0x230
      [  645.084257]  [  645.084288] [<ffffffff816537b2>] intel_lr_context_unpin+0x192/0x230
      [  645.084380]  [  645.084413] [<ffffffff81645250>] i915_gem_request_retire+0x620/0x630
      [  645.084500]  [  645.085226] [<ffffffff816473d1>] i915_gem_retire_requests+0x181/0x280
      [  645.085313]  [  645.085352] [<ffffffff816352ba>] i915_gem_retire_work_handler+0xca/0xe0
      [  645.085440]  [  645.085471] [<ffffffff810c725b>] process_one_work+0x4fb/0x920
      [  645.085532]  [  645.085562] [<ffffffff810c770d>] worker_thread+0x8d/0x840
      [  645.085622]  [  645.085653] [<ffffffff810d21e5>] kthread+0x185/0x1b0
      [  645.085718]  [  645.085750] [<ffffffff819ad7a7>] ret_from_fork+0x27/0x40
      [  645.085811] Memory state around the buggy address:
      [  645.085869]  ffff880233564280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.085956]  ffff880233564300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.086053] >ffff880233564380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.086138]                                ^
      [  645.086193]  ffff880233564400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.086283]  ffff880233564480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      v2: Add a comment to document the hint like nature of
       intel_engine_last_submit()
      
      Fixes: 73cb9701 ("drm/i915: Combine seqno + tracking into a global timeline struct")
      Fixes: 80b204bc ("drm/i915: Enable multiple timelines")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-1-chris@chris-wilson.co.uk
      cb399eab
  2. 29 10月, 2016 10 次提交
    • C
      drm/i915: Enable multiple timelines · 80b204bc
      Chris Wilson 提交于
      With the infrastructure converted over to tracking multiple timelines in
      the GEM API whilst preserving the efficiency of using a single execution
      timeline internally, we can now assign a separate timeline to every
      context with full-ppgtt.
      
      v2: Add a comment to indicate the xfer between timelines upon submission.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-35-chris@chris-wilson.co.uk
      80b204bc
    • C
      drm/i915: Convert breadcrumbs spinlock to be irqsafe · f6168e33
      Chris Wilson 提交于
      The breadcrumbs are about to be used from within IRQ context sections
      (e.g. nouveau signals a fence from an interrupt handler causing us to
      submit a new request) and/or from bottom-half tasklets (i.e.
      intel_lrc_irq_handler), therefore we need to employ the irqsafe spinlock
      variants.
      
      For example, deferring the request submission to the
      intel_lrc_irq_handler generates this trace:
      
      [   66.388639] =================================
      [   66.388650] [ INFO: inconsistent lock state ]
      [   66.388663] 4.9.0-rc2+ #56 Not tainted
      [   66.388672] ---------------------------------
      [   66.388682] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [   66.388695] swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
      [   66.388706]  (&(&b->lock)->rlock){+.?...} , at: [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
      [   66.388761] {SOFTIRQ-ON-W} state was registered at:
      [   66.388772]   [   66.388783] [<ffffffff810bd842>] __lock_acquire+0x682/0x1870
      [   66.388795]   [   66.388803] [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0
      [   66.388814]   [   66.388824] [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
      [   66.388835]   [   66.388845] [<ffffffff81401e41>] intel_engine_reset_breadcrumbs+0x21/0xb0
      [   66.388857]   [   66.388866] [<ffffffff81403ae7>] gen8_init_common_ring+0x67/0x100
      [   66.388878]   [   66.388887] [<ffffffff81403b92>] gen8_init_render_ring+0x12/0x60
      [   66.388903]   [   66.388912] [<ffffffff813f8707>] i915_gem_init_hw+0xf7/0x2a0
      [   66.388927]   [   66.388936] [<ffffffff813f899b>] i915_gem_init+0xbb/0xf0
      [   66.388950]   [   66.388959] [<ffffffff813b4980>] i915_driver_load+0x7e0/0x1330
      [   66.388978]   [   66.388988] [<ffffffff813c09d8>] i915_pci_probe+0x28/0x40
      [   66.389003]   [   66.389013] [<ffffffff812fa0db>] pci_device_probe+0x8b/0xf0
      [   66.389028]   [   66.389037] [<ffffffff8147737e>] driver_probe_device+0x21e/0x430
      [   66.389056]   [   66.389065] [<ffffffff8147766e>] __driver_attach+0xde/0xe0
      [   66.389080]   [   66.389090] [<ffffffff814751ad>] bus_for_each_dev+0x5d/0x90
      [   66.389105]   [   66.389113] [<ffffffff81477799>] driver_attach+0x19/0x20
      [   66.389134]   [   66.389144] [<ffffffff81475ced>] bus_add_driver+0x15d/0x260
      [   66.389159]   [   66.389168] [<ffffffff81477e3b>] driver_register+0x5b/0xd0
      [   66.389183]   [   66.389281] [<ffffffff812fa19b>] __pci_register_driver+0x5b/0x60
      [   66.389301]   [   66.389312] [<ffffffff81aed333>] i915_init+0x3e/0x45
      [   66.389326]   [   66.389336] [<ffffffff81ac2ffa>] do_one_initcall+0x8b/0x118
      [   66.389350]   [   66.389359] [<ffffffff81ac323a>] kernel_init_freeable+0x1b3/0x23b
      [   66.389378]   [   66.389387] [<ffffffff8160fc39>] kernel_init+0x9/0x100
      [   66.389402]   [   66.389411] [<ffffffff816180e7>] ret_from_fork+0x27/0x40
      [   66.389426] irq event stamp: 315865
      [   66.389438] hardirqs last  enabled at (315864): [<ffffffff816178f1>] _raw_spin_unlock_irqrestore+0x31/0x50
      [   66.389469] hardirqs last disabled at (315865): [<ffffffff816176b3>] _raw_spin_lock_irqsave+0x13/0x50
      [   66.389499] softirqs last  enabled at (315818): [<ffffffff8107a04c>] _local_bh_enable+0x1c/0x50
      [   66.389530] softirqs last disabled at (315819): [<ffffffff8107a50e>] irq_exit+0xbe/0xd0
      [   66.389559]
      [   66.389559] other info that might help us debug this:
      [   66.389580]  Possible unsafe locking scenario:
      [   66.389580]
      [   66.389598]        CPU0
      [   66.389609]        ----
      [   66.389620]   lock(&(&b->lock)->rlock);
      [   66.389650]   <Interrupt>
      [   66.389661]     lock(&(&b->lock)->rlock);
      [   66.389690]
      [   66.389690]  *** DEADLOCK ***
      [   66.389690]
      [   66.389715] 2 locks held by swapper/1/0:
      [   66.389728]  #0: (&(&tl->lock)->rlock){..-...}, at: [<ffffffff81403e01>] intel_lrc_irq_handler+0x201/0x3c0
      [   66.389785]  #1: (&(&req->lock)->rlock/1){..-...}, at: [<ffffffff813fc0af>] __i915_gem_request_submit+0x8f/0x170
      [   66.389854]
      [   66.389854] stack backtrace:
      [   66.389959] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-rc2+ #56
      [   66.389976] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
      [   66.389999]  ffff88027fd03c58 ffffffff812beae5 ffff88027696e680 ffffffff822afe20
      [   66.390036]  ffff88027fd03ca8 ffffffff810bb420 0000000000000001 0000000000000000
      [   66.390070]  0000000000000000 0000000000000006 0000000000000004 ffff88027696ee10
      [   66.390104] Call Trace:
      [   66.390117]  <IRQ>
      [   66.390128]  [<ffffffff812beae5>] dump_stack+0x68/0x93
      [   66.390147]  [<ffffffff810bb420>] print_usage_bug+0x1d0/0x1e0
      [   66.390164]  [<ffffffff810bb8a0>] mark_lock+0x470/0x4f0
      [   66.390181]  [<ffffffff810ba9d0>] ? print_shortest_lock_dependencies+0x1b0/0x1b0
      [   66.390203]  [<ffffffff810bd75d>] __lock_acquire+0x59d/0x1870
      [   66.390221]  [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0
      [   66.390237]  [<ffffffff810bedbc>] ? lock_acquire+0x6c/0xb0
      [   66.390255]  [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
      [   66.390273]  [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
      [   66.390291]  [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
      [   66.390309]  [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
      [   66.390327]  [<ffffffff813fc170>] __i915_gem_request_submit+0x150/0x170
      [   66.390345]  [<ffffffff81403e8b>] intel_lrc_irq_handler+0x28b/0x3c0
      [   66.390363]  [<ffffffff81079d97>] tasklet_action+0x57/0xc0
      [   66.390380]  [<ffffffff8107a249>] __do_softirq+0x119/0x240
      [   66.390396]  [<ffffffff8107a50e>] irq_exit+0xbe/0xd0
      [   66.390414]  [<ffffffff8101afd5>] do_IRQ+0x65/0x110
      [   66.390431]  [<ffffffff81618806>] common_interrupt+0x86/0x86
      [   66.390446]  <EOI>
      [   66.390457]  [<ffffffff814ec6d1>] ? cpuidle_enter_state+0x151/0x200
      [   66.390480]  [<ffffffff814ec7a2>] cpuidle_enter+0x12/0x20
      [   66.390498]  [<ffffffff810b639e>] call_cpuidle+0x1e/0x40
      [   66.390516]  [<ffffffff810b65ae>] cpu_startup_entry+0x10e/0x1f0
      [   66.390534]  [<ffffffff81036133>] start_secondary+0x103/0x130
      
      (This is split out of the defer global seqno allocation patch due to
      realisation that we need a more complete conversion if we want to defer
      request submission even further.)
      
      v2: lockdep was warning about mixed SOFTIRQ contexts not HARDIRQ
      contexts so we only need to use spin_lock_bh and not disable interrupts.
      
      v3: We need full irq protection as we may be called from a third party
      interrupt handler (via fences).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-32-chris@chris-wilson.co.uk
      f6168e33
    • C
      drm/i915: Move the global sync optimisation to the timeline · 85e17f59
      Chris Wilson 提交于
      Currently we try to reduce the number of synchronisations (now the
      number of requests we need to wait upon) by noting that if we have
      earlier waited upon a request, all subsequent requests in the timeline
      will be after the wait. This only applies to requests in this timeline,
      as other timelines will not be ordered by that waiter.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-30-chris@chris-wilson.co.uk
      85e17f59
    • C
      drm/i915: Defer breadcrumb emission · caddfe71
      Chris Wilson 提交于
      Move the actual emission of the breadcrumb for closing the request from
      i915_add_request() to the submit callback. (It can be moved later when
      required.) This allows us to defer the allocation of the global_seqno
      from request construction to actual submission, allowing us to emit the
      requests out of order (wrt to the order of their construction, they
      still will only be executed one all of their dependencies are resolved
      including that all earlier requests on their timeline have been
      submitted.) We have to specialise how we then emit the request in order
      to write into the preallocated space, rather than at the tail of the
      ringbuffer (which will have been advanced by the addition of new
      requests).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-29-chris@chris-wilson.co.uk
      caddfe71
    • C
      drm/i915: Record space required for breadcrumb emission · 98f29e8d
      Chris Wilson 提交于
      In the next patch, we will use deferred breadcrumb emission. That requires
      reserving sufficient space in the ringbuffer to emit the breadcrumb, which
      first requires us to know how large the breadcrumb is.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-28-chris@chris-wilson.co.uk
      98f29e8d
    • C
      drm/i915: Rename ->emit_request to ->emit_breadcrumb · 9b81d556
      Chris Wilson 提交于
      Now that the emission of the request tail and its submission to hardware
      are two separate steps, engine->emit_request() is confusing.
      engine->emit_request() is called to emit the breadcrumb commands for the
      request into the ring, name it such (engine->emit_breadcrumb).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-27-chris@chris-wilson.co.uk
      9b81d556
    • C
      drm/i915: Combine seqno + tracking into a global timeline struct · 73cb9701
      Chris Wilson 提交于
      Our timelines are more than just a seqno. They also provide an ordered
      list of requests to be executed. Due to the restriction of handling
      individual address spaces, we are limited to a timeline per address
      space but we use a fence context per engine within.
      
      Our first step to introducing independent timelines per context (i.e. to
      allow each context to have a queue of requests to execute that have a
      defined set of dependencies on other requests) is to provide a timeline
      abstraction for the global execution queue.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
      73cb9701
    • C
      drm/i915: Reuse the active golden render state batch · 4e50f082
      Chris Wilson 提交于
      The golden render state is constant, but we recreate the batch setting
      it up for every new context. If we keep that batch in a volatile cache
      we can safely reuse it whenever we need to initialise a new context. We
      mark the pages as purgeable and use the shrinker to recover pages from
      the batch whenever we face memory pressues, recreating that batch afresh
      on the next new context.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtien@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-8-chris@chris-wilson.co.uk
      4e50f082
    • C
      drm/i915: Rearrange i915_wait_request() accounting with callers · e95433c7
      Chris Wilson 提交于
      Our low-level wait routine has evolved from our generic wait interface
      that handled unlocked, RPS boosting, waits with time tracking. If we
      push our GEM fence tracking to use reservation_objects (required for
      handling multiple timelines), we lose the ability to pass the required
      information down to i915_wait_request(). However, if we push the extra
      functionality from i915_wait_request() to the individual callsites
      (i915_gem_object_wait_rendering and i915_gem_wait_ioctl) that make use
      of those extras, we can both simplify our low level wait and prepare for
      extending the GEM interface for use of reservation_objects.
      
      v2: Rewrite i915_wait_request() kerneldocs
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Matthew Auld <matthew.william.auld@gmail.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-4-chris@chris-wilson.co.uk
      e95433c7
    • C
      drm/i915: Remove unused i915_gem_active_wait() in favour of _unlocked() · 2e36991a
      Chris Wilson 提交于
      Since we only use the more generic unlocked variant, just rename it as
      the normal i915_gem_active_wait(). The temporary cost is that we need to
      always acquire the reference in a RCU safe manner, but the benefit is
      that we will combine the common paths.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-5-chris@chris-wilson.co.uk
      2e36991a
  3. 14 10月, 2016 1 次提交
    • A
      drm/i915: Allocate intel_engine_cs structure only for the enabled engines · 3b3f1650
      Akash Goel 提交于
      With the possibility of addition of many more number of rings in future,
      the drm_i915_private structure could bloat as an array, of type
      intel_engine_cs, is embedded inside it.
      	struct intel_engine_cs engine[I915_NUM_ENGINES];
      Though this is still fine as generally there is only a single instance of
      drm_i915_private structure used, but not all of the possible rings would be
      enabled or active on most of the platforms. Some memory can be saved by
      allocating intel_engine_cs structure only for the enabled/active engines.
      Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
      indexed using the enums defined in intel_engine_id.
      To save memory and continue using the static engine/ring IDs, 'engine' is
      defined as an array of pointers.
      	struct intel_engine_cs *engine[I915_NUM_ENGINES];
      dev_priv->engine[engine_ID] will be NULL for disabled engine instances.
      
      There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
      i915.o file (but for i915.ko file text size remain same as 1193131 bytes).
      
      v2:
      - Remove the engine iterator field added in drm_i915_private structure,
        instead pass a local iterator variable to the for_each_engine**
        macros. (Chris)
      - Do away with intel_engine_initialized() and instead directly use the
        NULL pointer check on engine pointer. (Chris)
      
      v3:
      - Remove for_each_engine_id() macro, as the updated macro for_each_engine()
        can be used in place of it. (Chris)
      - Protect the access to Render engine Fault register with a NULL check, as
        engine specific init is done later in Driver load sequence.
      
      v4:
      - Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
      - Kill the superfluous init_engine_lists().
      
      v5:
      - Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
        allocation of intel_engine_cs structure. (Chris)
      
      v6:
      - Rebase.
      
      v7:
      - Optimize the for_each_engine_masked() macro. (Chris)
      - Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
      - Rebase.
      
      v8: Rebase.
      
      v9: Rebase.
      
      v10:
      - For index calculation use engine ID instead of pointer based arithmetic in
        intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
      - For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
      - Use for_each_engine macro for cleanup in intel_engines_init() and remove
        check for NULL engine pointer in cleanup() routines. (Joonas)
      
      v11: Rebase.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NAkash Goel <akash.goel@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
      3b3f1650
  4. 12 10月, 2016 1 次提交
  5. 10 10月, 2016 2 次提交
  6. 07 10月, 2016 2 次提交
  7. 05 10月, 2016 1 次提交
  8. 21 9月, 2016 2 次提交
  9. 09 9月, 2016 4 次提交
    • C
      drm/i915: Drive request submission through fence callbacks · 5590af3e
      Chris Wilson 提交于
      Drive final request submission from a callback from the fence. This way
      the request is queued until all dependencies are resolved, at which
      point it is handed to the backend for queueing to hardware. At this
      point, no dependencies are set on the request, so the callback is
      immediate.
      
      A side-effect of imposing a heavier-irqsafe spinlock for execlist
      submission is that we lose the softirq enabling after scheduling the
      execlists tasklet. To compensate, we manually kickstart the softirq by
      disabling and enabling the bh around the fence signaling.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NJohn Harrison <john.c.harrison@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-14-chris@chris-wilson.co.uk
      5590af3e
    • C
      drm/i915: Update reset path to fix incomplete requests · 821ed7df
      Chris Wilson 提交于
      Update reset path in preparation for engine reset which requires
      identification of incomplete requests and associated context and fixing
      their state so that engine can resume correctly after reset.
      
      The request that caused the hang will be skipped and head is reset to the
      start of breadcrumb. This allows us to resume from where we left-off.
      Since this request didn't complete normally we also need to cleanup elsp
      queue manually. This is vital if we employ nonblocking request
      submission where we may have a web of dependencies upon the hung request
      and so advancing the seqno manually is no longer trivial.
      
      ABI: gem_reset_stats / DRM_IOCTL_I915_GET_RESET_STATS
      
      We change the way we count pending batches. Only the active context
      involved in the reset is marked as either innocent or guilty, and not
      mark the entire world as pending. By inspection this only affects
      igt/gem_reset_stats (which assumes implementation details) and not
      piglit.
      
      ARB_robustness gives this guide on how we expect the user of this
      interface to behave:
      
       * Provide a mechanism for an OpenGL application to learn about
         graphics resets that affect the context.  When a graphics reset
         occurs, the OpenGL context becomes unusable and the application
         must create a new context to continue operation. Detecting a
         graphics reset happens through an inexpensive query.
      
      And with regards to the actual meaning of the reset values:
      
         Certain events can result in a reset of the GL context. Such a reset
         causes all context state to be lost. Recovery from such events
         requires recreation of all objects in the affected context. The
         current status of the graphics reset state is returned by
      
      	enum GetGraphicsResetStatusARB();
      
         The symbolic constant returned indicates if the GL context has been
         in a reset state at any point since the last call to
         GetGraphicsResetStatusARB. NO_ERROR indicates that the GL context
         has not been in a reset state since the last call.
         GUILTY_CONTEXT_RESET_ARB indicates that a reset has been detected
         that is attributable to the current GL context.
         INNOCENT_CONTEXT_RESET_ARB indicates a reset has been detected that
         is not attributable to the current GL context.
         UNKNOWN_CONTEXT_RESET_ARB indicates a detected graphics reset whose
         cause is unknown.
      
      The language here is explicit in that we must mark up the guilty batch,
      but is loose enough for us to relax the innocent (i.e. pending)
      accounting as only the active batches are involved with the reset.
      
      In the future, we are looking towards single engine resetting (with
      minimal locking), where it seems inappropriate to mark the entire world
      as innocent since the reset occurred on a different engine. Reducing the
      information available means we only have to encounter the pain once, and
      also reduces the information leaking from one context to another.
      
      v2: Legacy ringbuffer submission required a reset following hibernation,
      or else we restore stale values to the RING_HEAD and walked over
      stolen garbage.
      
      v3: GuC requires replaying the requests after a reset.
      
      v4: Restore engine IRQ after reset (so waiters will be woken!)
          Rearm hangcheck if resetting with a waiter.
      
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-13-chris@chris-wilson.co.uk
      821ed7df
    • C
      drm/i915: Expand bool interruptible to pass flags to i915_wait_request() · ea746f36
      Chris Wilson 提交于
      We need finer control over wakeup behaviour during i915_wait_request(),
      so expand the current bool interruptible to a bitmask.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-9-chris@chris-wilson.co.uk
      ea746f36
    • C
      drm/i915: Simplify ELSP queue request tracking · 70c2a24d
      Chris Wilson 提交于
      Emulate HW to track and manage ELSP queue. A set of SW ports are defined
      and requests are assigned to these ports before submitting them to HW. This
      helps in cleaning up incomplete requests during reset recovery easier
      especially after engine reset by decoupling elsp queue management. This
      will become more clear in the next patch.
      
      In the engine reset case we want to resume where we left-off after skipping
      the incomplete batch which requires checking the elsp queue, removing
      element and fixing elsp_submitted counts in some cases. Instead of directly
      manipulating the elsp queue from reset path we can examine these ports, fix
      up ringbuffer pointers using the incomplete request and restart submissions
      again after reset.
      
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-3-git-send-email-arun.siluvery@linux.intel.comReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-6-chris@chris-wilson.co.uk
      70c2a24d
  10. 19 8月, 2016 1 次提交
  11. 17 8月, 2016 2 次提交
  12. 15 8月, 2016 5 次提交
  13. 10 8月, 2016 3 次提交
  14. 05 8月, 2016 1 次提交
    • C
      drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex · dcff85c8
      Chris Wilson 提交于
      The principal motivation for this was to try and eliminate the
      struct_mutex from i915_gem_suspend - but we still need to hold the mutex
      current for the i915_gem_context_lost(). (The issue there is that there
      may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
      struct_mutex via the stop_machine().) For the moment, enabling last
      request tracking for the engine, allows us to do busyness checking and
      waiting without requiring the struct_mutex - which is useful in its own
      right.
      
      As a side-effect of having a robust means for tracking engine busyness,
      we can replace our other busyness heuristic, that of comparing against
      the last submitted seqno. For paranoid reasons, we have a semi-ordered
      check of that seqno inside the hangchecker, which we can now improve to
      an ordered check of the engine's busyness (removing a locked xchg in the
      process).
      
      v2: Pass along "bool interruptible" as being unlocked we cannot rely on
      i915->mm.interruptible being stable or even under our control.
      v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk
      dcff85c8
  15. 04 8月, 2016 2 次提交
  16. 03 8月, 2016 2 次提交