1. 07 12月, 2016 1 次提交
  2. 06 12月, 2016 1 次提交
    • C
      drm/i915: Use memcpy_from_wc for GPU error capture · d637c178
      Chris Wilson 提交于
      On all platforms we now always read the contents of buffers via the GTT,
      i.e. using WC cpu access. Reads are slow, but they can be accelerated
      with an internal read buffer using sse4.1 (movntqda). This is our
      i915_memcpy_from_wc() routine which also checks for sse4.1 support and
      so we can fallback to using a regular slow memcpy if we need to.
      
      When compressing the pages, the reads are currently done inside zlib's
      fill_window() routine and so we must copy the page into a temporary
      which is then already inside the CPU cache and fast for zlib's
      compression. When not compressing the pages, we don't need a temporary
      and can just use the accelerated read from WC into the destination.
      
      v2: Use zstream locals to reduce diff and allocate the additional
      temporary storage only if sse4.1 is supported.
      v3: Use length=0 for the sse4.1 support check
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161206124051.17040-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      d637c178
  3. 02 12月, 2016 1 次提交
  4. 21 11月, 2016 5 次提交
    • M
      drm/i915: Wipe hang stats as an embedded struct · bc1d53c6
      Mika Kuoppala 提交于
      Bannable property, banned status, guilty and active counts are
      properties of i915_gem_context. Make them so.
      
      v2: rebase
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1479309634-28574-1-git-send-email-mika.kuoppala@intel.com
      bc1d53c6
    • M
      drm/i915: Add per client max context ban limit · b083a087
      Mika Kuoppala 提交于
      If we have a bad client submitting unfavourably across different
      contexts, creating new ones, the per context scoring of badness
      doesn't remove the root cause, the offending client.
      To counter, keep track of per client context bans. Deny access if
      client is responsible for more than 3 context bans in
      it's lifetime.
      
      v2: move ban check to context create ioctl (Chris)
      v3: add commentary about hangs needed to reach client ban (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      b083a087
    • M
      drm/i915: Add bannable context parameter · 84102171
      Mika Kuoppala 提交于
      Now when driver has per context scoring of 'hanging badness'
      and also subsequent hangs during short windows are allowed,
      if there is progress made in between, it does not make sense
      to expose a ban timing window as a context parameter anymore.
      
      Let the scoring be the sole indicator for ban policy and substitute
      ban period context parameter as a boolean to get/set context
      bannable property.
      
      v2: allow non root to opt into being banned (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      84102171
    • M
      drm/i915: Decouple hang detection from hangcheck period · 3fe3b030
      Mika Kuoppala 提交于
      Hangcheck state accumulation has gained more steps
      along the years, like head movement and more recently the
      subunit inactivity check. As the subunit sampling is only
      done if the previous state check showed inactivity, we
      have added more stages (and time) to reach a hang verdict.
      
      Asymmetric engine states led to different actual weight of
      'one hangcheck unit' and it was demonstrated in some
      hangs that due to difference in stages, simpler engines
      were accused falsely of a hang as their scoring was much
      more quicker to accumulate above the hang treshold.
      
      To completely decouple the hangcheck guilty score
      from the hangcheck period, convert hangcheck score to a
      rough period of inactivity measurement. As these are
      tracked as jiffies, they are meaningful also across
      reset boundaries. This makes finding a guilty engine
      more accurate across multi engine activity scenarios,
      especially across asymmetric engines.
      
      We lose the ability to detect cross batch malicious attempts
      to hinder the progress. Plan is to move this functionality
      to be part of context banning which is more natural fit,
      later in the series.
      
      v2: use time_before macros (Chris)
          reinstate the pardoning of moving engine after hc (Chris)
      v3: avoid global state for per engine stall detection (Chris)
      v4: take timeline last retirement into account (Chris)
      v5: do debug print on pardoning, split out retirement timestamp (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      3fe3b030
    • M
      drm/i915: Split up hangcheck phases · 6e16d028
      Mika Kuoppala 提交于
      In order to simplify hangcheck state keeping, split hangcheck
      per engine loop in three phases: state load, action, state save.
      
      Add few more hangcheck actions to separate between seqno, head
      and subunit movements. This helps to gather all the hangcheck
      actions under a single switch umbrella.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      6e16d028
  5. 18 11月, 2016 1 次提交
  6. 17 11月, 2016 1 次提交
  7. 11 11月, 2016 2 次提交
  8. 08 11月, 2016 1 次提交
  9. 01 11月, 2016 1 次提交
    • C
      drm/i915: Avoid accessing request->timeline outside of its lifetime · cb399eab
      Chris Wilson 提交于
      Whilst waiting on a request, we may do so without holding any locks or
      any guards beyond a reference to the request. In order to avoid taking
      locks within request deallocation, we drop references to its timeline
      (via the context and ppgtt) upon retirement. We should avoid chasing
      such pointers outside of their control, in particular we inspect the
      request->timeline to see if we may restore the RPS waitboost for a
      client. If we instead look at the engine->timeline, we will have similar
      behaviour on both full-ppgtt and !full-ppgtt systems and reduce the
      amount of reward we give towards stalling clients (i.e. only if the
      client stalls and the GPU is uncontended does it reclaim its boost).
      This restores behaviour back to pre-timelines, whilst fixing:
      
      [  645.078485] BUG: KASAN: use-after-free in i915_gem_object_wait_fence+0x1ee/0x2e0 at addr ffff8802335643a0
      [  645.078577] Read of size 4 by task gem_exec_schedu/28408
      [  645.078638] CPU: 1 PID: 28408 Comm: gem_exec_schedu Not tainted 4.9.0-rc2+ #64
      [  645.078724] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
      [  645.078816]  ffff88022daef9a0 ffffffff8143d059 ffff880235402a80 ffff880233564200
      [  645.078998]  ffff88022daef9c8 ffffffff81229c5c ffff88022daefa48 ffff880233564200
      [  645.079172]  ffff880235402a80 ffff88022daefa38 ffffffff81229ef0 000000008110a796
      [  645.079345] Call Trace:
      [  645.079404]  [<ffffffff8143d059>] dump_stack+0x68/0x9f
      [  645.079467]  [<ffffffff81229c5c>] kasan_object_err+0x1c/0x70
      [  645.079534]  [<ffffffff81229ef0>] kasan_report_error+0x1f0/0x4b0
      [  645.079601]  [<ffffffff8122a244>] kasan_report+0x34/0x40
      [  645.079676]  [<ffffffff81634f5e>] ? i915_gem_object_wait_fence+0x1ee/0x2e0
      [  645.079741]  [<ffffffff81229951>] __asan_load4+0x61/0x80
      [  645.079807]  [<ffffffff81634f5e>] i915_gem_object_wait_fence+0x1ee/0x2e0
      [  645.079876]  [<ffffffff816364bf>] i915_gem_object_wait+0x19f/0x590
      [  645.079944]  [<ffffffff81636320>] ? i915_gem_object_wait_priority+0x500/0x500
      [  645.080016]  [<ffffffff8110fb30>] ? debug_show_all_locks+0x1e0/0x1e0
      [  645.080084]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
      [  645.080157]  [<ffffffff8110a796>] ? __lock_is_held+0x46/0xc0
      [  645.080226]  [<ffffffff8163bc61>] ? i915_gem_set_domain_ioctl+0x141/0x690
      [  645.080296]  [<ffffffff8163bcc2>] i915_gem_set_domain_ioctl+0x1a2/0x690
      [  645.080366]  [<ffffffff811f8f85>] ? __might_fault+0x75/0xe0
      [  645.080433]  [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
      [  645.080508]  [<ffffffff8163bb20>] ? i915_gem_obj_prepare_shmem_write+0x3a0/0x3a0
      [  645.080603]  [<ffffffff815a52d0>] ? drm_ioctl_permit+0x120/0x120
      [  645.080670]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
      [  645.080738]  [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
      [  645.080804]  [<ffffffff8120268c>] ? do_mmap+0x47c/0x580
      [  645.080871]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
      [  645.080938]  [<ffffffff812755f0>] ? ioctl_preallocate+0x150/0x150
      [  645.081011]  [<ffffffff81108c53>] ? up_write+0x23/0x50
      [  645.081078]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
      [  645.081145]  [<ffffffff811da450>] ? vma_is_stack_for_current+0x90/0x90
      [  645.081214]  [<ffffffff8110d853>] ? mark_held_locks+0x23/0xc0
      [  645.082030]  [<ffffffff81288408>] ? __fget+0x168/0x250
      [  645.082106]  [<ffffffff819ad517>] ? entry_SYSCALL_64_fastpath+0x5/0xb1
      [  645.082176]  [<ffffffff81288592>] ? __fget_light+0xa2/0xc0
      [  645.082242]  [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
      [  645.082309]  [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  645.082374] Object at ffff880233564200, in cache kmalloc-8192 size: 8192
      [  645.082431] Allocated:
      [  645.082480] PID = 28408
      [  645.082535]  [  645.082566] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
      [  645.082623]  [  645.082656] [<ffffffff81228b06>] save_stack+0x46/0xd0
      [  645.082716]  [  645.082756] [<ffffffff812292fd>] kasan_kmalloc+0xad/0xe0
      [  645.082817]  [  645.082848] [<ffffffff81631752>] i915_ppgtt_create+0x52/0x220
      [  645.082908]  [  645.082941] [<ffffffff8161db96>] i915_gem_create_context+0x396/0x560
      [  645.083027]  [  645.083059] [<ffffffff8161f857>] i915_gem_context_create_ioctl+0x97/0xf0
      [  645.083152]  [  645.083183] [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
      [  645.083243]  [  645.083274] [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
      [  645.083334]  [  645.083372] [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
      [  645.083432]  [  645.083464] [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  645.083551] Freed:
      [  645.083599] PID = 27629
      [  645.083648]  [  645.083676] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
      [  645.083738]  [  645.083770] [<ffffffff81228b06>] save_stack+0x46/0xd0
      [  645.083830]  [  645.083862] [<ffffffff81229203>] kasan_slab_free+0x73/0xc0
      [  645.083922]  [  645.083961] [<ffffffff812279c9>] kfree+0xa9/0x170
      [  645.084021]  [  645.084053] [<ffffffff81629f60>] i915_ppgtt_release+0x100/0x180
      [  645.084139]  [  645.084171] [<ffffffff8161d414>] i915_gem_context_free+0x1b4/0x230
      [  645.084257]  [  645.084288] [<ffffffff816537b2>] intel_lr_context_unpin+0x192/0x230
      [  645.084380]  [  645.084413] [<ffffffff81645250>] i915_gem_request_retire+0x620/0x630
      [  645.084500]  [  645.085226] [<ffffffff816473d1>] i915_gem_retire_requests+0x181/0x280
      [  645.085313]  [  645.085352] [<ffffffff816352ba>] i915_gem_retire_work_handler+0xca/0xe0
      [  645.085440]  [  645.085471] [<ffffffff810c725b>] process_one_work+0x4fb/0x920
      [  645.085532]  [  645.085562] [<ffffffff810c770d>] worker_thread+0x8d/0x840
      [  645.085622]  [  645.085653] [<ffffffff810d21e5>] kthread+0x185/0x1b0
      [  645.085718]  [  645.085750] [<ffffffff819ad7a7>] ret_from_fork+0x27/0x40
      [  645.085811] Memory state around the buggy address:
      [  645.085869]  ffff880233564280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.085956]  ffff880233564300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.086053] >ffff880233564380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.086138]                                ^
      [  645.086193]  ffff880233564400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  645.086283]  ffff880233564480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      v2: Add a comment to document the hint like nature of
       intel_engine_last_submit()
      
      Fixes: 73cb9701 ("drm/i915: Combine seqno + tracking into a global timeline struct")
      Fixes: 80b204bc ("drm/i915: Enable multiple timelines")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-1-chris@chris-wilson.co.uk
      cb399eab
  10. 29 10月, 2016 6 次提交
    • C
      drm/i915: Convert breadcrumbs spinlock to be irqsafe · f6168e33
      Chris Wilson 提交于
      The breadcrumbs are about to be used from within IRQ context sections
      (e.g. nouveau signals a fence from an interrupt handler causing us to
      submit a new request) and/or from bottom-half tasklets (i.e.
      intel_lrc_irq_handler), therefore we need to employ the irqsafe spinlock
      variants.
      
      For example, deferring the request submission to the
      intel_lrc_irq_handler generates this trace:
      
      [   66.388639] =================================
      [   66.388650] [ INFO: inconsistent lock state ]
      [   66.388663] 4.9.0-rc2+ #56 Not tainted
      [   66.388672] ---------------------------------
      [   66.388682] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [   66.388695] swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
      [   66.388706]  (&(&b->lock)->rlock){+.?...} , at: [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
      [   66.388761] {SOFTIRQ-ON-W} state was registered at:
      [   66.388772]   [   66.388783] [<ffffffff810bd842>] __lock_acquire+0x682/0x1870
      [   66.388795]   [   66.388803] [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0
      [   66.388814]   [   66.388824] [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
      [   66.388835]   [   66.388845] [<ffffffff81401e41>] intel_engine_reset_breadcrumbs+0x21/0xb0
      [   66.388857]   [   66.388866] [<ffffffff81403ae7>] gen8_init_common_ring+0x67/0x100
      [   66.388878]   [   66.388887] [<ffffffff81403b92>] gen8_init_render_ring+0x12/0x60
      [   66.388903]   [   66.388912] [<ffffffff813f8707>] i915_gem_init_hw+0xf7/0x2a0
      [   66.388927]   [   66.388936] [<ffffffff813f899b>] i915_gem_init+0xbb/0xf0
      [   66.388950]   [   66.388959] [<ffffffff813b4980>] i915_driver_load+0x7e0/0x1330
      [   66.388978]   [   66.388988] [<ffffffff813c09d8>] i915_pci_probe+0x28/0x40
      [   66.389003]   [   66.389013] [<ffffffff812fa0db>] pci_device_probe+0x8b/0xf0
      [   66.389028]   [   66.389037] [<ffffffff8147737e>] driver_probe_device+0x21e/0x430
      [   66.389056]   [   66.389065] [<ffffffff8147766e>] __driver_attach+0xde/0xe0
      [   66.389080]   [   66.389090] [<ffffffff814751ad>] bus_for_each_dev+0x5d/0x90
      [   66.389105]   [   66.389113] [<ffffffff81477799>] driver_attach+0x19/0x20
      [   66.389134]   [   66.389144] [<ffffffff81475ced>] bus_add_driver+0x15d/0x260
      [   66.389159]   [   66.389168] [<ffffffff81477e3b>] driver_register+0x5b/0xd0
      [   66.389183]   [   66.389281] [<ffffffff812fa19b>] __pci_register_driver+0x5b/0x60
      [   66.389301]   [   66.389312] [<ffffffff81aed333>] i915_init+0x3e/0x45
      [   66.389326]   [   66.389336] [<ffffffff81ac2ffa>] do_one_initcall+0x8b/0x118
      [   66.389350]   [   66.389359] [<ffffffff81ac323a>] kernel_init_freeable+0x1b3/0x23b
      [   66.389378]   [   66.389387] [<ffffffff8160fc39>] kernel_init+0x9/0x100
      [   66.389402]   [   66.389411] [<ffffffff816180e7>] ret_from_fork+0x27/0x40
      [   66.389426] irq event stamp: 315865
      [   66.389438] hardirqs last  enabled at (315864): [<ffffffff816178f1>] _raw_spin_unlock_irqrestore+0x31/0x50
      [   66.389469] hardirqs last disabled at (315865): [<ffffffff816176b3>] _raw_spin_lock_irqsave+0x13/0x50
      [   66.389499] softirqs last  enabled at (315818): [<ffffffff8107a04c>] _local_bh_enable+0x1c/0x50
      [   66.389530] softirqs last disabled at (315819): [<ffffffff8107a50e>] irq_exit+0xbe/0xd0
      [   66.389559]
      [   66.389559] other info that might help us debug this:
      [   66.389580]  Possible unsafe locking scenario:
      [   66.389580]
      [   66.389598]        CPU0
      [   66.389609]        ----
      [   66.389620]   lock(&(&b->lock)->rlock);
      [   66.389650]   <Interrupt>
      [   66.389661]     lock(&(&b->lock)->rlock);
      [   66.389690]
      [   66.389690]  *** DEADLOCK ***
      [   66.389690]
      [   66.389715] 2 locks held by swapper/1/0:
      [   66.389728]  #0: (&(&tl->lock)->rlock){..-...}, at: [<ffffffff81403e01>] intel_lrc_irq_handler+0x201/0x3c0
      [   66.389785]  #1: (&(&req->lock)->rlock/1){..-...}, at: [<ffffffff813fc0af>] __i915_gem_request_submit+0x8f/0x170
      [   66.389854]
      [   66.389854] stack backtrace:
      [   66.389959] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-rc2+ #56
      [   66.389976] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
      [   66.389999]  ffff88027fd03c58 ffffffff812beae5 ffff88027696e680 ffffffff822afe20
      [   66.390036]  ffff88027fd03ca8 ffffffff810bb420 0000000000000001 0000000000000000
      [   66.390070]  0000000000000000 0000000000000006 0000000000000004 ffff88027696ee10
      [   66.390104] Call Trace:
      [   66.390117]  <IRQ>
      [   66.390128]  [<ffffffff812beae5>] dump_stack+0x68/0x93
      [   66.390147]  [<ffffffff810bb420>] print_usage_bug+0x1d0/0x1e0
      [   66.390164]  [<ffffffff810bb8a0>] mark_lock+0x470/0x4f0
      [   66.390181]  [<ffffffff810ba9d0>] ? print_shortest_lock_dependencies+0x1b0/0x1b0
      [   66.390203]  [<ffffffff810bd75d>] __lock_acquire+0x59d/0x1870
      [   66.390221]  [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0
      [   66.390237]  [<ffffffff810bedbc>] ? lock_acquire+0x6c/0xb0
      [   66.390255]  [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
      [   66.390273]  [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
      [   66.390291]  [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
      [   66.390309]  [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
      [   66.390327]  [<ffffffff813fc170>] __i915_gem_request_submit+0x150/0x170
      [   66.390345]  [<ffffffff81403e8b>] intel_lrc_irq_handler+0x28b/0x3c0
      [   66.390363]  [<ffffffff81079d97>] tasklet_action+0x57/0xc0
      [   66.390380]  [<ffffffff8107a249>] __do_softirq+0x119/0x240
      [   66.390396]  [<ffffffff8107a50e>] irq_exit+0xbe/0xd0
      [   66.390414]  [<ffffffff8101afd5>] do_IRQ+0x65/0x110
      [   66.390431]  [<ffffffff81618806>] common_interrupt+0x86/0x86
      [   66.390446]  <EOI>
      [   66.390457]  [<ffffffff814ec6d1>] ? cpuidle_enter_state+0x151/0x200
      [   66.390480]  [<ffffffff814ec7a2>] cpuidle_enter+0x12/0x20
      [   66.390498]  [<ffffffff810b639e>] call_cpuidle+0x1e/0x40
      [   66.390516]  [<ffffffff810b65ae>] cpu_startup_entry+0x10e/0x1f0
      [   66.390534]  [<ffffffff81036133>] start_secondary+0x103/0x130
      
      (This is split out of the defer global seqno allocation patch due to
      realisation that we need a more complete conversion if we want to defer
      request submission even further.)
      
      v2: lockdep was warning about mixed SOFTIRQ contexts not HARDIRQ
      contexts so we only need to use spin_lock_bh and not disable interrupts.
      
      v3: We need full irq protection as we may be called from a third party
      interrupt handler (via fences).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-32-chris@chris-wilson.co.uk
      f6168e33
    • C
      drm/i915: Move the global sync optimisation to the timeline · 85e17f59
      Chris Wilson 提交于
      Currently we try to reduce the number of synchronisations (now the
      number of requests we need to wait upon) by noting that if we have
      earlier waited upon a request, all subsequent requests in the timeline
      will be after the wait. This only applies to requests in this timeline,
      as other timelines will not be ordered by that waiter.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-30-chris@chris-wilson.co.uk
      85e17f59
    • C
      drm/i915: Introduce a global_seqno for each request · 65e4760e
      Chris Wilson 提交于
      Though we will have multiple timelines, we still have a single timeline
      of execution. This we can use to provide an execution and retirement order
      of requests. This keeps tracking execution of requests simple, and vital
      for preserving a single waiter (i.e. so that we can order the waiters so
      that only the earliest to wakeup need be woken). To accomplish this we
      distinguish the seqno used to order requests per-context (external) and
      that used internally for execution.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-26-chris@chris-wilson.co.uk
      65e4760e
    • C
      drm/i915: Combine seqno + tracking into a global timeline struct · 73cb9701
      Chris Wilson 提交于
      Our timelines are more than just a seqno. They also provide an ordered
      list of requests to be executed. Due to the restriction of handling
      individual address spaces, we are limited to a timeline per address
      space but we use a fence context per engine within.
      
      Our first step to introducing independent timelines per context (i.e. to
      allow each context to have a queue of requests to execute that have a
      defined set of dependencies on other requests) is to provide a timeline
      abstraction for the global execution queue.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
      73cb9701
    • C
      drm/i915: Move GEM activity tracking into a common struct reservation_object · d07f0e59
      Chris Wilson 提交于
      In preparation to support many distinct timelines, we need to expand the
      activity tracking on the GEM object to handle more than just a request
      per engine. We already use the struct reservation_object on the dma-buf
      to handle many fence contexts, so integrating that into the GEM object
      itself is the preferred solution. (For example, we can now share the same
      reservation_object between every consumer/producer using this buffer and
      skip the manual import/export via dma-buf.)
      
      v2: Reimplement busy-ioctl (by walking the reservation object), postpone
      the ABI change for another day. Similarly use the reservation object to
      find the last_write request (if active and from i915) for choosing
      display CS flips.
      
      Caveats:
      
       * busy-ioctl: busy-ioctl only reports on the native fences, it will not
      warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
      being rendered to by external fences. It also will not report the same
      busy state as wait-ioctl (or polling on the dma-buf) in the same
      circumstances. On the plus side, it does retain reporting of which
      *i915* engines are engaged with this object.
      
       * non-blocking atomic modesets take a step backwards as the wait for
      render completion blocks the ioctl. This is fixed in a subsequent
      patch to use a fence instead for awaiting on the rendering, see
      "drm/i915: Restore nonblocking awaits for modesetting"
      
       * dynamic array manipulation for shared-fences in reservation is slower
      than the previous lockless static assignment (e.g. gem_exec_lut_handle
      runtime on ivb goes from 42s to 66s), mainly due to atomic operations
      (maintaining the fence refcounts).
      
       * loss of object-level retirement callbacks, emulated by VMA retirement
      tracking.
      
       * minor loss of object-level last activity information from debugfs,
      could be replaced with per-vma information if desired
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
      d07f0e59
    • C
      drm/i915: Refactor object page API · a4f5ea64
      Chris Wilson 提交于
      The plan is to make obtaining the backing storage for the object avoid
      struct_mutex (i.e. use its own locking). The first step is to update the
      API so that normal users only call pin/unpin whilst working on the
      backing storage.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
      a4f5ea64
  11. 25 10月, 2016 2 次提交
  12. 20 10月, 2016 1 次提交
  13. 14 10月, 2016 5 次提交
  14. 13 10月, 2016 2 次提交
  15. 12 10月, 2016 6 次提交
  16. 05 10月, 2016 2 次提交
  17. 21 9月, 2016 2 次提交