1. 05 4月, 2016 6 次提交
  2. 04 4月, 2016 2 次提交
    • T
      drm/i915: Move execlists irq handler to a bottom half · 27af5eea
      Tvrtko Ursulin 提交于
      Doing a lot of work in the interrupt handler introduces huge
      latencies to the system as a whole.
      
      Most dramatic effect can be seen by running an all engine
      stress test like igt/gem_exec_nop/all where, when the kernel
      config is lean enough, the whole system can be brought into
      multi-second periods of complete non-interactivty. That can
      look for example like this:
      
       NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u8:3:143]
       Modules linked in: [redacted for brevity]
       CPU: 0 PID: 143 Comm: kworker/u8:3 Tainted: G     U       L  4.5.0-160321+ #183
       Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1
       Workqueue: i915 gen6_pm_rps_work [i915]
       task: ffff8800aae88000 ti: ffff8800aae90000 task.ti: ffff8800aae90000
       RIP: 0010:[<ffffffff8104a3c2>]  [<ffffffff8104a3c2>] __do_softirq+0x72/0x1d0
       RSP: 0000:ffff88014f403f38  EFLAGS: 00000206
       RAX: ffff8800aae94000 RBX: 0000000000000000 RCX: 00000000000006e0
       RDX: 0000000000000020 RSI: 0000000004208060 RDI: 0000000000215d80
       RBP: ffff88014f403f80 R08: 0000000b1b42c180 R09: 0000000000000022
       R10: 0000000000000004 R11: 00000000ffffffff R12: 000000000000a030
       R13: 0000000000000082 R14: ffff8800aa4d0080 R15: 0000000000000082
       FS:  0000000000000000(0000) GS:ffff88014f400000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007fa53b90c000 CR3: 0000000001a0a000 CR4: 00000000001406f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Stack:
        042080601b33869f ffff8800aae94000 00000000fffc2678 ffff88010000000a
        0000000000000000 000000000000a030 0000000000005302 ffff8800aa4d0080
        0000000000000206 ffff88014f403f90 ffffffff8104a716 ffff88014f403fa8
       Call Trace:
        <IRQ>
        [<ffffffff8104a716>] irq_exit+0x86/0x90
        [<ffffffff81031e7d>] smp_apic_timer_interrupt+0x3d/0x50
        [<ffffffff814f3eac>] apic_timer_interrupt+0x7c/0x90
        <EOI>
        [<ffffffffa01c5b40>] ? gen8_write64+0x1a0/0x1a0 [i915]
        [<ffffffff814f2b39>] ? _raw_spin_unlock_irqrestore+0x9/0x20
        [<ffffffffa01c5c44>] gen8_write32+0x104/0x1a0 [i915]
        [<ffffffff8132c6a2>] ? n_tty_receive_buf_common+0x372/0xae0
        [<ffffffffa017cc9e>] gen6_set_rps_thresholds+0x1be/0x330 [i915]
        [<ffffffffa017eaf0>] gen6_set_rps+0x70/0x200 [i915]
        [<ffffffffa0185375>] intel_set_rps+0x25/0x30 [i915]
        [<ffffffffa01768fd>] gen6_pm_rps_work+0x10d/0x2e0 [i915]
        [<ffffffff81063852>] ? finish_task_switch+0x72/0x1c0
        [<ffffffff8105ab29>] process_one_work+0x139/0x350
        [<ffffffff8105b186>] worker_thread+0x126/0x490
        [<ffffffff8105b060>] ? rescuer_thread+0x320/0x320
        [<ffffffff8105fa64>] kthread+0xc4/0xe0
        [<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170
        [<ffffffff814f351f>] ret_from_fork+0x3f/0x70
        [<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170
      
      I could not explain, or find a code path, which would explain
      a +20 second lockup, but from some instrumentation it was
      apparent the interrupts off proportion of time was between
      10-25% under heavy load which is quite bad.
      
      When a interrupt "cliff" is reached, which was >~320k irq/s on
      my machine, the whole system goes into a terrible state of the
      above described multi-second lockups.
      
      By moving the GT interrupt handling to a tasklet in a most
      simple way, the problem above disappears completely.
      
      Testing the effect on sytem-wide latencies using
      igt/gem_syslatency shows the following before this patch:
      
      gem_syslatency: cycles=1532739, latency mean=416531.829us max=2499237us
      gem_syslatency: cycles=1839434, latency mean=1458099.157us max=4998944us
      gem_syslatency: cycles=1432570, latency mean=2688.451us max=1201185us
      gem_syslatency: cycles=1533543, latency mean=416520.499us max=2498886us
      
      This shows that the unrelated process is experiencing huge
      delays in its wake-up latency. After the patch the results
      look like this:
      
      gem_syslatency: cycles=808907, latency mean=53.133us max=1640us
      gem_syslatency: cycles=862154, latency mean=62.778us max=2117us
      gem_syslatency: cycles=856039, latency mean=58.079us max=2123us
      gem_syslatency: cycles=841683, latency mean=56.914us max=1667us
      
      Showing a huge improvement in the unrelated process wake-up
      latency. It also shows an approximate halving in the number
      of total empty batches submitted during the test. This may
      not be worrying since the test puts the driver under
      a very unrealistic load with ncpu threads doing empty batch
      submission to all GPU engines each.
      
      Another benefit compared to the hard-irq handling is that now
      work on all engines can be dispatched in parallel since we can
      have up to number of CPUs active tasklets. (While previously
      a single hard-irq would serially dispatch on one engine after
      another.)
      
      More interesting scenario with regards to throughput is
      "gem_latency -n 100" which  shows 25% better throughput and
      CPU usage, and 14% better dispatch latencies.
      
      I did not find any gains or regressions with Synmark2 or
      GLbench under light testing. More benchmarking is certainly
      required.
      
      v2:
         * execlists_lock should be taken as spin_lock_bh when
           queuing work from userspace now. (Chris Wilson)
         * uncore.lock must be taken with spin_lock_irq when
           submitting requests since that now runs from either
           softirq or process context.
      
      v3:
         * Expanded commit message with more testing data;
         * converted missed locking sites to _bh;
         * added execlist_lock comment. (Chris Wilson)
      
      v4:
         * Mention dispatch parallelism in commit. (Chris Wilson)
         * Do not hold uncore.lock over MMIO reads since the block
           is already serialised per-engine via the tasklet itself.
           (Chris Wilson)
         * intel_lrc_irq_handler should be static. (Chris Wilson)
         * Cancel/sync the tasklet on GPU reset. (Chris Wilson)
         * Document and WARN that tasklet cannot be active/pending
           on engine cleanup. (Chris Wilson/Imre Deak)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Imre Deak <imre.deak@intel.com>
      Testcase: igt/gem_exec_nop/all
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94350Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1459768316-6670-1-git-send-email-tvrtko.ursulin@linux.intel.com
      27af5eea
    • C
      drm/i915/ddi: Silence compiler warning for unknown output type · 183aec16
      Chris Wilson 提交于
      Silences
      
      	src/drivers/gpu/drm/i915/intel_ddi.c: warning: 'port' may be used uninitialized in this function [-Wuninitialized]
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1459717154-27607-1-git-send-email-chris@chris-wilson.co.uk
      183aec16
  3. 03 4月, 2016 2 次提交
  4. 02 4月, 2016 8 次提交
  5. 01 4月, 2016 9 次提交
  6. 31 3月, 2016 7 次提交
  7. 30 3月, 2016 6 次提交
    • C
      drm/i915: Exit cherryview_irq_handler() after one pass · 579de73b
      Chris Wilson 提交于
      This effectively reverts
      
      commit 8e5fd599
      Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Date:   Wed Apr 9 13:28:50 2014 +0300
      
          drm/i915/chv: Make CHV irq handler loop until all interrupts are consumed
      
      as under continuous execlists load we can saturate the IRQ handler,
      destablising the tsc clock and triggering the NMI watchdog to declare a hung
      CPU.
      
      [  552.756051] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
      [  552.756080] clocksource:                       'refined-jiffies' wd_now: 10003b480 wd_last: 10003b28c mask: ffffffff
      [  552.756091] clocksource:                       'tsc' cs_now: d55d31aa50 cs_last: d17446166c mask: ffffffffffffffff
      [  552.756210] clocksource: Switched to clocksource refined-jiffies
      [  575.217870] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
      [  575.217893] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rc7+ #18
      [  575.217905] Hardware name:                  /NUC5CPYB, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
      [  575.217915]  0000000000000000 ffff88027fd05bc0 ffffffff81288c6d 0000000000000000
      [  575.217935]  0000000000000001 ffff88027fd05be0 ffffffff810e72d1 0000000000000000
      [  575.217951]  ffff88027fd05c80 ffff88027fd05c20 ffffffff81114b60 0000000181015f1e
      [  575.217967] Call Trace:
      [  575.217973]  <NMI>  [<ffffffff81288c6d>] dump_stack+0x4f/0x72
      [  575.217994]  [<ffffffff810e72d1>] watchdog_overflow_callback+0x151/0x160
      [  575.218003]  [<ffffffff81114b60>] __perf_event_overflow+0xa0/0x1e0
      [  575.218016]  [<ffffffff811154c4>] perf_event_overflow+0x14/0x20
      [  575.218028]  [<ffffffff8101d2ca>] intel_pmu_handle_irq+0x1da/0x460
      [  575.218042]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
      [  575.218052]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
      [  575.218064]  [<ffffffff81014ae8>] perf_event_nmi_handler+0x28/0x50
      [  575.218075]  [<ffffffff81007540>] nmi_handle+0x60/0x130
      [  575.218086]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
      [  575.218096]  [<ffffffff810079c0>] do_nmi+0x140/0x470
      [  575.218108]  [<ffffffff81559ec7>] end_repeat_nmi+0x1a/0x1e
      [  575.218119]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
      [  575.218129]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
      [  575.218139]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
      [  575.218148]  <<EOE>>  [<ffffffff814a8353>] cpuidle_enter_state+0xf3/0x2f0
      [  575.218164]  [<ffffffff814a8587>] cpuidle_enter+0x17/0x20
      [  575.218175]  [<ffffffff810aaa3a>] call_cpuidle+0x2a/0x40
      [  575.218185]  [<ffffffff810aade3>] cpu_startup_entry+0x273/0x330
      [  575.218196]  [<ffffffff81033a1e>] start_secondary+0x10e/0x130
      
      However, not servicing all available IIR within the handler does hurt the
      throughput of pathological nop execbuf by about 20%, with a similar effect
      upon the dispatch latency of a series of execbuf.
      
      v2: use do {} while(0) for a smaller patch, and easier to revert again
      
      I have reasonable confidence that we do not miss GT interrupts (as
      execlists provides a stress case with a failure mechanism easily
      detected by igt), however I have less confidence about all the other
      sources of interrupts and worry that may lose a display hotplug
      interrupt, for example.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93467
      Testcase: igt/gem_exec_nop/basic # requires NMI watchdog
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Antti Koskipää <antti.koskipaa@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1457946117-6714-1-git-send-email-chris@chris-wilson.co.uk
      579de73b
    • C
      drm/i915: Rename __force_wake_get to __force_wake_auto · b208ba8e
      Chris Wilson 提交于
      __force_wake_get() only acquires a temporary wakeref on forcewake that is
      automatically released when a timer expires. When reading the code
      again, I confused __intel_uncore_forcewake_get() for __force_wake_get()
      and to my shame thought I found a bug in unbalanced wake_count handling.
      
      I claim that if the function had been called __force_wake_auto() instead
      I would not have embarrassed myself.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1458829907-26596-1-git-send-email-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      b208ba8e
    • J
      drm/i915: Rename GGTT init functions · d85489d3
      Joonas Lahtinen 提交于
      Rename and document the GGTT init functions to give a better
      idea of the context where they are called from.
      
      i915_gem_gtt_init => i915_ggtt_init_hw
      i915_gem_init_global_gtt => i915_gem_init_ggtt
      i915_global_gtt_cleanup => i915_ggtt_cleanup_hw
      
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1458830866-12578-1-git-send-email-joonas.lahtinen@linux.intel.com
      d85489d3
    • M
      drm/i915: BUG_ON when ggtt_view is NULL · ade7daa1
      Matthew Auld 提交于
      Lets BUG_ON and don't bother with a WARN and returning an error, so we can
      remove the need to pollute the code with error handling, after all it is
      a programmer error to provide NULL view. Also while we're here remove
      redundant NULL ggtt_view check.
      
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
      Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1458834860-7898-1-git-send-email-matthew.auld@intel.com
      ade7daa1
    • B
      drm/i915: fix deadlock on lid open · 9f54d4bd
      Bjørn Mork 提交于
      commit e2c8b870 moved modeset locking inside resume/suspend
      functions, but missed a code path only executed on lid close/open
      on older hardware. The result was a deadlock when closing and
      opening the lid without suspending on such hardware:
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.6.0-rc1 #385 Not tainted
       ---------------------------------------------
       kworker/0:3/88 is trying to acquire lock:
        (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa063e6a4>] intel_display_resume+0x4a/0x12f [i915]
      
       but task is already holding lock:
        (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa02d0d4f>] drm_modeset_lock_all+0x3e/0xa6 [drm]
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&dev->mode_config.mutex);
         lock(&dev->mode_config.mutex);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       7 locks held by kworker/0:3/88:
        #0:  ("kacpi_notify"){++++.+}, at: [<ffffffff81068dfc>] process_one_work+0x14a/0x50b
        #1:  ((&dpc->work)#2){+.+.+.}, at: [<ffffffff81068dfc>] process_one_work+0x14a/0x50b
        #2:  ((acpi_lid_notifier).rwsem){++++.+}, at: [<ffffffff8106f874>] __blocking_notifier_call_chain+0x34/0x65
        #3:  (&dev_priv->modeset_restore_lock){+.+.+.}, at: [<ffffffffa0664cf6>] intel_lid_notify+0x3c/0xd9 [i915]
        #4:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa02d0d4f>] drm_modeset_lock_all+0x3e/0xa6 [drm]
        #5:  (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffffa02d0d59>] drm_modeset_lock_all+0x48/0xa6 [drm]
        #6:  (crtc_ww_class_mutex){+.+.+.}, at: [<ffffffffa02d0b2a>] modeset_lock+0x13c/0x1cd [drm]
      
       stack backtrace:
       CPU: 0 PID: 88 Comm: kworker/0:3 Not tainted 4.6.0-rc1 #385
       Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
       Workqueue: kacpi_notify acpi_os_execute_deferred
        0000000000000000 ffff88022fd5f990 ffffffff8124af06 ffffffff825b39c0
        ffffffff825b39c0 ffff88022fd5fa60 ffffffff8108f547 ffff88022fd5fa70
        000000008108e817 ffff880230236cc0 0000000000000000 ffffffff825b39c0
       Call Trace:
        [<ffffffff8124af06>] dump_stack+0x67/0x90
        [<ffffffff8108f547>] __lock_acquire+0xdb5/0xf71
        [<ffffffff8108bd2c>] ? look_up_lock_class+0xbe/0x10a
        [<ffffffff8108fae2>] lock_acquire+0x137/0x1cb
        [<ffffffff8108fae2>] ? lock_acquire+0x137/0x1cb
        [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
        [<ffffffff8148202f>] mutex_lock_nested+0x7e/0x3a4
        [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
        [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
        [<ffffffffa02d0b2a>] ? modeset_lock+0x13c/0x1cd [drm]
        [<ffffffffa063e6a4>] intel_display_resume+0x4a/0x12f [i915]
        [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
        [<ffffffffa02d0b2a>] ? modeset_lock+0x13c/0x1cd [drm]
        [<ffffffffa02d0b2a>] ? modeset_lock+0x13c/0x1cd [drm]
        [<ffffffffa02d0bf7>] ? drm_modeset_lock+0x17/0x24 [drm]
        [<ffffffffa02d0c8b>] ? drm_modeset_lock_all_ctx+0x87/0xa1 [drm]
        [<ffffffffa0664d6a>] intel_lid_notify+0xb0/0xd9 [i915]
        [<ffffffff8106f4c6>] notifier_call_chain+0x4a/0x6c
        [<ffffffff8106f88d>] __blocking_notifier_call_chain+0x4d/0x65
        [<ffffffff8106f8b9>] blocking_notifier_call_chain+0x14/0x16
        [<ffffffffa0011215>] acpi_lid_send_state+0x83/0xad [button]
        [<ffffffffa00112a6>] acpi_button_notify+0x41/0x132 [button]
        [<ffffffff812b07df>] acpi_device_notify+0x19/0x1b
        [<ffffffff812c8570>] acpi_ev_notify_dispatch+0x49/0x64
        [<ffffffff812ab9fb>] acpi_os_execute_deferred+0x14/0x20
        [<ffffffff81068f17>] process_one_work+0x265/0x50b
        [<ffffffff810696f5>] worker_thread+0x1fc/0x2dd
        [<ffffffff810694f9>] ? rescuer_thread+0x309/0x309
        [<ffffffff810694f9>] ? rescuer_thread+0x309/0x309
        [<ffffffff8106e2d6>] kthread+0xe0/0xe8
        [<ffffffff8107bc47>] ? local_clock+0x19/0x22
        [<ffffffff81484f42>] ret_from_fork+0x22/0x40
        [<ffffffff8106e1f6>] ? kthread_create_on_node+0x1b5/0x1b5
      
      Fixes: e2c8b870 ("drm/i915: Use atomic helpers for suspend, v2.")
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1459328913-13719-1-git-send-email-bjorn@mork.no
      9f54d4bd
    • D
      drm/i915: Update DRIVER_DATE to 20160330 · 68d4aee9
      Daniel Vetter 提交于
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      68d4aee9