1. 19 12月, 2017 1 次提交
  2. 18 12月, 2017 1 次提交
  3. 16 12月, 2017 1 次提交
  4. 14 12月, 2017 2 次提交
    • M
      drm/i915/guc: Extract guc_init from guc_init_hw · 61b5c158
      Michał Winiarski 提交于
      After GPU reset, GuC HW needs to be reinitialized (with FW reload).
      Unfortunately, we're doing some extra work there (mostly allocating stuff),
      work that can be moved to guc_init and called once at driver load time.
      
      As a side effect we're no longer hitting an assert in
      i915_ggtt_enable_guc on suspend/resume.
      
      v2: Do not duplicate disable_communication / reset_guc_interrupts
      v3: Add proper teardown after rebase
      
      References: 04f7b24e ("drm/i915/guc: Assert that we switch between known ggtt->invalidate functions")
      Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171213221352.7173-3-michal.winiarski@intel.com
      61b5c158
    • M
      drm/i915/guc: Move GuC workqueue allocations outside of the mutex · 3176ff49
      Michał Winiarski 提交于
      This gets rid of the following lockdep splat:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.15.0-rc2-CI-Patchwork_7428+ #1 Not tainted
      ------------------------------------------------------
      debugfs_test/1351 is trying to acquire lock:
       (&dev->struct_mutex){+.+.}, at: [<000000009d90d1a3>] i915_mutex_lock_interruptible+0x47/0x130 [i915]
      
      but task is already holding lock:
       (&mm->mmap_sem){++++}, at: [<000000005df01c1e>] __do_page_fault+0x106/0x560
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #6 (&mm->mmap_sem){++++}:
             __might_fault+0x63/0x90
             _copy_to_user+0x1e/0x70
             filldir+0x8c/0xf0
             dcache_readdir+0xeb/0x160
             iterate_dir+0xe6/0x150
             SyS_getdents+0xa0/0x130
             entry_SYSCALL_64_fastpath+0x1c/0x89
      
      -> #5 (&sb->s_type->i_mutex_key#5){++++}:
             lockref_get+0x9/0x20
      
      -> #4 ((completion)&req.done){+.+.}:
             wait_for_common+0x54/0x210
             devtmpfs_create_node+0x130/0x150
             device_add+0x5ad/0x5e0
             device_create_groups_vargs+0xd4/0xe0
             device_create+0x35/0x40
             msr_device_create+0x22/0x40
             cpuhp_invoke_callback+0xc5/0xbf0
             cpuhp_thread_fun+0x167/0x210
             smpboot_thread_fn+0x17f/0x270
             kthread+0x173/0x1b0
             ret_from_fork+0x24/0x30
      
      -> #3 (cpuhp_state-up){+.+.}:
             cpuhp_issue_call+0x132/0x1c0
             __cpuhp_setup_state_cpuslocked+0x12f/0x2a0
             __cpuhp_setup_state+0x3a/0x50
             page_writeback_init+0x3a/0x5c
             start_kernel+0x393/0x3e2
             secondary_startup_64+0xa5/0xb0
      
      -> #2 (cpuhp_state_mutex){+.+.}:
             __mutex_lock+0x81/0x9b0
             __cpuhp_setup_state_cpuslocked+0x4b/0x2a0
             __cpuhp_setup_state+0x3a/0x50
             page_alloc_init+0x1f/0x26
             start_kernel+0x139/0x3e2
             secondary_startup_64+0xa5/0xb0
      
      -> #1 (cpu_hotplug_lock.rw_sem){++++}:
             cpus_read_lock+0x34/0xa0
             apply_workqueue_attrs+0xd/0x40
             __alloc_workqueue_key+0x2c7/0x4e1
             intel_guc_submission_init+0x10c/0x650 [i915]
             intel_uc_init_hw+0x29e/0x460 [i915]
             i915_gem_init_hw+0xca/0x290 [i915]
             i915_gem_init+0x115/0x3a0 [i915]
             i915_driver_load+0x9a8/0x16c0 [i915]
             i915_pci_probe+0x2e/0x90 [i915]
             pci_device_probe+0x9c/0x120
             driver_probe_device+0x2a3/0x480
             __driver_attach+0xd9/0xe0
             bus_for_each_dev+0x57/0x90
             bus_add_driver+0x168/0x260
             driver_register+0x52/0xc0
             do_one_initcall+0x39/0x150
             do_init_module+0x56/0x1ef
             load_module+0x231c/0x2d70
             SyS_finit_module+0xa5/0xe0
             entry_SYSCALL_64_fastpath+0x1c/0x89
      
      -> #0 (&dev->struct_mutex){+.+.}:
             lock_acquire+0xaf/0x200
             __mutex_lock+0x81/0x9b0
             i915_mutex_lock_interruptible+0x47/0x130 [i915]
             i915_gem_fault+0x201/0x760 [i915]
             __do_fault+0x15/0x70
             __handle_mm_fault+0x85b/0xe40
             handle_mm_fault+0x14f/0x2f0
             __do_page_fault+0x2d1/0x560
             page_fault+0x22/0x30
      
      other info that might help us debug this:
      
      Chain exists of:
        &dev->struct_mutex --> &sb->s_type->i_mutex_key#5 --> &mm->mmap_sem
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&mm->mmap_sem);
                                     lock(&sb->s_type->i_mutex_key#5);
                                     lock(&mm->mmap_sem);
        lock(&dev->struct_mutex);
      
       *** DEADLOCK ***
      
      1 lock held by debugfs_test/1351:
       #0:  (&mm->mmap_sem){++++}, at: [<000000005df01c1e>] __do_page_fault+0x106/0x560
      
      stack backtrace:
      CPU: 2 PID: 1351 Comm: debugfs_test Not tainted 4.15.0-rc2-CI-Patchwork_7428+ #1
      Hardware name:                  /NUC6i5SYB, BIOS SYSKLi35.86A.0057.2017.0119.1758 01/19/2017
      Call Trace:
       dump_stack+0x5f/0x86
       print_circular_bug+0x230/0x3b0
       check_prev_add+0x439/0x7b0
       ? lockdep_init_map_crosslock+0x20/0x20
       ? unwind_get_return_address+0x16/0x30
       ? __lock_acquire+0x1385/0x15a0
       __lock_acquire+0x1385/0x15a0
       lock_acquire+0xaf/0x200
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       __mutex_lock+0x81/0x9b0
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       i915_mutex_lock_interruptible+0x47/0x130 [i915]
       ? __pm_runtime_resume+0x4f/0x80
       i915_gem_fault+0x201/0x760 [i915]
       __do_fault+0x15/0x70
       __handle_mm_fault+0x85b/0xe40
       handle_mm_fault+0x14f/0x2f0
       __do_page_fault+0x2d1/0x560
       page_fault+0x22/0x30
      RIP: 0033:0x7f98d6f49116
      RSP: 002b:00007ffd6ffc3278 EFLAGS: 00010283
      RAX: 00007f98d39a2bc0 RBX: 0000000000000000 RCX: 0000000000001680
      RDX: 0000000000001680 RSI: 00007ffd6ffc3400 RDI: 00007f98d39a2bc0
      RBP: 00007ffd6ffc33a0 R08: 0000000000000000 R09: 00000000000005a0
      R10: 000055e847c2a830 R11: 0000000000000002 R12: 0000000000000001
      R13: 000055e847c1d040 R14: 00007ffd6ffc3400 R15: 00007f98d6752ba0
      
      v2: Init preempt_work unconditionally (Chris)
      v3: Mention that we need the enable_guc=1 for lockdep splat (Chris)
      
      Testcase: igt/debugfs_test/read_all_entries # with i915.enable_guc=1
      Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171213221352.7173-2-michal.winiarski@intel.com
      3176ff49
  5. 13 12月, 2017 1 次提交
  6. 12 12月, 2017 1 次提交
  7. 06 12月, 2017 1 次提交
  8. 01 12月, 2017 2 次提交
  9. 28 11月, 2017 2 次提交
  10. 22 11月, 2017 1 次提交
    • T
      drm/i915/pmu: Expose a PMU interface for perf queries · b46a33e2
      Tvrtko Ursulin 提交于
      From: Chris Wilson <chris@chris-wilson.co.uk>
      From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      
      The first goal is to be able to measure GPU (and invidual ring) busyness
      without having to poll registers from userspace. (Which not only incurs
      holding the forcewake lock indefinitely, perturbing the system, but also
      runs the risk of hanging the machine.) As an alternative we can use the
      perf event counter interface to sample the ring registers periodically
      and send those results to userspace.
      
      Functionality we are exporting to userspace is via the existing perf PMU
      API and can be exercised via the existing tools. For example:
      
        perf stat -a -e i915/rcs0-busy/ -I 1000
      
      Will print the render engine busynnes once per second. All the performance
      counters can be enumerated (perf list) and have their unit of measure
      correctly reported in sysfs.
      
      v1-v2 (Chris Wilson):
      
      v2: Use a common timer for the ring sampling.
      
      v3: (Tvrtko Ursulin)
       * Decouple uAPI from i915 engine ids.
       * Complete uAPI defines.
       * Refactor some code to helpers for clarity.
       * Skip sampling disabled engines.
       * Expose counters in sysfs.
       * Pass in fake regs to avoid null ptr deref in perf core.
       * Convert to class/instance uAPI.
       * Use shared driver code for rc6 residency, power and frequency.
      
      v4: (Dmitry Rogozhkin)
       * Register PMU with .task_ctx_nr=perf_invalid_context
       * Expose cpumask for the PMU with the single CPU in the mask
       * Properly support pmu->stop(): it should call pmu->read()
       * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
       * Introduce refcounting of event subscriptions.
       * Make pmu.busy_stats a refcounter to avoid busy stats going away
         with some deleted event.
       * Expose cpumask for i915 PMU to avoid multiple events creation of
         the same type followed by counter aggregation by perf-stat.
       * Track CPUs getting online/offline to migrate perf context. If (likely)
         cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
         needed to see effect of CPU status tracking.
       * End result is that only global events are supported and perf stat
         works correctly.
       * Deny perf driver level sampling - it is prohibited for uncore PMU.
      
      v5: (Tvrtko Ursulin)
      
       * Don't hardcode number of engine samplers.
       * Rewrite event ref-counting for correctness and simplicity.
       * Store initial counter value when starting already enabled events
         to correctly report values to all listeners.
       * Fix RC6 residency readout.
       * Comments, GPL header.
      
      v6:
       * Add missing entry to v4 changelog.
       * Fix accounting in CPU hotplug case by copying the approach from
         arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)
      
      v7:
       * Log failure message only on failure.
       * Remove CPU hotplug notification state on unregister.
      
      v8:
       * Fix error unwind on failed registration.
       * Checkpatch cleanup.
      
      v9:
       * Drop the energy metric, it is available via intel_rapl_perf.
         (Ville Syrjälä)
       * Use HAS_RC6(p). (Chris Wilson)
       * Handle unsupported non-engine events. (Dmitry Rogozhkin)
       * Rebase for intel_rc6_residency_ns needing caller managed
         runtime pm.
       * Drop HAS_RC6 checks from the read callback since creating those
         events will be rejected at init time already.
       * Add counter units to sysfs so perf stat output is nicer.
       * Cleanup the attribute tables for brevity and readability.
      
      v10:
       * Fixed queued accounting.
      
      v11:
       * Move intel_engine_lookup_user to intel_engine_cs.c
       * Commit update. (Joonas Lahtinen)
      
      v12:
       * More accurate sampling. (Chris Wilson)
       * Store and report frequency in MHz for better usability from
         perf stat.
       * Removed metrics: queued, interrupts, rc6 counters.
       * Sample engine busyness based on seqno difference only
         for less MMIO (and forcewake) on all platforms. (Chris Wilson)
      
      v13:
       * Comment spelling, use mul_u32_u32 to work around potential GCC
         issue and somne code alignment changes. (Chris Wilson)
      
      v14:
       * Rebase.
      
      v15:
       * Rebase for RPS refactoring.
      
      v16:
       * Use the dynamic slot in the CPU hotplug state machine so that we are
         free to setup our state as multi-instance. Previously we were re-using
         the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as
         multi-instance, nor owned by our driver to start with.
       * Register the CPU hotplug handlers after the PMU, otherwise the callback
         will get called before the PMU is initialized which can end up in
         perf_pmu_migrate_context with an un-initialized base.
       * Added workaround for a probable bug in cpuhp core.
      
      v17:
       * Remove workaround for the cpuhp bug.
      
      v18:
       * Rebase for drm_i915_gem_engine_class getting upstream before us.
      
      v19:
       * Rebase. (trivial)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NDmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com
      b46a33e2
  11. 21 11月, 2017 5 次提交
  12. 20 11月, 2017 1 次提交
  13. 18 11月, 2017 1 次提交
  14. 15 11月, 2017 3 次提交
  15. 14 11月, 2017 1 次提交
  16. 13 11月, 2017 1 次提交
  17. 12 11月, 2017 1 次提交
  18. 11 11月, 2017 3 次提交
    • C
      drm/i915/selftests: Yet another forgotten mock_i915->mm initialiser · 9c52d1c8
      Chris Wilson 提交于
      Move all of the i915->mm initialisation to a private function that can
      be reused by the mock i915 device to save forgetting any more steps.
      
      For example,
      <7>[ 1542.046332] [IGT] drv_selftest: starting subtest mock_objects
      <4>[ 1542.123924] Setting dangerous option mock_selftests - tainting kernel
      <6>[ 1542.167941] i915: Performing mock selftests with st_random_seed=0x246f5ab5 st_timeout=1000
      <4>[ 1542.178012] INFO: trying to register non-static key.
      <4>[ 1542.178027] the code is fine but needs lockdep annotation.
      <4>[ 1542.178032] turning off the locking correctness validator.
      <4>[ 1542.178041] CPU: 3 PID: 6008 Comm: kworker/3:7 Tainted: G     U          4.14.0-rc8-CI-CI_DRM_3332+ #1
      <4>[ 1542.178049] Hardware name:                  /NUC6CAYB, BIOS AYAPLCEL.86A.0040.2017.0619.1722 06/19/2017
      <4>[ 1542.178144] Workqueue: events __i915_gem_free_work [i915]
      <4>[ 1542.178152] Call Trace:
      <4>[ 1542.178163]  dump_stack+0x68/0x9f
      <4>[ 1542.178170]  register_lock_class+0x3fd/0x580
      <4>[ 1542.178177]  ? unwind_next_frame+0x14/0x20
      <4>[ 1542.178184]  ? __save_stack_trace+0x73/0xd0
      <4>[ 1542.178191]  __lock_acquire+0xa4/0x1b00
      <4>[ 1542.178254]  ? __i915_gem_free_work+0x28/0xa0 [i915]
      <4>[ 1542.178261]  ? __lock_acquire+0x4ab/0x1b00
      <4>[ 1542.178268]  lock_acquire+0xb0/0x200
      <4>[ 1542.178273]  ? lock_acquire+0xb0/0x200
      <4>[ 1542.178336]  ? __i915_gem_free_work+0x28/0xa0 [i915]
      <4>[ 1542.178344]  _raw_spin_lock+0x32/0x50
      <4>[ 1542.178405]  ? __i915_gem_free_work+0x28/0xa0 [i915]
      <4>[ 1542.178468]  __i915_gem_free_work+0x28/0xa0 [i915]
      <4>[ 1542.178476]  process_one_work+0x221/0x650
      <4>[ 1542.178483]  worker_thread+0x4e/0x3c0
      <4>[ 1542.178489]  kthread+0x114/0x150
      <4>[ 1542.178494]  ? process_one_work+0x650/0x650
      <4>[ 1542.178499]  ? kthread_create_on_node+0x40/0x40
      <4>[ 1542.178506]  ret_from_fork+0x27/0x40
      
      v2: Fish out i915->mm.object_stat_lock which was being inited over in
      i915_drv.c (Matthew)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171110232447.21618-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>
      9c52d1c8
    • C
      drm/i915: Record the default hw state after reset upon load · d2b4b979
      Chris Wilson 提交于
      Take a copy of the HW state after a reset upon module loading by
      executing a context switch from a blank context to the kernel context,
      thus saving the default hw state over the blank context image.
      We can then use the default hw state to initialise any future context,
      ensuring that each starts with the default view of hw state.
      
      v2: Unmap our default state from the GTT after stealing it from the
      context. This should stop us from accidentally overwriting it via the
      GTT (and frees up some precious GTT space).
      
      Testcase: igt/gem_ctx_isolation
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-7-chris@chris-wilson.co.uk
      d2b4b979
    • C
      drm/i915: Inline intel_modeset_gem_init() · d378a3ef
      Chris Wilson 提交于
      intel_modeset_gem_init() now only sets up the legacy overlay, so let's
      remove the function and call the setup directly during driver load. This
      should help us find a better point in the initialisation sequence for it
      later.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-5-chris@chris-wilson.co.uk
      d378a3ef
  19. 06 11月, 2017 1 次提交
  20. 02 11月, 2017 1 次提交
    • M
      drm/i915/guc: Add support for reset engine using GuC commands · 6acbea89
      Michel Thierry 提交于
      This patch adds per engine reset and recovery (TDR) support when GuC is
      used to submit workloads to GPU.
      
      In the case of i915 directly submission to ELSP, driver manages hang
      detection, recovery and resubmission. With GuC submission these tasks
      are shared between driver and GuC. i915 is still responsible for detecting
      a hang, and when it does it only requests GuC to reset that Engine. GuC
      internally manages acquiring forcewake and idling the engine before
      resetting it.
      
      Once the reset is successful, i915 takes over again and handles the
      resubmission. The scheduler in i915 knows which requests are pending so
      after resetting a engine, pending workloads/requests are resubmitted
      again.
      
      v2: s/i915_guc_request_engine_reset/i915_guc_reset_engine/ to match the
      non-guc function names.
      
      v3: Removed debug message about engine restarting from which request,
      since the new baseline do it regardless of submission mode. (Chris)
      
      v4: Rebase.
      
      v5: Do not pass unnecessary reporting flags to the fw (Jeff);
      tasklet_schedule(&execlists->irq_tasklet) handles the resubmit; rebase.
      
      v6: Rename the existing reset engine function and share a similar
      interface between guc and non-guc paths (Chris).
      Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171031225309.10888-1-michel.thierry@intel.comReviewed-by: NJeff McGee <jeff.mcgee@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      6acbea89
  21. 27 10月, 2017 2 次提交
    • M
      drm/i915/guc: Preemption! With GuC · c41937fd
      Michał Winiarski 提交于
      Pretty similar to what we have on execlists.
      We're reusing most of the GEM code, however, due to GuC quirks we need a
      couple of extra bits.
      Preemption is implemented as GuC action, and actions can be pretty slow.
      Because of that, we're using a mutex to serialize them. Since we're
      requesting preemption from the tasklet, the task of creating a workitem
      and wrapping it in GuC action is delegated to a worker.
      
      To distinguish that preemption has finished, we're using additional
      piece of HWSP, and since we're not getting context switch interrupts,
      we're also adding a user interrupt.
      
      The fact that our special preempt context has completed unfortunately
      doesn't mean that we're ready to submit new work. We also need to wait
      for GuC to finish its own processing.
      
      v2: Don't compile out the wait for GuC, handle workqueue flush on reset,
      no need for ordered workqueue, put on a reviewer hat when looking at my own
      patches (Chris)
      Move struct work around in intel_guc, move user interruput outside of
      conditional (Michał)
      Keep ring around rather than chase though intel_context
      
      v3: Extract WA for flushing ggtt writes to a helper (Chris)
      Keep work_struct in intel_guc rather than engine (Michał)
      Use ordered workqueue for inject_preempt worker to avoid GuC quirks.
      
      v4: Drop now unused INTEL_GUC_PREEMPT_OPTION_IMMEDIATE (Daniele)
      Drop stray newlines, use container_of for intel_guc in worker,
      check for presence of workqueue when flushing it, rather than
      enable_guc_submission modparam, reorder preempt postprocessing (Chris)
      
      v5: Make wq NULL after destroying it
      
      v6: Swap struct guc_preempt_work members (Michał)
      Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Jeff McGee <jeff.mcgee@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171026133558.19580-1-michal.winiarski@intel.com
      c41937fd
    • M
      drm/i915: Rename helpers used for unwinding, use macro for can_preempt · a4598d17
      Michał Winiarski 提交于
      We would also like to make use of execlist_cancel_port_requests and
      unwind_incomplete_requests in GuC preemption backend.
      Let's rename the functions to use the correct prefixes, so that we can
      simply add the declarations in the following patch.
      Similar thing for applies for can_preempt, except we're introducing
      HAS_LOGICAL_RING_PREEMPTION macro instad, converting other users that
      were previously touching device info directly.
      
      v2: s/intel_engine/execlists and pass execlists to unwind (Chris)
      v3: use locked version for exporting, drop const qual (Chris)
      Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-11-michal.winiarski@intel.com
      a4598d17
  22. 12 10月, 2017 1 次提交
    • C
      drm/i915/userptr: Drop struct_mutex before cleanup · 7c781423
      Chris Wilson 提交于
      Purely to silence lockdep, as we know that no bo can exist at this time
      and so the inversion is impossible. Nevertheless, lockdep currently
      warns on unload:
      
      [  137.522565] WARNING: possible circular locking dependency detected
      [  137.522568] 4.14.0-rc4-CI-CI_DRM_3209+ #1 Tainted: G     U
      [  137.522570] ------------------------------------------------------
      [  137.522572] drv_module_relo/1532 is trying to acquire lock:
      [  137.522574]  ("i915-userptr-acquire"){+.+.}, at: [<ffffffff8109a831>] flush_workqueue+0x91/0x540
      [  137.522581]
                     but task is already holding lock:
      [  137.522583]  (&dev->struct_mutex){+.+.}, at: [<ffffffffa014fb3f>] i915_gem_fini+0x3f/0xc0 [i915]
      [  137.522605]
                     which lock already depends on the new lock.
      
      [  137.522608]
                     the existing dependency chain (in reverse order) is:
      [  137.522611]
                     -> #3 (&dev->struct_mutex){+.+.}:
      [  137.522615]        __lock_acquire+0x1420/0x15e0
      [  137.522618]        lock_acquire+0xb0/0x200
      [  137.522621]        __mutex_lock+0x86/0x9b0
      [  137.522623]        mutex_lock_interruptible_nested+0x1b/0x20
      [  137.522640]        i915_mutex_lock_interruptible+0x51/0x130 [i915]
      [  137.522657]        i915_gem_fault+0x20b/0x720 [i915]
      [  137.522660]        __do_fault+0x1e/0x80
      [  137.522662]        __handle_mm_fault+0xa08/0xed0
      [  137.522664]        handle_mm_fault+0x156/0x300
      [  137.522666]        __do_page_fault+0x2c5/0x570
      [  137.522668]        do_page_fault+0x28/0x250
      [  137.522671]        page_fault+0x22/0x30
      [  137.522672]
                     -> #2 (&mm->mmap_sem){++++}:
      [  137.522677]        __lock_acquire+0x1420/0x15e0
      [  137.522679]        lock_acquire+0xb0/0x200
      [  137.522682]        down_read+0x3e/0x70
      [  137.522699]        __i915_gem_userptr_get_pages_worker+0x141/0x240 [i915]
      [  137.522701]        process_one_work+0x233/0x660
      [  137.522704]        worker_thread+0x4e/0x3b0
      [  137.522706]        kthread+0x152/0x190
      [  137.522708]        ret_from_fork+0x27/0x40
      [  137.522710]
                     -> #1 ((&work->work)){+.+.}:
      [  137.522714]        __lock_acquire+0x1420/0x15e0
      [  137.522717]        lock_acquire+0xb0/0x200
      [  137.522719]        process_one_work+0x206/0x660
      [  137.522721]        worker_thread+0x4e/0x3b0
      [  137.522723]        kthread+0x152/0x190
      [  137.522725]        ret_from_fork+0x27/0x40
      [  137.522727]
                     -> #0 ("i915-userptr-acquire"){+.+.}:
      [  137.522731]        check_prev_add+0x430/0x840
      [  137.522733]        __lock_acquire+0x1420/0x15e0
      [  137.522735]        lock_acquire+0xb0/0x200
      [  137.522738]        flush_workqueue+0xb4/0x540
      [  137.522740]        drain_workqueue+0xd4/0x1b0
      [  137.522742]        destroy_workqueue+0x1c/0x200
      [  137.522758]        i915_gem_cleanup_userptr+0x15/0x20 [i915]
      [  137.522770]        i915_gem_fini+0x5f/0xc0 [i915]
      [  137.522782]        i915_driver_unload+0x122/0x180 [i915]
      [  137.522794]        i915_pci_remove+0x19/0x30 [i915]
      [  137.522797]        pci_device_remove+0x39/0xb0
      [  137.522800]        device_release_driver_internal+0x15d/0x220
      [  137.522803]        driver_detach+0x40/0x80
      [  137.522805]        bus_remove_driver+0x58/0xd0
      [  137.522807]        driver_unregister+0x2c/0x40
      [  137.522809]        pci_unregister_driver+0x36/0xb0
      [  137.522828]        i915_exit+0x1a/0x8b [i915]
      [  137.522831]        SyS_delete_module+0x18c/0x1e0
      [  137.522834]        entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  137.522835]
                     other info that might help us debug this:
      
      [  137.522838] Chain exists of:
                       "i915-userptr-acquire" --> &mm->mmap_sem --> &dev->struct_mutex
      
      [  137.522844]  Possible unsafe locking scenario:
      
      [  137.522846]        CPU0                    CPU1
      [  137.522848]        ----                    ----
      [  137.522850]   lock(&dev->struct_mutex);
      [  137.522852]                                lock(&mm->mmap_sem);
      [  137.522854]                                lock(&dev->struct_mutex);
      [  137.522857]   lock("i915-userptr-acquire");
      [  137.522859]
                      *** DEADLOCK ***
      
      [  137.522862] 3 locks held by drv_module_relo/1532:
      [  137.522864]  #0:  (&dev->mutex){....}, at: [<ffffffff8161d47b>] device_release_driver_internal+0x2b/0x220
      [  137.522869]  #1:  (&dev->mutex){....}, at: [<ffffffff8161d489>] device_release_driver_internal+0x39/0x220
      [  137.522873]  #2:  (&dev->struct_mutex){+.+.}, at: [<ffffffffa014fb3f>] i915_gem_fini+0x3f/0xc0 [i915]
      [  137.522888]
                     stack backtrace:
      [  137.522891] CPU: 0 PID: 1532 Comm: drv_module_relo Tainted: G     U          4.14.0-rc4-CI-CI_DRM_3209+ #1
      [  137.522894] Hardware name:                  /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017
      [  137.522897] Call Trace:
      [  137.522900]  dump_stack+0x68/0x9f
      [  137.522902]  print_circular_bug+0x235/0x3c0
      [  137.522905]  ? lockdep_init_map_crosslock+0x20/0x20
      [  137.522908]  check_prev_add+0x430/0x840
      [  137.522919]  ? i915_gem_fini+0x5f/0xc0 [i915]
      [  137.522922]  ? __kernel_text_address+0x12/0x40
      [  137.522925]  ? __save_stack_trace+0x66/0xd0
      [  137.522928]  __lock_acquire+0x1420/0x15e0
      [  137.522930]  ? __lock_acquire+0x1420/0x15e0
      [  137.522933]  ? lockdep_init_map_crosslock+0x20/0x20
      [  137.522936]  ? __this_cpu_preempt_check+0x13/0x20
      [  137.522938]  lock_acquire+0xb0/0x200
      [  137.522940]  ? flush_workqueue+0x91/0x540
      [  137.522943]  flush_workqueue+0xb4/0x540
      [  137.522945]  ? flush_workqueue+0x91/0x540
      [  137.522948]  ? __mutex_unlock_slowpath+0x43/0x2c0
      [  137.522951]  ? trace_hardirqs_on_caller+0xe3/0x1b0
      [  137.522954]  drain_workqueue+0xd4/0x1b0
      [  137.522956]  ? drain_workqueue+0xd4/0x1b0
      [  137.522958]  destroy_workqueue+0x1c/0x200
      [  137.522975]  i915_gem_cleanup_userptr+0x15/0x20 [i915]
      [  137.522987]  i915_gem_fini+0x5f/0xc0 [i915]
      [  137.523000]  i915_driver_unload+0x122/0x180 [i915]
      [  137.523015]  i915_pci_remove+0x19/0x30 [i915]
      [  137.523018]  pci_device_remove+0x39/0xb0
      [  137.523021]  device_release_driver_internal+0x15d/0x220
      [  137.523023]  driver_detach+0x40/0x80
      [  137.523026]  bus_remove_driver+0x58/0xd0
      [  137.523028]  driver_unregister+0x2c/0x40
      [  137.523030]  pci_unregister_driver+0x36/0xb0
      [  137.523049]  i915_exit+0x1a/0x8b [i915]
      [  137.523052]  SyS_delete_module+0x18c/0x1e0
      [  137.523055]  entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  137.523057] RIP: 0033:0x7f7bd0609287
      [  137.523059] RSP: 002b:00007ffef694bc18 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0
      [  137.523062] RAX: ffffffffffffffda RBX: ffffffff81493f33 RCX: 00007f7bd0609287
      [  137.523065] RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000564f999f9fc8
      [  137.523067] RBP: ffffc90005c4ff88 R08: 0000000000000000 R09: 0000000000000080
      [  137.523069] R10: 00007f7bd20ef8c0 R11: 0000000000000246 R12: 0000000000000000
      [  137.523072] R13: 00007ffef694be00 R14: 0000000000000000 R15: 0000000000000000
      [  137.523075]  ? __this_cpu_preempt_check+0x13/0x20
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171011141857.14161-1-chris@chris-wilson.co.ukReviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7c781423
  23. 11 10月, 2017 4 次提交
  24. 05 10月, 2017 2 次提交
    • C
      drm/i915/scheduler: Support user-defined priorities · ac14fbd4
      Chris Wilson 提交于
      Use a priority stored in the context as the initial value when
      submitting a request. This allows us to change the default priority on a
      per-context basis, allowing different contexts to be favoured with GPU
      time at the expense of lower importance work. The user can adjust the
      context's priority via I915_CONTEXT_PARAM_PRIORITY, with more positive
      values being higher priority (they will be serviced earlier, after their
      dependencies have been resolved). Any prerequisite work for an execbuf
      will have its priority raised to match the new request as required.
      
      Normal users can specify any value in the range of -1023 to 0 [default],
      i.e. they can reduce the priority of their workloads (and temporarily
      boost it back to normal if so desired).
      
      Privileged users can specify any value in the range of -1023 to 1023,
      [default is 0], i.e. they can raise their priority above all overs and
      so potentially starve the system.
      
      Note that the existing schedulers are not fair, nor load balancing, the
      execution is strictly by priority on a first-come, first-served basis,
      and the driver may choose to boost some requests above the range
      available to users.
      
      This priority was originally based around nice(2), but evolved to allow
      clients to adjust their priority within a small range, and allow for a
      privileged high priority range.
      
      For example, this can be used to implement EGL_IMG_context_priority
      https://www.khronos.org/registry/egl/extensions/IMG/EGL_IMG_context_priority.txt
      
      	EGL_CONTEXT_PRIORITY_LEVEL_IMG determines the priority level of
              the context to be created. This attribute is a hint, as an
              implementation may not support multiple contexts at some
              priority levels and system policy may limit access to high
              priority contexts to appropriate system privilege level. The
              default value for EGL_CONTEXT_PRIORITY_LEVEL_IMG is
              EGL_CONTEXT_PRIORITY_MEDIUM_IMG."
      
      so we can map
      
      	PRIORITY_HIGH -> 1023 [privileged, will failback to 0]
      	PRIORITY_MED -> 0 [default]
      	PRIORITY_LOW -> -1023
      
      They also map onto the priorities used by VkQueue (and a VkQueue is
      essentially a timeline, our i915_gem_context under full-ppgtt).
      
      v2: s/CAP_SYS_ADMIN/CAP_SYS_NICE/
      v3: Report min/max user priorities as defines in the uapi, and rebase
      internal priorities on the exposed values.
      
      Testcase: igt/gem_exec_schedule
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-9-chris@chris-wilson.co.uk
      ac14fbd4
    • C
      drm/i915/execlists: Preemption! · beecec90
      Chris Wilson 提交于
      When we write to ELSP, it triggers a context preemption at the earliest
      arbitration point (3DPRIMITIVE, some PIPECONTROLs, a few other
      operations and the explicit MI_ARB_CHECK). If this is to the same
      context, it triggers a LITE_RESTORE where the RING_TAIL is merely
      updated (used currently to chain requests from the same context
      together, avoiding bubbles). However, if it is to a different context, a
      full context-switch is performed and it will start to execute the new
      context saving the image of the old for later execution.
      
      Previously we avoided preemption by only submitting a new context when
      the old was idle. But now we wish embrace it, and if the new request has
      a higher priority than the currently executing request, we write to the
      ELSP regardless, thus triggering preemption, but we tell the GPU to
      switch to our special preemption context (not the target). In the
      context-switch interrupt handler, we know that the previous contexts
      have finished execution and so can unwind all the incomplete requests
      and compute the new highest priority request to execute.
      
      It would be feasible to avoid the switch-to-idle intermediate by
      programming the ELSP with the target context. The difficulty is in
      tracking which request that should be whilst maintaining the dependency
      change, the error comes in with coalesced requests. As we only track the
      most recent request and its priority, we may run into the issue of being
      tricked in preempting a high priority request that was followed by a
      low priority request from the same context (e.g. for PI); worse still
      that earlier request may be our own dependency and the order then broken
      by preemption. By injecting the switch-to-idle and then recomputing the
      priority queue, we avoid the issue with tracking in-flight coalesced
      requests. Having tried the preempt-to-busy approach, and failed to find
      a way around the coalesced priority issue, Michal's original proposal to
      inject an idle context (based on handling GuC preemption) succeeds.
      
      The current heuristic for deciding when to preempt are only if the new
      request is of higher priority, and has the privileged priority of
      greater than 0. Note that the scheduler remains unfair!
      
      v2: Disable for gen8 (bdw/bsw) as we need additional w/a for GPGPU.
      Since, the feature is now conditional and not always available when we
      have a scheduler, make it known via the HAS_SCHEDULER GETPARAM (now a
      capability mask).
      v3: Stylistic tweaks.
      v4: Appease Joonas with a snippet of kerneldoc, only to fuel to fire of
      the preempt vs preempting debate.
      Suggested-by: NMichal Winiarski <michal.winiarski@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michal Winiarski <michal.winiarski@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
      Cc: Zhi Wang <zhi.a.wang@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-8-chris@chris-wilson.co.uk
      beecec90