1. 08 9月, 2022 1 次提交
  2. 05 4月, 2022 1 次提交
    • T
      drm/i915: Explicitly track DRM clients · 5f0d4d14
      Tvrtko Ursulin 提交于
      Tracking DRM clients more explicitly will allow later patches to
      accumulate past and current GPU usage in a centralised place and also
      consolidate access to owning task pid/name.
      
      Unique client id is also assigned for the purpose of distinguishing/
      consolidating between multiple file descriptors owned by the same process.
      
      v2:
       Chris Wilson:
       * Enclose new members into dedicated structs.
       * Protect against failed sysfs registration.
      
      v3:
       * sysfs_attr_init.
      
      v4:
       * Fix for internal clients.
      
      v5:
       * Use cyclic ida for client id. (Chris)
       * Do not leak pid reference. (Chris)
       * Tidy code with some locals.
      
      v6:
       * Use xa_alloc_cyclic to simplify locking. (Chris)
       * No need to unregister individial sysfs files. (Chris)
       * Rebase on top of fpriv kref.
       * Track client closed status and reflect in sysfs.
      
      v7:
       * Make drm_client more standalone concept.
      
      v8:
       * Simplify sysfs show. (Chris)
       * Always track name and pid.
      
      v9:
       * Fix cyclic id assignment.
      
      v10:
       * No need for a mutex around xa_alloc_cyclic.
       * Refactor sysfs into own function.
       * Unregister sysfs before freeing pid and name.
       * Move clients setup into own function.
      
      v11:
       * Call clients init directly from driver init. (Chris)
      
      v12:
       * Do not fail client add on id wrap. (Maciej)
      
      v13 (Lucas): Rebase.
      
      v14:
       * Dropped sysfs bits.
      
      v15:
       * Dropped tracking of pid/ and name.
       * Dropped RCU freeing of the client object.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v11
      Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v11
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-2-tvrtko.ursulin@linux.intel.com
      5f0d4d14
  3. 21 3月, 2022 1 次提交
    • V
      drm/i915/gem: Don't evict unmappable VMAs when pinning with PIN_MAPPABLE (v2) · 230523ba
      Vivek Kasireddy 提交于
      On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or
      more framebuffers/scanout buffers results in only one that is mappable/
      fenceable. Therefore, pageflipping between these 2 FBs where only one
      is mappable/fenceable creates latencies large enough to miss alternate
      vblanks thereby producing less optimal framerate.
      
      This mainly happens because when i915_gem_object_pin_to_display_plane()
      is called to pin one of the FB objs, the associated vma is identified
      as misplaced -- because there is no space for it in the aperture --
      and therefore i915_vma_unbind() is called which unbinds and evicts it.
      This misplaced vma gets subseqently pinned only when
      i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This whole
      thing results in a latency of ~10ms and happens every other repaint cycle.
      Therefore, to fix this issue, we just ensure that the misplaced VMA
      does not get evicted when we try to pin it with PIN_MAPPABLE -- by
      returning early if the mappable/fenceable flag is not set.
      
      Testcase:
      Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform
      with a 8K@60 mode results in only ~40 FPS (compared to ~59 FPS with
      this patch). Since upstream Weston submits a frame ~7ms before the
      next vblank, the latencies seen between atomic commit and flip event
      are 7, 24 (7 + 16.66), 7, 24..... suggesting that it misses the
      vblank every other frame.
      
      Here is the ftrace snippet that shows the source of the ~10ms latency:
                    i915_gem_object_pin_to_display_plane() {
      0.102 us   |    i915_gem_object_set_cache_level();
                      i915_gem_object_ggtt_pin_ww() {
      0.390 us   |      i915_vma_instance();
      0.178 us   |      i915_vma_misplaced();
                        i915_vma_unbind() {
                        __i915_active_wait() {
      0.082 us   |        i915_active_acquire_if_busy();
      0.475 us   |      }
                        intel_runtime_pm_get() {
      0.087 us   |        intel_runtime_pm_acquire();
      0.259 us   |      }
                        __i915_active_wait() {
      0.085 us   |        i915_active_acquire_if_busy();
      0.240 us   |      }
                        __i915_vma_evict() {
                          ggtt_unbind_vma() {
                            gen8_ggtt_clear_range() {
      10507.255 us |        }
      10507.689 us |      }
      10508.516 us |   }
      
      v2:
      - Expand the code comments to describe the ping-pong issue.
      
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NVivek Kasireddy <vivek.kasireddy@intel.com>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220321005431.1113890-1-vivek.kasireddy@intel.com
      230523ba
  4. 07 3月, 2022 3 次提交
  5. 14 2月, 2022 2 次提交
  6. 11 2月, 2022 1 次提交
  7. 18 1月, 2022 1 次提交
  8. 11 1月, 2022 1 次提交
    • T
      drm/i915: Use vma resources for async unbinding · 2f6b90da
      Thomas Hellström 提交于
      Implement async (non-blocking) unbinding by not syncing the vma before
      calling unbind on the vma_resource.
      Add the resulting unbind fence to the object's dma_resv from where it is
      picked up by the ttm migration code.
      Ideally these unbind fences should be coalesced with the migration blit
      fence to avoid stalling the migration blit waiting for unbind, as they
      can certainly go on in parallel, but since we don't yet have a
      reasonable data structure to use to coalesce fences and attach the
      resulting fence to a timeline, we defer that for now.
      
      Note that with async unbinding, even while the unbind waits for the
      preceding bind to complete before unbinding, the vma itself might have been
      destroyed in the process, clearing the vma pages. Therefore we can
      only allow async unbinding if we have a refcounted sg-list and keep a
      refcount on that for the vma resource pages to stay intact until
      binding occurs. If this condition is not met, a request for an async
      unbind is diverted to a sync unbind.
      
      v2:
      - Use a separate kmem_cache for vma resources for now to isolate their
        memory allocation and aid debugging.
      - Move the check for vm closed to the actual unbinding thread. Regardless
        of whether the vm is closed, we need the unbind fence to properly wait
        for capture.
      - Clear vma_res::vm on unbind and update its documentation.
      v4:
      - Take cache coloring into account when searching for vma resources
        pending unbind. (Matthew Auld)
      v5:
      - Fix timeout and error check in i915_vma_resource_bind_dep_await().
      - Avoid taking a reference on the object for async binding if
        async unbind capable.
      - Fix braces around a single-line if statement.
      v6:
      - Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>)
      - Don't allow async unbinding if the vma_res pages are not the same as
        the object pages. (Matthew Auld)
      v7:
      - s/unsigned long/u64/ in a number of places (Matthew Auld)
      Signed-off-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com
      2f6b90da
  9. 10 1月, 2022 1 次提交
  10. 06 1月, 2022 1 次提交
  11. 20 12月, 2021 1 次提交
  12. 18 12月, 2021 1 次提交
  13. 10 11月, 2021 1 次提交
  14. 02 11月, 2021 1 次提交
  15. 20 9月, 2021 1 次提交
  16. 17 6月, 2021 3 次提交
  17. 11 6月, 2021 1 次提交
  18. 07 6月, 2021 1 次提交
  19. 02 6月, 2021 1 次提交
  20. 19 5月, 2021 1 次提交
  21. 17 5月, 2021 1 次提交
  22. 30 4月, 2021 1 次提交
  23. 29 4月, 2021 1 次提交
    • M
      drm/i915: Use trylock in shrinker for ggtt on bsw vt-d and bxt, v2. · bc6f80cc
      Maarten Lankhorst 提交于
      The stop_machine() lock may allocate memory, but is called inside
      vm->mutex, which is taken in the shrinker. This will cause a lockdep
      splat, as can be seen below:
      
      <4>[  462.585762] ======================================================
      <4>[  462.585768] WARNING: possible circular locking dependency detected
      <4>[  462.585773] 5.12.0-rc5-CI-Trybot_7644+ #1 Tainted: G     U
      <4>[  462.585779] ------------------------------------------------------
      <4>[  462.585783] i915_selftest/5540 is trying to acquire lock:
      <4>[  462.585788] ffffffff826440b0 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x12/0x30
      <4>[  462.585814]
                        but task is already holding lock:
      <4>[  462.585818] ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
      <4>[  462.586301]
                        which lock already depends on the new lock.
      
      <4>[  462.586305]
                        the existing dependency chain (in reverse order) is:
      <4>[  462.586309]
                        -> #2 (&vm->mutex/1){+.+.}-{3:3}:
      <4>[  462.586323]        i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
      <4>[  462.586719]        i915_address_space_init+0x12d/0x130 [i915]
      <4>[  462.587092]        ppgtt_init+0x4e/0x80 [i915]
      <4>[  462.587467]        gen8_ppgtt_create+0x3e/0x5c0 [i915]
      <4>[  462.587828]        i915_ppgtt_create+0x28/0xf0 [i915]
      <4>[  462.588203]        intel_gt_init+0x123/0x370 [i915]
      <4>[  462.588572]        i915_gem_init+0x129/0x1f0 [i915]
      <4>[  462.588971]        i915_driver_probe+0x753/0xd80 [i915]
      <4>[  462.589320]        i915_pci_probe+0x43/0x1d0 [i915]
      <4>[  462.589671]        pci_device_probe+0x9e/0x110
      <4>[  462.589680]        really_probe+0xea/0x410
      <4>[  462.589690]        driver_probe_device+0xd9/0x140
      <4>[  462.589697]        device_driver_attach+0x4a/0x50
      <4>[  462.589704]        __driver_attach+0x83/0x140
      <4>[  462.589711]        bus_for_each_dev+0x75/0xc0
      <4>[  462.589718]        bus_add_driver+0x14b/0x1f0
      <4>[  462.589724]        driver_register+0x66/0xb0
      <4>[  462.589731]        i915_init+0x70/0x87 [i915]
      <4>[  462.590053]        do_one_initcall+0x56/0x2e0
      <4>[  462.590061]        do_init_module+0x55/0x200
      <4>[  462.590068]        load_module+0x2703/0x2990
      <4>[  462.590074]        __do_sys_finit_module+0xad/0x110
      <4>[  462.590080]        do_syscall_64+0x33/0x80
      <4>[  462.590089]        entry_SYSCALL_64_after_hwframe+0x44/0xae
      <4>[  462.590096]
                        -> #1 (fs_reclaim){+.+.}-{0:0}:
      <4>[  462.590109]        fs_reclaim_acquire+0x9f/0xd0
      <4>[  462.590118]        kmem_cache_alloc_trace+0x3d/0x430
      <4>[  462.590126]        intel_cpuc_prepare+0x3b/0x1b0
      <4>[  462.590133]        cpuhp_invoke_callback+0x9e/0x890
      <4>[  462.590141]        _cpu_up+0xa4/0x130
      <4>[  462.590147]        cpu_up+0x82/0x90
      <4>[  462.590153]        bringup_nonboot_cpus+0x4a/0x60
      <4>[  462.590159]        smp_init+0x21/0x5c
      <4>[  462.590167]        kernel_init_freeable+0x8a/0x1b7
      <4>[  462.590175]        kernel_init+0x5/0xff
      <4>[  462.590181]        ret_from_fork+0x22/0x30
      <4>[  462.590187]
                        -> #0 (cpu_hotplug_lock){++++}-{0:0}:
      <4>[  462.590199]        __lock_acquire+0x1520/0x2590
      <4>[  462.590207]        lock_acquire+0xd1/0x3d0
      <4>[  462.590213]        cpus_read_lock+0x39/0xc0
      <4>[  462.590219]        stop_machine+0x12/0x30
      <4>[  462.590226]        bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
      <4>[  462.590601]        ggtt_bind_vma+0x5d/0x80 [i915]
      <4>[  462.590970]        i915_vma_bind+0xdc/0x1c0 [i915]
      <4>[  462.591374]        i915_vma_pin_ww+0x435/0xb40 [i915]
      <4>[  462.591779]        make_obj_busy+0xcb/0x330 [i915]
      <4>[  462.592170]        igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
      <4>[  462.592562]        __i915_subtests.cold.7+0x42/0x92 [i915]
      <4>[  462.592995]        __run_selftests.part.3+0x10d/0x172 [i915]
      <4>[  462.593428]        i915_live_selftests.cold.5+0x1f/0x47 [i915]
      <4>[  462.593860]        i915_pci_probe+0x93/0x1d0 [i915]
      <4>[  462.594210]        pci_device_probe+0x9e/0x110
      <4>[  462.594217]        really_probe+0xea/0x410
      <4>[  462.594226]        driver_probe_device+0xd9/0x140
      <4>[  462.594233]        device_driver_attach+0x4a/0x50
      <4>[  462.594240]        __driver_attach+0x83/0x140
      <4>[  462.594247]        bus_for_each_dev+0x75/0xc0
      <4>[  462.594254]        bus_add_driver+0x14b/0x1f0
      <4>[  462.594260]        driver_register+0x66/0xb0
      <4>[  462.594267]        i915_init+0x70/0x87 [i915]
      <4>[  462.594586]        do_one_initcall+0x56/0x2e0
      <4>[  462.594592]        do_init_module+0x55/0x200
      <4>[  462.594599]        load_module+0x2703/0x2990
      <4>[  462.594605]        __do_sys_finit_module+0xad/0x110
      <4>[  462.594612]        do_syscall_64+0x33/0x80
      <4>[  462.594618]        entry_SYSCALL_64_after_hwframe+0x44/0xae
      <4>[  462.594625]
                        other info that might help us debug this:
      
      <4>[  462.594629] Chain exists of:
                          cpu_hotplug_lock --> fs_reclaim --> &vm->mutex/1
      
      <4>[  462.594645]  Possible unsafe locking scenario:
      
      <4>[  462.594648]        CPU0                    CPU1
      <4>[  462.594652]        ----                    ----
      <4>[  462.594655]   lock(&vm->mutex/1);
      <4>[  462.594664]                                lock(fs_reclaim);
      <4>[  462.594671]                                lock(&vm->mutex/1);
      <4>[  462.594679]   lock(cpu_hotplug_lock);
      <4>[  462.594686]
                         *** DEADLOCK ***
      
      <4>[  462.594690] 4 locks held by i915_selftest/5540:
      <4>[  462.594696]  #0: ffff888100fbc240 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x50
      <4>[  462.594715]  #1: ffffc900006cb9a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: make_obj_busy+0x81/0x330 [i915]
      <4>[  462.595118]  #2: ffff88812a6081e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: make_obj_busy+0x21f/0x330 [i915]
      <4>[  462.595519]  #3: ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
      <4>[  462.595934]
                        stack backtrace:
      <4>[  462.595939] CPU: 0 PID: 5540 Comm: i915_selftest Tainted: G     U            5.12.0-rc5-CI-Trybot_7644+ #1
      <4>[  462.595947] Hardware name: GOOGLE Kefka/Kefka, BIOS MrChromebox 02/04/2018
      <4>[  462.595952] Call Trace:
      <4>[  462.595961]  dump_stack+0x7f/0xad
      <4>[  462.595974]  check_noncircular+0x12e/0x150
      <4>[  462.595982]  ? save_stack.isra.17+0x3f/0x70
      <4>[  462.595991]  ? drm_mm_insert_node_in_range+0x34a/0x5b0
      <4>[  462.596000]  ? i915_vma_pin_ww+0x9ec/0xb40 [i915]
      <4>[  462.596410]  __lock_acquire+0x1520/0x2590
      <4>[  462.596419]  ? do_init_module+0x55/0x200
      <4>[  462.596429]  lock_acquire+0xd1/0x3d0
      <4>[  462.596435]  ? stop_machine+0x12/0x30
      <4>[  462.596445]  ? gen8_ggtt_insert_entries+0xf0/0xf0 [i915]
      <4>[  462.596816]  cpus_read_lock+0x39/0xc0
      <4>[  462.596824]  ? stop_machine+0x12/0x30
      <4>[  462.596831]  stop_machine+0x12/0x30
      <4>[  462.596839]  bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
      <4>[  462.597210]  ggtt_bind_vma+0x5d/0x80 [i915]
      <4>[  462.597580]  i915_vma_bind+0xdc/0x1c0 [i915]
      <4>[  462.597986]  i915_vma_pin_ww+0x435/0xb40 [i915]
      <4>[  462.598395]  ? make_obj_busy+0xcb/0x330 [i915]
      <4>[  462.598786]  make_obj_busy+0xcb/0x330 [i915]
      <4>[  462.599180]  ? 0xffffffff81000000
      <4>[  462.599187]  ? debug_mutex_unlock+0x50/0xa0
      <4>[  462.599198]  igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
      <4>[  462.599592]  __i915_subtests.cold.7+0x42/0x92 [i915]
      <4>[  462.600026]  ? i915_perf_selftests+0x20/0x20 [i915]
      <4>[  462.600422]  ? __i915_nop_setup+0x10/0x10 [i915]
      <4>[  462.600820]  __run_selftests.part.3+0x10d/0x172 [i915]
      <4>[  462.601253]  i915_live_selftests.cold.5+0x1f/0x47 [i915]
      <4>[  462.601686]  i915_pci_probe+0x93/0x1d0 [i915]
      <4>[  462.602037]  ? _raw_spin_unlock_irqrestore+0x3d/0x60
      <4>[  462.602047]  pci_device_probe+0x9e/0x110
      <4>[  462.602057]  really_probe+0xea/0x410
      <4>[  462.602067]  driver_probe_device+0xd9/0x140
      <4>[  462.602075]  device_driver_attach+0x4a/0x50
      <4>[  462.602084]  __driver_attach+0x83/0x140
      <4>[  462.602091]  ? device_driver_attach+0x50/0x50
      <4>[  462.602099]  ? device_driver_attach+0x50/0x50
      <4>[  462.602107]  bus_for_each_dev+0x75/0xc0
      <4>[  462.602116]  bus_add_driver+0x14b/0x1f0
      <4>[  462.602124]  driver_register+0x66/0xb0
      <4>[  462.602133]  i915_init+0x70/0x87 [i915]
      <4>[  462.602453]  ? 0xffffffffa0606000
      <4>[  462.602458]  do_one_initcall+0x56/0x2e0
      <4>[  462.602466]  ? kmem_cache_alloc_trace+0x374/0x430
      <4>[  462.602476]  do_init_module+0x55/0x200
      <4>[  462.602484]  load_module+0x2703/0x2990
      <4>[  462.602500]  ? __do_sys_finit_module+0xad/0x110
      <4>[  462.602507]  __do_sys_finit_module+0xad/0x110
      <4>[  462.602519]  do_syscall_64+0x33/0x80
      <4>[  462.602527]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      <4>[  462.602535] RIP: 0033:0x7fab69d8d89d
      
      Changes since v1:
      - Add lockdep annotations during init, to ensure that lockdep is primed.
        This also fixes a false positive when reading /proc/lockdep_stats
        during module reload.
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210426102351.921874-1-maarten.lankhorst@linux.intel.comReviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
      bc6f80cc
  24. 25 3月, 2021 5 次提交
  25. 24 3月, 2021 1 次提交
  26. 18 3月, 2021 1 次提交
  27. 09 2月, 2021 1 次提交
  28. 21 1月, 2021 2 次提交
  29. 15 1月, 2021 1 次提交
  30. 16 12月, 2020 1 次提交