1. 10 1月, 2022 1 次提交
  2. 06 1月, 2022 1 次提交
  3. 21 12月, 2021 2 次提交
  4. 20 12月, 2021 1 次提交
  5. 18 12月, 2021 1 次提交
  6. 25 11月, 2021 2 次提交
  7. 22 10月, 2021 3 次提交
    • M
      drm/i915/ttm: move shrinker management into adjust_lru · ebd4a8ec
      Matthew Auld 提交于
      We currently just evict lmem objects to system memory when under memory
      pressure. For this case we might lack the usual object mm.pages, which
      effectively hides the pages from the i915-gem shrinker, until we
      actually "attach" the TT to the object, or in the case of lmem-only
      objects it just gets migrated back to lmem when touched again.
      
      For all cases we can just adjust the i915 shrinker LRU each time we also
      adjust the TTM LRU. The two cases we care about are:
      
        1) When something is moved by TTM, including when initially populating
           an object. Importantly this covers the case where TTM moves something from
           lmem <-> smem, outside of the normal get_pages() interface, which
           should still ensure the shmem pages underneath are reclaimable.
      
        2) When calling into i915_gem_object_unlock(). The unlock should
           ensure the object is removed from the shinker LRU, if it was indeed
           swapped out, or just purged, when the shrinker drops the object lock.
      
      v2(Thomas):
        - Handle managing the shrinker LRU in adjust_lru, where it is always
          safe to touch the object.
      v3(Thomas):
        - Pretty much a re-write. This time piggy back off the shrink_pin
          stuff, which actually seems to fit quite well for what we want here.
      v4(Thomas):
        - Just use a simple boolean for tracking ttm_shrinkable.
      v5:
        - Ensure we call adjust_lru when faulting the object, to ensure the
          pages are visible to the shrinker, if needed.
        - Add back the adjust_lru when in i915_ttm_move (Thomas)
      v6(Reported-by: kernel test robot <lkp@intel.com>):
        - Remove unused i915_tt
      Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #v4
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-6-matthew.auld@intel.com
      ebd4a8ec
    • M
      drm/i915: add some kernel-doc for shrink_pin and friends · e25d1ea4
      Matthew Auld 提交于
      Attempt to document shrink_pin and the other relevant interfaces that
      interact with it, before we start messing with it.
      Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-5-matthew.auld@intel.com
      e25d1ea4
    • M
      drm/i915/ttm: add tt shmem backend · 7ae03459
      Matthew Auld 提交于
      For cached objects we can allocate our pages directly in shmem. This
      should make it possible(in a later patch) to utilise the existing
      i915-gem shrinker code for such objects. For now this is still disabled.
      
      v2(Thomas):
        - Add optional try_to_writeback hook for objects. Importantly we need
          to check if the object is even still shrinkable; in between us
          dropping the shrinker LRU lock and acquiring the object lock it could for
          example have been moved. Also we need to differentiate between
          "lazy" shrinking and the immediate writeback mode. Also later we need to
          handle objects which don't even have mm.pages, so bundling this into
          put_pages() would require somehow handling that edge case, hence
          just letting the ttm backend handle everything in try_to_writeback
          doesn't seem too bad.
      v3(Thomas):
        - Likely a bad idea to touch the object from the unpopulate hook,
          since it's not possible to hold a reference, without also creating
          circular dependency, so likely this is too fragile. For now just
          ensure we at least mark the pages as dirty/accessed when called from the
          shrinker on WILLNEED objects.
        - s/try_to_writeback/shrinker_release_pages, since this can do more
          than just writeback.
        - Get rid of do_backup boolean and just set the SWAPPED flag prior to
          calling unpopulate.
        - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
          these just get skipped anyway. We can try to come up with something
          better later.
      v4(Thomas):
        - s/PCI_DMA/DMA/. Also drop NO_KERNEL_MAPPING and NO_WARN, which
          apparently doesn't do anything with streaming mappings.
        - Just pass along the error for ->truncate, and assume nothing.
      Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Oak Zeng <oak.zeng@intel.com>
      Reviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
      Acked-by: NOak Zeng <oak.zeng@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-2-matthew.auld@intel.com
      7ae03459
  8. 05 10月, 2021 1 次提交
  9. 29 9月, 2021 1 次提交
  10. 12 7月, 2021 1 次提交
  11. 29 4月, 2021 1 次提交
    • M
      drm/i915: Use trylock in shrinker for ggtt on bsw vt-d and bxt, v2. · bc6f80cc
      Maarten Lankhorst 提交于
      The stop_machine() lock may allocate memory, but is called inside
      vm->mutex, which is taken in the shrinker. This will cause a lockdep
      splat, as can be seen below:
      
      <4>[  462.585762] ======================================================
      <4>[  462.585768] WARNING: possible circular locking dependency detected
      <4>[  462.585773] 5.12.0-rc5-CI-Trybot_7644+ #1 Tainted: G     U
      <4>[  462.585779] ------------------------------------------------------
      <4>[  462.585783] i915_selftest/5540 is trying to acquire lock:
      <4>[  462.585788] ffffffff826440b0 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x12/0x30
      <4>[  462.585814]
                        but task is already holding lock:
      <4>[  462.585818] ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
      <4>[  462.586301]
                        which lock already depends on the new lock.
      
      <4>[  462.586305]
                        the existing dependency chain (in reverse order) is:
      <4>[  462.586309]
                        -> #2 (&vm->mutex/1){+.+.}-{3:3}:
      <4>[  462.586323]        i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
      <4>[  462.586719]        i915_address_space_init+0x12d/0x130 [i915]
      <4>[  462.587092]        ppgtt_init+0x4e/0x80 [i915]
      <4>[  462.587467]        gen8_ppgtt_create+0x3e/0x5c0 [i915]
      <4>[  462.587828]        i915_ppgtt_create+0x28/0xf0 [i915]
      <4>[  462.588203]        intel_gt_init+0x123/0x370 [i915]
      <4>[  462.588572]        i915_gem_init+0x129/0x1f0 [i915]
      <4>[  462.588971]        i915_driver_probe+0x753/0xd80 [i915]
      <4>[  462.589320]        i915_pci_probe+0x43/0x1d0 [i915]
      <4>[  462.589671]        pci_device_probe+0x9e/0x110
      <4>[  462.589680]        really_probe+0xea/0x410
      <4>[  462.589690]        driver_probe_device+0xd9/0x140
      <4>[  462.589697]        device_driver_attach+0x4a/0x50
      <4>[  462.589704]        __driver_attach+0x83/0x140
      <4>[  462.589711]        bus_for_each_dev+0x75/0xc0
      <4>[  462.589718]        bus_add_driver+0x14b/0x1f0
      <4>[  462.589724]        driver_register+0x66/0xb0
      <4>[  462.589731]        i915_init+0x70/0x87 [i915]
      <4>[  462.590053]        do_one_initcall+0x56/0x2e0
      <4>[  462.590061]        do_init_module+0x55/0x200
      <4>[  462.590068]        load_module+0x2703/0x2990
      <4>[  462.590074]        __do_sys_finit_module+0xad/0x110
      <4>[  462.590080]        do_syscall_64+0x33/0x80
      <4>[  462.590089]        entry_SYSCALL_64_after_hwframe+0x44/0xae
      <4>[  462.590096]
                        -> #1 (fs_reclaim){+.+.}-{0:0}:
      <4>[  462.590109]        fs_reclaim_acquire+0x9f/0xd0
      <4>[  462.590118]        kmem_cache_alloc_trace+0x3d/0x430
      <4>[  462.590126]        intel_cpuc_prepare+0x3b/0x1b0
      <4>[  462.590133]        cpuhp_invoke_callback+0x9e/0x890
      <4>[  462.590141]        _cpu_up+0xa4/0x130
      <4>[  462.590147]        cpu_up+0x82/0x90
      <4>[  462.590153]        bringup_nonboot_cpus+0x4a/0x60
      <4>[  462.590159]        smp_init+0x21/0x5c
      <4>[  462.590167]        kernel_init_freeable+0x8a/0x1b7
      <4>[  462.590175]        kernel_init+0x5/0xff
      <4>[  462.590181]        ret_from_fork+0x22/0x30
      <4>[  462.590187]
                        -> #0 (cpu_hotplug_lock){++++}-{0:0}:
      <4>[  462.590199]        __lock_acquire+0x1520/0x2590
      <4>[  462.590207]        lock_acquire+0xd1/0x3d0
      <4>[  462.590213]        cpus_read_lock+0x39/0xc0
      <4>[  462.590219]        stop_machine+0x12/0x30
      <4>[  462.590226]        bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
      <4>[  462.590601]        ggtt_bind_vma+0x5d/0x80 [i915]
      <4>[  462.590970]        i915_vma_bind+0xdc/0x1c0 [i915]
      <4>[  462.591374]        i915_vma_pin_ww+0x435/0xb40 [i915]
      <4>[  462.591779]        make_obj_busy+0xcb/0x330 [i915]
      <4>[  462.592170]        igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
      <4>[  462.592562]        __i915_subtests.cold.7+0x42/0x92 [i915]
      <4>[  462.592995]        __run_selftests.part.3+0x10d/0x172 [i915]
      <4>[  462.593428]        i915_live_selftests.cold.5+0x1f/0x47 [i915]
      <4>[  462.593860]        i915_pci_probe+0x93/0x1d0 [i915]
      <4>[  462.594210]        pci_device_probe+0x9e/0x110
      <4>[  462.594217]        really_probe+0xea/0x410
      <4>[  462.594226]        driver_probe_device+0xd9/0x140
      <4>[  462.594233]        device_driver_attach+0x4a/0x50
      <4>[  462.594240]        __driver_attach+0x83/0x140
      <4>[  462.594247]        bus_for_each_dev+0x75/0xc0
      <4>[  462.594254]        bus_add_driver+0x14b/0x1f0
      <4>[  462.594260]        driver_register+0x66/0xb0
      <4>[  462.594267]        i915_init+0x70/0x87 [i915]
      <4>[  462.594586]        do_one_initcall+0x56/0x2e0
      <4>[  462.594592]        do_init_module+0x55/0x200
      <4>[  462.594599]        load_module+0x2703/0x2990
      <4>[  462.594605]        __do_sys_finit_module+0xad/0x110
      <4>[  462.594612]        do_syscall_64+0x33/0x80
      <4>[  462.594618]        entry_SYSCALL_64_after_hwframe+0x44/0xae
      <4>[  462.594625]
                        other info that might help us debug this:
      
      <4>[  462.594629] Chain exists of:
                          cpu_hotplug_lock --> fs_reclaim --> &vm->mutex/1
      
      <4>[  462.594645]  Possible unsafe locking scenario:
      
      <4>[  462.594648]        CPU0                    CPU1
      <4>[  462.594652]        ----                    ----
      <4>[  462.594655]   lock(&vm->mutex/1);
      <4>[  462.594664]                                lock(fs_reclaim);
      <4>[  462.594671]                                lock(&vm->mutex/1);
      <4>[  462.594679]   lock(cpu_hotplug_lock);
      <4>[  462.594686]
                         *** DEADLOCK ***
      
      <4>[  462.594690] 4 locks held by i915_selftest/5540:
      <4>[  462.594696]  #0: ffff888100fbc240 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x50
      <4>[  462.594715]  #1: ffffc900006cb9a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: make_obj_busy+0x81/0x330 [i915]
      <4>[  462.595118]  #2: ffff88812a6081e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: make_obj_busy+0x21f/0x330 [i915]
      <4>[  462.595519]  #3: ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
      <4>[  462.595934]
                        stack backtrace:
      <4>[  462.595939] CPU: 0 PID: 5540 Comm: i915_selftest Tainted: G     U            5.12.0-rc5-CI-Trybot_7644+ #1
      <4>[  462.595947] Hardware name: GOOGLE Kefka/Kefka, BIOS MrChromebox 02/04/2018
      <4>[  462.595952] Call Trace:
      <4>[  462.595961]  dump_stack+0x7f/0xad
      <4>[  462.595974]  check_noncircular+0x12e/0x150
      <4>[  462.595982]  ? save_stack.isra.17+0x3f/0x70
      <4>[  462.595991]  ? drm_mm_insert_node_in_range+0x34a/0x5b0
      <4>[  462.596000]  ? i915_vma_pin_ww+0x9ec/0xb40 [i915]
      <4>[  462.596410]  __lock_acquire+0x1520/0x2590
      <4>[  462.596419]  ? do_init_module+0x55/0x200
      <4>[  462.596429]  lock_acquire+0xd1/0x3d0
      <4>[  462.596435]  ? stop_machine+0x12/0x30
      <4>[  462.596445]  ? gen8_ggtt_insert_entries+0xf0/0xf0 [i915]
      <4>[  462.596816]  cpus_read_lock+0x39/0xc0
      <4>[  462.596824]  ? stop_machine+0x12/0x30
      <4>[  462.596831]  stop_machine+0x12/0x30
      <4>[  462.596839]  bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
      <4>[  462.597210]  ggtt_bind_vma+0x5d/0x80 [i915]
      <4>[  462.597580]  i915_vma_bind+0xdc/0x1c0 [i915]
      <4>[  462.597986]  i915_vma_pin_ww+0x435/0xb40 [i915]
      <4>[  462.598395]  ? make_obj_busy+0xcb/0x330 [i915]
      <4>[  462.598786]  make_obj_busy+0xcb/0x330 [i915]
      <4>[  462.599180]  ? 0xffffffff81000000
      <4>[  462.599187]  ? debug_mutex_unlock+0x50/0xa0
      <4>[  462.599198]  igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
      <4>[  462.599592]  __i915_subtests.cold.7+0x42/0x92 [i915]
      <4>[  462.600026]  ? i915_perf_selftests+0x20/0x20 [i915]
      <4>[  462.600422]  ? __i915_nop_setup+0x10/0x10 [i915]
      <4>[  462.600820]  __run_selftests.part.3+0x10d/0x172 [i915]
      <4>[  462.601253]  i915_live_selftests.cold.5+0x1f/0x47 [i915]
      <4>[  462.601686]  i915_pci_probe+0x93/0x1d0 [i915]
      <4>[  462.602037]  ? _raw_spin_unlock_irqrestore+0x3d/0x60
      <4>[  462.602047]  pci_device_probe+0x9e/0x110
      <4>[  462.602057]  really_probe+0xea/0x410
      <4>[  462.602067]  driver_probe_device+0xd9/0x140
      <4>[  462.602075]  device_driver_attach+0x4a/0x50
      <4>[  462.602084]  __driver_attach+0x83/0x140
      <4>[  462.602091]  ? device_driver_attach+0x50/0x50
      <4>[  462.602099]  ? device_driver_attach+0x50/0x50
      <4>[  462.602107]  bus_for_each_dev+0x75/0xc0
      <4>[  462.602116]  bus_add_driver+0x14b/0x1f0
      <4>[  462.602124]  driver_register+0x66/0xb0
      <4>[  462.602133]  i915_init+0x70/0x87 [i915]
      <4>[  462.602453]  ? 0xffffffffa0606000
      <4>[  462.602458]  do_one_initcall+0x56/0x2e0
      <4>[  462.602466]  ? kmem_cache_alloc_trace+0x374/0x430
      <4>[  462.602476]  do_init_module+0x55/0x200
      <4>[  462.602484]  load_module+0x2703/0x2990
      <4>[  462.602500]  ? __do_sys_finit_module+0xad/0x110
      <4>[  462.602507]  __do_sys_finit_module+0xad/0x110
      <4>[  462.602519]  do_syscall_64+0x33/0x80
      <4>[  462.602527]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      <4>[  462.602535] RIP: 0033:0x7fab69d8d89d
      
      Changes since v1:
      - Add lockdep annotations during init, to ensure that lockdep is primed.
        This also fixes a false positive when reading /proc/lockdep_stats
        during module reload.
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210426102351.921874-1-maarten.lankhorst@linux.intel.comReviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
      bc6f80cc
  12. 26 4月, 2021 1 次提交
  13. 23 4月, 2021 1 次提交
  14. 25 3月, 2021 2 次提交
  15. 24 12月, 2020 1 次提交
  16. 09 7月, 2020 1 次提交
    • C
      drm/i915/gem: Unpin idle contexts from kswapd reclaim · 09137e94
      Chris Wilson 提交于
      We removed retiring requests from the shrinker in order to decouple the
      mutexes from reclaim in preparation for unravelling the struct_mutex.
      The impact of not retiring is that we are much less agressive in making
      global objects available for shrinking, as such objects remain pinned
      until they are flushed by a heartbeat pulse following the last retired
      request along their timeline. In order to ensure that pulse occurs in
      time for memory reclamation, we should kick it from kswapd.
      
      The catch is that we have added some flush_work() into the retirement
      phase (to ensure that we reach a global idle in a timely manner), but
      these flush_work() are not eligible (i.e do not belong to WQ_MEM_RELCAIM)
      for use from inside kswapd. To avoid flushing those workqueues, we teach
      the retirer not to do so unless we are actually waiting, and only do the
      plain retire from inside the shrinker.
      
      Note that for execlists, we already retire completed contexts as they
      are scheduled out, so it should not be keeping global state
      unnecessarily pinned. The legacy ringbuffer however...
      
      References: 9e953980 ("drm/i915: Remove waiting & retiring from shrinker paths")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200708173748.32734-1-chris@chris-wilson.co.uk
      09137e94
  17. 03 7月, 2020 1 次提交
  18. 02 4月, 2020 1 次提交
  19. 27 2月, 2020 1 次提交
  20. 26 2月, 2020 1 次提交
    • C
      drm/i915: Avoid recursing onto active vma from the shrinker · 23873426
      Chris Wilson 提交于
      We mark the vma as active while binding it in order to protect outselves
      from being shrunk under mempressure. This only works if we are strict in
      not attempting to shrink active objects.
      
      <6> [472.618968] Workqueue: events_unbound fence_work [i915]
      <4> [472.618970] Call Trace:
      <4> [472.618974]  ? __schedule+0x2e5/0x810
      <4> [472.618978]  schedule+0x37/0xe0
      <4> [472.618982]  schedule_preempt_disabled+0xf/0x20
      <4> [472.618984]  __mutex_lock+0x281/0x9c0
      <4> [472.618987]  ? mark_held_locks+0x49/0x70
      <4> [472.618989]  ? _raw_spin_unlock_irqrestore+0x47/0x60
      <4> [472.619038]  ? i915_vma_unbind+0xae/0x110 [i915]
      <4> [472.619084]  ? i915_vma_unbind+0xae/0x110 [i915]
      <4> [472.619122]  i915_vma_unbind+0xae/0x110 [i915]
      <4> [472.619165]  i915_gem_object_unbind+0x1dc/0x400 [i915]
      <4> [472.619208]  i915_gem_shrink+0x328/0x660 [i915]
      <4> [472.619250]  ? i915_gem_shrink_all+0x38/0x60 [i915]
      <4> [472.619282]  i915_gem_shrink_all+0x38/0x60 [i915]
      <4> [472.619325]  vm_alloc_page.constprop.25+0x1aa/0x240 [i915]
      <4> [472.619330]  ? rcu_read_lock_sched_held+0x4d/0x80
      <4> [472.619363]  ? __alloc_pd+0xb/0x30 [i915]
      <4> [472.619366]  ? module_assert_mutex_or_preempt+0xf/0x30
      <4> [472.619368]  ? __module_address+0x23/0xe0
      <4> [472.619371]  ? is_module_address+0x26/0x40
      <4> [472.619374]  ? static_obj+0x34/0x50
      <4> [472.619376]  ? lockdep_init_map+0x4d/0x1e0
      <4> [472.619407]  setup_page_dma+0xd/0x90 [i915]
      <4> [472.619437]  alloc_pd+0x29/0x50 [i915]
      <4> [472.619470]  __gen8_ppgtt_alloc+0x443/0x6b0 [i915]
      <4> [472.619503]  gen8_ppgtt_alloc+0xd7/0x300 [i915]
      <4> [472.619535]  ppgtt_bind_vma+0x2a/0xe0 [i915]
      <4> [472.619577]  __vma_bind+0x26/0x40 [i915]
      <4> [472.619611]  fence_work+0x1c/0x90 [i915]
      <4> [472.619617]  process_one_work+0x26a/0x620
      
      Fixes: 2850748e ("drm/i915: Pull i915_vma_pin under the vm->mutex")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200221221818.2861432-1-chris@chris-wilson.co.uk
      (cherry picked from commit 6f24e410)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      23873426
  21. 22 2月, 2020 1 次提交
    • C
      drm/i915: Avoid recursing onto active vma from the shrinker · 6f24e410
      Chris Wilson 提交于
      We mark the vma as active while binding it in order to protect outselves
      from being shrunk under mempressure. This only works if we are strict in
      not attempting to shrink active objects.
      
      <6> [472.618968] Workqueue: events_unbound fence_work [i915]
      <4> [472.618970] Call Trace:
      <4> [472.618974]  ? __schedule+0x2e5/0x810
      <4> [472.618978]  schedule+0x37/0xe0
      <4> [472.618982]  schedule_preempt_disabled+0xf/0x20
      <4> [472.618984]  __mutex_lock+0x281/0x9c0
      <4> [472.618987]  ? mark_held_locks+0x49/0x70
      <4> [472.618989]  ? _raw_spin_unlock_irqrestore+0x47/0x60
      <4> [472.619038]  ? i915_vma_unbind+0xae/0x110 [i915]
      <4> [472.619084]  ? i915_vma_unbind+0xae/0x110 [i915]
      <4> [472.619122]  i915_vma_unbind+0xae/0x110 [i915]
      <4> [472.619165]  i915_gem_object_unbind+0x1dc/0x400 [i915]
      <4> [472.619208]  i915_gem_shrink+0x328/0x660 [i915]
      <4> [472.619250]  ? i915_gem_shrink_all+0x38/0x60 [i915]
      <4> [472.619282]  i915_gem_shrink_all+0x38/0x60 [i915]
      <4> [472.619325]  vm_alloc_page.constprop.25+0x1aa/0x240 [i915]
      <4> [472.619330]  ? rcu_read_lock_sched_held+0x4d/0x80
      <4> [472.619363]  ? __alloc_pd+0xb/0x30 [i915]
      <4> [472.619366]  ? module_assert_mutex_or_preempt+0xf/0x30
      <4> [472.619368]  ? __module_address+0x23/0xe0
      <4> [472.619371]  ? is_module_address+0x26/0x40
      <4> [472.619374]  ? static_obj+0x34/0x50
      <4> [472.619376]  ? lockdep_init_map+0x4d/0x1e0
      <4> [472.619407]  setup_page_dma+0xd/0x90 [i915]
      <4> [472.619437]  alloc_pd+0x29/0x50 [i915]
      <4> [472.619470]  __gen8_ppgtt_alloc+0x443/0x6b0 [i915]
      <4> [472.619503]  gen8_ppgtt_alloc+0xd7/0x300 [i915]
      <4> [472.619535]  ppgtt_bind_vma+0x2a/0xe0 [i915]
      <4> [472.619577]  __vma_bind+0x26/0x40 [i915]
      <4> [472.619611]  fence_work+0x1c/0x90 [i915]
      <4> [472.619617]  process_one_work+0x26a/0x620
      
      Fixes: 2850748e ("drm/i915: Pull i915_vma_pin under the vm->mutex")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200221221818.2861432-1-chris@chris-wilson.co.uk
      6f24e410
  22. 22 1月, 2020 1 次提交
  23. 07 11月, 2019 1 次提交
    • D
      drm/i915: Switch obj->mm.lock lockdep annotations on its head · f86dbacb
      Daniel Vetter 提交于
      The trouble with having a plain nesting flag for locks which do not
      naturally nest (unlike block devices and their partitions, which is
      the original motivation for nesting levels) is that lockdep will
      never spot a true deadlock if you screw up.
      
      This patch is an attempt at trying better, by highlighting a bit more
      of the actual nature of the nesting that's going on. Essentially we
      have two kinds of objects:
      
      - objects without pages allocated, which cannot be on any lru and are
        hence inaccessible to the shrinker.
      
      - objects which have pages allocated, which are on an lru, and which
        the shrinker can decide to throw out.
      
      For the former type of object, memory allocations while holding
      obj->mm.lock are permissible. For the latter they are not. And
      get/put_pages transitions between the two types of objects.
      
      This is still not entirely fool-proof since the rules might change.
      But as long as we run such a code ever at runtime lockdep should be
      able to observe the inconsistency and complain (like with any other
      lockdep class that we've split up in multiple classes). But there are
      a few clear benefits:
      
      - We can drop the nesting flag parameter from
        __i915_gem_object_put_pages, because that function by definition is
        never going allocate memory, and calling it on an object which
        doesn't have its pages allocated would be a bug.
      
      - We strictly catch more bugs, since there's not only one place in the
        entire tree which is annotated with the special class. All the
        other places that had explicit lockdep nesting annotations we're now
        going to leave up to lockdep again.
      
      - Specifically this catches stuff like calling get_pages from
        put_pages (which isn't really a good idea, if we can call get_pages
        so could the shrinker). I've seen patches do exactly that.
      
      Of course I fully expect CI will show me for the fool I am with this
      one here :-)
      
      v2: There can only be one (lockdep only has a cache for the first
      subclass, not for deeper ones, and we don't want to make these locks
      even slower). Still separate enums for better documentation.
      
      Real fix: don't forget about phys objs and pin_map(), and fix the
      shrinker to have the right annotations ... silly me.
      
      v3: Forgot usertptr too ...
      
      v4: Improve comment for pages_pin_count, drop the IMPORTANT comment
      and instead prime lockdep (Chris).
      
      v5: Appease checkpatch, no double empty lines (Chris)
      
      v6: More rebasing over selftest changes. Also somehow I forgot to
      push this patch :-/
      
      Also format comments consistently while at it.
      
      v7: Fix typo in commit message (Joonas)
      
      Also drop the priming, with the lmem merge we now have allocations
      while holding the lmem lock, which wreaks the generic priming I've
      done in earlier patches. Should probably be resurrected when lmem is
      fixed. See
      
      commit 232a6eba
      Author: Matthew Auld <matthew.auld@intel.com>
      Date:   Tue Oct 8 17:01:14 2019 +0100
      
          drm/i915: introduce intel_memory_region
      
      I'm keeping the priming patch locally so it wont get lost.
      
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: "Tang, CQ" <cq.tang@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v6)
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191105090148.30269-1-daniel.vetter@ffwll.ch
      [mlankhorst: Fix commit typos pointed out by Michael Ruhl]
      f86dbacb
  24. 09 10月, 2019 1 次提交
    • Q
      locking/lockdep: Remove unused @nested argument from lock_release() · 5facae4f
      Qian Cai 提交于
      Since the following commit:
      
        b4adfe8e ("locking/lockdep: Remove unused argument in __lock_release")
      
      @nested is no longer used in lock_release(), so remove it from all
      lock_release() calls and friends.
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NWill Deacon <will@kernel.org>
      Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: airlied@linux.ie
      Cc: akpm@linux-foundation.org
      Cc: alexander.levin@microsoft.com
      Cc: daniel@iogearbox.net
      Cc: davem@davemloft.net
      Cc: dri-devel@lists.freedesktop.org
      Cc: duyuyang@gmail.com
      Cc: gregkh@linuxfoundation.org
      Cc: hannes@cmpxchg.org
      Cc: intel-gfx@lists.freedesktop.org
      Cc: jack@suse.com
      Cc: jlbec@evilplan.or
      Cc: joonas.lahtinen@linux.intel.com
      Cc: joseph.qi@linux.alibaba.com
      Cc: jslaby@suse.com
      Cc: juri.lelli@redhat.com
      Cc: maarten.lankhorst@linux.intel.com
      Cc: mark@fasheh.com
      Cc: mhocko@kernel.org
      Cc: mripard@kernel.org
      Cc: ocfs2-devel@oss.oracle.com
      Cc: rodrigo.vivi@intel.com
      Cc: sean@poorly.run
      Cc: st@kernel.org
      Cc: tj@kernel.org
      Cc: tytso@mit.edu
      Cc: vdavydov.dev@gmail.com
      Cc: vincent.guittot@linaro.org
      Cc: viro@zeniv.linux.org.uk
      Link: https://lkml.kernel.org/r/1568909380-32199-1-git-send-email-cai@lca.pwSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5facae4f
  25. 04 10月, 2019 1 次提交
    • C
      drm/i915: Pull i915_vma_pin under the vm->mutex · 2850748e
      Chris Wilson 提交于
      Replace the struct_mutex requirement for pinning the i915_vma with the
      local vm->mutex instead. Note that the vm->mutex is tainted by the
      shrinker (we require unbinding from inside fs-reclaim) and so we cannot
      allocate while holding that mutex. Instead we have to preallocate
      workers to do allocate and apply the PTE updates after we have we
      reserved their slot in the drm_mm (using fences to order the PTE writes
      with the GPU work and with later unbind).
      
      In adding the asynchronous vma binding, one subtle requirement is to
      avoid coupling the binding fence into the backing object->resv. That is
      the asynchronous binding only applies to the vma timeline itself and not
      to the pages as that is a more global timeline (the binding of one vma
      does not need to be ordered with another vma, nor does the implicit GEM
      fencing depend on a vma, only on writes to the backing store). Keeping
      the vma binding distinct from the backing store timelines is verified by
      a number of async gem_exec_fence and gem_exec_schedule tests. The way we
      do this is quite simple, we keep the fence for the vma binding separate
      and only wait on it as required, and never add it to the obj->resv
      itself.
      
      Another consequence in reducing the locking around the vma is the
      destruction of the vma is no longer globally serialised by struct_mutex.
      A natural solution would be to add a kref to i915_vma, but that requires
      decoupling the reference cycles, possibly by introducing a new
      i915_mm_pages object that is own by both obj->mm and vma->pages.
      However, we have not taken that route due to the overshadowing lmem/ttm
      discussions, and instead play a series of complicated games with
      trylocks to (hopefully) ensure that only one destruction path is called!
      
      v2: Add some commentary, and some helpers to reduce patch churn.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
      2850748e
  26. 11 9月, 2019 1 次提交
  27. 03 9月, 2019 1 次提交
    • C
      drm/i915: Replace obj->pin_global with obj->frontbuffer · 5a90606d
      Chris Wilson 提交于
      obj->pin_global was originally used as a means to keep the shrinker off
      the active scanout, but we use the vma->pin_count itself for that and
      the obj->frontbuffer to delay shrinking active framebuffers. The other
      role that obj->pin_global gained was for spotting display objects inside
      GEM and working harder to keep those coherent; for which we can again
      simply inspect obj->frontbuffer directly.
      
      Coming up next, we will want to manipulate the pin_global counter
      outside of the principle locks, so would need to make pin_global atomic.
      However, since obj->frontbuffer is already managed atomically, it makes
      sense to use that the primary key for display objects instead of having
      pin_global.
      
      Ville pointed out the principle difference is that obj->frontbuffer is
      set for as long as an intel_framebuffer is attached to an object, but
      obj->pin_global was only raised for as long as the object was active. In
      practice, this means that we consider the object as being on the scanout
      for longer than is strictly required, causing us to be more proactive in
      flushing -- though it should be true that we would have flushed
      eventually when the back became the front, except that on the flip path
      that flush is async but when hit from another ioctl it will be
      synchronous.
      
      v2: i915_gem_object_is_framebuffer()
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190902040303.14195-5-chris@chris-wilson.co.uk
      5a90606d
  28. 06 8月, 2019 1 次提交
  29. 03 8月, 2019 1 次提交
    • C
      drm/i915: Hide unshrinkable context objects from the shrinker · 1aff1903
      Chris Wilson 提交于
      The shrinker cannot touch objects used by the contexts (logical state
      and ring). Currently we mark those as "pin_global" to let the shrinker
      skip over them, however, if we remove them from the shrinker lists
      entirely, we don't event have to include them in our shrink accounting.
      
      By keeping the unshrinkable objects in our shrinker tracking, we report
      a large number of objects available to be shrunk, and leave the shrinker
      deeply unsatisfied when we fail to reclaim those. The shrinker will
      persist in trying to reclaim the unavailable objects, forcing the system
      into a livelock (not even hitting the dread oomkiller).
      
      v2: Extend unshrinkable protection for perma-pinned scratch and guc
      allocations (Tvrtko)
      v3: Notice that we should be pinned when marking unshrinkable and so the
      link cannot be empty; merge duplicate paths.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190802212137.22207-1-chris@chris-wilson.co.uk
      1aff1903
  30. 03 7月, 2019 1 次提交
  31. 22 6月, 2019 2 次提交
  32. 19 6月, 2019 1 次提交
  33. 15 6月, 2019 1 次提交
    • C
      drm/i915: Keep contexts pinned until after the next kernel context switch · ce476c80
      Chris Wilson 提交于
      We need to keep the context image pinned in memory until after the GPU
      has finished writing into it. Since it continues to write as we signal
      the final breadcrumb, we need to keep it pinned until the request after
      it is complete. Currently we know the order in which requests execute on
      each engine, and so to remove that presumption we need to identify a
      request/context-switch we know must occur after our completion. Any
      request queued after the signal must imply a context switch, for
      simplicity we use a fresh request from the kernel context.
      
      The sequence of operations for keeping the context pinned until saved is:
      
       - On context activation, we preallocate a node for each physical engine
         the context may operate on. This is to avoid allocations during
         unpinning, which may be from inside FS_RECLAIM context (aka the
         shrinker)
      
       - On context deactivation on retirement of the last active request (which
         is before we know the context has been saved), we add the
         preallocated node onto a barrier list on each engine
      
       - On engine idling, we emit a switch to kernel context. When this
         switch completes, we know that all previous contexts must have been
         saved, and so on retiring this request we can finally unpin all the
         contexts that were marked as deactivated prior to the switch.
      
      We can enhance this in future by flushing all the idle contexts on a
      regular heartbeat pulse of a switch to kernel context, which will also
      be used to check for hung engines.
      
      v2: intel_context_active_acquire/_release
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk
      ce476c80
  34. 14 6月, 2019 1 次提交