1. 06 6月, 2021 2 次提交
  2. 05 6月, 2021 1 次提交
  3. 04 6月, 2021 2 次提交
  4. 25 11月, 2020 1 次提交
  5. 08 10月, 2020 1 次提交
  6. 17 9月, 2020 1 次提交
  7. 29 7月, 2020 2 次提交
  8. 21 7月, 2020 1 次提交
    • D
      dma-fence: prime lockdep annotations · d0b9a9ae
      Daniel Vetter 提交于
      Two in one go:
      - it is allowed to call dma_fence_wait() while holding a
        dma_resv_lock(). This is fundamental to how eviction works with ttm,
        so required.
      
      - it is allowed to call dma_fence_wait() from memory reclaim contexts,
        specifically from shrinker callbacks (which i915 does), and from mmu
        notifier callbacks (which amdgpu does, and which i915 sometimes also
        does, and probably always should, but that's kinda a debate). Also
        for stuff like HMM we really need to be able to do this, or things
        get real dicey.
      
      Consequence is that any critical path necessary to get to a
      dma_fence_signal for a fence must never a) call dma_resv_lock nor b)
      allocate memory with GFP_KERNEL. Also by implication of
      dma_resv_lock(), no userspace faulting allowed. That's some supremely
      obnoxious limitations, which is why we need to sprinkle the right
      annotations to all relevant paths.
      
      The one big locking context we're leaving out here is mmu notifiers,
      added in
      
      commit 23b68395
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Mon Aug 26 22:14:21 2019 +0200
      
          mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
      
      that one covers a lot of other callsites, and it's also allowed to
      wait on dma-fences from mmu notifiers. But there's no ready-made
      functions exposed to prime this, so I've left it out for now.
      
      v2: Also track against mmu notifier context.
      
      v3: kerneldoc to spec the cross-driver contract. Note that currently
      i915 throws in a hard-coded 10s timeout on foreign fences (not sure
      why that was done, but it's there), which is why that rule is worded
      with SHOULD instead of MUST.
      
      Also some of the mmu_notifier/shrinker rules might surprise SoC
      drivers, I haven't fully audited them all. Which is infeasible anyway,
      we'll need to run them with lockdep and dma-fence annotations and see
      what goes boom.
      
      v4: A spelling fix from Mika
      
      v5: #ifdef for CONFIG_MMU_NOTIFIER. Reported by 0day. Unfortunately
      this means lockdep enforcement is slightly inconsistent, it won't spot
      GFP_NOIO and GFP_NOFS allocations in the wrong spot if
      CONFIG_MMU_NOTIFIER is disabled in the kernel config. Oh well.
      
      v5: Note that only drivers/gpu has a reasonable (or at least
      historical) excuse to use dma_fence_wait() from shrinker and mmu
      notifier callbacks. Everyone else should either have a better memory
      manager model, or better hardware. This reflects discussions with
      Jason Gunthorpe.
      
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Felix Kuehling <Felix.Kuehling@amd.com>
      Cc: kernel test robot <lkp@intel.com>
      Acked-by: NChristian König <christian.koenig@amd.com>
      Acked-by: NDave Airlie <airlied@redhat.com>
      Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> (v4)
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
      Cc: linux-media@vger.kernel.org
      Cc: linaro-mm-sig@lists.linaro.org
      Cc: linux-rdma@vger.kernel.org
      Cc: amd-gfx@lists.freedesktop.org
      Cc: intel-gfx@lists.freedesktop.org
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Christian König <christian.koenig@amd.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200707201229.472834-3-daniel.vetter@ffwll.ch
      d0b9a9ae
  9. 10 6月, 2020 1 次提交
  10. 21 11月, 2019 1 次提交
    • D
      dma-resv: Also prime acquire ctx for lockdep · fedf7a44
      Daniel Vetter 提交于
      Semnatically it really doesn't matter where we grab the ticket. But
      since the ticket is a fake lockdep lock, it matters for lockdep
      validation purposes.
      
      This means stuff like grabbing a ticket and then doing
      copy_from/to_user isn't allowed anymore. This is a changed compared to
      the current ttm fault handler, which doesn't bother with having a full
      reservation. Since I'm looking into fixing the TODO entry in
      ttm_mem_evict_wait_busy() I think that'll have to change sooner or
      later anyway, better get started. A bit more context on why I'm
      looking into this: For backwards compat with existing i915 gem code I
      think we'll have to do full slowpath locking in the i915 equivalent of
      the eviction code. And with dynamic dma-buf that will leak across
      drivers, so another thing we need to standardize and make sure it's
      done the same way everyway.
      
      Unfortunately this means another full audit of all drivers:
      
      - gem helpers: acquire_init is done right before taking locks, so no
        problem. Same for acquire_fini and unlocking, which means nothing
        that's not already covered by the dma_resv_lock rules will be caught
        with this extension here to the acquire_ctx.
      
      - etnaviv: An absolute massive amount of code is run between the
        acquire_init and the first lock acquisition in submit_lock_objects.
        But nothing that would touch user memory and could cause a fault.
        Furthermore nothing that uses the ticket, so even if I missed
        something, it would be easy to fix by pushing the acquire_init right
        before the first use. Similar on the unlock/acquire_fini side.
      
      - i915: Right now (and this will likely change a lot rsn) the acquire
        ctx and actual locks are right next to each another. No problem.
      
      - msm has a problem: submit_create calls acquire_init, but then
        submit_lookup_objects() has a bunch of copy_from_user to do the
        object lookups. That's the only thing before submit_lock_objects
        call dma_resv_lock(). Despite all the copypasta to etnaviv, etnaviv
        does not have this issue since it copies all the userspace structs
        earlier. submit_cleanup does not have any such issues.
      
        With the prep patch to pull out the acquire_ctx and reorder it msm
        is going to be safe too.
      
      - nouveau: acquire_init is right next to ttm_bo_reserve, so all good.
        Similar on the acquire_fini/ttm_bo_unreserve side.
      
      - ttm execbuf utils: acquire context and locking are even in the same
        functions here (one function to reserve everything, the other to
        unreserve), so all good.
      
      - vc4: Another case where acquire context and locking are handled in
        the same functions (one function to lock everything, the other to
        unlock).
      
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: linux-media@vger.kernel.org
      Cc: linaro-mm-sig@lists.linaro.org
      Cc: Huang Rui <ray.huang@amd.com>
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Lucas Stach <l.stach@pengutronix.de>
      Cc: Russell King <linux+etnaviv@armlinux.org.uk>
      Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: Sean Paul <sean@poorly.run>
      Acked-by: NChristian König <christian.koenig@amd.com>
      Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191119210844.16947-3-daniel.vetter@ffwll.ch
      fedf7a44
  11. 20 11月, 2019 1 次提交
  12. 06 11月, 2019 1 次提交
    • D
      dma_resv: prime lockdep annotations · b2a8116e
      Daniel Vetter 提交于
      Full audit of everyone:
      
      - i915, radeon, amdgpu should be clean per their maintainers.
      
      - vram helpers should be fine, they don't do command submission, so
        really no business holding struct_mutex while doing copy_*_user. But
        I haven't checked them all.
      
      - panfrost seems to dma_resv_lock only in panfrost_job_push, which
        looks clean.
      
      - v3d holds dma_resv locks in the tail of its v3d_submit_cl_ioctl(),
        copying from/to userspace happens all in v3d_lookup_bos which is
        outside of the critical section.
      
      - vmwgfx has a bunch of ioctls that do their own copy_*_user:
        - vmw_execbuf_process: First this does some copies in
          vmw_execbuf_cmdbuf() and also in the vmw_execbuf_process() itself.
          Then comes the usual ttm reserve/validate sequence, then actual
          submission/fencing, then unreserving, and finally some more
          copy_to_user in vmw_execbuf_copy_fence_user. Glossing over tons of
          details, but looks all safe.
        - vmw_fence_event_ioctl: No ttm_reserve/dma_resv_lock anywhere to be
          seen, seems to only create a fence and copy it out.
        - a pile of smaller ioctl in vmwgfx_ioctl.c, no reservations to be
          found there.
        Summary: vmwgfx seems to be fine too.
      
      - virtio: There's virtio_gpu_execbuffer_ioctl, which does all the
        copying from userspace before even looking up objects through their
        handles, so safe. Plus the getparam/getcaps ioctl, also both safe.
      
      - qxl only has qxl_execbuffer_ioctl, which calls into
        qxl_process_single_command. There's a lovely comment before the
        __copy_from_user_inatomic that the slowpath should be copied from
        i915, but I guess that never happened. Try not to be unlucky and get
        your CS data evicted between when it's written and the kernel tries
        to read it. The only other copy_from_user is for relocs, but those
        are done before qxl_release_reserve_list(), which seems to be the
        only thing reserving buffers (in the ttm/dma_resv sense) in that
        code. So looks safe.
      
      - A debugfs file in nouveau_debugfs_pstate_set() and the usif ioctl in
        usif_ioctl() look safe. nouveau_gem_ioctl_pushbuf() otoh breaks this
        everywhere and needs to be fixed up.
      
      v2: Thomas pointed at that vmwgfx calls dma_resv_init while it holds a
      dma_resv lock of a different object already. Christian mentioned that
      ttm core does this too for ghost objects. intel-gfx-ci highlighted
      that i915 has similar issues.
      
      Unfortunately we can't do this in the usual module init functions,
      because kernel threads don't have an ->mm - we have to wait around for
      some user thread to do this.
      
      Solution is to spawn a worker (but only once). It's horrible, but it
      works.
      
      v3: We can allocate mm! (Chris). Horrible worker hack out, clean
      initcall solution in.
      
      v4: Annotate with __init (Rob Herring)
      
      Cc: Rob Herring <robh@kernel.org>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Thomas Zimmermann <tzimmermann@suse.de>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Gerd Hoffmann <kraxel@redhat.com>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: "VMware Graphics" <linux-graphics-maintainer@vmware.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: NThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191104173801.2972-1-daniel.vetter@ffwll.ch
      b2a8116e
  13. 10 10月, 2019 1 次提交
  14. 22 9月, 2019 1 次提交
  15. 16 8月, 2019 1 次提交
  16. 13 8月, 2019 2 次提交
  17. 10 8月, 2019 2 次提交
  18. 07 8月, 2019 3 次提交
  19. 17 7月, 2019 1 次提交
  20. 15 7月, 2019 1 次提交
  21. 28 6月, 2019 1 次提交
  22. 05 6月, 2019 1 次提交
  23. 28 2月, 2019 1 次提交
  24. 26 10月, 2018 1 次提交
    • C
      dma-buf: Update reservation shared_count after adding the new fence · a590d0fd
      Chris Wilson 提交于
      We need to serialise the addition of a new fence into the shared list
      such that the fence is visible before we claim it is there. Otherwise a
      concurrent reader of the shared fence list will see an uninitialised
      fence slot before it is set.
      
        <4> [109.613162] general protection fault: 0000 [#1] PREEMPT SMP PTI
        <4> [109.613177] CPU: 1 PID: 1357 Comm: gem_busy Tainted: G     U            4.19.0-rc8-CI-CI_DRM_5035+ #1
        <4> [109.613189] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
        <4> [109.613252] RIP: 0010:i915_gem_busy_ioctl+0x146/0x380 [i915]
        <4> [109.613261] Code: 0b 43 04 49 83 c6 08 4d 39 e6 89 43 04 74 6d 4d 8b 3e e8 5d 54 f4 e0 85 c0 74 0d 80 3d 08 71 1d 00 00
        0f 84 bb 00 00 00 31 c0 <49> 81 7f 08 20 3a 2c a0 75 cc 41 8b 97 50 02 00 00 49 8b 8f a8 00
        <4> [109.613283] RSP: 0018:ffffc9000044bcf8 EFLAGS: 00010246
        <4> [109.613292] RAX: 0000000000000000 RBX: ffffc9000044bdc0 RCX: 0000000000000001
        <4> [109.613302] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: ffffffff822474a0
        <4> [109.613311] RBP: ffffc9000044bd28 R08: ffff88021e158680 R09: 0000000000000001
        <4> [109.613321] R10: 0000000000000040 R11: 0000000000000000 R12: ffff88021e1641b8
        <4> [109.613331] R13: 0000000000000003 R14: ffff88021e1641b0 R15: 6b6b6b6b6b6b6b6b
        <4> [109.613341] FS:  00007f9c9fc84980(0000) GS:ffff880227a40000(0000) knlGS:0000000000000000
        <4> [109.613352] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        <4> [109.613360] CR2: 00007f9c9fcb8000 CR3: 00000002247d4005 CR4: 00000000000606e0
      
      Fixes: 27836b64 ("dma-buf: remove shared fence staging in reservation object")
      Testcase: igt/gem_busy/close-race
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Junwei Zhang <Jerry.Zhang@amd.com>
      Cc: Huang Rui <ray.huang@amd.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181026080302.11507-1-chris@chris-wilson.co.uk
      a590d0fd
  25. 25 10月, 2018 2 次提交
  26. 16 7月, 2018 1 次提交
    • M
      dma-buf: Move BUG_ON from _add_shared_fence to _add_shared_inplace · 7f43ef9f
      Michel Dänzer 提交于
      Fixes the BUG_ON spuriously triggering under the following
      circumstances:
      
      * reservation_object_reserve_shared is called with shared_count ==
        shared_max - 1, so obj->staged is freed in preparation of an in-place
        update.
      
      * reservation_object_add_shared_fence is called with the first fence,
        after which shared_count == shared_max.
      
      * reservation_object_add_shared_fence is called with a follow-up fence
        from the same context.
      
      In the second reservation_object_add_shared_fence call, the BUG_ON
      triggers. However, nothing bad would happen in
      reservation_object_add_shared_inplace, since both fences are from the
      same context, so they only occupy a single slot.
      
      Prevent this by moving the BUG_ON to where an overflow would actually
      happen (e.g. if a buggy caller didn't call
      reservation_object_reserve_shared before).
      
      v2:
      * Fix description of breaking scenario (Christian König)
      * Add bugzilla reference
      
      Cc: stable@vger.kernel.org
      Bugzilla: https://bugs.freedesktop.org/106418
      Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
      Reviewed-by: Christian König <christian.koenig@amd.com> # v1
      Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
      Signed-off-by: NSumit Semwal <sumit.semwal@linaro.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180704151405.10357-1-michel@daenzer.net
      7f43ef9f
  27. 03 7月, 2018 1 次提交
    • T
      locking: Implement an algorithm choice for Wound-Wait mutexes · 08295b3b
      Thomas Hellstrom 提交于
      The current Wound-Wait mutex algorithm is actually not Wound-Wait but
      Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait
      is, contrary to Wait-Die a preemptive algorithm and is known to generate
      fewer backoffs. Testing reveals that this is true if the
      number of simultaneous contending transactions is small.
      As the number of simultaneous contending threads increases, Wait-Wound
      becomes inferior to Wait-Die in terms of elapsed time.
      Possibly due to the larger number of held locks of sleeping transactions.
      
      Update documentation and callers.
      
      Timings using git://people.freedesktop.org/~thomash/ww_mutex_test
      tag patch-18-06-15
      
      Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly
      chosen out of 100000. Four core Intel x86_64:
      
      Algorithm    #threads       Rollbacks  time
      Wound-Wait   4              ~100       ~17s.
      Wait-Die     4              ~150000    ~19s.
      Wound-Wait   16             ~360000    ~109s.
      Wait-Die     16             ~450000    ~82s.
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Gustavo Padovan <gustavo@padovan.org>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Sean Paul <seanpaul@chromium.org>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Philippe Ombredanne <pombredanne@nexb.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: linux-doc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: linaro-mm-sig@lists.linaro.org
      Co-authored-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      08295b3b
  28. 24 1月, 2018 1 次提交
  29. 23 1月, 2018 1 次提交
  30. 15 11月, 2017 2 次提交
  31. 10 11月, 2017 1 次提交