提交 · ed29c2691188cf7ea2a46d40b891836c2bd1a4f5 · openeuler / Kernel

25 3月, 2021 2 次提交

drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7. · ed29c269

由 Maarten Lankhorst 提交于 3月 23, 2021

Instead of doing what we do currently, which will never work with
PROVE_LOCKING, do the same as AMD does, and something similar to
relocation slowpath. When all locks are dropped, we acquire the
pages for pinning. When the locks are taken, we transfer those
pages in .get_pages() to the bo. As a final check before installing
the fences, we ensure that the mmu notifier was not called; if it is,
we return -EAGAIN to userspace to signal it has to start over.

Changes since v1:
- Unbinding is done in submit_init only. submit_begin() removed.
- MMU_NOTFIER -> MMU_NOTIFIER
Changes since v2:
- Make i915->mm.notifier a spinlock.
Changes since v3:
- Add WARN_ON if there are any page references left, should have been 0.
- Return 0 on success in submit_init(), bug from spinlock conversion.
- Release pvec outside of notifier_lock (Thomas).
Changes since v4:
- Mention why we're clearing eb->[i + 1].vma in the code. (Thomas)
- Actually check all invalidations in eb_move_to_gpu. (Thomas)
- Do not wait when process is exiting to fix gem_ctx_persistence.userptr.
Changes since v5:
- Clarify why check on PF_EXITING is (temporarily) required.
Changes since v6:
- Ensure userptr validity is checked in set_domain through a special path.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: NDave Airlie <airlied@redhat.com>
[danvet: s/kfree/kvfree/ in i915_gem_object_userptr_drop_ref in the
previous review round, but which got lost. The other open questions
around page refcount are imo better discussed in a separate series,
with amdgpu folks involved].
Reviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-17-maarten.lankhorst@linux.intel.com

ed29c269

drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER. · 20ee27bd

由 Maarten Lankhorst 提交于 3月 23, 2021

Now that unsynchronized mappings are removed, the only time userptr
works is when the MMU notifier is enabled. Put all of the userptr
code behind a mmu notifier ifdef.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-16-maarten.lankhorst@linux.intel.com

20ee27bd

24 3月, 2021 3 次提交

drm/i915: make lockdep slightly happier about execbuf. · bfaae47d

由 Maarten Lankhorst 提交于 3月 23, 2021

As soon as we install fences, we should stop allocating memory
in order to prevent any potential deadlocks.

This is required later on, when we start adding support for
dma-fence annotations.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-11-maarten.lankhorst@linux.intel.com

bfaae47d

drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2. · 237647f4

由 Maarten Lankhorst 提交于 3月 23, 2021

i915_vma_pin may fail with -EDEADLK when we start locking page tables,
so ensure we handle this correctly.

Changes since v1:
- Drop -EDEADLK todo, this commit handles it.
- Change eb_pin_vma from sort-of-bool + -EDEADLK to a proper int. (Matt)

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-5-maarten.lankhorst@linux.intel.com

237647f4

drm/i915: Move cmd parser pinning to execbuffer · 0edbb9ba

由 Maarten Lankhorst 提交于 3月 23, 2021

We need to get rid of allocations in the cmd parser, because it needs
to be called from a signaling context, first move all pinning to
execbuf, where we already hold all locks.

Allocate jump_whitelist in the execbuffer, and add annotations around
intel_engine_cmd_parser(), to ensure we only call the command parser
without allocating any memory, or taking any locks we're not supposed to.

Because i915_gem_object_get_page() may also allocate memory, add a
path to i915_gem_object_get_sg() that prevents memory allocations,
and walk the sg list manually. It should be similarly fast.

This has the added benefit of being able to catch all memory allocation
errors before the point of no return, and return -ENOMEM safely to the
execbuf submitter.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: NThomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-4-maarten.lankhorst@linux.intel.com

0edbb9ba

18 3月, 2021 2 次提交

drm/i915/gem: Drop relocation support on all new hardware (v6) · 2eb8e1a6

由 Jason Ekstrand 提交于 3月 17, 2021

The Vulkan driver in Mesa for Intel hardware never uses relocations if
it's running on a version of i915 that supports at least softpin which
all versions of i915 supporting Gen12 do. On the OpenGL side, Gen12+ is
only supported by iris which never uses relocations. The older i965
driver in Mesa does use relocations but it only supports Intel hardware
through Gen11 and has been deprecated for all hardware Gen9+. The
compute driver also never uses relocations. This only leaves the media
driver which is supposed to be switching to softpin going forward.
Making softpin a requirement for all future hardware seems reasonable.

There is one piece of hardware enabled by default in i915: RKL which was
enabled by e22fa6f0 which has not yet landed in drm-next so this
almost but not really a userspace API change for RKL. If it becomes a
problem, we can always add !IS_ROCKETLAKE(eb->i915) to the condition.

Rejecting relocations starting with newer Gen12 platforms has the
benefit that we don't have to bother supporting it on platforms with
local memory. Given how much CPU touching of memory is required for
relocations, not having to do so on platforms where not all memory is
directly CPU-accessible carries significant advantages.

v2 (Jason Ekstrand):
- Allow TGL-LP platforms as they've already shipped

v3 (Jason Ekstrand):
- WARN_ON platforms with LMEM support in case the check is wrong

v4 (Jason Ekstrand):
- Call out Rocket Lake in the commit message

v5 (Jason Ekstrand):
- Drop the HAS_LMEM check as it's already covered by the version check

v6 (Jason Ekstrand):
- Move the check to eb_validate_vma() with all the other exec_object
validation checks.
Signed-off-by: NJason Ekstrand <jason@jlekstrand.net>
Reviewed-by: NZbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210317234014.2271006-3-jason@jlekstrand.net

2eb8e1a6

drm/i915/gem: Drop legacy execbuffer support (v2) · b5b6f6a6

由 Jason Ekstrand 提交于 3月 17, 2021

libdrm has supported the newer execbuffer2 ioctl and using it by default
when it exists since libdrm commit b50964027bef which landed Mar 2, 2010.
The i915 and i965 drivers in Mesa at the time both used libdrm and so
did the Intel X11 back-end.  The SNA back-end for X11 has always used
execbuffer2.

v2 (Jason Ekstrand):
 - Add a comment saying what Linux version it's being removed in.
Signed-off-by: NJason Ekstrand <jason@jlekstrand.net>
Acked-by: NKeith Packard <keithp@keithp.com>
Acked-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210317234014.2271006-2-jason@jlekstrand.net

b5b6f6a6

20 1月, 2021 1 次提交

drm/i915/pool: constrain pool objects by mapping type · 8f47c8c3

由 Matthew Auld 提交于 1月 19, 2021

In a few places we always end up mapping the pool object with the FORCE
constraint(to prevent hitting -EBUSY) which will destroy the cached
mapping if it has a different type. As a simple first step, make the
mapping type part of the pool interface, where the behaviour is to only
give out pool objects which match the requested mapping type.
Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-4-matthew.auld@intel.com

8f47c8c3

09 1月, 2021 1 次提交

drm/i915/gt: Disable arbitration on no-preempt requests · 9b3a8f55

由 Chris Wilson 提交于 1月 08, 2021

If a request is submitted and known to require no preemption, disable
arbitration around the batch which prevents the HW from handling a
preemption request during the payload.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NAndi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-7-chris@chris-wilson.co.uk

9b3a8f55

05 1月, 2021 1 次提交

drm/i915: clear the gpu reloc batch · 641382e9

由 Matthew Auld 提交于 12月 24, 2020

The reloc batch is short lived but can exist in the user visible ppGTT,
and since it's backed by an internal object, which lacks page clearing,
we should take care to clear it upfront.
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-2-matthew.auld@intel.com
Cc: stable@vger.kernel.org
(cherry picked from commit 26ebc511)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

641382e9

03 1月, 2021 1 次提交

drm/i915: fix shift warning · bb80d878

由 Arnd Bergmann 提交于 1月 03, 2021

Randconfig builds on 32-bit machines show lots of warnings for
the i915 driver for passing a 32-bit value into __const_hweight64():

drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2584:9: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
        return hweight64(VDBOX_MASK(&i915->gt));
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64'
 #define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w))

Change it to hweight_long() to avoid the warning.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210103135158.3591442-1-arnd@kernel.org

bb80d878

24 12月, 2020 1 次提交

drm/i915: clear the gpu reloc batch · 26ebc511

由 Matthew Auld 提交于 12月 24, 2020

The reloc batch is short lived but can exist in the user visible ppGTT,
and since it's backed by an internal object, which lacks page clearing,
we should take care to clear it upfront.
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-2-matthew.auld@intel.com
Cc: stable@vger.kernel.org

26ebc511

18 12月, 2020 1 次提交

drm/i915: Fix mismatch between misplaced vma check and vma insert · 0e53656a

由 Chris Wilson 提交于 12月 16, 2020

When inserting a VMA, we restrict the placement to the low 4G unless the
caller opts into using the full range. This was done to allow usersapce
the opportunity to transition slowly from a 32b address space, and to
avoid breaking inherent 32b assumptions of some commands.

However, for insert we limited ourselves to 4G-4K, but on verification
we allowed the full 4G. This causes some attempts to bind a new buffer
to sporadically fail with -ENOSPC, but at other times be bound
successfully.

commit 48ea1e32 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1
page") suggests that there is a genuine problem with stateless addressing
that cannot utilize the last page in 4G and so we purposefully excluded
it. This means that the quick pin pass may cause us to utilize a buggy
placement.
Reported-by: NCQ Tang <cq.tang@intel.com>
Testcase: igt/gem_exec_params/larger-than-life-batch
Fixes: 48ea1e32 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: NCQ Tang <cq.tang@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v4.5+
Link: https://patchwork.freedesktop.org/patch/msgid/20201216092951.7124-1-chris@chris-wilson.co.uk
(cherry picked from commit 5f22cc0b)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

0e53656a

16 12月, 2020 2 次提交

drm/i915/gt: Move gen8 CS emitters into gen8_engine_cs.h · 45233ab2

由 Chris Wilson 提交于 12月 16, 2020

Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control
and friends to gen8_engine_cs.h
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk

45233ab2

drm/i915: Fix mismatch between misplaced vma check and vma insert · 5f22cc0b

由 Chris Wilson 提交于 12月 16, 2020

When inserting a VMA, we restrict the placement to the low 4G unless the
caller opts into using the full range. This was done to allow usersapce
the opportunity to transition slowly from a 32b address space, and to
avoid breaking inherent 32b assumptions of some commands.

However, for insert we limited ourselves to 4G-4K, but on verification
we allowed the full 4G. This causes some attempts to bind a new buffer
to sporadically fail with -ENOSPC, but at other times be bound
successfully.

commit 48ea1e32 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1
page") suggests that there is a genuine problem with stateless addressing
that cannot utilize the last page in 4G and so we purposefully excluded
it. This means that the quick pin pass may cause us to utilize a buggy
placement.
Reported-by: NCQ Tang <cq.tang@intel.com>
Testcase: igt/gem_exec_params/larger-than-life-batch
Fixes: 48ea1e32 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: NCQ Tang <cq.tang@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v4.5+
Link: https://patchwork.freedesktop.org/patch/msgid/20201216092951.7124-1-chris@chris-wilson.co.uk

5f22cc0b

08 12月, 2020 2 次提交

drm/i915/gem: Propagate error from cancelled submit due to context closure · 0e124e19

由 Chris Wilson 提交于 12月 03, 2020

In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6b ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.

Fixes: 61231f6b ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk
(cherry picked from commit ba38b79e)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

0e124e19

drm/i915/gem: Drop false !i915_vma_is_closed assertion · e9f4829f

由 Chris Wilson 提交于 12月 07, 2020

Closed vma are protected by the GT wakeref held as we lookup the vma, so
we know that the vma will not be freed as we process it for the execbuf.
Instead we expect to catch the closed status of the context, and simply
allow the close-race on an individual vma to be washed away.

Longer term, the GT wakeref protection will be removed by explicit
vma.kref tracking.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2245Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201207193824.18114-1-chris@chris-wilson.co.uk

e9f4829f

04 12月, 2020 1 次提交

drm/i915/gem: Propagate error from cancelled submit due to context closure · ba38b79e

由 Chris Wilson 提交于 12月 03, 2020

In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6b ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.

Fixes: 61231f6b ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk

ba38b79e

20 10月, 2020 1 次提交

drm/i915/gem: Support parsing of oversize batches · d5e87821

由 Chris Wilson 提交于 10月 15, 2020

Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.

Based on a patch by Matthew Auld.
Reported-by: NMatthew Auld <matthew.auld@intel.com>
Fixes: 435e8fc0 ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk
(cherry picked from commit 57b2d834)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

d5e87821

16 10月, 2020 1 次提交

drm/i915/gem: Support parsing of oversize batches · 57b2d834

由 Chris Wilson 提交于 10月 15, 2020

Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.

Based on a patch by Matthew Auld.
Reported-by: NMatthew Auld <matthew.auld@intel.com>
Fixes: 435e8fc0 ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk

57b2d834

01 10月, 2020 1 次提交

drm/i915: Avoid mixing integer types during batch copies · c60b93cd

由 Chris Wilson 提交于 9月 28, 2020

Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.

Fixes: ed13033f ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: N"Candelaria, Jared" <jared.candelaria@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk
(cherry picked from commit b7eeb2b4)
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

c60b93cd

29 9月, 2020 1 次提交

drm/i915: Avoid mixing integer types during batch copies · b7eeb2b4

由 Chris Wilson 提交于 9月 28, 2020

Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.

Fixes: ed13033f ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: N"Candelaria, Jared" <jared.candelaria@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk

b7eeb2b4

10 9月, 2020 1 次提交

drm/i915: Fix slightly botched merge in __reloc_entry_gpu · 166774a2

由 Maarten Lankhorst 提交于 9月 10, 2020

This function should be an int, not a bool.

Presumably because we had the same 2 reverts in a slightly different
way, git got confused.

Thanks to Dan for reporting. :)

The conflict is between the 3 reverts in drm-fixes:

4993a8a3 ("Revert "drm/i915: Remove i915_gem_object_get_dirty_page()"")
ad5d95e4 ("Revert "drm/i915/gem: Async GPU relocations only"")
20561da3 ("Revert "drm/i915/gem: Delete unused code"")

And the slightly different combined revert in drm-intel-gt-next, but
with the same goal:

102a0a90 ("Revert "drm/i915/gem: Async GPU relocations only"")

In the merge commit 1f4b2aca ("Merge tag
'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next") things
went wrong, but the merge commit view now doesn't show any conflict
anymore (as git tends to do when the resolution picks one or the other
branch).

The need to handle other than just true/false error codes in
__reloc_entry_gpu was added in the dma_resv locking changes in
c43ce123 ("drm/i915: Use per object locking in execbuf, v12.")
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Dave Airlie <airlied@redhat.com>
[danvet: Explain this entire saga a lot better, adding tons of commit
references. Also note that this was merged before full intel-gfx-CI
results, only after BAT, since the breakage at the BAT run is already
severe enough to block all pre-merge testing.]
Fixes: 1f4b2aca ("Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next")
Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910111225.2184193-1-maarten.lankhorst@linux.intel.com

166774a2

08 9月, 2020 2 次提交

Revert "drm/i915/gem: Delete unused code" · 20561da3

由 Dave Airlie 提交于 9月 08, 2020

These commits caused a regression on Lenovo t520 sandybridge
machine belonging to reporter. We are reverting them for 5.10
for other reasons, so just do it for 5.9 as well.

This reverts commit 7ac2d253.
Reported-by: NHarald Arnesen <harald@skogtun.org>
Signed-off-by: NDave Airlie <airlied@redhat.com>

20561da3

Revert "drm/i915/gem: Async GPU relocations only" · ad5d95e4

由 Dave Airlie 提交于 9月 08, 2020

These commits caused a regression on Lenovo t520 sandybridge
machine belonging to reporter. We are reverting them for 5.10
for other reasons, so just do it for 5.9 as well.

This reverts commit 9e0f9464.
Reported-by: NHarald Arnesen <harald@skogtun.org>
Signed-off-by: NDave Airlie <airlied@redhat.com>

ad5d95e4

07 9月, 2020 14 次提交

drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin. · 47b08693

由 Maarten Lankhorst 提交于 8月 19, 2020

As a preparation step for full object locking and wait/wound handling
during pin and object mapping, ensure that we always pass the ww context
in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this
happens.

This also requires changing the order of eb_parse slightly, to ensure
we pass ww at a point where we could still handle -EDEADLK safely.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NThomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

47b08693

drm/i915: Pin engine before pinning all objects, v5. · 2bf541ff

由 Maarten Lankhorst 提交于 8月 19, 2020

We want to lock all gem objects, including the engine context objects,
rework the throttling to ensure that we can do this. Now we only throttle
once, but can take eb_pin_engine while acquiring objects. This means we
will have to drop the lock to wait. If we don't have to throttle we can
still take the fastpath, if not we will take the slowpath and wait for
the throttle request while unlocked.

The engine has to be pinned as first step, otherwise gpu relocations
won't work.

Changes since v1:
- Only need to get a throttled request in the fastpath, no need for
  a global flag any more.
- Always free the waited request correctly.
Changes since v2:
- Use intel_engine_pm_get()/put() to keeep engine pool alive during
  EDEADLK handling.
Changes since v3:
- Fix small rq leak.
Changes since v4:
- Use a single reloc_context, for intel_context_pin_ww().
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NThomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-13-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

2bf541ff

drm/i915: Nuke arguments to eb_pin_engine · b49a7d51

由 Maarten Lankhorst 提交于 8月 19, 2020

Those arguments are already set as eb.file and eb.args, so kill off
the extra arguments. This will allow us to move eb_pin_engine() to
after we reserved all BO's.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NThomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-12-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

b49a7d51

drm/i915: Use per object locking in execbuf, v12. · c43ce123

由 Maarten Lankhorst 提交于 8月 19, 2020

Now that we changed execbuf submission slightly to allow us to do all
pinning in one place, we can now simply add ww versions on top of
struct_mutex. All we have to do is a separate path for -EDEADLK
handling, which needs to unpin all gem bo's before dropping the lock,
then starting over.

This finally allows us to do parallel submission, but because not
all of the pinning code uses the ww ctx yet, we cannot completely
drop struct_mutex yet.

Changes since v1:
- Keep struct_mutex for now. :(
Changes since v2:
- Make sure we always lock the ww context in slowpath.
Changes since v3:
- Don't call __eb_unreserve_vma in eb_move_to_gpu now; this can be
done on normal unlock path.
- Unconditionally release vmas and context.
Changes since v4:
- Rebased on top of struct_mutex reduction.
Changes since v5:
- Remove training wheels.
Changes since v6:
- Fix accidentally broken -ENOSPC handling.
Changes since v7:
- Handle gt buffer pool better.
Changes since v8:
- Properly clear variables, to make -EDEADLK handling not BUG.
Change since v9:
- Fix unpinning fence on pnv and below.
Changes since v10:
- Make relocation gpu chaining working again.
Changes since v11:
- Remove relocation chaining, pain to make it work.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NThomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-9-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

c43ce123

drm/i915: Parse command buffer earlier in eb_relocate(slow) · 8e4ba491

由 Maarten Lankhorst 提交于 8月 19, 2020

We want to introduce backoff logic, but we need to lock the
pool object as well for command parsing. Because of this, we
will need backoff logic for the engine pool obj, move the batch
validation up slightly to eb_lookup_vmas, and the actual command
parsing in a separate function which can get called from execbuf
relocation fast and slowpath.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NThomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-8-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

8e4ba491

drm/i915: Remove locking from i915_gem_object_prepare_read/write · 1af343cd

由 Maarten Lankhorst 提交于 8月 19, 2020

Execbuffer submission will perform its own WW locking, and we
cannot rely on the implicit lock there.

This also makes it clear that the GVT code will get a lockdep splat when
multiple batchbuffer shadows need to be performed in the same instance,
fix that up.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-7-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

1af343cd

drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2. · 80f0b679

由 Maarten Lankhorst 提交于 8月 19, 2020

i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory
eviction. We don't use it yet, but lets start adding the definition
first.

To use it, we have to pass a non-NULL ww to gem_object_lock, and don't
unlock directly. It is done in i915_gem_ww_ctx_fini.

Changes since v1:
- Change ww_ctx and obj order in locking functions (Jonas Lahtinen)
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NThomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-6-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

80f0b679

Revert "drm/i915/gem: Split eb_vma into its own allocation" · 8ae275c2

由 Maarten Lankhorst 提交于 8月 19, 2020

This reverts commit 0f1dd022 ("drm/i915/gem: Split eb_vma into
its own allocation") and also moves all unreserving to a single
place at the end, which is a minor simplification.

With the WW locking, we will drop all references only at the
end when unlocking, so refcounting can now be removed.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-5-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

8ae275c2

Revert "drm/i915/gem: Drop relocation slowpath". · fd1500fc

由 Maarten Lankhorst 提交于 8月 19, 2020

This reverts commit 7dc8f114 ("drm/i915/gem: Drop relocation
slowpath"). We need the slowpath relocation for taking ww-mutex
inside the page fault handler, and we will take this mutex when
pinning all objects.

We also functionally revert ef398881 ("drm/i915/gem: Limit
struct_mutex to eb_reserve"), as we need the struct_mutex in
the slowpath as well, and a tiny part of 003d8b91 ("drm/i915/gem:
Only call eb_lookup_vma once during execbuf ioctl"). Specifically,
we make the -EAGAIN handling part of fallback to slowpath again.

With this, we have a proper working slowpath again, which
will allow us to do fault handling with WW locks held.

[mlankhorst: Adjusted for reloc_gpu_flush() changes]

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
[mlankhorst: Removed extra reloc_gpu_flush()]
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-4-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

fd1500fc

drm/i915: Revert relocation chaining commits. · 50ae6c61

由 Maarten Lankhorst 提交于 8月 19, 2020

This reverts commit 964a9b0f ("drm/i915/gem: Use chained reloc batches")
and commit 0e97fbb0 ("drm/i915/gem: Use a single chained reloc batches
for a single execbuf").

When adding ww locking to execbuf, it's hard enough to deal with a
single BO that is part of relocation execution. Chaining is hard to
get right, and with GPU relocation deprecated, it's best to drop this
altogether, instead of trying to fix something we will remove.

This is not a completely 1:1 revert, we reset rq_size to 0 in
reloc_cache_init, this was from e3d29130 ("drm/i915/gem: Implement legacy
MI_STORE_DATA_IMM"), because we don't want to break the selftests. (Daniel)
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-3-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

50ae6c61

Revert "drm/i915/gem: Async GPU relocations only" · 102a0a90

由 Maarten Lankhorst 提交于 8月 19, 2020

This reverts commit 9e0f9464 ("drm/i915/gem: Async GPU relocations only"),
and related commit 7ac2d253 ("drm/i915/gem: Delete unused code").

Async GPU relocations are not the path forward, we want to remove
GPU accelerated relocation support eventually when userspace is fixed
to use VM_BIND, and this is the first step towards that. We will keep
async gpu relocations around for now, until userspace is fixed.

Relocation support will be disabled completely on platforms where there
was never any userspace that depends on it, as the hardware doesn't
require it from at least gen9+ onward. For older platforms, the plan
is to use cpu relocations only.

The igt side is fixed in igt commit 39e9aa1032a4e ("tests/i915: Remove
subtests that rely on async relocation behavior").
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-2-maarten.lankhorst@linux.intel.comSigned-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

102a0a90

drm/i915/gem: Free the fence after a fence-chain lookup failure · da1ea128

由 Chris Wilson 提交于 8月 06, 2020

If dma_fence_chain_find_seqno() reports an error, it does so in its
preamble before it disposes of the input fence. On handling the
error, we need to drop the reference to the fence.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2292Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 13149e8b ("drm/i915: add syncobj timeline support")
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200806161056.17593-1-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

da1ea128

drm/i915: Export a preallocate variant of i915_active_acquire() · 5d934137

由 Chris Wilson 提交于 7月 31, 2020

Sometimes we have to be very careful not to allocate underneath a mutex
(or spinlock) and yet still want to track activity. Enter
i915_active_acquire_for_context(). This raises the activity counter on
i915_active prior to use and ensures that the fence-tree contains a slot
for the context.

v2: Refactor active_lookup() so it can be called again before/after
locking to resolve contention. Since we protect the rbtree until we
idle, we can do a lockfree lookup, with the caveat that if another
thread performs a concurrent insertion, the rotations from the insert
may cause us to not find our target. A second pass holding the treelock
will find the target if it exists, or the place to perform our
insertion.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NThomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-3-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

5d934137

drm/i915/gem: Remove disordered per-file request list for throttling · 27a5dcfe

由 Chris Wilson 提交于 7月 28, 2020

I915_GEM_THROTTLE dates back to the time before contexts where there was
just a single engine, and therefore a single timeline and request list
globally. That request list was in execution/retirement order, and so
walking it to find a particular aged request made sense and could be
split per file.

That is no more. We now have many timelines with a file, as many as the
user wants to construct (essentially per-engine, per-context). Each of
those run independently and so make the single list futile. Remove the
disordered list, and iterate over all the timelines to find a request to
wait on in each to satisfy the criteria that the CPU is no more than 20ms
ahead of its oldest request.

It should go without saying that the I915_GEM_THROTTLE ioctl is no
longer used as the primary means of throttling, so it makes sense to push
the complication into the ioctl where it only impacts upon its few
irregular users, rather than the execbuf/retire where everybody has to
pay the cost. Fortunately, the few users do not create vast amount of
contexts, so the loops over contexts/engines should be concise.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152010.30701-1-chris@chris-wilson.co.ukSigned-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

27a5dcfe

18 8月, 2020 1 次提交

drm/i915: add syncobj timeline support · 13149e8b

由 Lionel Landwerlin 提交于 8月 04, 2020

Introduces a new parameters to execbuf so that we can specify syncobj
handles as well as timeline points.

v2: Reuse i915_user_extension_fn

v3: Check that the chained extension is only present once (Chris)

v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel)

v5: Use BIT_ULL (Chris)

v6: Fix issue with already signaled timeline points,
    dma_fence_chain_find_seqno() setting fence to NULL (Chris)

v7: Report ENOENT with invalid syncobj handle (Lionel)

v8: Check for out of order timeline point insertion (Chris)

v9: After explanations on
    https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html
    drop the ordering check from v8 (Lionel)

v10: Set first extension enum item to 1 (Jason)

v11: Rebase

v12: Allow multiple extension nodes of timeline syncobj (Chris)
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Co-authored-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v11)
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-3-lionel.g.landwerlin@intel.com
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

13149e8b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功