提交 · 8f4faed01e3015955801c8ef066ec7fd7a8b3902 · openeuler / Kernel

03 4月, 2019 1 次提交

i915, uaccess: Fix redundant CLAC · 8f4faed0

由 Peter Zijlstra 提交于 2月 28, 2019

New tooling noticed this:

 drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x3c: redundant UACCESS disable
 drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x66: redundant UACCESS disable

You don't need user_access_end() if user_access_begin() fails.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

8f4faed0

08 2月, 2019 1 次提交

drm/i915: Hack and slash, throttle execbuffer hogs · d6f328bf

由 Chris Wilson 提交于 2月 07, 2019

Apply backpressure to hogs that emit requests faster than the GPU can
process them by waiting for their ring to be less than half-full before
proceeding with taking the struct_mutex.

This is a gross hack to apply throttling backpressure, the long term
goal is to remove the struct_mutex contention so that each client
naturally waits, preferably in an asynchronous, nonblocking fashion
(pipelined operations for the win), for their own resources and never
blocks another client within the driver at least. (Realtime priority
goals would extend to ensuring that resource contention favours high
priority clients as well.)

This patch only limits excessive request production and does not attempt
to throttle clients that block waiting for eviction (either global GTT or
system memory) or any other global resources, see above for the long term
goal.

No microbenchmarks are harmed (to the best of my knowledge).

Testcase: igt/gem_exec_schedule/pi-ringfull-*
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190207071829.5574-1-chris@chris-wilson.co.uk

d6f328bf

30 1月, 2019 1 次提交

drm/i915: Identify active requests · 85474441

由 Chris Wilson 提交于 1月 29, 2019

To allow requests to forgo a common execution timeline, one question we
need to be able to answer is "is this request running?". To track
whether a request has started on HW, we can emit a breadcrumb at the
beginning of the request and check its timeline's HWSP to see if the
breadcrumb has advanced past the start of this request. (This is in
contrast to the global timeline where we need only ask if we are on the
global timeline and if the timeline has advanced past the end of the
previous request.)

There is still confusion from a preempted request, which has already
started but relinquished the HW to a high priority request. For the
common case, this discrepancy should be negligible. However, for
identification of hung requests, knowing which one was running at the
time of the hang will be much more important.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-2-chris@chris-wilson.co.uk

85474441

15 1月, 2019 2 次提交

drm/i915/gem: Track the rpm wakerefs · 538ef96b

由 Chris Wilson 提交于 1月 14, 2019

Keep track of the temporary rpm wakerefs used for user access to the
device, so that we can cancel them upon release and clearly identify any
leaks.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-10-chris@chris-wilson.co.uk

538ef96b

drm/i915: Markup paired operations on wakerefs · 16e4dd03

由 Chris Wilson 提交于 1月 14, 2019

The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).

For regular builds, the compiler should be able to eliminate the unused
local variables and the program growth should be minimal. Fwiw, it came
out as a net improvement as gcc was able to refactor rpm_get and
rpm_get_if_in_use together,

v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual
mark up for smaller more targeted patches.
v3: Mention the cookie in Returns
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk

16e4dd03

09 1月, 2019 1 次提交

drm/i915: drop all drmP.h includes · 2f80d7bd

由 Jani Nikula 提交于 1月 08, 2019

Needs just a few additional includes here and there.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108082709.3748-1-jani.nikula@intel.com

2f80d7bd

05 1月, 2019 2 次提交

make 'user_access_begin()' do 'access_ok()' · 594cc251

由 Linus Torvalds 提交于 1月 04, 2019

Originally, the rule used to be that you'd have to do access_ok()
separately, and then user_access_begin() before actually doing the
direct (optimized) user access.

But experience has shown that people then decide not to do access_ok()
at all, and instead rely on it being implied by other operations or
similar.  Which makes it very hard to verify that the access has
actually been range-checked.

If you use the unsafe direct user accesses, hardware features (either
SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged
Access Never - on ARM) do force you to use user_access_begin().  But
nothing really forces the range check.

By putting the range check into user_access_begin(), we actually force
people to do the right thing (tm), and the range check vill be visible
near the actual accesses.  We have way too long a history of people
trying to avoid them.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

594cc251

i915: fix missing user_access_end() in page fault exception case · 0b2c8f8b

由 Linus Torvalds 提交于 1月 04, 2019

When commit fddcd00a ("drm/i915: Force the slow path after a
user-write error") unified the error handling for various user access
problems, it didn't do the user_access_end() that is needed for the
unsafe_put_user() case.

It's not a huge deal: a missed user_access_end() will only mean that
SMAP protection isn't active afterwards, and for the error case we'll be
returning to user mode soon enough anyway.  But it's wrong, and adding
the proper user_access_end() is trivial enough (and doing it for the
other error cases where it isn't needed doesn't hurt).

I noticed it while doing the same prep-work for changing
user_access_begin() that precipitated the access_ok() changes in commit
96d4f267 ("Remove 'type' argument from access_ok() function").

Fixes: fddcd00a ("drm/i915: Force the slow path after a user-write error")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@kernel.org # v4.20
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0b2c8f8b

04 1月, 2019 1 次提交

Remove 'type' argument from access_ok() function · 96d4f267

由 Linus Torvalds 提交于 1月 03, 2019

Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

96d4f267

13 12月, 2018 1 次提交

drm/i915: replace IS_GEN<N> with IS_GEN(..., N) · cf819eff

由 Lucas De Marchi 提交于 12月 12, 2018

Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.

The following spatch was used to convert the users of these macros:

@@
expression e;
@@
(
- IS_GEN2(e)
+ IS_GEN(e, 2)
|
- IS_GEN3(e)
+ IS_GEN(e, 3)
|
- IS_GEN4(e)
+ IS_GEN(e, 4)
|
- IS_GEN5(e)
+ IS_GEN(e, 5)
|
- IS_GEN6(e)
+ IS_GEN(e, 6)
|
- IS_GEN7(e)
+ IS_GEN(e, 7)
|
- IS_GEN8(e)
+ IS_GEN(e, 8)
|
- IS_GEN9(e)
+ IS_GEN(e, 9)
|
- IS_GEN10(e)
+ IS_GEN(e, 10)
|
- IS_GEN11(e)
+ IS_GEN(e, 11)
)

v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than
    using the bitmask
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com

cf819eff

12 12月, 2018 1 次提交

drm/i915: Flush GPU relocs harder for gen3 · 5b2e3120

由 Chris Wilson 提交于 12月 07, 2018

Adding an extra MI_STORE_DWORD_IMM to the gpu relocation path for gen3
was good, but still not good enough. To survive 24+ hours under test we
needed to perform not one, not two but three extra store-dw. Doing so
for each GPU relocation was a little unsightly and since we need to
worry about userspace hitting the same issues, we should apply the dummy
store-dw into the EMIT_FLUSH.

Fixes: 7dd4f672 ("drm/i915: Async GPU relocation processing")
References: 7fa28e14 ("drm/i915: Write GPU relocs harder with gen3")
Testcase: igt/gem_tiled_fence_blits # blb/pnv
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207134037.11848-1-chris@chris-wilson.co.uk
(cherry picked from commit a889580c)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

5b2e3120

07 12月, 2018 1 次提交

drm/i915: Flush GPU relocs harder for gen3 · a889580c

由 Chris Wilson 提交于 12月 07, 2018

Adding an extra MI_STORE_DWORD_IMM to the gpu relocation path for gen3
was good, but still not good enough. To survive 24+ hours under test we
needed to perform not one, not two but three extra store-dw. Doing so
for each GPU relocation was a little unsightly and since we need to
worry about userspace hitting the same issues, we should apply the dummy
store-dw into the EMIT_FLUSH.

Fixes: 7dd4f672 ("drm/i915: Async GPU relocation processing")
References: 7fa28e14 ("drm/i915: Write GPU relocs harder with gen3")
Testcase: igt/gem_tiled_fence_blits # blb/pnv
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207134037.11848-1-chris@chris-wilson.co.uk

a889580c

05 12月, 2018 1 次提交

drm: revert "expand replace_fence to support timeline point v2" · 0b258ed1

由 Christian König 提交于 11月 14, 2018

This reverts commit 9a09a423.

The whole interface isn't thought through. Since this function can't
fail we actually can't allocate an object to store the sync point.

Sorry, I should have taken the lead on this from the very beginning and
reviewed it more thoughtfully. Going to propose a new interface as a
follow up change.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Link: https://patchwork.freedesktop.org/patch/265580/

0b258ed1

21 11月, 2018 1 次提交

drm/i915: Write GPU relocs harder with gen3 · f8577fb3

由 Chris Wilson 提交于 11月 19, 2018

Under moderate amounts of GPU stress, we can observe on Bearlake and
Pineview (later gen3 models) that we execute the following batch buffer
before the write into the batch is coherent. Adding extra (tested with
upto 32x) MI_FLUSH to either the invalidation, flush or both phases does
not solve the incoherency issue with the relocations, but emitting the
MI_STORE_DWORD_IMM twice does. So be it.

Fixes: 7dd4f672 ("drm/i915: Async GPU relocation processing")
Testcase: igt/gem_tiled_fence_blits # blb/pnv
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181119154153.15327-1-chris@chris-wilson.co.uk
(cherry picked from commit 7fa28e14)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

f8577fb3

20 11月, 2018 1 次提交

drm/i915: Write GPU relocs harder with gen3 · 7fa28e14

由 Chris Wilson 提交于 11月 19, 2018

Under moderate amounts of GPU stress, we can observe on Bearlake and
Pineview (later gen3 models) that we execute the following batch buffer
before the write into the batch is coherent. Adding extra (tested with
upto 32x) MI_FLUSH to either the invalidation, flush or both phases does
not solve the incoherency issue with the relocations, but emitting the
MI_STORE_DWORD_IMM twice does. So be it.

Fixes: 7dd4f672 ("drm/i915: Async GPU relocation processing")
Testcase: igt/gem_tiled_fence_blits # blb/pnv
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181119154153.15327-1-chris@chris-wilson.co.uk

7fa28e14

12 11月, 2018 2 次提交

drm/syncobj: Fix compilation following partial revert · 91324069

由 Chris Wilson 提交于 11月 12, 2018

We need to include the revert of commit 783195ec ("drm/syncobj:
disable the timeline UAPI for now v2") along with undoing the change to
drm/i915.

Fixes: 131280a1 ("drm: Revert syncobj timeline changes.")
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chunming Zhou <david1.zhou@amd.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <maxime.ripard@bootlin.com>
Cc: Sean Paul <sean@poorly.run>
Cc: David Airlie <airlied@linux.ie>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NSean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112152130.12275-1-chris@chris-wilson.co.uk

91324069

iommu/vtd: Cleanup dma_remapping.h header · daedaa33

由 Lu Baolu 提交于 11月 12, 2018

Commit e61d98d8 ("x64, x2apic/intr-remap: Intel vt-d, IOMMU
code reorganization") moved dma_remapping.h from drivers/pci/ to
current place. It is entirely VT-d specific, but uses a generic
name. This merges dma_remapping.h with include/linux/intel-iommu.h
and removes dma_remapping.h as the result.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NLiu, Yi L <yi.l.liu@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

daedaa33

06 11月, 2018 1 次提交

drm/i915: Compare user's 64b GTT offset even on 32b · 08560328

由 Chris Wilson 提交于 10月 25, 2018

Beware mixing unsigned long constants and 64b values, as on 32b the
constant will be zero extended and discard the high 32b when used as
a mask!
Reported-by: NSergii Romantsov <sergii.romantsov@globallogic.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108282Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181025091823.20571-2-chris@chris-wilson.co.uk
(cherry picked from commit 6fc4e48f)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

08560328

26 10月, 2018 1 次提交

drm/i915: Compare user's 64b GTT offset even on 32b · 6fc4e48f

由 Chris Wilson 提交于 10月 25, 2018

Beware mixing unsigned long constants and 64b values, as on 32b the
constant will be zero extended and discard the high 32b when used as
a mask!
Reported-by: NSergii Romantsov <sergii.romantsov@globallogic.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108282Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181025091823.20571-2-chris@chris-wilson.co.uk

6fc4e48f

18 10月, 2018 1 次提交

drm: add syncobj timeline support v9 · 48197bc5

由 Chunming Zhou 提交于 10月 18, 2018

This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side:
This extension introduces a new type of syncobj that has an integer payload
identifying a point in a timeline. Such timeline syncobjs support the
following operations:
   * CPU query - A host operation that allows querying the payload of the
     timeline syncobj.
   * CPU wait - A host operation that allows a blocking wait for a
     timeline syncobj to reach a specified value.
   * Device wait - A device operation that allows waiting for a
     timeline syncobj to reach a specified value.
   * Device signal - A device operation that allows advancing the
     timeline syncobj to a specified value.

v1:
Since it's a timeline, that means the front time point(PT) always is signaled before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be
signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
    a. normal syncobj signal op will create a signal PT to tail of signal pt list.
    b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4：
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)
4. fix timeline path issues.
5. write a timeline test in libdrm

v5: (Christian)
1. semaphore is called syncobj in kernel side.
2. don't need 'timeline' characters in some function name.
3. keep syncobj cb.

v6: (Christian)
1. merge syncobj_timeline to syncobj structure.
2. simplify some check sentences.
3. some misc change.
4. fix CTS failed issue.

v7: (Christian)
1. error handling when creating signal pt.
2. remove timeline naming in func.
3. export flags in find_fence.
4. allow reset timeline.

v8:
1. use wait_event_interruptible without timeout
2. rename _TYPE_INDIVIDUAL to _TYPE_BINARY

v9:
1. rename signal_pt->base to signal_pt->fence_array to avoid misleading
2. improve kerneldoc

individual syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
timeline syncobj is tested by ./amdgpu_test -s 9
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Cc: Christian Konig <christian.koenig@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Rakos <Daniel.Rakos@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/257258/

48197bc5

12 9月, 2018 2 次提交

drm/i915: Reorder execobject[] to insert non-48b objects into the low 4G · 35e882a4

由 Chris Wilson 提交于 9月 12, 2018

If the caller supplies more than 4G of objects and than one that has to
be in the low 4G, it is possible for the low 4G to be full before we
attempt to find room for the last object that must be there. As we don't
reorder the two types, every pass hits the same problem and we fail with
ENOSPC. However, if we impose a little bit of ordering between the two
classes of objects, on the second pass we will be able to fit the
special object as we do it first. For setups that only use !48b objects,
we now reverse the order between passes, hopefully making the subsequent
passes more likely to succeed given that we are trying a different
order (rather than repeating the previous pass!)

v2: Quick one line explanation for the relative priorities given to
reservations.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180912101133.31377-1-chris@chris-wilson.co.uk

35e882a4

drm/i915: Nuke struct_mutex from context_setparam · d3f3e5e4

由 Chris Wilson 提交于 9月 11, 2018

Userspace should be free to race against itself and shoot itself in
the foot if it so desires to adjust a parameter at the same time as
submitting a batch to that context. As such, the struct_mutex in context
setparam is only being used to serialise userspace against itself and
not for any protection of internal structs and so is superfluous.

v2: Separate user_flags from internal flags to reduce chance of
interference; and use locked bit ops for user updates.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180911132206.23032-1-chris@chris-wilson.co.uk

d3f3e5e4

06 9月, 2018 1 次提交

drm: expand replace_fence to support timeline point v2 · 9a09a423

由 Chunming Zhou 提交于 8月 30, 2018

we can place a fence to a timeline point after expanded.
v2: change func parameter order
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/246543/

9a09a423

04 9月, 2018 1 次提交

drm/i915: Fix up FORCE_GPU_RELOC (debug) to flush CPU write domains · 46223993

由 Chris Wilson 提交于 9月 03, 2018

We currently assert that if the target is in a CPU write domain, we use
a CPU reloc path rather than the GPU reloc path. However, we have a debug
override to force the GPU path and that unfortunately hits the assert.
Include the async clflush under the debug option to ensure correct
behaviour even when debugging, and strict when not.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180903150216.19965-1-chris@chris-wilson.co.uk

46223993

03 9月, 2018 2 次提交

drm/i915: Force the slow path after a user-write error · fddcd00a

由 Chris Wilson 提交于 9月 03, 2018

If we fail to write the user relocation back when it is changed, force
ourselves to take the slow relocation path where we can handle faults in
the write path. There is still an element of dubiousness as having
patched up the batch to use the correct offset, it no longer matches the
presumed_offset in the relocation, so a second pass may miss any changes
in layout.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180903083337.13134-3-chris@chris-wilson.co.uk

fddcd00a

drm/i915: Determine uses-full-ppgtt from context for execbuf · 4f2c7337

由 Chris Wilson 提交于 9月 01, 2018

Rather than inspect the global module parameter for whether full-ppgtt
maybe enabled, we can inspect the context directly as to whether it has
its own vm.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180901092451.7233-1-chris@chris-wilson.co.uk

4f2c7337

07 8月, 2018 2 次提交

drm/i915: kill resource streamer support · 08e3e21a

由 Lucas De Marchi 提交于 8月 03, 2018

After disabling resource streamer on ICL (due to it actually not
existing there), I got feedback that there have been some experimental
patches for mesa to use RS years ago, but nothing ever landed or shipped
because there was no performance improvement.

This removes it from kernel keeping the uapi defines around for
compatibility.

v2: - re-add the inadvertent removal of CTX_CTRL_INHIBIT_SYN_CTX_SWITCH
    - don't bother trying to document removed params on uapi header:
      applications should know that from the query.
      (from Chris)

v3: - disable CTX_CTRL_RS_CTX_ENABLE istead of removing it
    - reword commit message after Daniele confirmed no performance
      regression on his machine
    - reword commit message to make clear RS is being removed due to
      never been used
v4: - move I915_EXEC_RESOURCE_STREAMER to __I915_EXEC_ILLEGAL_FLAGS so
      the check on ioctl() is made much earlier by
      i915_gem_check_execbuffer() (suggested by Tvrtko)
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Acked-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180803232443.17193-1-lucas.demarchi@intel.com

08e3e21a

drm/i915/icl: move has_resource_streamer to GEN11_FEATURES · 48928d4b

由 Lucas De Marchi 提交于 7月 19, 2018

Resource streamer has been removed on GEN11 so move it to the FEATURES
macro.
Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180719170557.10729-1-lucas.demarchi@intel.com

48928d4b

07 7月, 2018 3 次提交

drm/i915: Move i915_vma_move_to_active() to i915_vma.c · e6bb1d7f

由 Chris Wilson 提交于 7月 06, 2018

i915_vma_move_to_active() has grown beyond its execbuf origins, and
should take its rightful place in i915_vma.c as a method for i915_vma!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-4-chris@chris-wilson.co.uk

e6bb1d7f

drm/i915: Start returning an error from i915_vma_move_to_active() · a5236978

由 Chris Wilson 提交于 7月 06, 2018

Handling such a late error in request construction is tricky, but to
accommodate future patches which may allocate here, we potentially could
err. To handle the error after already adjusting global state to track
the new request, we must finish and submit the request. But we don't
want to use the request as not everything is being tracked by it, so we
opt to cancel the commands inside the request.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-3-chris@chris-wilson.co.uk

a5236978

drm/i915: Refactor export_fence() after i915_vma_move_to_active() · da99fe5f

由 Chris Wilson 提交于 7月 06, 2018

Currently all callers are responsible for adding the vma to the active
timeline and then exporting its fence. Combine the two operations into
i915_vma_move_to_active() to move all the extra handling from the
callers to the single site.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-1-chris@chris-wilson.co.uk

da99fe5f

21 6月, 2018 2 次提交

drm/i915: Redefine EINVAL for debugging · d20ac620

由 Chris Wilson 提交于 6月 21, 2018

To aide debugging spurious EINVALs, include a debug message every time
we emit one from execbuf.

References: https://bugs.freedesktop.org/show_bug.cgi?id=106744Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180621080150.8110-1-chris@chris-wilson.co.uk

d20ac620

drm/i915: Ignore applying the self-relocation BIAS if no relocations · 827db9d8

由 Chris Wilson 提交于 6月 21, 2018

We only need to apply the BIAS for self-relocations into the batchbuffer
iff the execobject has any relocations.

This suppresses some warnings we may get with a full gtt (so the batch
object has wound up at 0 from a previous invocation), but doesn't fix
the underlying problem of how we tried to move a pinned batch vma (how
we have a pinned user vma outside of execbuf, I do not know, though this
being on an aliasing ppgtt means it could be a spurious pinning via the
global gtt). One step at a time...

References: https://bugs.freedesktop.org/show_bug.cgi?id=106744#c1
Testcase: igt/gem_exec_gttfill # byt (sporadic)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180621073205.26701-1-chris@chris-wilson.co.uk

827db9d8

19 6月, 2018 1 次提交

drm/i915: Apply batch location restrictions before pinning · 7ba33e1c

由 Chris Wilson 提交于 6月 10, 2018

We special case the position of the batch within the GTT to prevent
negative self-relocation deltas from underflowing. However, that
restriction is being applied after a trial pin of the batch in its
current position. Thus we are not rejecting an invalid location if the
batch has been used before, leading to an assertion if we happen to need
to rearrange the entire payload. In the worst case, this may cause a GPU
hang on gen7 or perhaps missing state.

References: https://bugs.freedesktop.org/show_bug.cgi?id=105720
Fixes: 2889caa9 ("drm/i915: Eliminate lots of iterations over the execobjects array")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180610194325.13467-2-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 746c8f14)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

7ba33e1c

14 6月, 2018 1 次提交

drm/i915: Make closing request flush mandatory · 697b9a87

由 Chris Wilson 提交于 6月 12, 2018

For symmetry, simplicity and ensuring the request is always truly idle
upon its completion, always emit the closing flush prior to emitting the
request breadcrumb. Previously, we would only emit the flush if we had
started a user batch, but this just leaves all the other paths open to
speculation (do they affect the GPU caches or not?) With mm switching, a
key requirement is that the GPU is flushed and invalidated before hand,
so for absolute safety, we want that closing flush be mandatory.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180612105135.4459-1-chris@chris-wilson.co.uk

697b9a87

11 6月, 2018 1 次提交

drm/i915: Apply batch location restrictions before pinning · 746c8f14

由 Chris Wilson 提交于 6月 10, 2018

We special case the position of the batch within the GTT to prevent
negative self-relocation deltas from underflowing. However, that
restriction is being applied after a trial pin of the batch in its
current position. Thus we are not rejecting an invalid location if the
batch has been used before, leading to an assertion if we happen to need
to rearrange the entire payload. In the worst case, this may cause a GPU
hang on gen7 or perhaps missing state.

References: https://bugs.freedesktop.org/show_bug.cgi?id=105720
Fixes: 2889caa9 ("drm/i915: Eliminate lots of iterations over the execobjects array")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180610194325.13467-2-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

746c8f14

06 6月, 2018 1 次提交

drm/i915/gtt: Rename i915_hw_ppgtt base member · 82ad6443

由 Chris Wilson 提交于 6月 05, 2018

In the near future, I want to subclass gen6_hw_ppgtt as it contains a
few specialised members and I wish to add more. To avoid the ugliness of
using ppgtt->base.base, rename the i915_hw_ppgtt base member
(i915_address_space) as vm, which is our common shorthand for an
i915_address_space local.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk

82ad6443

04 5月, 2018 1 次提交

drm/i915: Lazily unbind vma on close · 3365e226

由 Chris Wilson 提交于 5月 03, 2018

When userspace is passing around swapbuffers using DRI, we frequently
have to open and close the same object in the foreign address space.
This shows itself as the same object being rebound at roughly 30fps
(with a second object also being rebound at 30fps), which involves us
having to rewrite the page tables and maintain the drm_mm range manager
every time.

However, since the object still exists and it is only the local handle
that disappears, if we are lazy and do not unbind the VMA immediately
when the local user closes the object but defer it until the GPU is
idle, then we can reuse the same VMA binding. We still have to be
careful to mark the handle and lookup tables as closed to maintain the
uABI, just allowing the underlying VMA to be resurrected if the user is
able to access the same object from the same context again.

If the object itself is destroyed (neither userspace keeping a handle to
it), the VMA will be reaped immediately as usual.

In the future, this will be even more useful as instantiating a new VMA
for use on the GPU will become heavier. A nuisance indeed, so nip it in
the bud.

v2: s/__i915_vma_final_close/i915_vma_destroy/ etc.
v3: Leave a hint as to why we deferred the unbind on close.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-1-chris@chris-wilson.co.uk

3365e226

18 4月, 2018 1 次提交

drm/i915: Do no use kfree() to free a kmem_cache_alloc() return value · fcf1fadf

由 Xidong Wang 提交于 4月 04, 2018

Along the eb_lookup_vmas() error path, the return value from
kmem_cache_alloc() was freed using kfree(). Fix it to use the proper
kmem_cache_free() instead.

Fixes: d1b48c1e ("drm/i915: Replace execbuf vma ht with an idr")
Signed-off-by: NXidong Wang <wangxidong_97@163.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v4.14+
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180404093824.9313-1-chris@chris-wilson.co.uk
(cherry picked from commit 6be1187d)
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

fcf1fadf

06 4月, 2018 1 次提交

drm/i915: Describe the bottom of stack in processing a batchbuffer · 99d7e4ee

由 Kevin Rogovin 提交于 4月 06, 2018

Now that "DOC: User command execution" of i915_gem_execbuffer.c is included
in the i915.rst, it is benecifial (for new developers) to read what happens
at the bottom of the driver stack (in terms of bytes written to be read
by the GPU) when processing a user-space batchbuffer.

v5:
  Typo correction of lacking double tick.
Signed-off-by: NKevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
[Joonas: correcting the patch title]
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1523001957-6427-4-git-send-email-kevin.rogovin@intel.com

99d7e4ee

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功