提交 · 8c45cec48e5871f93e56650f7e476d4ea7174a0e · openanolis / cloud-kernel

Now that drm_[cm]alloc* helpers are simple one line wrappers around
kvmalloc_array and drm_free_large is just kvfree alias we can drop
them and replace by their native forms.

This shouldn't introduce any functional change.

Changes since v1
- fix typo in drivers/gpu//drm/etnaviv/etnaviv_gem.c - noticed by 0day
  build robot
Suggested-by: NDaniel Vetter <daniel@ffwll.ch>
Signed-off-by: Michal Hocko <mhocko@suse.com>drm: drop drm_[cm]alloc* helpers
[danvet: Fixup vgem which grew another user very recently.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NChristian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517122312.GK18247@dhcp22.suse.cz

2098105e

15 4月, 2017 1 次提交

drm/i915: Copy user requested buffers into the error state · b0fd47ad

由 Chris Wilson 提交于 4月 15, 2017

Introduce a new execobject.flag (EXEC_OBJECT_CAPTURE) that userspace may
use to indicate that it wants the contents of this buffer preserved in
the error state (/sys/class/drm/cardN/error) following a GPU hang
involving this batch.

Use this at your discretion, the contents of the error state. although
compressed, are allocated with GFP_ATOMIC (i.e. limited) and kept for all
eternity (until the error state is destroyed).

Based on an earlier patch by Ben Widawsky <ben@bwidawsk.net>
Testcase: igt/gem_exec_capture
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: NBen Widawsky <ben@bwidawsk.net>
Acked-by: NMatt Turner <mattst88@gmail.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170415093902.22581-1-chris@chris-wilson.co.uk

b0fd47ad

29 3月, 2017 1 次提交

drm/i915: Align "unfenced" tiled access on gen2, early gen3 · 9e176430

由 Chris Wilson 提交于 3月 25, 2017

Old devices have quite severe restrictions for using fences, and unlike
more recent device (anything from Pineview onwards) we need to enforce
those restrictions even for unfenced tiled access from the render
pipeline.

Fixes: 944397f0 ("drm/i915: Store required fence size/alignment for GGTT vma")
Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.11-rc1+
Link: http://patchwork.freedesktop.org/patch/msgid/20170325113243.16438-1-chris@chris-wilson.co.ukReviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit f4ce766f)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

9e176430

27 3月, 2017 1 次提交

drm/i915: Align "unfenced" tiled access on gen2, early gen3 · f4ce766f

由 Chris Wilson 提交于 3月 25, 2017

Old devices have quite severe restrictions for using fences, and unlike
more recent device (anything from Pineview onwards) we need to enforce
those restrictions even for unfenced tiled access from the render
pipeline.

Fixes: 944397f0 ("drm/i915: Store required fence size/alignment for GGTT vma")
Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.11-rc1+
Link: http://patchwork.freedesktop.org/patch/msgid/20170325113243.16438-1-chris@chris-wilson.co.ukReviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>

f4ce766f

14 3月, 2017 1 次提交

drm/i915: Drop support for I915_EXEC_CONSTANTS_* execbuf parameters. · 0f5418e5

由 Kenneth Graunke 提交于 3月 13, 2017

This patch makes the I915_PARAM_HAS_EXEC_CONSTANTS getparam return 0
(indicating the optional feature is not supported), and makes execbuf
always return -EINVAL if the flags are used.

Apparently, no userspace ever shipped which used this optional feature:
I checked the git history of Mesa, xf86-video-intel, libva, and Beignet,
and there were zero commits showing a use of these flags.  Kernel commit
72bfa19c apparently introduced the feature prematurely.  According
to Chris, the intention was to use this in cairo-drm, but "the use was
broken for gen6", so I don't think it ever happened.

'relative_constants_mode' has always been tracked per-device, but this
has actually been wrong ever since hardware contexts were introduced, as
the INSTPM register is saved (and automatically restored) as part of the
render ring context. The software per-device value could therefore get
out of sync with the hardware per-context value.  This meant that using
them is actually unsafe: a client which tried to use them could damage
the state of other clients, causing the GPU to interpret their BO
offsets as absolute pointers, leading to bogus memory reads.

These flags were also never ported to execlist mode, making them no-ops
on Gen9+ (which requires execlists), and Gen8 in the default mode.

On Gen8+, userspace can write these registers directly, achieving the
same effect.  On Gen6-7.5, it likely makes sense to extend the command
parser to support them.  I don't think anyone wants this on Gen4-5.

Based on a patch by Dave Gordon.

v3: Return -ENODEV for the getparam, as this is what we do for other
    obsolete features.  Suggested by Chris Wilson.

Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92448Signed-off-by: NKenneth Graunke <kenneth@whitecape.org>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215093446.21291-1-kenneth@whitecape.orgAcked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170313170433.26843-1-chris@chris-wilson.co.uk
(cherry picked from commit ef0f411f)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

0f5418e5

03 3月, 2017 1 次提交

drm/i915: Drop spinlocks around adding to the client request list · c8659efa

由 Chris Wilson 提交于 3月 02, 2017

Adding to the tail of the client request list as the only other user is
in the throttle ioctl that iterates forwards over the list. It only
needs protection against deletion of a request as it reads it, it simply
won't see a new request added to the end of the list, or it would be too
early and rejected. We can further reduce the number of spinlocks
required when throttling by removing stale requests from the client_list
as we throttle.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302122525.19675-1-chris@chris-wilson.co.uk

c8659efa

25 2月, 2017 1 次提交

drm/i915: Drop support for I915_EXEC_CONSTANTS_* execbuf parameters. · ef0f411f

由 Kenneth Graunke 提交于 2月 15, 2017

This patch makes the I915_PARAM_HAS_EXEC_CONSTANTS getparam return 0
(indicating the optional feature is not supported), and makes execbuf
always return -EINVAL if the flags are used.

Apparently, no userspace ever shipped which used this optional feature:
I checked the git history of Mesa, xf86-video-intel, libva, and Beignet,
and there were zero commits showing a use of these flags.  Kernel commit
72bfa19c apparently introduced the feature prematurely.  According
to Chris, the intention was to use this in cairo-drm, but "the use was
broken for gen6", so I don't think it ever happened.

'relative_constants_mode' has always been tracked per-device, but this
has actually been wrong ever since hardware contexts were introduced, as
the INSTPM register is saved (and automatically restored) as part of the
render ring context. The software per-device value could therefore get
out of sync with the hardware per-context value.  This meant that using
them is actually unsafe: a client which tried to use them could damage
the state of other clients, causing the GPU to interpret their BO
offsets as absolute pointers, leading to bogus memory reads.

These flags were also never ported to execlist mode, making them no-ops
on Gen9+ (which requires execlists), and Gen8 in the default mode.

On Gen8+, userspace can write these registers directly, achieving the
same effect.  On Gen6-7.5, it likely makes sense to extend the command
parser to support them.  I don't think anyone wants this on Gen4-5.

Based on a patch by Dave Gordon.

v3: Return -ENODEV for the getparam, as this is what we do for other
    obsolete features.  Suggested by Chris Wilson.

Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92448Signed-off-by: NKenneth Graunke <kenneth@whitecape.org>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215093446.21291-1-kenneth@whitecape.orgAcked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

ef0f411f

22 2月, 2017 2 次提交

drm/i915: Perform object clflushing asynchronously · 57822dc6

由 Chris Wilson 提交于 2月 22, 2017

Flushing the cachelines for an object is slow, can be as much as 100ms
for a large framebuffer. We currently do this under the struct_mutex BKL
on execution or on pageflip. But now with the ability to add fences to
obj->resv for both flips and execbuf (and we naturally wait on the fence
before CPU access), we can move the clflush operation to a workqueue and
signal a fence for completion, thereby doing the work asynchronously and
not blocking the driver or its clients.

v2: Introduce i915_gem_clflush.h and use a new name, split out some
extras into separate patches.
Suggested-by: NAkash Goel <akash.goel@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-5-chris@chris-wilson.co.uk

57822dc6

drm/i915: Remove change_domain tracepoint · 208b84a3

由 Chris Wilson 提交于 2月 22, 2017

The change_domain tracepoint has been inaccurate for a few years - it
doesn't fully capture the domains, especially with userspace bypassing
them. It is defunct, misleading and time to be removed.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-1-chris@chris-wilson.co.uk

208b84a3

21 2月, 2017 2 次提交

drm/i915/tracepoints: Adjust i915_gem_ring_dispatch · 1cce8922

由 Tvrtko Ursulin 提交于 2月 21, 2017

Rename it to i915_gem_request_queue and fix the logged info
equivalent to the i915_gem_request even class. Also moved it
a bit further apart from the i915_gem_request_add tracepoint
since they otherwise provide similar information too close in
time.

v2: Remove sw fence singalling. We will rely on the soon to
    come GuC scheduling backend to enable that. (Chris Wilson)

v3: Log hex with 0x prefix for clarity.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>

1cce8922

drm/i915: Use reservation_object_lock() · e2989f14

由 Chris Wilson 提交于 2月 21, 2017

Replace the calls to ww_mutex_lock(&resv->lock) with the helper
reservation_object_lock(resv) and similarly for unlock.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221091723.6219-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

e2989f14

14 2月, 2017 1 次提交

drm/i915: Emit to ringbuffer directly · 73dec95e

由 Tvrtko Ursulin 提交于 2月 14, 2017

This removes the usage of intel_ring_emit in favour of
directly writing to the ring buffer.

intel_ring_emit was preventing the compiler for optimising
fetch and increment of the current ring buffer pointer and
therefore generating very verbose code for every write.

It had no useful purpose since all ringbuffer operations
are started and ended with intel_ring_begin and
intel_ring_advance respectively, with no bail out in the
middle possible, so it is fine to increment the tail in
intel_ring_begin and let the code manage the pointer
itself.

Useless instruction removal amounts to approximately
two and half kilobytes of saved text on my build.

Not sure if this has any measurable performance
implications but executing a ton of useless instructions
on fast paths cannot be good.

v2:
 * Change return from intel_ring_begin to error pointer by
   popular demand.
 * Move tail increment to intel_ring_advance to enable some
   error checking.

v3:
 * Move tail advance back into intel_ring_begin.
 * Rebase and tidy.

v4:
 * Complete rebase after a few months since v3.

v5:
 * Remove unecessary cast and fix !debug compile. (Chris Wilson)

v6:
 * Make intel_ring_offset take request as well.
 * Fix recording of request postfix plus a sprinkle of asserts.
   (Chris Wilson)

v7:
 * Use intel_ring_offset to get the postfix. (Chris Wilson)
 * Convert GVT code as well.

v8:
 * Rename *out++ to *cs++.

v9:
 * Fix GVT out to cs conversion in GVT.

v10:
 * Rebase for new intel_ring_begin in selftests.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170214113242.29241-1-tvrtko.ursulin@linux.intel.com

73dec95e

08 2月, 2017 2 次提交

drm/i915: Always convert incoming exec offsets to non-canonical · 6e7eb178

由 Michał Winiarski 提交于 2月 07, 2017

We're using non-canonical addresses in drm_mm, and we're making sure that
userspace is using canonical addressing - both in case of softpin
(verifying incoming offset) and when relocating (converting to canonical
when updating offset returned to userspace).
Unfortunately when considering the need for relocations, we're comparing
offset from userspace (in canonical form) with drm_mm node (in
non-canonical form), and as a result, we end up always relocating if our
offsets are in the "problematic" range.
Let's always convert the offsets to avoid the performance impact of
relocations.

Fixes: a5f0edf6 ("drm/i915: Avoid writing relocs with addresses in non-canonical form")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reported-by: NMichał Pyrzowski <michal.pyrzowski@intel.com>
Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170207195559.18798-1-michal.winiarski@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 038c95a3)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

6e7eb178

drm/i915: Always convert incoming exec offsets to non-canonical · 038c95a3

由 Michał Winiarski 提交于 2月 07, 2017

We're using non-canonical addresses in drm_mm, and we're making sure that
userspace is using canonical addressing - both in case of softpin
(verifying incoming offset) and when relocating (converting to canonical
when updating offset returned to userspace).
Unfortunately when considering the need for relocations, we're comparing
offset from userspace (in canonical form) with drm_mm node (in
non-canonical form), and as a result, we end up always relocating if our
offsets are in the "problematic" range.
Let's always convert the offsets to avoid the performance impact of
relocations.

Fixes: a5f0edf6 ("drm/i915: Avoid writing relocs with addresses in non-canonical form")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reported-by: NMichał Pyrzowski <michal.pyrzowski@intel.com>
Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170207195559.18798-1-michal.winiarski@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

038c95a3

04 2月, 2017 1 次提交

drm/i915: fix pm refcounting on fence error in execbuf · 4a04e371

由 Daniele Ceraolo Spurio 提交于 2月 03, 2017

Fences are creted/checked before the pm ref is taken, so if we jump to
pre_mutex_err we will uncorrectly call intel_runtime_pm_put.

v2: Massage unwind error paths

Fixes: fec0445c (drm/i915: Support explicit fencing for execbuf)
Testcase: igt/gem_exec_params
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1486161930-11764-1-git-send-email-daniele.ceraolospurio@intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

4a04e371

03 2月, 2017 1 次提交

drm: Improve drm_mm search (and fix topdown allocation) with rbtrees · 4e64e553

由 Chris Wilson 提交于 2月 02, 2017

The drm_mm range manager claimed to support top-down insertion, but it
was neither searching for the top-most hole that could fit the
allocation request nor fitting the request to the hole correctly.

In order to search the range efficiently, we create a secondary index
for the holes using either their size or their address. This index
allows us to find the smallest hole or the hole at the bottom or top of
the range efficiently, whilst keeping the hole stack to rapidly service
evictions.

v2: Search for holes both high and low. Rename flags to mode.
v3: Discover rb_entry_safe() and use it!
v4: Kerneldoc for enum drm_mm_insert_mode.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Sinclair Yeh <syeh@vmware.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com> # vmwgfx
Reviewed-by: Lucas Stach <l.stach@pengutronix.de> #etnaviv
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170202210438.28702-1-chris@chris-wilson.co.uk

4e64e553

28 1月, 2017 2 次提交

drm/i915: Support explicit fencing for execbuf · fec0445c

由 Chris Wilson 提交于 1月 27, 2017

Now that the user can opt-out of implicit fencing, we need to give them
back control over the fencing. We employ sync_file to wrap our
drm_i915_gem_request and provide an fd that userspace can merge with
other sync_file fds and pass back to the kernel to wait upon before
future execution.

Testcase: igt/gem_exec_fence
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: NChad Versace <chadversary@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170127094008.27489-2-chris@chris-wilson.co.uk

fec0445c

drm/i915: Enable userspace to opt-out of implicit fencing · 77ae9957

由 Chris Wilson 提交于 1月 27, 2017

Userspace is faced with a dilemma. The kernel requires implicit fencing
to manage resource usage (we always must wait for the GPU to finish
before releasing its PTE) and for third parties. However, userspace may
wish to avoid this serialisation if it is either using explicit fencing
between parties and wants more fine-grained access to buffers (e.g. it
may partition the buffer between uses and track fences on ranges rather
than the implicit fences tracking the whole object). It follows that
userspace needs a mechanism to avoid the kernel's serialisation on its
implicit fences before execbuf execution.

The next question is whether this is an object, execbuf or context flag.
Hybrid users (such as using explicit EGL_ANDROID_native_sync fencing on
shared winsys buffers, but implicit fencing on internal surfaces)
require a per-object level flag. Given that this flag need to be only
set once for the lifetime of the object, this reduces the convenience of
having an execbuf or context level flag (and avoids having multiple
pieces of uABI controlling the same feature).

Incorrect use of this flag will result in rendering corruption and GPU
hangs - but will not result in use-after-free or similar resource
tracking issues.

Serious caveat: write ordering is not strictly correct after setting
this flag on a render target on multiple engines. This affects all
subsequent GEM operations (execbuf, set-domain, pread) and shared
dma-buf operations. A fix is possible - but costly (both in terms of
further ABI changes and runtime overhead).

Testcase: igt/gem_exec_async
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: NChad Versace <chadversary@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170127094008.27489-1-chris@chris-wilson.co.uk

77ae9957

19 1月, 2017 1 次提交

drm/i915: Rename some warts in the VMA API · 718659a6

由 Chris Wilson 提交于 1月 16, 2017

Whilst writing testcases to exercise the VMA API, some oddities came to
light, such as i915_gem_obj_lookup_or_create(). Joonas suggested
i915_vma_instance() as a neat replacement, so rename them, move them to
i915_vma.c and add some kerneldoc as a sugary bonus.

s/i915_gem_obj_to_vma/i915_vma_lookup/
s/i915_gem_obj_lookup_or_create_vma/i915_vma_instance/
Suggested-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-2-chris@chris-wilson.co.uk

718659a6

11 1月, 2017 1 次提交

drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE · f51455d4

由 Chris Wilson 提交于 1月 10, 2017

Start converting over from the byte count to its semantic macro, either
we want to allocate the size of a physical page in main memory or we
want the size of a virtual page in the GTT. 4096 could mean either, but
PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve
code comprehension and future changes. In the future, we may want to use
variable GTT page sizes and so have the challenge of knowing which
hardcoded values were used to represent a physical page vs the virtual
page.

v2: Look for a few more 4096s to convert, discover IS_ALIGNED().
v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep
bdw stolen w/a as 4096 until we know better.
v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page
sizes.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

f51455d4

31 12月, 2016 1 次提交

drm/i915: Complete kerneldoc for struct i915_gem_context · 6095868a

由 Chris Wilson 提交于 12月 31, 2016

The existing kerneldoc was outdated, so time for a refresh.

v2: Use single line kdoc, mention functions for manipulation
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161231112012.29263-3-chris@chris-wilson.co.uk

6095868a

19 12月, 2016 1 次提交

drm/i915: Add a reminder that i915_vma_move_to_active() requires struct_mutex · 81147b07

由 Chris Wilson 提交于 12月 18, 2016

i915_vma_move_to_active() requires the struct_mutex for serialisation
with retirement, so mark it up with lockdep_assert_held().
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-1-chris@chris-wilson.co.uk

81147b07

06 12月, 2016 2 次提交

drm/i915: Fix i915_gem_evict_for_vma (soft-pinning) · 172ae5b4

由 Chris Wilson 提交于 12月 05, 2016

Soft-pinning depends upon being able to check for availabilty of an
interval and evict overlapping object from a drm_mm range manager very
quickly. Currently it uses a linear list, and so performance is dire and
not suitable as a general replacement. Worse, the current code will oops
if it tries to evict an active buffer.

It also helps if the routine reports the correct error codes as expected
by its callers and emits a tracepoint upon use.

For posterity since the wrong patch was pushed (i.e. that missed these
key points and had known bugs), this is the changelog that should have
been on commit 506a8e87 ("drm/i915: Add soft-pinning API for
execbuffer"):

Userspace can pass in an offset that it presumes the object is located
at. The kernel will then do its utmost to fit the object into that
location. The assumption is that userspace is handling its own object
locations (for example along with full-ppgtt) and that the kernel will
rarely have to make space for the user's requests.

This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following:
* if the user supplies a virtual address via the execobject->offset
*and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then
that object is placed at that offset in the address space selected
by the context specifier in execbuffer.
* the location must be aligned to the GTT page size, 4096 bytes
* as the object is placed exactly as specified, it may be used by this
execbuffer call without relocations pointing to it

It may fail to do so if:
* EINVAL is returned if the object does not have a 4096 byte aligned
address
* the object conflicts with another pinned object (either pinned by
hardware in that address space, e.g. scanouts in the aliasing ppgtt)
or within the same batch.
EBUSY is returned if the location is pinned by hardware
EINVAL is returned if the location is already in use by the batch
* EINVAL is returned if the object conflicts with its own alignment (as meets
the hardware requirements) or if the placement of the object does not fit
within the address space

All other execbuffer errors apply.

Presence of this execbuf extension may be queried by passing
I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for
a reported value of 1 (or greater).

v2: Combine the hole/adjusted-hole ENOSPC checks
v3: More color, more splitting, more blurb.

Fixes: 506a8e87 ("drm/i915: Add soft-pinning API for execbuffer")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-2-chris@chris-wilson.co.uk

172ae5b4

drm/i915: Mark all non-vma being inserted into the address spaces · 85fd4f58

由 Chris Wilson 提交于 12月 05, 2016

We need to distinguish between full i915_vma structs and simple
drm_mm_nodes when considering eviction (i.e. we must be careful not to
treat a mere drm_mm_node as a much larger i915_vma causing memory
corruption, if we are lucky). To do this, color these not-a-vma with -1
(I915_COLOR_UNEVICTABLE).

v2...v200: New name for -1.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-1-chris@chris-wilson.co.uk

85fd4f58

24 11月, 2016 1 次提交

drm/i915: Use the precomputed value for whether to enable command parsing · 41736a8e

由 Chris Wilson 提交于 11月 24, 2016

As i915.enable_cmd_parser is an unsafe option, make it read-only at
runtime. Now that it is constant, we can use the value determined during
initialisation as to whether we need the cmdparser at execbuffer time.

v2: Remove the inline for its single user, it is clear enough (and
shorter) without!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161124125851.6615-1-chris@chris-wilson.co.uk

41736a8e

21 11月, 2016 1 次提交

drm/i915: Wipe hang stats as an embedded struct · bc1d53c6

由 Mika Kuoppala 提交于 11月 16, 2016

Bannable property, banned status, guilty and active counts are
properties of i915_gem_context. Make them so.

v2: rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479309634-28574-1-git-send-email-mika.kuoppala@intel.com

bc1d53c6

18 11月, 2016 1 次提交

drm/i915: Move frontbuffer CS write tracking from ggtt vma to object · 5b8c8aec

由 Chris Wilson 提交于 11月 16, 2016

I tried to avoid having to track the write for every VMA by only
tracking writes to the ggtt. However, for the purposes of frontbuffer
tracking this is insufficient as we need to invalidate around writes not
just to the the ggtt but all aliased ppgtt views of the framebuffer. By
moving the critical section to the object and only doing so for
framebuffer writes we can reduce the tracking even further by only
watching framebuffers and not vma.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161116190704.5293-1-chris@chris-wilson.co.ukTested-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

5b8c8aec

17 11月, 2016 1 次提交
- T
  drm/i915: Use dev_priv in INTEL_INFO in i915_gem_execbuffer.c · f0836b72
  由 Tvrtko Ursulin 提交于 11月 16, 2016
```
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
```
  f0836b72
11 11月, 2016 3 次提交

drm/i915: Further assorted dev_priv cleanups · 4805fe82

由 Tvrtko Ursulin 提交于 11月 04, 2016

A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>

4805fe82

drm/i915: Assorted dev_priv cleanups · 0031fb96

由 Tvrtko Ursulin 提交于 11月 04, 2016

A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>

0031fb96

drm/i915: Mark CPU cache as dirty when used for rendering · 48004881

由 Chris Wilson 提交于 11月 07, 2016

On LLC, or even snooped, machines rendering via the GPU ends up in the CPU
cache. This cacheline dirt also needs to be flushed to main memory when
moving to an incoherent domain, such as the display's scanout engine.
Mostly, this happens because either the object is marked as dirty from
its first use or is avoided by setting the object into the display
domain from the start.

v2: Treat WT as not requiring a clflush prior to use on the display
engine as well.

Fixes: 0f71979a ("drm/i915: Performed deferred clflush inside set-cache-level")
References: https://bugs.freedesktop.org/show_bug.cgi?id=95414Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.0+
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161107165204.7008-1-chris@chris-wilson.co.uk
(cherry picked from commit 7aa6ca61)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

48004881

08 11月, 2016 1 次提交

drm/i915: Mark CPU cache as dirty when used for rendering · 7aa6ca61

由 Chris Wilson 提交于 11月 07, 2016

On LLC, or even snooped, machines rendering via the GPU ends up in the CPU
cache. This cacheline dirt also needs to be flushed to main memory when
moving to an incoherent domain, such as the display's scanout engine.
Mostly, this happens because either the object is marked as dirty from
its first use or is avoided by setting the object into the display
domain from the start.

v2: Treat WT as not requiring a clflush prior to use on the display
engine as well.

Fixes: 0f71979a ("drm/i915: Performed deferred clflush inside set-cache-level")
References: https://bugs.freedesktop.org/show_bug.cgi?id=95414Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.0+
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161107165204.7008-1-chris@chris-wilson.co.uk

7aa6ca61

03 11月, 2016 1 次提交

drm/i915: Introduce HAS_64BIT_RELOC · dfc5148f

由 Joonas Lahtinen 提交于 11月 03, 2016

Move has_64bit_reloc into dev_priv->info. This will make it visible
in the feature listing debug output.

v2:
- Keep the struct member to keep GCC fragile but happy (Chris)
v3:
- More detailed commit message (Chris)
- Include forgotten CHV and BXT (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478162386-5018-1-git-send-email-joonas.lahtinen@linux.intel.com

dfc5148f

29 10月, 2016 4 次提交

drm/i915: Move GEM activity tracking into a common struct reservation_object · d07f0e59

由 Chris Wilson 提交于 10月 28, 2016

In preparation to support many distinct timelines, we need to expand the
activity tracking on the GEM object to handle more than just a request
per engine. We already use the struct reservation_object on the dma-buf
to handle many fence contexts, so integrating that into the GEM object
itself is the preferred solution. (For example, we can now share the same
reservation_object between every consumer/producer using this buffer and
skip the manual import/export via dma-buf.)

v2: Reimplement busy-ioctl (by walking the reservation object), postpone
the ABI change for another day. Similarly use the reservation object to
find the last_write request (if active and from i915) for choosing
display CS flips.

Caveats:

* busy-ioctl: busy-ioctl only reports on the native fences, it will not
warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
being rendered to by external fences. It also will not report the same
busy state as wait-ioctl (or polling on the dma-buf) in the same
circumstances. On the plus side, it does retain reporting of which
*i915* engines are engaged with this object.

* non-blocking atomic modesets take a step backwards as the wait for
render completion blocks the ioctl. This is fixed in a subsequent
patch to use a fence instead for awaiting on the rendering, see
"drm/i915: Restore nonblocking awaits for modesetting"

* dynamic array manipulation for shared-fences in reservation is slower
than the previous lockless static assignment (e.g. gem_exec_lut_handle
runtime on ivb goes from 42s to 66s), mainly due to atomic operations
(maintaining the fence refcounts).

* loss of object-level retirement callbacks, emulated by VMA retirement
tracking.

* minor loss of object-level last activity information from debugfs,
could be replaced with per-vma information if desired
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk

d07f0e59

drm/i915: Refactor object page API · a4f5ea64

由 Chris Wilson 提交于 10月 28, 2016

The plan is to make obtaining the backing storage for the object avoid
struct_mutex (i.e. use its own locking). The first step is to update the
API so that normal users only call pin/unpin whilst working on the
backing storage.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk

a4f5ea64

drm/i915: Defer active reference until required · f8a7fde4

由 Chris Wilson 提交于 10月 28, 2016

We only need the active reference to keep the object alive after the
handle has been deleted (so as to prevent a synchronous gem_close). Why
then pay the price of a kref on every execbuf when we can insert that
final active ref just in time for the handle deletion?
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-6-chris@chris-wilson.co.uk

f8a7fde4

drm/i915: Support asynchronous waits on struct fence from i915_gem_request · b52992c0

由 Chris Wilson 提交于 10月 28, 2016

We will need to wait on DMA completion (as signaled via struct fence)
before executing our i915_gem_request. Therefore we want to expose a
method for adding the await on the fence itself to the request.

v2: Add a comment detailing a failure to handle a signal-on-any
fence-array.
v3: Pretend that magic numbers don't exist.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-1-chris@chris-wilson.co.uk

b52992c0

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功