提交 · 506a8e87d8d2746b9e9d2433503fe237c54e4750 · openeuler / Kernel

09 12月, 2015 1 次提交

drm/i915: Add soft-pinning API for execbuffer · 506a8e87

由 Chris Wilson 提交于 12月 08, 2015

Userspace can pass in an offset that it presumes the object is located
at. The kernel will then do its utmost to fit the object into that
location. The assumption is that userspace is handling its own object
locations (for example along with full-ppgtt) and that the kernel will
rarely have to make space for the user's requests.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

v2: Fixed incorrect eviction found by Michal Winiarski - fix suggested by Chris
Wilson.  Fixed incorrect error paths causing crash found by Michal Winiarski.
(Not published externally)

v3: Rebased because of trivial conflict in object_bind_to_vm.  Fixed eviction
to allow eviction of soft-pinned objects when another soft-pinned object used
by a subsequent execbuffer overlaps reported by Michal Winiarski.
(Not published externally)

v4: Moved soft-pinned objects to the front of ordered_vmas so that they are
pinned first after an address conflict happens to avoid repeated conflicts in
rare cases (Suggested by Chris Wilson).  Expanded comment on
drm_i915_gem_exec_object2.offset to cover this new API.

v5: Added I915_PARAM_HAS_EXEC_SOFTPIN parameter for detecting this capability
(Kristian). Added check for multiple pinnings on eviction (Akash). Made sure
buffers are not considered misplaced without the user specifying
EXEC_OBJECT_SUPPORTS_48B_ADDRESS.  User must assume responsibility for any
addressing workarounds.  Updated object2.offset field comment again to clarify
NO_RELOC case (Chris).  checkpatch cleanup.

v6: Trivial rebase on latest drm-intel-nightly

v7: Catch attempts to pin above the max virtual address size and return
EINVAL (Tvrtko). Decouple EXEC_OBJECT_SUPPORTS_48B_ADDRESS and
EXEC_OBJECT_PINNED flags, user must pass both flags in any attempt to pin
something at an offset above 4GB (Chris, Daniel Vetter).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Zou Nanhai <nanhai.zou@intel.com>
Cc: Kristian Høgsberg <hoegsberg@gmail.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Acked-by: PDT
Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449575707-20933-1-git-send-email-thomas.daniel@intel.com

506a8e87

18 11月, 2015 1 次提交

drm/i915: Add functions to emit register offsets to the ring · f92a9162

由 Ville Syrjälä 提交于 11月 04, 2015

When register type safety happens, we can't just try to emit the
register itself to the ring. Instead we'll need to extract the
offset from it first. Add some convenience functions that will do
that.

v2: Convert MOCS setup too
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1446672017-24497-20-git-send-email-ville.syrjala@linux.intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>

f92a9162

07 10月, 2015 1 次提交

drm/i915: Kill DRI1 cliprects · 2f5945bc

由 Chris Wilson 提交于 10月 06, 2015

Passing cliprects into the kernel for it to re-execute the batch buffer
with different CMD_DRAWRECT died out long ago. As DRI1 support has been
removed from the kernel, we can now simply reject any execbuf trying to
use this "feature".

To keep Daniel happy with the prospect of being able to reuse these
fields in the next decade, continue to ensure that current userspace is
not passing garbage in through the dead fields.

v2: Fix the cliprects_ptr check
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDave Gordon <david.s.gordon@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

2f5945bc

02 10月, 2015 1 次提交

drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset · 101b506a

由 Michel Thierry 提交于 10月 01, 2015

There are some allocations that must be only referenced by 32-bit
offsets. To limit the chances of having the first 4GB already full,
objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
DRM_MM_CREATE_TOP flags

In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
General State Heap (GSH) or Instruction State Heap (ISH) must be in a
32-bit range, because the General State Offset and Instruction State
Offset are limited to 32-bits.

Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
they can be allocated above the 32-bit address range. To limit the
chances of having the first 4GB already full, objects will use
DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.

The libdrm user of the EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag is here:
http://lists.freedesktop.org/archives/intel-gfx/2015-September/075836.html

v2: Changed flag logic from neeeds_32b, to supports_48b.
v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
to use last PIN_ defined instead of hard-coded value; use correct limit
check in eb_vma_misplaced. (Chris)
v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
v6: Apply pin-high for ggtt too (Chris)
v7: Handle simultaneous pin-high and pin-mappable end correctly (Akash)
    Fix check for entries currently using +4GB addresses, use min_t and
    other polish in object_bind_to_vm (Chris)
v8: Commit message updated to point to libdrm patch.
v9: vmas are allocated in the correct ozone, so only check flag when the
    vma has not been allocated. (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

101b506a

14 9月, 2015 1 次提交

drm/i915: Split alloc from init for lrc · e84fe803

由 Nick Hoath 提交于 9月 11, 2015

Extend init/init_hw split to context init.
   - Move context initialisation in to i915_gem_init_hw
   - Move one off initialisation for render ring to
        i915_gem_validate_context
   - Move default context initialisation to logical_ring_init

Rename intel_lr_context_deferred_create to
intel_lr_context_deferred_alloc, to reflect reduced functionality &
alloc/init split.

This patch is intended to split out the allocation of resources &
initialisation to allow easier reuse of code for resume/gpu reset.

v2: Removed function ptr wrapping of do_switch_context (Daniel Vetter)
    Left ->init_context int intel_lr_context_deferred_alloc
    (Daniel Vetter)
    Remove unnecessary init flag & ring type test. (Daniel Vetter)
    Improve commit message (Daniel Vetter)
v3: On init/reinit, set the hw next sequence number to the sw next
    sequence number. This is set to 1 at driver load time. This prevents
    the seqno being reset on reinit (Chris Wilson)
v4: Set seqno back to ~0 - 0x1000 at start-of-day, and increment by 0x100
    on reset.
    This makes it obvious which bbs are which after a reset. (David Gordon
    & John Harrison)
    Rebase.
v5: Rebase. Fixed rebase breakage. Put context pinning in separate
    function. Removed code churn. (Thomas Daniel)
v6: Cleanup up issues introduced in v2 & v5 (Thomas Daniel)

Issue: VIZ-4798
Signed-off-by: NNick Hoath <nicholas.hoath@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: David Gordon <david.s.gordon@intel.com>
Cc: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: NThomas Daniel <thomas.daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

e84fe803

02 9月, 2015 1 次提交

drm/i915: Always mark the object as dirty when used by the GPU · 51bc1404

由 Chris Wilson 提交于 8月 31, 2015

There have been many hard to track down bugs whereby userspace forgot to
flag a write buffer and then cause graphics corruption or a hung GPU
when that buffer was later purged under memory pressure (as the buffer
appeared clean, its pages would have been evicted rather than preserved
and any changes more recent than in the backing storage would be lost).
In retrospect this is a rare optimisation against memory pressure,
already the slow path. If we always mark the buffer as dirty when
accessed by the GPU, anything not used can still be evicted cheaply
(ideal behaviour for mark-and-sweep eviction) but we do not run the risk
of corruption. For correct read serialisation, userspace still has to
notify when the GPU writes to an object. However, there are certain
situations under which userspace may wish to tell white lies to the
kernel...
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "Goel, Akash" <akash.goel@intel.co>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

51bc1404

06 7月, 2015 1 次提交

drm/i915: Expose I915_EXEC_RESOURCE_STREAMER flag and getparam · a9ed33ca

由 Abdiel Janulgue 提交于 7月 01, 2015

Ensures that the batch buffer is executed by the resource streamer.
And will let userspace know whether Resource Streamer is supported in
the kernel.

v2: Don't skip 1<<15 for the exec flags (Jani Nikula)
v3: Use HAS_RESOURCE_STREAMER macro for execbuf validation (Chris Wilson)

(from getparam patch)

v2: Update I915_PARAM_HAS_RESOURCE_STREAMER so it's after
    I915_PARAM_HAS_GPU_RESET.
v3: Only advertise RS support for hardware that supports it.
v4: Add HAS_RESOURCE_STREAMER() macro (Chris)

Testcase: igt/gem_exec_params
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NAbdiel Janulgue <abdiel.janulgue@linux.intel.com>
[danvet: squash in getparam patch since it'd break bisect, suggested
by Chris.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a9ed33ca

23 6月, 2015 19 次提交

drm/i915: Move the request/file and request/pid association to creation time · fcfa423c

由 John Harrison 提交于 5月 29, 2015

In _i915_add_request(), the request is associated with a userland client.
Specifically it is linked to the 'file' structure and the current user process
is recorded. One problem here is that the current user process is not
necessarily the same as when the request was submitted to the driver. This is
especially true when the GPU scheduler arrives and decouples driver submission
from hardware submission. Note also that it is only in the case where the add
request comes from an execbuff call that there is a client to associate. Any
other add request call is kernel only so does not need to do it.

This patch moves the client association into a separate function. This is then
called from the execbuffer code path itself at a sensible time. It also removes
the now redundant 'file' pointer from the add request parameter list.

An extra cleanup of the client association is also added to the request clean up
code for the eventuality where the request is killed after association but
before being submitted (e.g. due to out of memory error somewhere). Once the
submission has happened, the request is on the request list and the regular
request list removal will clear the association. Note that this still needs to
happen at this point in time because the request might be kept floating around
much longer (due to someone holding a reference count) and the client should not
be worrying about this request after it has been retired.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

fcfa423c

drm/i915: Remove the now obsolete 'outstanding_lazy_request' · bccca494

由 John Harrison 提交于 5月 29, 2015

The outstanding_lazy_request is no longer used anywhere in the driver.
Everything that was looking at it now has a request explicitly passed in from on
high. Everything that was relying upon it behind the scenes is now explicitly
creating/passing/submitting its own private request. Thus the OLR can be
removed.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bccca494

drm/i915: Update intel_ring_begin() to take a request structure · 5fb9de1a

由 John Harrison 提交于 5月 29, 2015

Now that everything above has been converted to use requests, intel_ring_begin()
can be updated to take a request instead of a ring. This also means that it no
longer needs to lazily allocate a request if no-one happens to have done it
earlier.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5fb9de1a

drm/i915: Update ring->dispatch_execbuffer() to take a request structure · 53fddaf7

由 John Harrison 提交于 5月 29, 2015

Updated the various ring->dispatch_execbuffer() implementations to take a
request instead of a ring.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

53fddaf7

drm/i915: Update a bunch of execbuffer helpers to take request structures · 2f20055d

由 John Harrison 提交于 5月 29, 2015

Updated *_ring_invalidate_all_caches(), i915_reset_gen7_sol_offsets() and
i915_emit_box() to take request structures instead of ring or ringbuf/context
pairs.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

2f20055d

drm/i915: Update [vma|object]_move_to_active() to take request structures · b2af0376

由 John Harrison 提交于 5月 29, 2015

Now that everything above has been converted to use request structures, it is
possible to update the lower level move_to_active() functions to be request
based as well.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b2af0376

drm/i915: Update add_request() to take a request structure · 75289874

由 John Harrison 提交于 5月 29, 2015

Now that all callers of i915_add_request() have a request pointer to hand, it is
possible to update the add request function to take a request pointer rather
than pulling it out of the OLR.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

75289874

drm/i915: Update i915_gem_object_sync() to take a request structure · 91af127f

由 John Harrison 提交于 6月 18, 2015

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the i915_gem_object_sync()
code path.

v2: Much more complex patch to share a single request between the sync and the
page flip. The _sync() function now supports lazy allocation of the request
structure. That is, if one is passed in then that will be used. If one is not,
then a request will be allocated and passed back out. Note that the _sync() code
does not necessarily require a request. Thus one will only be created until
certain situations. The reason the lazy allocation must be done within the
_sync() code itself is because the decision to need one or not is not really
something that code above can second guess (except in the case where one is
definitely not required because no ring is passed in).

The call chains above _sync() now support passing a request through which most
callers passing in NULL and assuming that no request will be required (because
they also pass in NULL for the ring and therefore can't be generating any ring
code).

The exeception is intel_crtc_page_flip() which now supports having a request
returned from _sync(). If one is, then that request is shared by the page flip
(if the page flip is of a type to need a request). If _sync() does not generate
a request but the page flip does need one, then the page flip path will create
its own request.

v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
Elf review request). Rebased onto newer tree that significantly changed the
synchronisation code.

v4: Updated comments from review feedback (Tomas Elf)

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

91af127f

drm/i915: Update i915_switch_context() to take a request structure · ba01cc93

由 John Harrison 提交于 5月 29, 2015

Now that the request is guaranteed to specify the context, it is possible to
update the context switch code to use requests rather than ring and context
pairs. This patch updates i915_switch_context() accordingly.

Also removed the warning that the request's context must match the last context
switch's context. As the context switch now gets the context object from the
request structure, there is no longer any scope for the two to become out of
step.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ba01cc93

drm/i915: Add flag to i915_add_request() to skip the cache flush · 5b4a60c2

由 John Harrison 提交于 5月 29, 2015

In order to explcitly track all GPU work (and completely remove the outstanding
lazy request), it is necessary to add extra i915_add_request() calls to various
places. Some of these do not need the implicit cache flush done as part of the
standard batch buffer submission process.

This patch adds a flag to _add_request() to specify whether the flush is
required or not.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5b4a60c2

drm/i915: Update execbuffer_move_to_active() to take a request structure · 8a8edb59

由 John Harrison 提交于 5月 29, 2015

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the
execbuffer_move_to_active() code path.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8a8edb59

drm/i915: Update move_to_gpu() to take a request structure · 535fbe82

由 John Harrison 提交于 5月 29, 2015

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the move_to_gpu() code paths.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

535fbe82

drm/i915: Update the dispatch tracepoint to use params->request · 95c24161

由 John Harrison 提交于 5月 29, 2015

Updated a couple of trace points to use the now cached request pointer rather
than extracting it from the ring.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

95c24161

drm/i915: Add request to execbuf params and add explicit cleanup · 6a6ae79a

由 John Harrison 提交于 5月 29, 2015

Rather than just having a local request variable in the execbuff code, the
request pointer is now stored in the execbuff params structure. Also added
explicit cleanup of the request (plus wiping the OLR to match) in the error
case. This means that the execbuff code is no longer dependent upon the OLR
keeping track of the request so as to not leak it when things do go wrong. Note
that in the success case, the i915_add_request() at the end of the submission
function will tidy up the request and clear the OLR.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6a6ae79a

drm/i915: Update alloc_request to return the allocated request · 217e46b5

由 John Harrison 提交于 5月 29, 2015

The alloc_request() function does not actually return the newly allocated
request. Instead, it must be pulled from ring->outstanding_lazy_request. This
patch fixes this so that code can create a request and start using it knowing
exactly which request it actually owns.

v2: Updated for new i915_gem_request_alloc() scheme.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

217e46b5

drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters · adeca76d

由 John Harrison 提交于 5月 29, 2015

Shrunk the parameter list of i915_gem_execbuffer_retire_commands() to a single
structure as everything it requires is available in the execbuff_params object.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

adeca76d

drm/i915: Merged the many do_execbuf() parameters into a structure · 5f19e2bf

由 John Harrison 提交于 5月 29, 2015

The do_execbuf() function takes quite a few parameters. The actual set of
parameters is going to change with the conversion to passing requests around.
Further, it is due to grow massively with the arrival of the GPU scheduler.

This patch simplifies the prototype by passing a parameter structure instead.
Changing the parameter set in the future is then simply a matter of
adding/removing items to the structure.

Note that the structure does not contain absolutely everything that is passed
in. This is because the intention is to use this structure more extensively
later in this patch series and more especially in the GPU scheduler that is
coming soon. The latter requires hanging on to the structure as the final
hardware submission can be delayed until long after the execbuf IOCTL has
returned to user land. Thus it is unsafe to put anything in the structure that
is local to the IOCTL call itself - such as the 'args' parameter. All entries
must be copies of data or pointers to structures that are reference counted in
some way and guaranteed to exist for the duration of the batch buffer's life.

v2: Rebased to newer tree and updated for changes to the command parser.
Specifically, a code shuffle has required saving the batch start address in the
params structure.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5f19e2bf

drm/i915: Early alloc request in execbuff · 0c8dac88

由 John Harrison 提交于 5月 29, 2015

Start of explicit request management in the execbuffer code path. This patch
adds a call to allocate a request structure before all the actual hardware work
is done. Thus guaranteeing that all that work is tagged by a known request. At
present, nothing further is done with the request, the rest comes later in the
series.

The only noticable change is that failure to get a request (e.g. due to lack of
memory) will be caught earlier in the sequence. It now occurs right at the start
before any un-undoable work has been done.

v2: Simplified the error handling path.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0c8dac88

drm/i915: i915_add_request must not fail · bf7dc5b7

由 John Harrison 提交于 5月 29, 2015

The i915_add_request() function is called to keep track of work that has been
written to the ring buffer. It adds epilogue commands to track progress (seqno
updates and such), moves the request structure onto the right list and other
such house keeping tasks. However, the work itself has already been written to
the ring and will get executed whether or not the add request call succeeds. So
no matter what goes wrong, there isn't a whole lot of point in failing the call.

At the moment, this is fine(ish). If the add request does bail early on and not
do the housekeeping, the request will still float around in the
ring->outstanding_lazy_request field and be picked up next time. It means
multiple pieces of work will be tagged as the same request and driver can't
actually wait for the first piece of work until something else has been
submitted. But it all sort of hangs together.

This patch series is all about removing the OLR and guaranteeing that each piece
of work gets its own personal request. That means that there is no more
'hoovering up of forgotten requests'. If the request does not get tracked then
it will be leaked. Thus the add request call _must_ not fail. The previous patch
should have already ensured that it _will_ not fail by removing the potential
for running out of ring space. This patch enforces the rule by actually removing
the early exit paths and the return code.

Note that if something does manage to fail and the epilogue commands don't get
written to the ring, the driver will still hang together. The request will be
added to the tracking lists. And as in the old case, any subsequent work will
generate a new seqno which will suffice for marking the old one as complete.

v2: Improved WARNings (Tomas Elf review request).

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bf7dc5b7

22 6月, 2015 2 次提交

drm/i915: Enforce execobject.alignment to be a power-of-two · 55a9785d

由 Chris Wilson 提交于 6月 19, 2015

Internal requirement for the alignment is that it must be a
power-of-two, so enforce rejection at the user interface to execbuffer
(which allows the caller to specify a stricter-than-expected alignment
criterion).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

55a9785d

drm/i915: Remove unused ring argument from frontbuffer invalidate and busy functions. · 77a0d1ca

由 Rodrigo Vivi 提交于 6月 18, 2015

This patch doesn't have any functional change, but organize fruntbuffer
invalidate and busy by removing unecesarry signature argument for ring.

It was unsed on mark_fb_busy and only used on fb_obj_invalidate for the
same ORIGIN_CS usage. So let's clean it a bit

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

77a0d1ca

29 5月, 2015 1 次提交

drm/i915: add a context parameter to {en, dis}able zero address mapping · b1b38278

由 David Weinehall 提交于 5月 20, 2015

Export a new context parameter that can be set/queried through the
context_{get,set}param ioctls.  This parameter is passed as a context
flag and decides whether or not a GPU address mapping is allowed to
be made at address zero.  The default is to allow such mappings.
Signed-off-by: NDavid Weinehall <david.weinehall@intel.com>
Acked-by: N"Zou, Nanhai" <nanhai.zou@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b1b38278

21 5月, 2015 1 次提交

drm/i915: Inline check required for object syncing prior to execbuf · 03ade511

由 Chris Wilson 提交于 4月 27, 2015

This trims a little overhead from the common case of not needing to
synchronize between rings.

v2: execlists is special and likes to duplicate code.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

03ade511

19 5月, 2015 1 次提交

mm/fault, drm/i915: Use pagefault_disabled() to check for disabled pagefaults · 32d82067

由 David Hildenbrand 提交于 5月 11, 2015

Now that the pagefault disabled counter is in place, we can replace
the in_atomic() check by a pagefault_disabled() checks.
Reviewed-and-tested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: airlied@linux.ie
Cc: akpm@linux-foundation.org
Cc: benh@kernel.crashing.org
Cc: bigeasy@linutronix.de
Cc: borntraeger@de.ibm.com
Cc: daniel.vetter@intel.com
Cc: heiko.carstens@de.ibm.com
Cc: herbert@gondor.apana.org.au
Cc: hocko@suse.cz
Cc: hughd@google.com
Cc: mst@redhat.com
Cc: paulus@samba.org
Cc: ralf@linux-mips.org
Cc: schwidefsky@de.ibm.com
Cc: yang.shi@windriver.com
Link: http://lkml.kernel.org/r/1431359540-32227-8-git-send-email-dahi@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

32d82067

08 5月, 2015 2 次提交

drm/i915: Fix possible security hole in command parsing · c7c7372e

由 Rebecca N. Palmer 提交于 5月 08, 2015

i915_parse_cmds returns -EACCES on chained batches, which "tells the
caller to abort and dispatch the workload as a non-secure batch",
but the mechanism implementing that was broken when
flags |= I915_DISPATCH_SECURE was moved from i915_gem_execbuffer_parse
to i915_gem_do_execbuffer (17cabf57):
i915_gem_execbuffer_parse returns the original batch_obj in this case,
and i915_gem_do_execbuffer doesn't check for that.

Don't set the secure bit in this case to make sure such batches don't
run with elevated priviledges.
Signed-off-by: NRebecca Palmer <rebecca_palmer@zoho.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
[danvet: Stitch together commit message. Also remove a comment as
suggested by Mika. And style-align the comment while at it.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c7c7372e

drm/i915: Simplify cmd-parser DISPATCH_SECURE check · c6b8a4bc

由 Daniel Vetter 提交于 5月 04, 2015

i915_needs_cmd_parser already checks that for us.
Suggested-by: NMika Kuoppala <mika.kuoppala@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

c6b8a4bc

30 4月, 2015 1 次提交

drm/i915: Enable cmd parser to do secure batch promotion for aliasing ppgtt · 245054a1

由 Daniel Vetter 提交于 4月 14, 2015

With the binding regression from the original full ppgtt patches
fixed we can throw the switch. Yay!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90190Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
[Jani: tweaked commit title per Chris' suggestion]
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

245054a1

24 4月, 2015 3 次提交

drm/i915: Fix up the vma aliasing ppgtt binding · 0875546c

由 Daniel Vetter 提交于 4月 20, 2015

Currently we have the problem that the decision whether ptes need to
be (re)written is splattered all over the codebase. Move all that into
i915_vma_bind. This needs a few changes:
- Just reuse the PIN_* flags for i915_vma_bind and do the conversion
  to vma->bound in there to avoid duplicating the conversion code all
  over.
- We need to make binding for EXECBUF (i.e. pick aliasing ppgtt if
  around) explicit, add PIN_USER for that.
- Two callers want to update ptes, give them a PIN_UPDATE for that.

Of course we still want to avoid double-binding, but that should be
taken care of:
- A ppgtt vma will only ever see PIN_USER, so no issue with
  double-binding.
- A ggtt vma with aliasing ppgtt needs both types of binding, and we
  track that properly now.
- A ggtt vma without aliasing ppgtt could be bound twice. In the
  lower-level ->bind_vma functions hence unconditionally set
  GLOBAL_BIND when writing the ggtt ptes.

There's still a bit room for cleanup, but that's for follow-up
patches.

v2: Fixup fumbles.

v3: s/PIN_EXECBUF/PIN_USER/ for clearer meaning, suggested by Chris.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

0875546c

drm/i915: Don't use atomics for pg_dirty_rings · 9258811c

由 Daniel Vetter 提交于 4月 14, 2015

It's already protected by the bkl^Wdev->struct_mutex. While at it
realign some related code.
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

9258811c

drm/i915: Don't look at pg_dirty_rings for aliasing ppgtt · 71b7e54f

由 Daniel Vetter 提交于 4月 14, 2015

We load the ppgtt ptes once per gpu reset/driver load/resume and
that's all that's needed. Note that this only blows up when we're
using the allocate_va_range funcs and not the special-purpose ones
used. With this change we can get rid of that duplication.
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

71b7e54f

20 4月, 2015 1 次提交

drm/i915: Dont clear PIN_GLOBAL in the execbuf pinning fallback · 0229da32

由 Daniel Vetter 提交于 4月 14, 2015

PIN_GLOBAL is set only when userspace asked for it, and that
is only the case for the gen6 PIPE_CONTROL workaround. We're not
allowed to just clear this.

The important part of the fallback is to drop the restriction to
the mappable range.

This issue has been introduced in

commit edf4427b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jan 14 11:20:56 2015 +0000

    drm/i915: Fallback to using CPU relocations for large batch buffers

v2: Chris pointed out that we also miss to set PIN_GLOBAL when the
buffer is already bound. Fix this up too.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0229da32

10 4月, 2015 2 次提交

drm/i915: Split the batch pool by engine · 06fbca71

由 Chris Wilson 提交于 4月 07, 2015

I woke up one morning and found 50k objects sitting in the batch pool
and every search seemed to iterate the entire list... Painting the
screen in oils would provide a more fluid display.

One issue with the current design is that we only check for retirements
on the current ring when preparing to submit a new batch. This means
that we can have thousands of "active" batches on another ring that we
have to walk over. The simplest way to avoid that is to split the pools
per ring and then our LRU execution ordering will also ensure that the
inactive buffers remain at the front.

v2: execlists still requires duplicate code.
v3: execlists requires more duplicate code
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

06fbca71

drm/i915: Tidy batch pool logic · de4e783a

由 Chris Wilson 提交于 4月 07, 2015

Move the madvise logic out of the execbuffer main path into the
relatively rare allocation path, making the execbuffer manipulation less
fragile.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

de4e783a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功