提交 · 217e46b576ef0d5eed10ddfeb2b29bd3de289e95 · openeuler / raspberrypi-kernel

23 6月, 2015 5 次提交

drm/i915: Update alloc_request to return the allocated request · 217e46b5

由 John Harrison 提交于 5月 29, 2015

The alloc_request() function does not actually return the newly allocated
request. Instead, it must be pulled from ring->outstanding_lazy_request. This
patch fixes this so that code can create a request and start using it knowing
exactly which request it actually owns.

v2: Updated for new i915_gem_request_alloc() scheme.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

217e46b5

drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters · adeca76d

由 John Harrison 提交于 5月 29, 2015

Shrunk the parameter list of i915_gem_execbuffer_retire_commands() to a single
structure as everything it requires is available in the execbuff_params object.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

adeca76d

drm/i915: Merged the many do_execbuf() parameters into a structure · 5f19e2bf

由 John Harrison 提交于 5月 29, 2015

The do_execbuf() function takes quite a few parameters. The actual set of
parameters is going to change with the conversion to passing requests around.
Further, it is due to grow massively with the arrival of the GPU scheduler.

This patch simplifies the prototype by passing a parameter structure instead.
Changing the parameter set in the future is then simply a matter of
adding/removing items to the structure.

Note that the structure does not contain absolutely everything that is passed
in. This is because the intention is to use this structure more extensively
later in this patch series and more especially in the GPU scheduler that is
coming soon. The latter requires hanging on to the structure as the final
hardware submission can be delayed until long after the execbuf IOCTL has
returned to user land. Thus it is unsafe to put anything in the structure that
is local to the IOCTL call itself - such as the 'args' parameter. All entries
must be copies of data or pointers to structures that are reference counted in
some way and guaranteed to exist for the duration of the batch buffer's life.

v2: Rebased to newer tree and updated for changes to the command parser.
Specifically, a code shuffle has required saving the batch start address in the
params structure.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5f19e2bf

drm/i915: Early alloc request in execbuff · 0c8dac88

由 John Harrison 提交于 5月 29, 2015

Start of explicit request management in the execbuffer code path. This patch
adds a call to allocate a request structure before all the actual hardware work
is done. Thus guaranteeing that all that work is tagged by a known request. At
present, nothing further is done with the request, the rest comes later in the
series.

The only noticable change is that failure to get a request (e.g. due to lack of
memory) will be caught earlier in the sequence. It now occurs right at the start
before any un-undoable work has been done.

v2: Simplified the error handling path.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0c8dac88

drm/i915: i915_add_request must not fail · bf7dc5b7

由 John Harrison 提交于 5月 29, 2015

The i915_add_request() function is called to keep track of work that has been
written to the ring buffer. It adds epilogue commands to track progress (seqno
updates and such), moves the request structure onto the right list and other
such house keeping tasks. However, the work itself has already been written to
the ring and will get executed whether or not the add request call succeeds. So
no matter what goes wrong, there isn't a whole lot of point in failing the call.

At the moment, this is fine(ish). If the add request does bail early on and not
do the housekeeping, the request will still float around in the
ring->outstanding_lazy_request field and be picked up next time. It means
multiple pieces of work will be tagged as the same request and driver can't
actually wait for the first piece of work until something else has been
submitted. But it all sort of hangs together.

This patch series is all about removing the OLR and guaranteeing that each piece
of work gets its own personal request. That means that there is no more
'hoovering up of forgotten requests'. If the request does not get tracked then
it will be leaked. Thus the add request call _must_ not fail. The previous patch
should have already ensured that it _will_ not fail by removing the potential
for running out of ring space. This patch enforces the rule by actually removing
the early exit paths and the return code.

Note that if something does manage to fail and the epilogue commands don't get
written to the ring, the driver will still hang together. The request will be
added to the tracking lists. And as in the old case, any subsequent work will
generate a new seqno which will suffice for marking the old one as complete.

v2: Improved WARNings (Tomas Elf review request).

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bf7dc5b7

22 6月, 2015 2 次提交

drm/i915: Enforce execobject.alignment to be a power-of-two · 55a9785d

由 Chris Wilson 提交于 6月 19, 2015

Internal requirement for the alignment is that it must be a
power-of-two, so enforce rejection at the user interface to execbuffer
(which allows the caller to specify a stricter-than-expected alignment
criterion).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

55a9785d

drm/i915: Remove unused ring argument from frontbuffer invalidate and busy functions. · 77a0d1ca

由 Rodrigo Vivi 提交于 6月 18, 2015

This patch doesn't have any functional change, but organize fruntbuffer
invalidate and busy by removing unecesarry signature argument for ring.

It was unsed on mark_fb_busy and only used on fb_obj_invalidate for the
same ORIGIN_CS usage. So let's clean it a bit

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

77a0d1ca

29 5月, 2015 1 次提交

drm/i915: add a context parameter to {en, dis}able zero address mapping · b1b38278

由 David Weinehall 提交于 5月 20, 2015

Export a new context parameter that can be set/queried through the
context_{get,set}param ioctls.  This parameter is passed as a context
flag and decides whether or not a GPU address mapping is allowed to
be made at address zero.  The default is to allow such mappings.
Signed-off-by: NDavid Weinehall <david.weinehall@intel.com>
Acked-by: N"Zou, Nanhai" <nanhai.zou@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b1b38278

21 5月, 2015 1 次提交

drm/i915: Inline check required for object syncing prior to execbuf · 03ade511

由 Chris Wilson 提交于 4月 27, 2015

This trims a little overhead from the common case of not needing to
synchronize between rings.

v2: execlists is special and likes to duplicate code.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

03ade511

08 5月, 2015 2 次提交

drm/i915: Fix possible security hole in command parsing · c7c7372e

由 Rebecca N. Palmer 提交于 5月 08, 2015

i915_parse_cmds returns -EACCES on chained batches, which "tells the
caller to abort and dispatch the workload as a non-secure batch",
but the mechanism implementing that was broken when
flags |= I915_DISPATCH_SECURE was moved from i915_gem_execbuffer_parse
to i915_gem_do_execbuffer (17cabf57):
i915_gem_execbuffer_parse returns the original batch_obj in this case,
and i915_gem_do_execbuffer doesn't check for that.

Don't set the secure bit in this case to make sure such batches don't
run with elevated priviledges.
Signed-off-by: NRebecca Palmer <rebecca_palmer@zoho.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
[danvet: Stitch together commit message. Also remove a comment as
suggested by Mika. And style-align the comment while at it.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c7c7372e

drm/i915: Simplify cmd-parser DISPATCH_SECURE check · c6b8a4bc

由 Daniel Vetter 提交于 5月 04, 2015

i915_needs_cmd_parser already checks that for us.
Suggested-by: NMika Kuoppala <mika.kuoppala@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

c6b8a4bc

30 4月, 2015 1 次提交

drm/i915: Enable cmd parser to do secure batch promotion for aliasing ppgtt · 245054a1

由 Daniel Vetter 提交于 4月 14, 2015

With the binding regression from the original full ppgtt patches
fixed we can throw the switch. Yay!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90190Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
[Jani: tweaked commit title per Chris' suggestion]
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

245054a1

24 4月, 2015 3 次提交

drm/i915: Fix up the vma aliasing ppgtt binding · 0875546c

由 Daniel Vetter 提交于 4月 20, 2015

Currently we have the problem that the decision whether ptes need to
be (re)written is splattered all over the codebase. Move all that into
i915_vma_bind. This needs a few changes:
- Just reuse the PIN_* flags for i915_vma_bind and do the conversion
  to vma->bound in there to avoid duplicating the conversion code all
  over.
- We need to make binding for EXECBUF (i.e. pick aliasing ppgtt if
  around) explicit, add PIN_USER for that.
- Two callers want to update ptes, give them a PIN_UPDATE for that.

Of course we still want to avoid double-binding, but that should be
taken care of:
- A ppgtt vma will only ever see PIN_USER, so no issue with
  double-binding.
- A ggtt vma with aliasing ppgtt needs both types of binding, and we
  track that properly now.
- A ggtt vma without aliasing ppgtt could be bound twice. In the
  lower-level ->bind_vma functions hence unconditionally set
  GLOBAL_BIND when writing the ggtt ptes.

There's still a bit room for cleanup, but that's for follow-up
patches.

v2: Fixup fumbles.

v3: s/PIN_EXECBUF/PIN_USER/ for clearer meaning, suggested by Chris.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

0875546c

drm/i915: Don't use atomics for pg_dirty_rings · 9258811c

由 Daniel Vetter 提交于 4月 14, 2015

It's already protected by the bkl^Wdev->struct_mutex. While at it
realign some related code.
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

9258811c

drm/i915: Don't look at pg_dirty_rings for aliasing ppgtt · 71b7e54f

由 Daniel Vetter 提交于 4月 14, 2015

We load the ppgtt ptes once per gpu reset/driver load/resume and
that's all that's needed. Note that this only blows up when we're
using the allocate_va_range funcs and not the special-purpose ones
used. With this change we can get rid of that duplication.
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

71b7e54f

20 4月, 2015 1 次提交

drm/i915: Dont clear PIN_GLOBAL in the execbuf pinning fallback · 0229da32

由 Daniel Vetter 提交于 4月 14, 2015

PIN_GLOBAL is set only when userspace asked for it, and that
is only the case for the gen6 PIPE_CONTROL workaround. We're not
allowed to just clear this.

The important part of the fallback is to drop the restriction to
the mappable range.

This issue has been introduced in

commit edf4427b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jan 14 11:20:56 2015 +0000

    drm/i915: Fallback to using CPU relocations for large batch buffers

v2: Chris pointed out that we also miss to set PIN_GLOBAL when the
buffer is already bound. Fix this up too.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0229da32

10 4月, 2015 2 次提交

drm/i915: Split the batch pool by engine · 06fbca71

由 Chris Wilson 提交于 4月 07, 2015

I woke up one morning and found 50k objects sitting in the batch pool
and every search seemed to iterate the entire list... Painting the
screen in oils would provide a more fluid display.

One issue with the current design is that we only check for retirements
on the current ring when preparing to submit a new batch. This means
that we can have thousands of "active" batches on another ring that we
have to walk over. The simplest way to avoid that is to split the pools
per ring and then our LRU execution ordering will also ensure that the
inactive buffers remain at the front.

v2: execlists still requires duplicate code.
v3: execlists requires more duplicate code
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

06fbca71

drm/i915: Tidy batch pool logic · de4e783a

由 Chris Wilson 提交于 4月 07, 2015

Move the madvise logic out of the execbuffer main path into the
relatively rare allocation path, making the execbuffer manipulation less
fragile.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

de4e783a

01 4月, 2015 1 次提交

drm/i915: Rename 'do_execbuf' to 'execbuf_submit' · f3dc74c0

由 John Harrison 提交于 3月 19, 2015

The submission portion of the execbuffer code path was abstracted into a
function pointer indirection as part of the legacy vs execlist work. The two
implementation functions are called 'i915_gem_ringbuffer_submission' and
'intel_execlists_submission' but the pointer was called 'do_execbuf'. There is
already a 'i915_gem_do_execbuffer' function (which is what calls the pointer
indirection). The name of the pointer is therefore considered to be backwards
and should be changed.

This patch renames it to 'execbuf_submit' which is hopefully a bit clearer.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f3dc74c0

30 3月, 2015 1 次提交

drm/i915: Skip allocating shadow batch for 0-length batches · ee73c61c

由 Chris Wilson 提交于 3月 27, 2015

Since

commit 17cabf57
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jan 14 11:20:57 2015 +0000

    drm/i915: Trim the command parser allocations

we may then try to allocate a zero-sized object and attempt to extract
its pages. Understandably this fails.

Note that the real offender seems to be

commit b9ffd80e
Author: Brad Volkin <bradley.d.volkin@intel.com>
Date:   Thu Dec 11 12:13:10 2014 -0800

    drm/i915: Use batch length instead of object size in command parser

Testcase: igt/gem_exec_nop #ivb,byt,hsw
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
[cherry picked from commit 743e78c1
from drm-intel-next because 4.0 seems to be affected by this too,
despite that the obvious culprit is definitely not in 4.0. Whatever,
if fixes a bug.
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

ee73c61c

27 3月, 2015 1 次提交

drm/i915: Skip allocating shadow batch for 0-length batches · 743e78c1

由 Chris Wilson 提交于 3月 27, 2015

Since

commit 17cabf57
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jan 14 11:20:57 2015 +0000

    drm/i915: Trim the command parser allocations

we may then try to allocate a zero-sized object and attempt to extract
its pages. Understandably this fails.

Testcase: igt/gem_exec_nop #ivb,byt,hsw
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

743e78c1

20 3月, 2015 2 次提交

drm/i915: Track page table reload need · 563222a7

由 Ben Widawsky 提交于 3月 19, 2015

This patch was formerly known as, "Force pd restore when PDEs change,
gen6-7." I had to change the name because it is needed for GEN8 too.

The real issue this is trying to solve is when a new object is mapped
into the current address space. The GPU does not snoop the new mapping
so we must do the gen specific action to reload the page tables.

GEN8 and GEN7 do differ in the way they load page tables for the RCS.
GEN8 does so with the context restore, while GEN7 requires the proper
load commands in the command streamer. Non-render is similar for both.

Caveat for GEN7
The docs say you cannot change the PDEs of a currently running context.
We never map new PDEs of a running context, and expect them to be
present - so I think this is okay. (We can unmap, but this should also
be okay since we only unmap unreferenced objects that the GPU shouldn't
be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag
to signal that even if the context is the same, force a reload. It's
unclear exactly what this does, but I have a hunch it's the right thing
to do.

The logic assumes that we always emit a context switch after mapping new
PDEs, and before we submit a batch. This is the case today, and has been
the case since the inception of hardware contexts. A note in the comment
let's the user know.

It's not just for gen8. If the current context has mappings change, we
need a context reload to switch

v2: Rebased after ppgtt clean up patches. Split the warning for aliasing
and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt
is always null.

v3: Invalidate PPGTT TLBs inside alloc_va_range.

v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move
pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when
neither ctx->ppgtt and aliasing_ppgtt exist.

v5: Removed references to teardown_va_range.

v6: Updated needs_pd_load_pre/post.

v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move
comment about updated PDEs to object_pin/bind (Mika).

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

563222a7

drm/i915: Fallback to using CPU relocations for large batch buffers · edf4427b

由 Chris Wilson 提交于 1月 14, 2015

If the batch buffer is too large to fit into the aperture and we need a
GTT mapping for relocations, we currently fail. This only applies to a
subset of machines for a subset of environments, quite undesirable. We
can simply check after failing to insert the batch into the GTT as to
whether we only need a mappable binding for relocation and, if so, we can
revert to using a non-mappable binding and an alternate relocation
method. However, using relocate_entry_cpu() is excruciatingly slow for
large buffers on non-LLC as the entire buffer requires clflushing before
and after the relocation handling. Alternatively, we can implement a
third relocation method that only clflushes around the relocation entry.
This is still slower than updating through the GTT, so we prefer using
the GTT where possible, but is orders of magnitude faster as we
typically do not have to then clflush the entire buffer.

An alternative idea of using a temporary WC mapping of the backing store
is promising (it should be faster than using the GTT itself), but
requires fairly extensive arch/x86 support - along the lines of
kmap_atomic_prof_pfn() (which is not universally implemented even for
x86).

Testcase: igt/gem_exec_big #pnv,byt
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88392Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet: Add a WARN_ONCE for the impossible reloc case and explain in
a short comment why we want to avoid ping-pong.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

edf4427b

18 3月, 2015 2 次提交

drm/i915: pass which operation triggered the frontbuffer tracking · a4001f1b

由 Paulo Zanoni 提交于 2月 13, 2015

We want to port FBC to the frontbuffer tracking infrastructure, but
for that we need to know what caused the object invalidation so
we can react accordingly: CPU mmaps need manual, GTT mmaps and
flips don't need handling and ring rendering needs nukes.

v2: - s/ORIGIN_RENDER/ORIGIN_CS/ (Daniel, Rodrigo)
    - Fix copy/pasted wrong documentation
    - Rebase
v3: - Rebase
v4: - Don't pass the operation to flushes (Daniel).
Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a4001f1b

drm/i915: Fix trivial typos in comments and warning message · fd0753cf

由 Yannick Guerrini 提交于 2月 28, 2015

Change 'mutliple' to 'multiple'
Change 'mutlipler' to 'multiplier'
Change 'Haswel' to 'Haswell'
Signed-off-by: NYannick Guerrini <yguerrini@tomshardware.fr>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

fd0753cf

26 2月, 2015 1 次提交

drm/i915: Rename 'flags' to 'dispatch_flags' for better code reading · 8e004efc

由 John Harrison 提交于 2月 13, 2015

There is a flags word that is passed through the execbuffer code path all the
way from initial decoding of the user parameters down to the very final dispatch
buffer call. It is simply called 'flags'. Unfortuantely, there are many other
flags words floating around in the same blocks of code. Even more once the GPU
scheduler arrives.

This patch makes it more obvious exactly which flags word is which by renaming
'flags' to 'dispatch_flags'. Note that the bit definitions for this flags word
already have an 'I915_DISPATCH_' prefix on them and so are not quite so
ambiguous.

OTC-Jira: VIZ-1587
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
[danvet: Resolve conflict with Chris' rework of the bb parsing.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8e004efc

24 2月, 2015 1 次提交

drm/i915: Trim the command parser allocations · 17cabf57

由 Chris Wilson 提交于 1月 14, 2015

Currently, the command parser tries to create a secondary batch exactly
as large as the original, and vmap both. This is open to abuse by
userspace using extremely large batch objects, but only executing very
short batches. For example, this would be if userspace were to implement
a command submission ringbuffer. However, we only need to allocate pages
for just the contents of the command sequence in the batch - all
relocations copied to the secondary batch will reference the original
batch and so there can be no access to the secondary batch outside of
the explicit execution region.

Testcase: igt/gem_exec_big #ivb,byt,hsw
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88308Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

17cabf57

27 1月, 2015 1 次提交

drm/i915: Specify bsd rings through exec flag · 8d360dff

由 Zhipeng Gong 提交于 1月 13, 2015

On Skylake GT3 we have 2 Video Command Streamers (VCS), which is asymmetrical.
For example, HEVC GPU commands can be only dispatched to VCS1 ring.
But userspace has no control when using VCS1 or VCS2. This patch introduces
a mechanism to avoid the default ping-pong mode and use one specific ring
through execution flag. This mechanism is usable for all the platforms
with 2 VCS rings.

The open source usage is from these two commits in vaapi/intel:
	commit 702050f04131a44ef8ac16651708ce8a8d98e4b8
	Author: Zhao, Yakui <yakui.zhao@intel.com>
	Date:   Mon Nov 17 12:44:19 2014 +0800

	    Allow the batchbuffer to be submitted with override flag

	commit a56efcdf27d11ad9b21664b4a2cda72d7f90f5a8
	Author: Zhao Yakui <yakui.zhao@intel.com>
	Date:   Mon Nov 17 12:44:22 2014 +0800

	    Add the override flag to assure that HEVC video command
		always uses BSD ring0 for SKL GT3 machine

v2: fix whitespace (Rodrigo)
v3: remove incorrect chunk that came on -collector rebase. (Rodrigo)
v4: change the comment (Zhipeng)
v5: address Daniel's comment (Zhipeng)
Signed-off-by: NZhipeng Gong <zhipeng.gong@intel.com>
Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8d360dff

08 1月, 2015 1 次提交

drm/i915: Reserve shadow batch VMA analogue to others · 7226572d

由 Tvrtko Ursulin 提交于 1月 07, 2015

If not pinned VMA can become an eviction target just before it needs to be
executed which breaks the internal object lifetime rules.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87399Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7226572d

24 12月, 2014 1 次提交

Revert "drm/gem: Warn on illegal use of the dumb buffer interface v2" · da6b51d0

由 Dave Airlie 提交于 12月 24, 2014

This reverts commit 355a7018.

This had some bad side effects under normal operation, and should
have been dropped earlier.
Signed-off-by: NDave Airlie <airlied@redhat.com>

da6b51d0

16 12月, 2014 4 次提交

drm/i915: Tidy up execbuffer command parsing code · 71745376

由 Brad Volkin 提交于 12月 11, 2014

Move it to a separate function since the main do_execbuffer function
already has so much going on.

v2:
- Move pin/unpin calls inside i915_parse_cmds() (Chris W, v4 7/7
  feedback)

Issue: VIZ-4719
Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

71745376

drm/i915: Mark shadow batch buffers as purgeable · 0079a7df

由 Brad Volkin 提交于 12月 11, 2014

By adding a new exec_entry flag, we cleanly mark the shadow objects
as purgeable after they are on the active list.

v2:
- Move 'shadow_batch_obj->madv = I915_MADV_WILLNEED' inside _get
  fnc (danvet, from v4 6/7 feedback)

v3:
- Remove duplicate 'madv = I915_MADV_WILLNEED' (danvet, from v6 4/5)

Issue: VIZ-4719
Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0079a7df

drm/i915: Use batch length instead of object size in command parser · b9ffd80e

由 Brad Volkin 提交于 12月 11, 2014

Previously we couldn't trust the user-supplied batch length because
it came directly from userspace (i.e. untrusted code). It would have
affected what commands software parsed without regard to what hardware
would actually execute, leaving a potential hole.

With the parser now copying the user supplied batch buffer and writing
MI_NOP commands to any space after the copied region, we can safely use
the batch length input. This should be a performance win as the actual
batch length is frequently much smaller than the allocated object size.

v2: Fix handling of non-zero batch_start_offset

Issue: VIZ-4719
Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b9ffd80e

drm/i915: Use batch pools with the command parser · 78a42377

由 Brad Volkin 提交于 12月 11, 2014

This patch sets up all of the tracking and copying necessary to
use batch pools with the command parser and dispatches the copied
(shadow) batch to the hardware.

After this patch, the parser is in 'enabling' mode.

Note that performance takes a hit from the copy in some cases
and will likely need some work. At a rough pass, the memcpy
appears to be the bottleneck. Without having done a deeper
analysis, two ideas that come to mind are:
1) Copy sections of the batch at a time, as they are reached
   by parsing. Might improve cache locality.
2) Copy only up to the userspace-supplied batch length and
   memset the rest of the buffer. Reduces the number of reads.

v2:
- Remove setting the capacity of the pool
- One global pool instead of per-ring pools
- Replace batch_obj with shadow_batch_obj and hook into eb->vmas
- Memset any space in the shadow batch beyond what gets copied
- Rebased on execlist prep refactoring

v3:
- Rebase on chained batch handling
- Squash in setting the secure dispatch flag
- Add a note about the interaction w/secure dispatch pinning
- Check for request->batch_obj == NULL in i915_gem_free_request

v4:
- Fix read domains for shadow_batch_obj
- Remove the set_to_gtt_domain call from i915_parse_cmds
- ggtt_pin/unpin in the parser block to simplify error handling
- Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag
- Remove i915_gem_batch_pool_put calls

v5:
- Move 'pending_read_domains |= I915_GEM_DOMAIN_COMMAND' after
  the parser (danvet, from v4 0/7 feedback)

Issue: VIZ-4719
Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

78a42377

15 12月, 2014 1 次提交

drm/i915: Infrastructure for supporting different GGTT views per object · fe14d5f4

由 Tvrtko Ursulin 提交于 12月 10, 2014

Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
to map objects into the same address space multiple times.

Added a GGTT view concept and linked it with the VMA to distinguish between
multiple instances per address space.

New objects and GEM functions which do not take this new view as a parameter
assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
previous behaviour.

This now means that objects can have multiple VMA entries so the code which
assumed there will only be one also had to be modified.

Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
which is DMA mapped on first VMA instantiation and unmapped on the last one
going away.

v2:
    * Removed per view special casing in i915_gem_ggtt_prepare /
      finish_object in favour of creating and destroying DMA mappings
      on first VMA instantiation and last VMA destruction. (Daniel Vetter)
    * Simplified i915_vma_unbind which does not need to count the GGTT views.
      (Daniel Vetter)
    * Also moved obj->map_and_fenceable reset under the same check.
    * Checkpatch cleanups.

v3:
    * Only retire objects once the last VMA is unbound.

v4:
    * Keep scatter-gather table for alternative views persistent for the
      lifetime of the VMA.
    * Propagate binding errors to callers and handle appropriately.

v5:
    * Explicitly look for normal GGTT view in i915_gem_obj_bound to align
      usage in i915_gem_object_ggtt_unpin. (Michel Thierry)
    * Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry)
    * Removed stray semi-colon in i915_gem_object_set_cache_level.

For: VIZ-4544
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
[danvet: Drop hunk from i915_gem_shrink since it's just prettification
but upsets a __must_check warning.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

fe14d5f4

03 12月, 2014 4 次提交

drm/i915: Convert trace functions from seqno to request · 74328ee5

由 John Harrison 提交于 11月 24, 2014

All the code above is now using requests not seqnos so it is possible to convert
the trace functions across. Note that rather than get into problematic reference
counting issues, the trace code only saves the seqno and ring values from the
request structure not the structure pointer itself.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

74328ee5

drm/i915: Remove obsolete seqno parameter from 'i915_add_request' · 9400ae5c

由 John Harrison 提交于 11月 24, 2014

There is no longer any need to retrieve a seqno value from an i915_add_request()
call. The calling code already knows which request structure is being processed
(it can only be ring->OLR). And as the request itself is now used in preference
to the basic seqno value, the latter is now redundant in this situation.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9400ae5c

drm/i915: Remove 'outstanding_lazy_seqno' · 6259cead

由 John Harrison 提交于 11月 24, 2014

The OLS value is now obsolete. Exactly the same value is guarateed to be always
available as PLR->seqno. Thus it is safe to remove the OLS completely. And also
to rename the PLR to OLR to keep the 'outstanding lazy ...' naming convention
valid.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6259cead

drm/i915: Replace last_[rwf]_seqno with last_[rwf]_req · 97b2a6a1

由 John Harrison 提交于 11月 24, 2014

The object structure contains the last read, write and fenced seqno values for
use in syncrhonisation operations. These have now been replaced with their
request structure counterparts.

Note that to ensure that objects do not end up with dangling pointers, the
assignments of last_*_req include reference count updates. Thus a request cannot
be freed if an object is still hanging on to it for any reason.

v2: Corrected 'last_rendering_' to 'last_read_' in a number of comments that did
not get updated when 'last_rendering_seqno' became 'last_read|write_seqno'
several millenia ago.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

97b2a6a1

21 11月, 2014 1 次提交

drm/gem: Warn on illegal use of the dumb buffer interface v2 · 355a7018

由 Thomas Hellstrom 提交于 11月 20, 2014

It happens on occasion that developers of generic user-space applications
abuse the dumb buffer API to get hold of drm buffers that they can both
mmap() and use for GPU acceleration, using the assumptions that dumb buffers
and buffers available for GPU are
a) The same type and can be aribtrarily type-casted.
b) fully coherent.

This patch makes the most widely used drivers warn nicely when that happens,
the next step will be to fail.

v2: Move drmP.h changes to drm_gem.h. Fix Radeon dumb mmap breakage.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

355a7018