提交 · e1fee72c2ea2e9c0c6e6743d32a6832f21337d6c · openeuler / Kernel

15 8月, 2014 7 次提交

drm/i915/bdw: Avoid non-lite-restore preemptions · e1fee72c

由 Oscar Mateo 提交于 7月 24, 2014

In the current Execlists feeding mechanism, full preemption is not
supported yet: only lite-restores are allowed (this is: the GPU
simply samples a new tail pointer for the context currently in
execution).

But we have identified an scenario in which a full preemption occurs:
1) We submit two contexts for execution (A & B).
2) The GPU finishes with the first one (A), switches to the second one
(B) and informs us.
3) We submit B again (hoping to cause a lite restore) together with C,
but in the time we spend writing to the ELSP, the GPU finishes B.
4) The GPU start executing B again (since we told it so).
5) We receive a B finished interrupt and, mistakenly, we submit C (again)
and D, causing a full preemption of B.

The race is avoided by keeping track of how many times a context has been
submitted to the hardware and by better discriminating the received context
switch interrupts: in the example, when we have submitted B twice, we won´t
submit C and D as soon as we receive the notification that B is completed
because we were expecting to get a LITE_RESTORE and we didn´t, so we know a
second completion will be received shortly.

Without this explicit checking, somehow, the batch buffer execution order
gets messed with. This can be verified with the IGT test I sent together with
the series. I don´t know the exact mechanism by which the pre-emption messes
with the execution order but, since other people is working on the Scheduler
+ Preemption on Execlists, I didn´t try to fix it. In these series, only Lite
Restores are supported (other kind of preemptions WARN).

v2: elsp_submitted belongs in the new intel_ctx_submit_request. Several
rebase changes.

v3: Clarify how the race is avoided, as requested by Daniel.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Align function parameters ...]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

e1fee72c

drm/i915/bdw: Handle context switch events · e981e7b1

由 Thomas Daniel 提交于 7月 24, 2014

Handle all context status events in the context status buffer on every
context switch interrupt. We only remove work from the execlist queue
after a context status buffer reports that it has completed and we only
attempt to schedule new contexts on interrupt when a previously submitted
context completes (unless no contexts are queued, which means the GPU is
free).

We canot call intel_runtime_pm_get() in an interrupt (or with a spinlock
grabbed, FWIW), because it might sleep, which is not a nice thing to do.
Instead, do the runtime_pm get/put together with the create/destroy request,
and handle the forcewake get/put directly.
Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>

v2: Unreferencing the context when we are freeing the request might free
the backing bo, which requires the struct_mutex to be grabbed, so defer
unreferencing and freeing to a bottom half.

v3:
- Ack the interrupt inmediately, before trying to handle it (fix for
missing interrupts by Bob Beckett <robert.beckett@intel.com>).
- Update the Context Status Buffer Read Pointer, just in case (spotted
by Damien Lespiau).

v4: New namespace and multiple rebase changes.

v5: Squash with "drm/i915/bdw: Do not call intel_runtime_pm_get() in an
interrupt", as suggested by Daniel.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Checkpatch ...]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

e981e7b1

drm/i915/bdw: Two-stage execlist submit process · acdd884a

由 Michel Thierry 提交于 7月 24, 2014

Context switch (and execlist submission) should happen only when
other contexts are not active, otherwise pre-emption occurs.

To assure this, we place context switch requests in a queue and those
request are later consumed when the right context switch interrupt is
received (still TODO).

v2: Use a spinlock, do not remove the requests on unqueue (wait for
context switch completion).
Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>

v3: Several rebases and code changes. Use unique ID.

v4:
- Move the queue/lock init to the late ring initialization.
- Damien's kmalloc review comments: check return, use sizeof(*req),
do not cast.

v5:
- Do not reuse drm_i915_gem_request. Instead, create our own.
- New namespace.

Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v1)
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v2-v5)
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[davnet: Checkpatch + wash-up s/BUG_ON/WARN_ON/.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

acdd884a

drm/i915/bdw: Write the tail pointer, LRC style · ae1250b9

由 Oscar Mateo 提交于 7月 24, 2014

Each logical ring context has the tail pointer in the context object,
so update it before submission.

v2: New namespace.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ae1250b9

drm/i915/bdw: Implement context switching (somewhat) · 84b790f8

由 Ben Widawsky 提交于 7月 24, 2014

A context switch occurs by submitting a context descriptor to the
ExecList Submission Port. Given that we can now initialize a context,
it's possible to begin implementing the context switch by creating the
descriptor and submitting it to ELSP (actually two, since the ELSP
has two ports).

The context object must be mapped in the GGTT, which means it must exist
in the 0-4GB graphics VA range.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>

v2: This code has changed quite a lot in various rebases. Of particular
importance is that now we use the globally unique Submission ID to send
to the hardware. Also, context pages are now pinned unconditionally to
GGTT, so there is no need to bind them.

v3: Use LRCA[31:12] as hwCtxId[19:0]. This guarantees that the HW context
ID we submit to the ELSP is globally unique and != 0 (Bspec requirements
of the software use-only bits of the Context ID in the Context Descriptor
Format) without the hassle of the previous submission Id construction.
Also, re-add the ELSP porting read (it was dropped somewhere during the
rebases).

v4:
- Squash with "drm/i915/bdw: Add forcewake lock around ELSP writes" (BSPEC
  says: "SW must set Force Wakeup bit to prevent GT from entering C6 while
  ELSP writes are in progress") as noted by Thomas Daniel
  (thomas.daniel@intel.com).
- Rename functions and use an execlists/intel_execlists_ namespace.
- The BUG_ON only checked that the LRCA was <32 bits, but it didn't make
  sure that it was properly aligned. Spotted by Alistair Mcaulay
  <alistair.mcaulay@intel.com>.

v5:
- Improved source code comments as suggested by Chris Wilson.
- No need to abstract submit_ctx away, as pointed by Brad Volkin.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Checkpatch. Sigh.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

84b790f8

drm/i915/bdw: Emission of requests with logical rings · 48e29f55

由 Oscar Mateo 提交于 7月 24, 2014

On a previous iteration of this patch, I created an Execlists
version of __i915_add_request and asbtracted it away as a
vfunc. Daniel Vetter wondered then why that was needed:

"with the clean split in command submission I expect every
function to know wether it'll submit to an lrc (everything in
intel_lrc.c) or wether it'll submit to a legacy ring (existing
code), so I don't see a need for an add_request vfunc."

The honest, hairy truth is that this patch is the glue keeping
the whole logical ring puzzle together:

- i915_add_request is used by intel_ring_idle, which in turn is
  used by i915_gpu_idle, which in turn is used in several places
  inside the eviction and gtt codes.
- Also, it is used by i915_gem_check_olr, which is littered all
  over i915_gem.c
- ...

If I were to duplicate all the code that directly or indirectly
uses __i915_add_request, I'll end up creating a separate driver.

To show the differences between the existing legacy version and
the new Execlists one, this time I have special-cased
__i915_add_request instead of adding an add_request vfunc. I
hope this helps to untangle this Gordian knot.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Adjust to ringbuf->FIXME_lrc_ctx per the discussion with
Thomas Daniel.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

48e29f55

drm/i915: Add temporary ring->ctx backpointer · 582d67f0

由 Oscar Mateo 提交于 7月 24, 2014

The execlist patches have a bit a convoluted and long history and due
to that have the actual submission still misplaced deeply burried in
the low-level ringbuffer handling code. This design goes back to the
legacy ringbuffer code with its tricky lazy request and simple work
submissiion using ring tail writes. For that reason they need a
ring->ctx backpointer.

The goal is to unburry that code and move it up into a level where the
full execlist context is available so that we can ditch this
backpointer. Until that's done make it really obvious that there's
work still to be done.

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Thomas Daniel <thomas.daniel@intel.com>
Acked-by: NThomas Daniel <thomas.daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

582d67f0

14 8月, 2014 3 次提交

drm/i915: Print captured bo for all VM in error state · 3a448734

由 Chris Wilson 提交于 8月 12, 2014

The current error state harks back to the era of just a single VM. For
full-ppgtt, we capture every bo on every VM. It behoves us to then print
every bo for every VM, which we currently fail to do and so miss vital
information in the error state.

v2: Use the vma address rather than -1!
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

3a448734

drm/i915: Sharing platform specific sequence between runtime and system suspend/ resume paths · 016970be

由 Sagar Kamble 提交于 8月 13, 2014

On VLV, post S0i3 during i915_drm_thaw following issue is observed during ring
initialization.

[ 335.604039] [drm:stop_ring] ERROR render ring :timed out trying to stop ring
[ 336.607340] [drm:stop_ring] ERROR render ring :timed out trying to stop ring
[ 336.607345] [drm:init_ring_common] ERROR failed to set render ring head to zero ctl 00000000 head 00000000 tail 00000000 start 00000000
[ 337.610645] [drm:stop_ring] ERROR bsd ring :timed out trying to stop ring
[ 338.613952] [drm:stop_ring] ERROR bsd ring :timed out trying to stop ring
[ 338.613956] [drm:init_ring_common] ERROR failed to set bsd ring head to zero ctl 00000000 head 00000000 tail 00000000 start 00000000
[ 339.617256] [drm:stop_ring] ERROR render ring :timed out trying to stop ring
[ 339.617258] -----------[ cut here ]-----------
[ 339.617267] WARNING: CPU: 0 PID: 6 at drivers/gpu/drm/i915/intel_ringbuffer.c:1666 intel_cleanup_ring+0xe6/0xf0()
[ 339.617396] --[ end trace 5ef5ed1a3c92e2a6 ]--
[ 339.617428] [drm:__i915_drm_thaw] ERROR failed to re-initialize GPU, declaring wedged!

This is happening since wake is not enabled and Gunit registers are not restored.
For this system suspend/resume paths need to follow save/restore and additional
platform specific setup in suspend_complete and resume_prepare.

suspend_complete is shared unconditionaly for VLV, HSW, BDW. resume_prepare for
HSW and BDW has pc8 disabling which is needed during thaw_early so sharing
uncondtionally. For VLV and SNB runtime resume specific sequence exists.

Cc: Imre Deak <imre.deak@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Goel, Akash <akash.goel@intel.com>
Signed-off-by: NSagar Kamble <sagar.a.kamble@intel.com>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

016970be

drm/i915: Created common handler for platform specific suspend/resume · ebc32824

由 Sagar Kamble 提交于 8月 13, 2014

With this change, intel_runtime_suspend and intel_runtime_resume functions
become completely platform agnostic. Platform specific suspend/resume
changes are moved to intel_suspend_complete and intel_resume_prepare.

Cc: Imre Deak <imre.deak@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Goel, Akash <akash.goel@intel.com>
Signed-off-by: NSagar Kamble <sagar.a.kamble@intel.com>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ebc32824

13 8月, 2014 18 次提交

drm/i915: Localise the fbdev console lock frobbing · 82e3b8c1

由 Chris Wilson 提交于 8月 13, 2014

Rather than take and release the console_lock() around a non-existent
DRM_I915_FBDEV, move the lock acquisation into the callee where it will
be compiled out by the config option entirely. This includes moving the
deferred fb_set_suspend() dance and encapsulating it entirely within
intel_fbdev.c.

v2: Use an integral work item so that we can explicitly flush the work
upon suspend/unload.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Add the flush_work in fbdev_fini per the mailing list
discussion. And s/BUG_ON/WARN_ON/ because.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

82e3b8c1

drm/i915: Replace __I915__ with typesafe variant · 7312e2dd

由 Chris Wilson 提交于 8月 13, 2014

Ville pointed out the GCCism __builtin_types_compatible_p() that we
could use to replace our heavily casted presumption __I915__ macro that
was based on comparing struct sizes.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7312e2dd

drm/i915: Add support for variable cursor size on 845/865 · dc41c154

由 Ville Syrjälä 提交于 8月 13, 2014

845/865 support different cursor sizes as well, albeit a bit differently
than later platforms. Add the necessary code to make them work.

Untested due to lack of hardware.

v2: Warn but accept invalid stride (Chris)
    Rewrite the cursor size checks for other platforms (Chris)
v3: More polish and magic to the cursor size checks (Chris)
v4: Moar polish and a comment (Chris)
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

dc41c154

drm/i915: Unify ivb_update_cursor() and i9xx_update_cursor() · 8ac54669

由 Ville Syrjälä 提交于 8月 12, 2014

Ever since
 commit 5efb3e28
 Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
 Date:   Wed Apr 9 13:28:53 2014 +0300

    drm/i915/chv: Add cursor pipe offsets

the only difference between i9xx_update_cursor() and ivb_update_cursor()
was the hsw+ pipe csc handling. Let's unify them and we can rid
outselves of some duplicated code.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8ac54669

drm/i915: Move CURSIZE setup to i845_update_cursor() · d7ce484e

由 Ville Syrjälä 提交于 8月 12, 2014

CURSIZE register exists on 845/865 only, so move it to
i845_update_cursor(). Changes to cursor size must be done only when the
cursor is disabled, so do the write just before enabling the cursor.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d7ce484e

drm/i915: Don't try to enable cursor from setplane when crtc is disabled · a08a42ad

由 Ville Syrjälä 提交于 8月 12, 2014

Make sure the cursor gets fully clipped when enabling it on a disabled
crtc via setplane. This will prevent the lower level code from
attempting to enable the cursor in hardware.

Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a08a42ad

drm/i915: Cleanup aliasging ppgtt alongside the global gtt · 70e32544

由 Daniel Vetter 提交于 8月 06, 2014

Also remove related WARN_ONs which seem to have been hit since a rather
long time. But apperently no one noticed since our module reload is
already WARNING-infested :(
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

70e32544

drm/i915: Extract commmon global gtt cleanup code · 90d0a0e8

由 Daniel Vetter 提交于 8月 06, 2014

We want to move the aliasing ppgtt cleanup back into the global
gtt cleanup code for symmetry, but first we need to create such
a place.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

90d0a0e8

drm/i915: Extract common cleanup into i915_ppgtt_release · 19dd120c

由 Daniel Vetter 提交于 8月 06, 2014

Address space cleanup isn't really a job for the low-level cleanup
callbacks. Without this change we can't reuse the low-level cleanup
callback for the aliasing ppgtt cleanup.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

19dd120c

drm/i915: Drop create_vm argument to i915_gem_create_context · d624d86e

由 Daniel Vetter 提交于 8月 06, 2014

Now that all the flow is streamlined the rule is simple: We create
a new ppgtt for a new context when we have full ppgtt enabled.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d624d86e

drm/i915: Only track real ppgtt for a context · ae6c4806

由 Daniel Vetter 提交于 8月 06, 2014

There's a bit a confusion since we track the global gtt,
the aliasing and real ppgtt in the ctx->vm pointer. And not
all callers really bother to check for the different cases and just
presume that it points to a real ppgtt.

Now looking closely we don't actually need ->vm to always point at an
address space - the only place that cares actually has fixup code
already to decide whether to look at the per-proces or the global
address space.

So switch to just tracking the ppgtt directly and ditch all the
extraneous code.

v2: Fixup the ppgtt debugfs file to not oops on a NULL ctx->ppgtt.
Also drop the early exit - without aliasing ppgtt we want to dump all
the ppgtts of the contexts if we have full ppgtt.

v3: Actually git add the compile fix.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Cc: "Thierry, Michel" <michel.thierry@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
OTC-Jira: VIZ-3724
[danvet: Resolve conflicts with execlist patches while applying.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ae6c4806

drm/i915: Initialize the aliasing ppgtt as part of global gtt · fa76da34

由 Daniel Vetter 提交于 8月 06, 2014

Stuffing this into the context setup code doesn't make a lot of sense.
Also reusing the real ppgtt setup code makes even less sense since the
aliasing ppgtt isn't a real address space. Leaving all that stuff
unitialized will make sure that we catch any abusers promptly.

This is also a prep work to clean up the context->ppgtt link.

v2: Fix up the logic fail, I've fumbled it so badly to completely
disable ppgtt on gen6. Spotted by Ville and Michel. Also move around
the pde write into the gen6 init function, since otherwise it won't
work at all.

v3: Only initialize the aliasing ppgtt when we actually enable it.

Cc: "Thierry, Michel" <michel.thierry@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
[danvet: Squash in fixup from Fengguang Wu.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

fa76da34

drm/i915: Rework ppgtt init to no require an aliasing ppgtt · 82460d97

由 Daniel Vetter 提交于 8月 06, 2014

Currently we abuse the aliasing ppgtt to set up the ppgtt support in
general. Which is a bit backwards since with full ppgtt we don't ever
need the aliasing ppgtt.

So untangle this and separate the ppgtt init from the aliasing
ppgtt. While at it drag it out of the context enabling (which just
does a switch to the default context).

Note that we still have the differentiation between synchronous and
asynchronous ppgtt setup, but that will soon vanish. So also correctly
wire up the return value handling to be prepared for when ->switch_mm
drops the synchronous parameter and could start to fail.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

82460d97

drm/i915: Fix up checks for aliasing ppgtt · 896ab1a5

由 Daniel Vetter 提交于 8月 06, 2014

A subsequent patch will no longer initialize the aliasing ppgtt if we
have full ppgtt enabled, since we simply don't need that any more.

Unfortunately a few places check for the aliasing ppgtt instead of
checking for ppgtt in general. Fix them up.

One special case are the gtt offset and size macros, which have some
code to remap the aliasing ppgtt to the global gtt. The aliasing ppgtt
is _not_ a logical address space, so passing that in as the vm is
plain and simple a bug. So just WARN about it and carry on - we have a
gracefully fall-through anyway if we can't find the vma.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

896ab1a5

drm/i915: Allow i915_gem_setup_global_gtt to fail · 6c5566a8

由 Daniel Vetter 提交于 8月 06, 2014

We already needs this just as a safety check in case the preallocation
reservation dance fails. But we definitely need this to be able to
move tha aliasing ppgtt setup back out of the context code to this
place, where it belongs.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6c5566a8

drm/i915: Add proper prefix to obj_to_ggtt · 5dc383b0

由 Daniel Vetter 提交于 8月 06, 2014

Stuff in headers really aught to have this.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5dc383b0

drm/i915: Only refcount ppgtt if it actually is one · 841cd773

由 Daniel Vetter 提交于 8月 06, 2014

This essentially unbreaks non-ppgtt operation where we'd scribble over
random memory.

While at it give the vm_to_ppgtt function a proper prefix and make it
a bit more paranoid.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

841cd773

drm/i915: Track file_priv, not ctx in the ppgtt structure · 4d884705

由 Daniel Vetter 提交于 8月 06, 2014

Hardware contexts reference a ppgtt, not the other way round. And the
only user of this (in debugfs) actually only cares about which file
the ppgtt is associated with. So give it what it wants.

While at it give the ppgtt create function a proper name&place.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4d884705

12 8月, 2014 11 次提交

drm/i915: Some cleanups for the ppgtt lifetime handling · ee960be7

由 Daniel Vetter 提交于 8月 06, 2014

So when reviewing Michel's patch I've noticed a few things and cleaned
them up:
- The early checks in ppgtt_release are now redundant: The inactive
  list should always be empty now, so we can ditch these checks. Even
  for the aliasing ppgtt (though that's a different confusion) since
  we tear that down after all the objects are gone.
- The ppgtt handling functions are splattered all over. Consolidate
  them in i915_gem_gtt.c, give them OCD prefixes and add wrappers for
  get/put.
- There was a bit a confusion in ppgtt_release about whether it cares
  about the active or inactive list. It should care about them both,
  so augment the WARNINGs to check for both.

There's still create_vm_for_ctx left to do, put that is blocked on the
removal of ppgtt->ctx. Once that's done we can rename it to
i915_ppgtt_create and move it to its siblings for handling ppgtts.

v2: Move the ppgtt checks into the inline get/put functions as
suggested by Chris.

v3: Inline the now redundant ppgtt local variable.

Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ee960be7

drm/i915: vma/ppgtt lifetime rules · b9d06dd9

由 Michel Thierry 提交于 8月 06, 2014

VMAs should take a reference of the address space they use.

Now, when the fd is closed, it will release the ref that the context was
holding, but it will still be referenced by any vmas that are still
active.

ppgtt_release() should then only be called when the last thing referencing
it releases the ref, and it can just call the base cleanup and free the
ppgtt.

Note that with this we will extend the lifetime of ppgtts which
contain shared objects. But all the non-shared objects will get
removed as soon as they drop of the active list and for the shared
ones the shrinker can eventually reap them. Since we currently can't
evict ppgtt pagetables either I don't think that temporary leak is
important.
Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
[danvet: Add note about potential ppgtt leak with this approach.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b9d06dd9

drm/i915/bdw: Always use MMIO flips with Execlists · 14bf993e

由 Oscar Mateo 提交于 7月 24, 2014

The normal flip function places things in the ring in the legacy
way, so we either fix that or force MMIO flips always as we do in
this patch.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Checkpatch. Fucking again.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

14bf993e

drm/i915/bdw: Workload submission mechanism for Execlists · ba8b7ccb

由 Oscar Mateo 提交于 7月 24, 2014

This is what i915_gem_do_execbuffer calls when it wants to execute some
worload in an Execlists world.

v2: Check arguments before doing stuff in intel_execlists_submission. Also,
get rel_constants parsing right.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Drop the chipset flush, that's pre-gen6. And appease
checkpatch a bit .... again!]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ba8b7ccb

drm/i915/bdw: GEN-specific logical ring emit batchbuffer start · 15648585

由 Oscar Mateo 提交于 7月 24, 2014

Dispatch_execbuffer's evil twin.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Ditch the check for aliasing ppgtt. It'll break soon and
execlists requires full ppgtt anyway.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

15648585

drm/i915/bdw: Interrupts with logical rings · 73d477f6

由 Oscar Mateo 提交于 7月 24, 2014

We need to attend context switch interrupts from all rings. Also, fixed writing
IMR/IER and added HWSTAM at ring init time.

Notice that, if added to irq_enable_mask, the context switch interrupts would
be incorrectly masked out when the user interrupts are due to no users waiting
on a sequence number. Therefore, this commit adds a bitmask of interrupts to
be kept unmasked at all times.

v2: Disable HWSTAM, as suggested by Damien (nobody listens to these interrupts,
anyway).

v3: Add new get/put_irq functions.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> (v1)
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v2 & v3)
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Drop the GEN8_ prefix from the context switch interrupt
define and move it to its brethren.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

73d477f6

drm/i915/bdw: Ring idle and stop with logical rings · 9832b9da

由 Oscar Mateo 提交于 7月 24, 2014

This is a hard one, since there is no direct hardware ring to
control when in Execlists.

We reuse intel_ring_idle here, but it should be fine as long
as i915_add_request does the ring thing.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9832b9da

drm/i915/bdw: GEN-specific logical ring emit flush · 4712274c

由 Oscar Mateo 提交于 7月 24, 2014

Same as the legacy-style ring->flush.

v2: The BSD invalidate bit still exists in GEN8! Add it for the VCS
rings (but still consolidate the blt and bsd ring flushes into one).
This was noticed by Brad Volkin.

v3: The command for BSD and for other rings is slightly different:
get it exactly the same as in gen6_ring_flush + gen6_bsd_ring_flush
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Checkpatch.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4712274c

drm/i915/bdw: GEN-specific logical ring emit request · 4da46e1e

由 Oscar Mateo 提交于 7月 24, 2014

Very similar to the legacy add_request, only modified to account for
logical ringbuffer.

v2: Use MI_GLOBAL_GTT, as suggested by Brad Volkin.

v3: Unify render and non-render in the same function, as noticed by
Brad Volkin.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4da46e1e

drm/i915/bdw: New logical ring submission mechanism · 82e104cc

由 Oscar Mateo 提交于 7月 24, 2014

Well, new-ish: if all this code looks familiar, that's because it's
a clone of the existing submission mechanism (with some modifications
here and there to adapt it to LRCs and Execlists).

And why did we do this instead of reusing code, one might wonder?
Well, there are some fears that the differences are big enough that
they will end up breaking all platforms.

Also, Execlists offer several advantages, like control over when the
GPU is done with a given workload, that can help simplify the
submission mechanism, no doubt. I am interested in getting Execlists
to work first and foremost, but in the future this parallel submission
mechanism will help us to fine tune the mechanism without affecting
old gens.

v2: Pass the ringbuffer only (whenever possible).
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Appease checkpatch. Again. And drop the legacy sarea gunk
that somehow crept in.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

82e104cc

drm/i915: Make hpd debug messages less cryptic · 26fbb774

由 Ville Syrjälä 提交于 8月 11, 2014

Don't print raw numbers, use port_name() and tell the user whether it's
long or short without having to figure out what the other magic number
means.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

26fbb774

11 8月, 2014 1 次提交

drm/i915/bdw: GEN-specific logical ring set/get seqno · e94e37ad

由 Oscar Mateo 提交于 7月 24, 2014

No mistery here: the seqno is still retrieved from the engine's
HW status page (the one in the default context. For the moment,
I see no reason to worry about other context's HWS page).
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

e94e37ad

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功