提交 · 99be1dfe06e56b6e32f522979e9cf354dad5dc2e · openeuler / raspberrypi-kernel

03 12月, 2014 10 次提交

drm/i915: Move intel_init_pipe_control out of engine->init_hw · 99be1dfe

由 Daniel Vetter 提交于 11月 20, 2014

With this all the ->init_hw hooks really only set up hw state needed
to start the ring, all the software state setup and memory/buffer
allocations happen beforehand.

v2: We need to call intel_init_pipe_control after the ring init since
otherwise engine->dev is NULL and it falls over. Currently that's
now after the hw ring is enabled but a) we'll be fine as long as no
one submits a batch b) this will change soon.
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Reviewed-by: NDave Gordon <david.s.gordon@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

99be1dfe

drm/i915: s/init()/init_hw()/ in intel_engine_cs · ecfe00d8

由 Daniel Vetter 提交于 11月 20, 2014

This is (mostly, some exceptions that need fixing) the hw setup
function which starts the ring. And not the function which allocates
all the resources.

Make this clear by giving it a better name.
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Reviewed-by: NDave Gordon <david.s.gordon@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ecfe00d8

drm/i915: Consolidate ring freespace calculations · ebd0fd4b

由 Dave Gordon 提交于 11月 27, 2014

There are numerous places in the code where the driver's idea of
how much space is left in a ring is updated using the driver's
latest notions of the positions of 'head' and 'tail' for the ring.
Among them are some that update one or both of these values before
(re)doing the calculation. In particular, there are four different
places in the code where 'last_retired_head' is copied to 'head'
and then set to -1; and two of these do not have a guard to check
that it has actually been updated since last time it was consumed,
leaving the possibility that the dummy -1 can be transferred from
'last_retired_head' to 'head', causing the space calculation to
produce 'impossible' results (previously seen on Android/VLV).

This code therefore consolidates all the calculation and updating of
these values, such that there is only one place where the ring space
is updated, and it ALWAYS uses (and consumes) 'last_retired_head' if
(and ONLY if) it has been updated since the last call.
Signed-off-by: NDave Gordon <david.s.gordon@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ebd0fd4b

drm/i915: Make ring freespace calculation more robust · 4f54741e

由 Dave Gordon 提交于 11月 27, 2014

The used space in a ring is given by the cyclic distance from the
consumer (HEAD) to the producer (TAIL), i.e. ((tail-head) MOD size);
conversely, the available space in a ring is the cyclic distance
from the producer to the consumer, MINUS the amount reserved for a
"gap" that is supposed to guarantee that the producer never catches
up with or overruns the consumer. Note that some GEN h/w requires
that TAIL never approach to within one cacheline of HEAD, so the gap
is usually set to twice the cacheline size to ensure this.

While the existing code gives the correct answer for correct inputs,
if the producer HAS overrun into the reserved space, the result can
be a value larger than the maximum valid value (size-reserved). We
can improve this by reorganising the calculation, so that in the
event of overrun the result will be negative rather than over-large.

This means that the commonly-used test (available >= required)
will then reject further writes into the ring after an overrun,
giving some chance that we can recover from or at least diagnose
the original problem; whereas allowing more writes would likely both
confuse the h/w and destroy the evidence of what went wrong.
Signed-off-by: NDave Gordon <david.s.gordon@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4f54741e

drm/i915: Connect requests to rings at creation not submission · ff79e857

由 John Harrison 提交于 11月 24, 2014

It makes a lot more sense (and makes future seqno -> request conversion patches
simpler) to fill in the 'ring' field of the request structure at the point of
creation rather than submission. Given that the request structure is assigned by
ring specific code and thus is locked to a ring from the start, there really is
no reason to defer this assignment.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ff79e857

drm/i915: Remove obsolete seqno parameter from 'i915_add_request' · 9400ae5c

由 John Harrison 提交于 11月 24, 2014

There is no longer any need to retrieve a seqno value from an i915_add_request()
call. The calling code already knows which request structure is being processed
(it can only be ring->OLR). And as the request itself is now used in preference
to the basic seqno value, the latter is now redundant in this situation.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9400ae5c

drm/i915: Convert i915_wait_seqno to i915_wait_request · a4b3a571

由 Daniel Vetter 提交于 11月 26, 2014

Updated i915_wait_seqno() to take a request structure instead of a seqno value
and renamed it accordingly. Internally, it just pulls the seqno out of the
request and calls on to __wait_seqno() as before. However, all the code further
up the stack is now simplified as it can just pass the request object straight
through without having to peek inside.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
[danvet: Squash in hunk from an earlier patch which was rebased
wrongly.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a4b3a571

drm/i915: Remove 'outstanding_lazy_seqno' · 6259cead

由 John Harrison 提交于 11月 24, 2014

The OLS value is now obsolete. Exactly the same value is guarateed to be always
available as PLR->seqno. Thus it is safe to remove the OLS completely. And also
to rename the PLR to OLR to keep the 'outstanding lazy ...' naming convention
valid.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6259cead

drm/i915: Add reference count to request structure · abfe262a

由 John Harrison 提交于 11月 24, 2014

The plan is to use request structures everywhere that seqno values were
previously used. This means saving pointers to structures in places that used to
be simple integers. In turn, that means that the target structure now needs much
more stringent lifetime tracking. That is, it must not be freed while some other
random object still holds a pointer to it.

To achieve this tracking, a reference count needs to be added. Whenever a
pointer to the structure is saved away, the count must be incremented and the
free must only occur when all references have been released.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

abfe262a

drm/i915: Ensure OLS & PLR are always in sync · 9eba5d4a

由 John Harrison 提交于 11月 24, 2014

The aim is to replace seqno values with request structures. A step along the way
is to switch to using the PLR in preference to the OLS. That requires the PLR to
only be valid when and only when the OLS is also valid. I.e., the two must be
kept in lock step. Then, code which was using the OLS can be safely switched
over to using the PLR instead.

For: VIZ-4377
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9eba5d4a

20 11月, 2014 2 次提交

drm/i915: Remove DRI1 ring accessors and API · 5c6c6003

由 Chris Wilson 提交于 9月 06, 2014

With the deprecation of UMS, and by association DRI1, we have a tough
choice when updating the ring access routines. We either rewrite the
DRI1 routines blindly without testing (so likely to be broken) or take
the liberty of declaring them no longer supported and remove them
entirely. This takes the latter approach.

v2: Also remove the DRI1 sarea updates
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet: Fix rebase conflicts.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5c6c6003

drm/i915/bdw: Pin the ringbuffer backing object to GGTT on-demand · 7ba717cf

由 Thomas Daniel 提交于 11月 13, 2014

Same as with the context, pinning to GGTT regardless is harmful (it
badly fragments the GGTT and can even exhaust it).

Unfortunately, this case is also more complex than the previous one
because we need to map and access the ringbuffer in several places
along the execbuffer path (and we cannot make do by leaving the
default ringbuffer pinned, as before). Also, the context object
itself contains a pointer to the ringbuffer address that we have to
keep updated if we are going to allow the ringbuffer to move around.

v2: Same as with the context pinning, we cannot really do it during
an interrupt. Also, pin the default ringbuffers objects regardless
(makes error capture a lot easier).

v3: Rebased. Take a pin reference of the ringbuffer for each item
in the execlist request queue because the hardware may still be using
the ringbuffer after the MI_USER_INTERRUPT to notify the seqno update
is executed.  The ringbuffer must remain pinned until the context save
is complete.  No longer pin and unpin ringbuffer in
populate_lr_context() - this transient address is meaningless and the
pinning can cause a sleep while atomic.

v4: Moved ringbuffer pin and unpin into the lr_context_pin functions.
Downgraded pinning check BUG_ONs to WARN_ONs.

v5: Reinstated WARN_ONs for unexpected execlist states.  Removed unused
variable.

Issue: VIZ-4277
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
Reviewed-by: NAkash Goel <akash.goels@gmail.com>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7ba717cf

14 11月, 2014 4 次提交

drm/i915: Initialize workarounds in logical ring mode too · 771b9a53

由 Michel Thierry 提交于 11月 11, 2014

Following the legacy ring submission example, update the
ring->init_context() hook to support the execlist submission mode.

v2: update to use the new workaround macros and cleanup unused code.
This takes care of both bdw and chv workarounds.

v2.1: Add missing call to init_context() during deferred context creation.

v3: Split init_context (emit) in legacy/lrc modes. For lrc, get the ringbuf
from the context (Mika/Daniel).

v4: Merge init_context interfaces back, the legacy mode only needs the ring,
but the lrc mode needs the ring and context (Mika).

Issue: VIZ-4092
Issue: GMIN-3475
Change-Id: Ie3d093b2542ab0e2a44b90460533e2f979788d6c
Cc: Deepak S <deepak.s@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
[danvet: Align function paramater lists properly.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

771b9a53

drm/i915/chv: Add new workarounds for chv · 95289009

由 Arun Siluvery 提交于 10月 28, 2014

+WaForceEnableNonCoherent:chv
+WaHdcDisableFetchWhenMasked:chv

For: VIZ-4090
Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

95289009

drm/i915/chv: Combine GEN8_ROW_CHICKEN w/a · 605f1433

由 Arun Siluvery 提交于 10月 28, 2014

WaDisablePartialInstShootdown:chv and
WaDisableThreadStallDopClockGating:chv are related to the same
register so combine them.
Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

605f1433

drm/i915/chv: Remove pre-production workarounds · 3e470eaa

由 Arun Siluvery 提交于 10月 28, 2014

-WaDisableDopClockGating:chv
-WaDisableSamplerPowerBypass:chv
-WaDisableGunitClockGating:chv
-WaDisableFfDopClockGating:chv
-WaDisableDopClockGating:chv

v2: Remove pre-production WA instead of restricting them
based on revision id (Ville)

For: VIZ-4090
Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

3e470eaa

05 11月, 2014 1 次提交

drm/i915: Fix null pointer dereference in ring cleanup code · 6402c330

由 John Harrison 提交于 10月 31, 2014

If a ring failed to initialise for any reason then the error path would try to
clean up all rings including those that had not yet been allocated. The ring
clean up code did a check that the ring was valid before starting its work.
Unfortunately, that was after it had already dereferenced the ring to obtain a
dev_private pointer.
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6402c330

24 10月, 2014 3 次提交

drm/i915: Emit even number of dwords when emitting LRIs · 22a916aa

由 Arun Siluvery 提交于 10月 22, 2014

The number of DWords should be even when doing ring emits as
command sequences require QWord alignment.

There was some discussion about the maximum length of the MI_LRI
command. Quoting Mika

"I did some test with bdw:

"The maximum is 128 writes, resulting the 8 bit length
field of the command being 0xff, thus following the spec.
The 128'th write went through.

"Perhaps the max command length is then less in older gens?

"Perhaps WARN_ON(x > 128) in MI_LOAD_REGISTER_IMM would be in place
but one needs minor tweak to command parser a bit also then.

	#define I915_MAX_WA_REGS 16

keeps us safe for now atleast."

Ville commented that on pre-gen6 the length field seems to be
restricted to 0x3f though. So for all cases we should be ok.

v2: user LRI variant that can write multiple regs in one go (Damien).
We can simply insert one NOP at the end instead of one per register write.

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Add a summary of the MI_LRI length discussion.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

22a916aa

drm/i915: Build workaround list in ring initialization · 7225342a

由 Mika Kuoppala 提交于 10月 07, 2014

If we build the workaround list in ring initialization
and decouple it from the actual writing of values, we
gain the ability to decide where and how we want to apply
the values.

The advantage of this will become more clear when
we need to initialize workarounds on older gens where
it is not possible to write all the registers through ring
LRIs.

v2: rebase on newest bdw workarounds

Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com>
[danvet: Resolve tiny conflict in comments and ocd alignments a bit.]
[danvet2: Remove bogus force_wake_get call spotted by Paulo and QA.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7225342a

drm/i915/bdw: Remove BDW preproduction W/As until C stepping. · 101b376d

由 Rodrigo Vivi 提交于 10月 09, 2014

Let's clean this a bit

v2: Rebase after other Mika's patch that removed some BDW production workarounds.
v3: Removed stepping info.
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

101b376d

30 9月, 2014 1 次提交

drm/i915/bdw: WaDisableFenceDestinationToSLM · da09654d

由 Rodrigo Vivi 提交于 9月 19, 2014

This WA affect BDW GT3 pre-production steppings.
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
[danvet: Don't mention steppings ...]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

da09654d

29 9月, 2014 1 次提交

drm/i915: Minimize the huge amount of unecessary fbc sw cache clean. · 1d73c2a8

由 Rodrigo Vivi 提交于 9月 24, 2014

The sw cache clean on BDW is a tempoorary workaround because we cannot
set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
However we are doing much more than needed. Not only when using blt ring.
So, with this extra w/a we minimize the ammount of cache cleans and call it only
on same cases that it was being called on gen7.

The traditional FBC Cache clean happens over LRI on BLT ring when there is a
frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable
to let BLT flush that it must clean FBC cache.

fbc.need_sw_cache_clean works in the opposite information direction
of ring->fbc_dirty telling software on frontbuffer tracking to perform
the cache clean on sw side.

v2: Clean it a little bit and fully check for Broadwell instead of gen8.

v3: Rebase after frontbuffer organization.

v4: Wiggle confused me. So fixing v3!

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1d73c2a8

24 9月, 2014 2 次提交

drm/i915/skl: don't set the AsyncFlip performance mode for Gen9+ · fbdcb068

由 Imre Deak 提交于 2月 13, 2013

The following sets the AsyncFlip performance mode for everything above
Gen6:

commit 4790cb36b3eede8fb0cca529dc1d31b9936fa24b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sun Jan 20 16:11:20 2013 +0000

    drm/i915: Disable AsyncFlip performance optimisations

Starting from Gen9 the MI_MODE register layout changes and doesn't
include the above bit.
Reviewed-by: NThomas Wood <thomas.wood@intel.com>
Signed-off-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

fbdcb068

drm/i915/bdw: Cleanup pre prod workarounds · d37cf5f7

由 Mika Kuoppala 提交于 9月 19, 2014

as these have been fixed in production hw and hurt performance
if applied.

v2: adjust requested ring space (Ville)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83482Tested-by: Nzhoujian <jianx.zhou@intel.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d37cf5f7

19 9月, 2014 2 次提交

drm/i915: Fix irq checks in ring->irq_get/put functions · 7cd512f1

由 Daniel Vetter 提交于 9月 15, 2014

Yet another place that wasn't properly transformed when implementing
SOix. While at it convert the checks to WARN_ON on gen5+ (since we
don't have UMS potentially doing stupid things on those platforms).
And also add the corresponding checks to the put functions (again with
a WARN_ON) for gen5+.

v2: Drop the WARNINGS in the irq_put functions (including the existing
one for vebox), Chris convinced me that they're not that terribly
useful.

v3: Don't forget about execlist code.

Cc: Imre Deak <imre.deak@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "Volkin, Bradley D" <bradley.d.volkin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

7cd512f1

drm/i915: HSW always use GGTT selector for secure batches · 77072258

由 Chris Wilson 提交于 9月 10, 2014

gen6 and earlier conflate address space selection (ppgtt vs ggtt) with
the security bit (i.e. only privileged batches were allowed to run from
ggtt). From Haswell only, you are able to select the security bit
separate from the address space - and we always requested to use ppgtt.
This breaks the golden render state batch execution with full-ppgtt as
that is only present in the global GTT and more generally any secure
batch that is not colocated in the ppgtt and ggtt. So we need to
disable the use of the ppgtt selector bit for secure batches, or else we
hang immediately upon boot and thence after every GPU reset...

v2: Only HSW differentiates between secure dispatch and ggtt, so simply
ignore the differentiation and always use secure==ggtt.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Rectify commit message as noted by Chris.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

77072258

15 9月, 2014 1 次提交

drm/i915: Fix SRC_COPY width on 830/845g · 611a7a4f

由 Chris Wilson 提交于 9月 12, 2014

One small change I forgot to make in

commit c4d69da1
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Sep 8 14:25:41 2014 +0100

    drm/i915: Evict CS TLBs between batches

was to update the copy width for the compact BLT copy instruction.
Reported-by: NThomas Richter <thor@math.tu-berlin.de>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Thomas Richter <thor@math.tu-berlin.de>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Tested-by: NThomas Richter <thor@math.tu-berlin.de>
Acked-by: NDaniel Vetter <daniel@ffwll.ch>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

611a7a4f

08 9月, 2014 1 次提交

drm/i915: Evict CS TLBs between batches · c4d69da1

由 Chris Wilson 提交于 9月 08, 2014

Running igt, I was encountering the invalid TLB bug on my 845g, despite
that it was using the CS workaround. Examining the w/a buffer in the
error state, showed that the copy from the user batch into the
workaround itself was suffering from the invalid TLB bug (the first
cacheline was broken with the first two words reversed). Time to try a
fresh approach. This extends the workaround to write into each page of
our scratch buffer in order to overflow the TLB and evict the invalid
entries. This could be refined to only do so after we update the GTT,
but for simplicity, we do it before each batch.

I suspect this supersedes our current workaround, but for safety keep
doing both.

v2: The magic number shall be 2.

This doesn't conclusively prove that it is the mythical TLB bug we've
been trying to workaround for so long, that it requires touching a number
of pages to prevent the corruption indicates to me that it is TLB
related, but the corruption (the reversed cacheline) is more subtle than
a TLB bug, where we would expect it to read the wrong page entirely.

Oh well, it prevents a reliable hang for me and so probably for others
as well.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

c4d69da1

04 9月, 2014 1 次提交

drm/i915: Reset the HEAD pointer for the ring after writing START · 95468892

由 Chris Wilson 提交于 8月 07, 2014

Ville found an old w/a documented for g4x that suggested that we need to
reset the HEAD after writing START. This is a useful fixup for some of
the g4x ring initialisation woes, but as usual, not all.

v2: Do the rewrite unconditionally anyway

References: https://bugs.freedesktop.org/show_bug.cgi?id=76554Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

95468892

03 9月, 2014 7 次提交

drm/i915: Remove unneeded brackets · b07ba1dc

由 Damien Lespiau 提交于 8月 30, 2014

Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b07ba1dc

drm/i915: Don't silently discard workarounds · 04ad2dc7

由 Damien Lespiau 提交于 8月 30, 2014

If we happen to emit more than I915_MAX_WA_REGS workarounds, we will
currently discard them, not even emit the LRI. Not really what we want,
so warn loudly.
Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

04ad2dc7

drm/i915: Don't overrun the intel_wa_regs array · 55820e1e

由 Damien Lespiau 提交于 8月 30, 2014

When entering intel_ring_emit_wa() with num_wa_regs equal to
I915_MAX_WA_REGS, we end up indexing the intel_wa_regs array beyond its
allocation.

Fix the check then.
Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

55820e1e

drm/i915: Init some CHV workarounds via LRIs in ring->init_context() · 00e1e623

由 Ville Syrjälä 提交于 8月 27, 2014

Follow the BDW example and apply the workarounds touching registers
which are saved in the context image through LRIs in the new
ring->init_context() hook.

This makes Mesa much happier and eg. glxgears doesn't hang after
the first frame.

Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Add missing wa table initialization to avoid a functional
conflict with Arun's wa table debugfs support.]
Reviewed-by: N"Barbalho, Rafael" <rafael.barbalho@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

00e1e623

drm/i915/bdw: Export workaround data to debugfs · 888b5995

由 Arun Siluvery 提交于 8月 26, 2014

The workarounds that are applied are exported to a debugfs file;
this is used to verify their state after the test case (reset or
suspend/resume etc). This patch is only required to support i-g-t.
Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

888b5995

drm/i915/bdw: Apply workarounds in render ring init function · 86d7f238

由 Arun Siluvery 提交于 8月 26, 2014

For BDW workarounds are currently initialized in init_clock_gating() but
they are lost during reset, suspend/resume etc; this patch moves the WAs
that are part of register state context to render ring init fn otherwise
default context ends up with incorrect values as they don't get initialized
until init_clock_gating fn.

v2: Add workarounds to golden render state
This method has its own issues, first of all this is different for
each gen and it is generated using a tool so adding new workaround
and mainitaining them across gens is not a straightforward process.

v3: Use LRIs to emit these workarounds (Ville)
Instead of modifying the golden render state the same LRIs are
emitted from within the driver.

v4: Use abstract name when exporting gen specific routines (Chris)

For: VIZ-4092
Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

86d7f238

drm/i915: FBC flush nuke for BDW · c5ad011d

由 Rodrigo Vivi 提交于 8月 04, 2014

According to spec FBC on BDW and HSW are identical without any gaps.
So let's copy the nuke and let FBC really start compressing stuff.

Without this patch we can verify with false color that nothing is being
compressed. With the nuke in place and false color it is possible
to see false color debugs.

Unfortunatelly on some rings like BCS on BDW we have to avoid Bits 22:18 on
LRIs due to a high risk of hung. So, when using Blt ring for frontbuffer rend
cache would never been cleaned and FBC would stop compressing buffer.
One alternative is to cache clean on software frontbuffer tracking.

v2: Fix rebase conflict.
v3: Do not clean cache on BCS ring. Instead use sw frontbuffer tracking.
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c5ad011d

20 8月, 2014 1 次提交

drm/i915/bdw: Make sure gpu reset still works with Execlists · cc9130be

由 Oscar Mateo 提交于 7月 24, 2014

If we reset a ring after a hang, we have to make sure that we clear
out all queued Execlists requests.

v2: The ring is, at this point, already being correctly re-programmed
for Execlists, and the hangcheck counters cleared.

v3: Daniel suggests to drop the "if (execlists)" because the Execlists
queue should be empty in legacy mode (which is true, if we do the
INIT_LIST_HEAD).

v4: Do the pending intel_runtime_pm_put
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

cc9130be

13 8月, 2014 1 次提交

drm/i915: Fix up checks for aliasing ppgtt · 896ab1a5

由 Daniel Vetter 提交于 8月 06, 2014

A subsequent patch will no longer initialize the aliasing ppgtt if we
have full ppgtt enabled, since we simply don't need that any more.

Unfortunately a few places check for the aliasing ppgtt instead of
checking for ppgtt in general. Fix them up.

One special case are the gtt offset and size macros, which have some
code to remap the aliasing ppgtt to the global gtt. The aliasing ppgtt
is _not_ a logical address space, so passing that in as the vm is
plain and simple a bug. So just WARN about it and carry on - we have a
gracefully fall-through anyway if we can't find the vma.
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

896ab1a5

12 8月, 2014 2 次提交

drm/i915/bdw: GEN-specific logical ring emit flush · 4712274c

由 Oscar Mateo 提交于 7月 24, 2014

Same as the legacy-style ring->flush.

v2: The BSD invalidate bit still exists in GEN8! Add it for the VCS
rings (but still consolidate the blt and bsd ring flushes into one).
This was noticed by Brad Volkin.

v3: The command for BSD and for other rings is slightly different:
get it exactly the same as in gen6_ring_flush + gen6_bsd_ring_flush
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Checkpatch.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4712274c

drm/i915/bdw: New logical ring submission mechanism · 82e104cc

由 Oscar Mateo 提交于 7月 24, 2014

Well, new-ish: if all this code looks familiar, that's because it's
a clone of the existing submission mechanism (with some modifications
here and there to adapt it to LRCs and Execlists).

And why did we do this instead of reusing code, one might wonder?
Well, there are some fears that the differences are big enough that
they will end up breaking all platforms.

Also, Execlists offer several advantages, like control over when the
GPU is done with a given workload, that can help simplify the
submission mechanism, no doubt. I am interested in getting Execlists
to work first and foremost, but in the future this parallel submission
mechanism will help us to fine tune the mechanism without affecting
old gens.

v2: Pass the ringbuffer only (whenever possible).
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
[danvet: Appease checkpatch. Again. And drop the legacy sarea gunk
that somehow crept in.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

82e104cc