提交 · ecdb4eb71b8f76db2bf58c86af907e7b8ee056b0 · openeuler / raspberrypi-kernel

20 2月, 2013 2 次提交

drm/i915: Fix PIPE_CONTROL DW/QW write through global GTT on IVB+ · b9e1faa7

由 Ville Syrjälä 提交于 2月 14, 2013

The bit controlling whether PIPE_CONTROL DW/QW write targets
the global GTT or PPGTT moved moved from DW 2 bit 2 to
DW 1 bit 24 on IVB.

I verified on IVB that the fix is in fact effective. Without the fix
none of the scratch writes actually landed in the pipe control page.
With the fix the writes show up correctly.

v2: move PIPE_CONTROL_GLOBAL_GTT_IVB setup to where other flags are set
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b9e1faa7

drm/i915: Print the pipe control page GTT address · 2b1086cc

由 Ville Syrjälä 提交于 2月 12, 2013

We already print the HWS addresses during init, so do the same for the
pipe control page. Reduces guesswork when looking at hex addresses
later.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

2b1086cc

23 1月, 2013 2 次提交

drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits · f05bb0c7

由 Chris Wilson 提交于 1月 20, 2013

On SNB, if bit 13 of GFX_MODE, Flush TLB Invalidate Mode, is not set to 1,
the hardware can not program the scanline values. Those scanline values
then control when the signal is sent from the display engine to the render
ring for MI_WAIT_FOR_EVENTs. Note setting this bit means that TLB
invalidations must be performed explicitly through the appropriate bits
being set in PIPE_CONTROL.

References: https://bugzilla.kernel.org/show_bug.cgi?id=52311Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f05bb0c7

drm/i915: Disable AsyncFlip performance optimisations · 1c8c38c5

由 Chris Wilson 提交于 1月 20, 2013

This is a required workarounds for all products, especially on gen6+
where it causes the command streamer to fail to parse instructions
following a WAIT_FOR_EVENT. We use WAIT_FOR_EVENT for synchronising
between the GPU and the display engines, and so this bit being unset may
cause hangs.

References: https://bugzilla.kernel.org/show_bug.cgi?id=52311Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1c8c38c5

22 1月, 2013 1 次提交

drm/i915: use gem_set_seqno() on hardware init · 99433931

由 Mika Kuoppala 提交于 1月 22, 2013

When machine was rebooted or module was reloaded,
gem_hw_init() set last_seqno to be identical to next_seqno.
This lead to situation that waits for first ever request
always passed immediately regardless if it was actually
executed.

Use gem_set_seqno() to be consistent how hw is
initialized on init, wrap and on resume.
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

99433931

20 1月, 2013 2 次提交

drm/i915: move wedged to the other gpu error handling stuff · 33196ded

由 Daniel Vetter 提交于 11月 14, 2012

And to make Ben Widawsky happier, use the gpu_error instead of
the entire device as the argument in some functions.

Drop the outdated comment on ->wedged for now, a follow-up patch will
change the semantics and add a proper comment again.
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

33196ded

drm/i915: extract hangcheck/reset/error_state state into substruct · 99584db3

由 Daniel Vetter 提交于 11月 14, 2012

This has been sprinkled all over the place in dev_priv. I think
it'd be good to also move all the code into a separate file like
i915_gem_error.c, but that's for another patch.
Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

99584db3

18 1月, 2013 1 次提交

drm/i915: Remove use on gma_bus_addr on gen6+ · dabb7a91

由 Ben Widawsky 提交于 1月 17, 2013

We have enough info to not use the intel_gtt bridge stuff.

v2: Move setup of mappable_base above the legacy init stuff because we
still need that on older platforms. (Daniel)

v3: Remove the dev_priv hunk which was rebased in by accident

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

dabb7a91

19 12月, 2012 2 次提交

drm/i915: Initialize hardware semaphore state on ring init · f7e98ad4

由 Mika Kuoppala 提交于 12月 19, 2012

Hardware status page needs to have proper seqno set
as our initial seqno can be arbitrary. If initial seqno is close
to wrap boundary on init and i915_seqno_passed() (31bit space)
refers to hw status page which contains zero, errorneous result
will be returned.

v2: clear mboxes and set hws page directly instead of going
through rings. Suggested by Chris Wilson.

v3: hws needs to be updated for all gens. Noticed by Chris
Wilson.

References: https://bugs.freedesktop.org/show_bug.cgi?id=58230Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f7e98ad4

drm/i915: Introduce ring set_seqno · b70ec5bf

由 Mika Kuoppala 提交于 12月 19, 2012

In preparation for setting per ring initial seqno values
add ring::set_seqno().
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b70ec5bf

18 12月, 2012 1 次提交

drm/i915: Implement workaround for broken CS tlb on i830/845 · b45305fc

由 Daniel Vetter 提交于 12月 17, 2012

Now that Chris Wilson demonstrated that the key for stability on early
gen 2 is to simple _never_ exchange the physical backing storage of
batch buffers I've tried a stab at a kernel solution. Doesn't look too
nefarious imho, now that I don't try to be too clever for my own good
any more.

v2: After discussing the various techniques, we've decided to always blit
batches on the suspect devices, but allow userspace to opt out of the
kernel workaround assume full responsibility for providing coherent
batches. The principal reason is that avoiding the blit does improve
performance in a few key microbenchmarks and also in cairo-trace
replays.
Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet:
- Drop the hunk which uses HAS_BROKEN_CS_TLB to implement the ring
  wrap w/a. Suggested by Chris Wilson.
- Also add the ACTHD check from Chris Wilson for the error state
  dumping, so that we still catch batches when userspace opts out of
  the w/a.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b45305fc

11 12月, 2012 1 次提交

drm/i915: Don't emit semaphore wait if wrap happened · f72b3435

由 Mika Kuoppala 提交于 12月 10, 2012

If wrap just happened we need to prevent emitting waits for
pre wrap values. Detect this and emit no-ops instead.

v2: Use olr > seqno to detect wrap instead of *seqno == 0
as suggested by Chris Wilson.

v3: Use last used seqno to detect the wraparound. From
Chris Wilson

v4: Fixed unnecessary last_seqno assigment

References: https://bugs.freedesktop.org/show_bug.cgi?id=57967Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f72b3435

06 12月, 2012 2 次提交

drm/i915: Add intel_ring_handle_seqno wrap · 498d2ac1

由 Mika Kuoppala 提交于 12月 04, 2012

If there are pre-wrap values in semaphore-mbox registers after wrap,
syncing against some after-wrap request will complete immediately.
Fix this by emitting ring commands to set mbox registers to zero
when the wrap happens.

v2: Use __intel_ring_begin to emit ring commands, from
Chris Wilson.
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet: Add a small comment to handle_seqno_wrap.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

498d2ac1

drm/i915: Split intel_ring_begin · cbcc80df

由 Mika Kuoppala 提交于 12月 04, 2012

In preparation for handling ring seqno wrapping, split
intel_ring_begin into helper part which doesn't allocate
seqno.
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

cbcc80df

04 12月, 2012 1 次提交

drm/i915: Don't allow ring tail to reach the same cacheline as head · 633cf8f5

由 Ville Syrjälä 提交于 12月 03, 2012

From BSpec:
"If the Ring Buffer Head Pointer and the Tail Pointer are on the same
cacheline, the Head Pointer must not be greater than the Tail
Pointer."

The easiest way to enforce this is to reduce the reported ring space.

References:
Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use"
Gen3 BSpec "vol1c Memory Interface Functions" / 2.3.4.5 "Ring Buffer Use"
Gen4+ BSpec "vol1c Memory Interface and Command Stream" / 5.3.4.5 "Ring Buffer Use"

v2: Include the exact BSpec references in the description

v3: s/64/I915_RING_FREE_SPACE, and add the BSpec information to the code
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

633cf8f5

01 12月, 2012 1 次提交

drm/i915: Allocate ringbuffers from stolen memory · ebc052e0

由 Chris Wilson 提交于 11月 15, 2012

Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ebc052e0

29 11月, 2012 2 次提交

drm/i915: Rearrange code to only have a single method for waiting upon the ring · 3e960501

由 Chris Wilson 提交于 11月 27, 2012

Replace the wait for the ring to be clear with the more common wait for
the ring to be idle. The principle advantage is one less exported
intel_ring_wait function, and the removal of a hardcoded value.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

3e960501

drm/i915: Preallocate next seqno before touching the ring · 9d773091

由 Chris Wilson 提交于 11月 27, 2012

Based on the work by Mika Kuoppala, we realised that we need to handle
seqno wraparound prior to committing our changes to the ring. The most
obvious point then is to grab the seqno inside intel_ring_begin(), and
then to reuse that seqno for all ring operations until the next request.
As intel_ring_begin() can fail, the callers must already be prepared to
handle such failure and so we can safely add further checks.

This patch looks like it should be split up into the interface
changes and the tweaks to move seqno wrapping from the execbuffer into
the core seqno increment. However, I found no easy way to break it into
incremental steps without introducing further broken behaviour.

v2: Mika found a silly mistake and a subtle error in the existing code;
inside i915_gem_retire_requests() we were resetting the sync_seqno of
the target ring based on the seqno from this ring - which are only
related by the order of their allocation, not retirement. Hence we were
applying the optimisation that the rings were synchronised too early,
fortunately the only real casualty there is the handling of seqno
wrapping.

v3: Do not forget to reset the sync_seqno upon module reinitialisation,
ala resume.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9d773091

22 11月, 2012 1 次提交

drm/i915: Use LRI to update the semaphore registers · 1c8b46fc

由 Chris Wilson 提交于 11月 14, 2012

The bspec was recently updated to remove the ability to update the
semaphore using the MI_SEMAPHORE_BOX command, the ability to wait upon
the semaphore value remained. Instead the advice is to update the
register using the MI_LOAD_REGISTER_IMM command. In cursory testing,
semaphores continue to function - the question is whether this fixes
some of the deadlocks where the semaphore registers contained stale
values?
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel J Blueman <daniel@quora.org>
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1c8b46fc

16 11月, 2012 1 次提交

drm/i915: Restore physical HWS_PGA after resume · 6b8294a4

由 Chris Wilson 提交于 11月 16, 2012

By always setting up the HWS register for both physical and virtual
address variations during render ring we can reduce the number of
different special cases that get set up at varying different times
during module load.

Fixes regression from

commit c630119f
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Oct 17 11:32:57 2012 +0200

    drm/i915: don't save/restore HWS_PGA reg for kms
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6b8294a4

12 11月, 2012 4 次提交

drm/i915: drop the double-OP_STOREDW usage in blt_ring_flush · b3fcabb1

由 Daniel Vetter 提交于 11月 04, 2012

This has been introduced in "drm/i915: TLB invalidation with
MI_FLUSH_DW requires a post-sync op".
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Reported-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b3fcabb1

drm/i915: PIPE_CONTROL TLB invalidate requires CS stall · 3ac78313

由 Jesse Barnes 提交于 10月 25, 2012

"If ENABLED, PIPE_CONTROL command will flush the in flight data  written
out by render engine to Global Observation point on flush done. Also
Requires stall bit ([20] of DW1) set."

So set the stall bit to ensure proper invalidation.
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: NAntti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

3ac78313

drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op v3 · 9a289771

由 Jesse Barnes 提交于 10月 26, 2012

So store into the scratch space of the HWS to make sure the invalidate
occurs.

v2: use GTT address space for store, clean up #defines (Chris)
v3: use correct #define in blt ring flush (Chris)
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: NAntti Koskipää <antti.koskipaa@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
References: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1063252Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9a289771

drm/i915/ringbuffer: exclude last 2 cachelines on 845g on all callpaths · 17f10fdc

由 Mika Kuoppala 提交于 10月 29, 2012

Make intel_render_ring_init_dri and intel_init_ring_buffer symmetrical
with regards of workaround introduced by:

commit 27c1cbd0
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Apr 9 13:59:46 2012 +0100

    drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

17f10fdc

18 10月, 2012 1 次提交

drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers · d7d4eedd

由 Chris Wilson 提交于 10月 17, 2012

With the introduction of per-process GTT space, the hardware designers
thought it wise to also limit the ability to write to MMIO space to only
a "secure" batch buffer. The ability to rewrite registers is the only
way to program the hardware to perform certain operations like scanline
waits (required for tear-free windowed updates). So we either have a
choice of adding an interface to perform those synchronized updates
inside the kernel, or we permit certain processes the ability to write
to the "safe" registers from within its command stream. This patch
exposes the ability to submit a SECURE batch buffer to
DRM_ROOT_ONLY|DRM_MASTER processes.

v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security
bit (bit 13, accidentally not set). Also add a comment explaining why
secure batches need a global gtt binding.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
[danvet: added hsw fixup.]
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d7d4eedd

03 10月, 2012 2 次提交

UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/ · 760285e7

由 David Howells 提交于 10月 02, 2012

Convert #include "..." to #include <path/...> in drivers/gpu/.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NDave Airlie <airlied@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NDave Jones <davej@redhat.com>

760285e7

UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/. · 4126d5d6

由 David Howells 提交于 10月 02, 2012

Remove redundant DRM UAPI header #inclusions from drivers/gpu/.

Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and
drm_sarea.h).  They are now #included via drmP.h and drm_crtc.h via a preceding
patch.

Without this patch and the patch to make include the UAPI headers from the core
headers, after the UAPI split, the DRM C sources cannot find these UAPI headers
because the DRM code relies on specific -I flags to make #include "..."  work
on headers in include/drm/ - but that does not work after the UAPI split without
adding more -I flags.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NDave Airlie <airlied@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NDave Jones <davej@redhat.com>

4126d5d6

20 9月, 2012 1 次提交

drm/i915: Replace the array of pages with a scatterlist · 9da3da66

由 Chris Wilson 提交于 6月 01, 2012

Rather than have multiple data structures for describing our page layout
in conjunction with the array of pages, we can migrate all users over to
a scatterlist.

One major advantage, other than unifying the page tracking structures,
this offers is that we replace the vmalloc'ed array (which can be up to
a megabyte in size) with a chain of individual pages which helps reduce
memory pressure.

The disadvantage is that we then do not have a simple array to iterate,
or to access randomly. The common case for this is in the relocation
processing, which will typically fit within a single scatterlist page
and so be almost the same cost as the simple array. For iterating over
the array, the extra function call could be optimised away, but in
reality is an insignificant cost of either binding the pages, or
performing the pwrite/pread.

v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the
trivial compile error from rebasing.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9da3da66

03 9月, 2012 3 次提交

drm/i915: add workarounds to gen7_render_ring_flush · f3987631

由 Paulo Zanoni 提交于 8月 17, 2012

From Bspec, Vol 2a, Section 1.9.3.4 "PIPE_CONTROL", intro section
detailing the various workarounds:

"[DevIVB {W/A}, DevHSW {W/A}]: Pipe_control with CS-stall bit
set must be issued before a pipe-control command that has the State
Cache Invalidate bit set."

Note that public Bspec has different numbering, it's Vol2Part1,
Section 1.10.4.1 "PIPE_CONTROL" there.

There's also a second workaround for the PIPE_CONTROL command itself:

"[DevIVB, DevVLV, DevHSW] {WA}: Every 4th PIPE_CONTROL command, not
counting the PIPE_CONTROL with only read-cache-invalidate bit(s) set,
must have a CS_STALL bit set"

For simplicity we simply set the CS_STALL bit on every pipe_control on
gen7+

Note that this massively helps on some hsw machines, together with the
following patch to unconditionally set the CS_STALL bit on every
pipe_control it prevents a gpu hang every few seconds.

This is a regression that has been introduced in the pipe_control
cleanup:

commit 6c6cf5aa
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 20 18:02:28 2012 +0100

    drm/i915: Only apply the SNB pipe control w/a to gen6

It looks like the massive snb pipe_control workaround also papered
over any issues on ivb and hsw.
Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: squashed both workarounds together, pimped commit message
with Bsepc citations, regression commit citation and changed the
comment in the code a bit to clarify that we unconditionally set
CS_STALL to avoid being hurt by trying to be clever.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f3987631

drm/i915: add workarounds directly to gen6_render_ring_flush · b3111509

由 Paulo Zanoni 提交于 8月 17, 2012

Since gen 7+ now run the new gen7_render_ring_flush function.
Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b3111509

drm/i915: add gen7_render_ring_flush · 4772eaeb

由 Paulo Zanoni 提交于 8月 17, 2012

For now, just a copy of gen6_render_ring_flush. Different gens have
different workarounds, so we want different functions.
Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4772eaeb

24 8月, 2012 1 次提交

drm/i915: Only pwrite through the GTT if there is space in the aperture · 86a1ee26

由 Chris Wilson 提交于 8月 11, 2012

Avoid stalling and waiting for the GPU by checking to see if there is
sufficient inactive space in the aperture for us to bind the buffer
prior to writing through the GTT. If there is inadequate space we will
have to stall waiting for the GPU, and incur overheads moving objects
about. Instead, only incur the clflush overhead on the target object by
writing through shmem.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

86a1ee26

14 8月, 2012 1 次提交

drm/i915: Apply post-sync write for pipe control invalidates · 7d54a904

由 Chris Wilson 提交于 8月 10, 2012

When invalidating the TLBs it is documentated as requiring a post-sync
write. Failure to do so seems to result in a GPU hang.

Exposure to this hang on IVB seems to be a result of removing the extra
stalls required for SNB pipecontrol workarounds:

commit 6c6cf5aa
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 20 18:02:28 2012 +0100

    drm/i915: Only apply the SNB pipe control w/a to gen6

Note: Manually switch the pipe_control cmd to 4 dwords to avoid a
(silent) functional conflict with -next. This way will get a loud (but
conflict with next (since the scratch_addr has been deleted there).

Reported-and-tested-by: yex.tian@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322Acked-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet: added note about merge conflict with -next.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7d54a904

10 8月, 2012 1 次提交

drm/i915: Lazily apply the SNB+ seqno w/a · b2eadbc8

由 Chris Wilson 提交于 8月 09, 2012

Avoid the forcewake overhead when simply retiring requests, as often the
last seen seqno is good enough to satisfy the retirment process and will
be promptly re-run in any case. Only ensure that we force the coherent
seqno read when we are explicitly waiting upon a completion event to be
sure that none go missing, and also for when we are reporting seqno
values in case of error or debugging.

This greatly reduces the load for userspace using the busy-ioctl to
track active buffers, for instance halving the CPU used by X in pushing
the pixels from a software render (flash). The effect will be even more
magnified with userptr and so providing a zero-copy upload path in that
instance, or in similar instances where X is simply compositing DRI
buffers.

v2: Reverse the polarity of the tachyon stream. Daniel suggested that
'force' was too generic for the parameter name and that 'lazy_coherency'
better encapsulated the semantics of it being an optimization and its
purpose. Also notice that gen6_get_seqno() is only used by gen6/7
chipsets and so the test for IS_GEN6 || IS_GEN7 is redundant in that
function.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b2eadbc8

08 8月, 2012 2 次提交

drm/i915: correctly order the ring init sequence · 0d8957c8

由 Daniel Vetter 提交于 8月 07, 2012

We may only start to set up the new register values after having
confirmed that the ring is truely off. Otherwise the hw might lose the
newly written register values. This is caught later on in the init
sequence, when we check whether the register writes have stuck.

Cc: stable@vger.kernel.org
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522Tested-by: NYang Guang <guang.a.yang@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0d8957c8

drm/i915: Only apply the SNB pipe control w/a to gen6 · 6c6cf5aa

由 Chris Wilson 提交于 7月 20, 2012

The requirements for the sync flush to be emitted prior to the render
cache flush is only true for SandyBridge. On IvyBridge and friends we
can just emit the flushes with an inline CS stall.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6c6cf5aa

26 7月, 2012 3 次提交

drm/i915: Macro to determine DPF support · e1ef7cc2

由 Ben Widawsky 提交于 7月 24, 2012

Originally I had a macro specifically for DPF support, and Daniel, with
good reason asked me to change it to this. It's not the way I would have
gone (and indeed I didn't), but for now there is no distinction as all
platforms with L3 also have DPF.

Note: The good reasons are that dpf is a l3$ feature (at least on
currrent hw), hence I don't expect one to go without the other.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
[danvet: added note]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

e1ef7cc2

drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcs · a7b9761d

由 Chris Wilson 提交于 7月 20, 2012

By moving the function to intel_ringbuffer and currying the appropriate
parameter, hopefully we make the callsites easier to read and
understand.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a7b9761d

drm/i915: Remove the per-ring write list · 69c2fc89

由 Chris Wilson 提交于 7月 20, 2012

This is now handled by a global flag to ensure we emit a flush before
the next serialisation point (if we failed to queue one previously).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

69c2fc89

20 7月, 2012 1 次提交

drm/i915: missing error case in init status page · 2e6c21ed

由 Ben Widawsky 提交于 7月 12, 2012

Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

2e6c21ed