提交 · cc889e0f6ce6a63c62db17d702ecfed86d58083f · openeuler / raspberrypi-kernel

20 6月, 2012 1 次提交

drm/i915: disable flushing_list/gpu_write_list · cc889e0f

由 Daniel Vetter 提交于 6月 13, 2012

This is just the minimal patch to disable all this code so that we can
do decent amounts of QA before we rip it all out.

The complicating thing is that we need to flush the gpu caches after
the batchbuffer is emitted. Which is past the point of no return where
execbuffer can't fail any more (otherwise we risk submitting the same
batch multiple times).

Hence we need to add a flag to track whether any caches associated
with that ring are dirty. And emit the flush in add_request if that's
the case.

Note that this has a quite a few behaviour changes:
- Caches get flushed/invalidated unconditionally.
- Invalidation now happens after potential inter-ring sync.

I've bantered around a bit with Chris on irc whether this fixes
anything, and it might or might not. The only thing clear is that with
these changes it's much easier to reason about correctness.

Also rip out a lone get_next_request_seqno in the execbuffer
retire_commands function. I've dug around and I couldn't figure out
why that is still there, with the outstanding lazy request stuff it
shouldn't be necessary.

v2: Chris Wilson complained that I also invalidate the read caches
when flushing after a batchbuffer. Now optimized.

v3: Added some comments to explain the new flushing behaviour.

Cc: Eric Anholt <eric@anholt.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

cc889e0f

14 6月, 2012 1 次提交

drm/i915/context: switch contexts with execbuf2 · 6e0a69db

由 Ben Widawsky 提交于 6月 04, 2012

Use the rsvd1 field in execbuf2 to specify the context ID associated
with the workload. This will allow the driver to do the proper context
switch when/if needed.

v2: Add checks for context switches on rings not supporting contexts.
Before the code would silently ignore such requests.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>

6e0a69db

20 5月, 2012 1 次提交

drm/i915: Check whether the ring is initialised prior to dispatch · a15817cf

由 Chris Wilson 提交于 5月 11, 2012

Rather than use the magic feature tests HAS_BLT/HAS_BSD just check
whether the ring we are about to dispatch the execbuffer on is
initialised.

v2: Use intel_ring_initialized()
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a15817cf

08 5月, 2012 1 次提交

drm/i915: Limit calling mark-busy only for potential scanouts · acb87dfb

由 Chris Wilson 提交于 5月 03, 2012

The principle of intel_mark_busy() is that we want to spot the
transition of when the display engine is being used in order to bump
powersaving modes and increase display clocks. As such it is only
important when the display is changing, i.e. when rendering to the
scanout or other sprite/plane, and these are characterised by being
pinned.

v2: Mark the whole device as busy on execbuffer and pageflips as well
and rebase against dinq for the minor bug fix to be immediately
applicable.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet: fix compile fail.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

acb87dfb

03 5月, 2012 2 次提交

drm/i915: disallow clip rects on gen5+ · 6ebebc92

由 Daniel Vetter 提交于 4月 26, 2012

Unfortunately there has been dri1 userspace that used gem to manage
the gtt and hence also needed cliprects in the execbuf ioctl. So
we can't ever remove that code without breaking the ioctl abi.

But at least we can disable it on gen5+, because these horrible
versions of mesa have not supported these chips.
Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6ebebc92

drm/i915: remove do_retire from i915_wait_request · b2da9fe5

由 Ben Widawsky 提交于 4月 26, 2012

This originates from a hack by me to quickly fix a bug in an earlier
patch where we needed control over whether or not waiting on a seqno
actually did any retire list processing. Since the two operations aren't
clearly related, we should pull the parameter out of the wait function,
and make the caller responsible for retiring if the action is desired.

The only function call site which did not get an explicit retire_request call
(on purpose) is i915_gem_inactive_shrink(). That code was already calling
retire_request a second time.

v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel)
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b2da9fe5

24 4月, 2012 2 次提交

drm/i915: fix integer overflow in i915_gem_do_execbuffer() · 44afb3a0

由 Xi Wang 提交于 4月 23, 2012

On 32-bit systems, a large args->num_cliprects from userspace via ioctl
may overflow the allocation size, leading to out-of-bounds access.

This vulnerability was introduced in commit 432e58ed ("drm/i915: Avoid
allocation for execbuffer object list").
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

44afb3a0

drm/i915: fix integer overflow in i915_gem_execbuffer2() · ed8cd3b2

由 Xi Wang 提交于 4月 23, 2012

On 32-bit systems, a large args->buffer_count from userspace via ioctl
may overflow the allocation size, leading to out-of-bounds access.

This vulnerability was introduced in commit 8408c282 ("drm/i915:
First try a normal large kmalloc for the temporary exec buffers").
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ed8cd3b2

18 4月, 2012 2 次提交

drm/i915: Remove the pipelined parameter from get_fence() · 06d98131

由 Chris Wilson 提交于 4月 17, 2012

We never succeeded in getting pipelined fencing to work (unresolved
spurious GPU hangs), so begin the process of dismantling and removal
the broken code.

Step 1 is the removal of the pipeline parameter to get_fence().
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

06d98131

drm/i915: Always flush tiling changes before accessing through the GTT · 7b09638f

由 Chris Wilson 提交于 4月 14, 2012

As we defer updating the fence register from set-tiling to the point of
use, we need to declare every access through the GTT as either fenced or
unfenced.

This patches fixes an old bug in the execbuffer relocation processing
which could conceivably be hit by a pathological userspace.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7b09638f

13 4月, 2012 2 次提交

drm/i915: use semaphores for the display plane · 2911a35b

由 Ben Widawsky 提交于 4月 05, 2012

In theory this will have performance and power improvements. Performance
because we don't need to stall when the scanout BO is busy, and power
because we don't have to stall when the BO is busy (and the ring can
even go to sleep if the HW supports it).

v2:
squash 2 patches into 1 (me)
un-inline the enable_semaphores function (Daniel)
remove comment about SNB hangs from i915_gem_object_sync (Chris)
rename intel_enable_semaphores to i915_semaphore_is_enabled (me)
removed page flip comment; "no why" (Chris)

To address other comments from Daniel (irc):
update the comment to say 'vt-d is crap, don't enable semaphores'
  - I think you misinterpreted Chris' comment, it already exists.
checking out whether we can pageflip on the render ring on ivb (didn't
work on early silicon)
  - We don't want to enable workarounds for early silicon unless we have
    to.
  - I can't find any references in the docs about this.
optionally use it if the fb is already busy on the render ring
  - This should be how the code already worked, unless I am
    misunderstanding your meaning.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

2911a35b

drm/i915: Reorganise rules for get_fence/put_fence · 9a5a53b3

由 Chris Wilson 提交于 3月 22, 2012

By simplifying the rules to calling get_fence when writing to the
through the GTT in a tiled manner, and calling put_fence before writing
to the object through the GTT in a linear manner, the code becomes
clearer and there is less chance of making a mistake.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
[danvet: fixed up conflict with ppgtt code and spelling in a new
comment.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9a5a53b3

01 4月, 2012 1 次提交

drm/i915: Mark untiled BLT commands as fenced on gen2/3 · 7dd49065

由 Chris Wilson 提交于 3月 21, 2012

The BLT commands on gen2/3 utilize the fence registers and so we cannot
modify any fences for the object whilst those commands are in flight.
Currently we marked tiled commands as occupying a fence, but forgot to
restrict the untiled commands from preventing a fence being assigned
before they were completed.

One side-effect is that we ten have to double check that a fence was
allocated for a fenced buffer during move-to-active.
Reported-by: NJiri Slaby <jirislaby@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43427
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47990Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Testcase: i-g-t/tests/gem_tiled_after_untiled_blt
Tested-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7dd49065

27 3月, 2012 2 次提交

mm: extend prefault helpers to fault in more than PAGE_SIZE · f56f821f

由 Daniel Vetter 提交于 3月 25, 2012

drm/i915 wants to read/write more than one page in its fastpath
and hence needs to prefault more than PAGE_SIZE bytes.

Add new functions in filemap.h to make that possible.

Also kill a copy&pasted spurious space in both functions while at it.

v2: As suggested by Andrew Morton, add a multipage parameter to both
functions to avoid the additional branch for the pagemap.c hotpath.
My gcc 4.6 here seems to dtrt and indeed reap these branches where not
needed.

v3: Becaus I couldn't find a way around adding a uaddr += PAGE_SIZE to
the filemap.c hotpaths (that the compiler couldn't remove again),
let's go with separate new functions for the multipage use-case.

v4: Adjust comment to CodingStlye and fix spelling.
Acked-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f56f821f

drm/i915: Avoid using mappable space for relocation processing through the CPU · dabdfe02

由 Chris Wilson 提交于 3月 26, 2012

We try to avoid writing the relocations through the uncached GTT, if the
buffer is currently in the CPU write domain and so will be flushed out to
main memory afterwards anyway. Also on SandyBridge we can safely write
to the pages in cacheable memory, so long as the buffer is LLC mapped.
In either of these cases, we therefore do not need to force the
reallocation of the buffer into the mappable region of the GTT, reducing
the aperture pressure.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

dabdfe02

26 3月, 2012 1 次提交

drm/i915: Batch copy_from_user for relocation processing · 1d83f442

由 Chris Wilson 提交于 3月 24, 2012

Originally the code tried to allocate a large enough array to perform
the copy using vmalloc, performance wasn't great and throughput was
improved by processing each individual relocation entry separately.
This too is not as efficient as one would desire. A compromise would be
to allocate a single page, or to allocate a few entries on the stack,
and process the copy in batches. The latter gives simpler code and more
consistent performance due to a lack of heuristic.

x11perf -copywinwin10: n450/pnv i3-330m i5-2520m (cpu)
before: 249000 785000 1280000 (80%)
page: 264000 896000 1280000 (65%)
on-stack: 264000 902000 1280000 (67%)

v2: Use 512-bytes of stack for batching rather than allocate a page.
v3: Tidy the code slightly with more descriptive variable names
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1d83f442

21 3月, 2012 1 次提交

drm/i915: implement SNB workaround for lazy global gtt · 149c8407

由 Daniel Vetter 提交于 2月 15, 2012

PIPE_CONTROL on snb needs global gtt mappings in place to workaround a
hw gotcha. No other commands need such a workaround. Luckily we can
detect a PIPE_CONTROL commands easily because they have a write_domain
= I915_GEM_DOMAIN_INSTRUCTION (and nothing else has that).

v2: Binding the target of such a reloc into the global gtt actually
works instead of binding the source, which is rather pointless ...

v3: Kill a superflous has_global_gtt_mapping assignement noticed by
Chris Wilson.
Reviewed-and-tested-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

149c8407

10 2月, 2012 1 次提交

drm/i915: ppgtt binding/unbinding support · 7bddb01f

由 Daniel Vetter 提交于 2月 09, 2012

This adds support to bind/unbind objects and wires it up. Objects are
only put into the ppgtt when necessary, i.e. at execbuf time.

Objects are still unconditionally put into the global gtt.

v2: Kill the quick hack and explicitly pass cache_level to ppgtt_bind
like for the global gtt function. Noticed by Chris Wilson.
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Tested-by: NChris Wilson <chris@chris-wilson.co.uk>
Tested-by: NEugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7bddb01f

09 2月, 2012 1 次提交

drm/i915: s/DRM_ERROR/DRM_DEBUG in i915_gem_execbuffer.c · ff240199

由 Daniel Vetter 提交于 1月 31, 2012

These are all user-trigerable, so tune down their loudness a notch.
For some of these we have i-g-t tests (because they prevent
newly-discovered bugs), without this patches running the test suite
leaves behind a dirty dmesg.
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ff240199

30 1月, 2012 3 次提交

drm/i915: reject GTT domain in relocations · 4ca4a250

由 Daniel Vetter 提交于 12月 14, 2011

This confuses our domain tracking and can (for gtt write domains) lead
to a subsequent oops.

Tested by tests/gem_exec_bad_domains from i-g-t.
Reviewed-by: NEric Anholt <eric@anholt.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4ca4a250

drm/i915: Separate fence pin counting from normal bind pin counting · 1690e1eb

由 Chris Wilson 提交于 12月 14, 2011

In order to correctly account for reserving space in the GTT and fences
for a batch buffer, we need to independently track whether the fence is
pinned due to a fenced GPU access in the batch or whether the buffer is
pinned in the aperture. Currently we count the fenced as pinned if the
buffer has already been seen in the execbuffer. This leads to a false
accounting of available fence registers, causing frequent mass evictions.
Worse, if coupled with the change to make i915_gem_object_get_fence()
report EDADLK upon fence starvation, the batchbuffer can fail with only
one fence required...

Fixes intel-gpu-tools/tests/gem_fenced_exec_thrash

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38735Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: NPaul Neumann <paul104x@yahoo.de>
[danvet: Resolve the functional conflict with Jesse Barnes sprite
patches, acked by Chris Wilson on irc.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1690e1eb

drm/i915: switch ring->id to be a real id · 96154f2f

由 Daniel Vetter 提交于 12月 14, 2011

... and add a helpr function for the places where we want a flag.

This way we can use ring->id to index into arrays.

v2: Resurrect the missing beautification-space Chris Wilson noted.
I'm moving this space around because I'll reuse ring_str in the next
patch.
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

96154f2f

26 1月, 2012 1 次提交

drm/i915: argument to control retiring behavior · b93f9cf1

由 Ben Widawsky 提交于 1月 25, 2012

Sometimes it may be the case when we idle the gpu or wait on something
we don't actually want to process the retiring list. This patch allows
callers to choose the behavior.
Reviewed-by: NKeith Packard <keithp@keithp.com>
Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b93f9cf1

04 1月, 2012 3 次提交

drm/i915: Add support for resetting the SO write pointers on gen7. · ae662d31

由 Eric Anholt 提交于 1月 03, 2012

These registers are automatically incremented by the hardware during
transform feedback to track where the next streamed vertex output
should go.  Unlike the previous generation, which had a packet for
setting the corresponding registers to a defined value, gen7 only has
MI_LOAD_REGISTER_IMM to do so.  That's a secure packet (since it loads
an arbitrary register), so we need to do it from the kernel, and it
needs to be settable atomically with the batchbuffer execution so that
two clients doing transform feedback don't stomp on each others'
state.

Instead of building a more complicated interface involcing setting the
registers to a specific value, just set them to 0 when asked and
userland can tweak its pointers accordingly.
Signed-off-by: NEric Anholt <eric@anholt.net>
Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: NKenneth Graunke <kenneth@whitecape.org>
Signed-off-by: NKeith Packard <keithp@keithp.com>

ae662d31

drm/i915: Force sync command ordering (Gen6+) · 84f9f938

由 Ben Widawsky 提交于 12月 12, 2011

The docs say this is required for Gen7, and since the bit was added for
Gen6, we are also setting it there pit pf paranoia. Particularly as
Chris points out, if PIPE_CONTROL counts as a 3d state packet.

This was found through doc inspection by Ken and applies to Gen6+;
Reported-by: NKenneth Graunke <kenneth@whitecape.org>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NKeith Packard <keithp@keithp.com>

84f9f938

drm/i915: relative_constants_mode race fix · e2971bda

由 Ben Widawsky 提交于 12月 12, 2011

dev_priv keeps track of the current addressing mode that gets set at
execbuffer time. Unfortunately the existing code was doing this before
acquiring struct_mutex which leaves a race with another thread also
doing an execbuffer. If that wasn't bad enough, relocate_slow drops
struct_mutex which opens a much more likely error where another thread
comes in and modifies the state while relocate_slow is being slow.

The solution here is to just defer setting this state until we
absolutely need it, and we know we'll have struct_mutex for the
remainder of our code path.

v2: Keith noticed a bug in the original patch.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NKeith Packard <keithp@keithp.com>

e2971bda

27 12月, 2011 1 次提交

drm/i915: Disable semaphores by default on SNB · ebbd857e

由 Keith Packard 提交于 12月 26, 2011

Semaphores still cause problems on some machines:

> From Udo Steinberg:
>
> With Linux-3.2-rc6 I'm frequently seeing GPU hangs when large amounts of
> text scroll in an xterm, such as when extracting a tar archive. Such as this
> one (note the timestamps):
>
>  I can reproduce it fairly easily with something
>  as simple as:
>
>	  while true; do dmesg; done

This patch turns them off on SNB while leaving them on for IVB.
Reported-by: NUdo Steinberg <udo@hypervisor.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Eugeni Dodonov <eugeni@dodonov.net>
Signed-off-by: NKeith Packard <keithp@keithp.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ebbd857e

17 12月, 2011 1 次提交

drm/i915: enable semaphores on per-device defaults · f45b5557

由 Eugeni Dodonov 提交于 12月 09, 2011

This adds a default setting for semaphores parameter, and enables
semaphores by default on IVB.

For now, as semaphores interaction with VTd causes random issues on
SNB, we do not enable them by default. But they can still be enabled
via the semaphores=1 kernel parameter.

v2: enables semaphores on SNB when IO remapping is disabled, with base
on Keith Packard patch.

CC: Daniel Vetter <daniel.vetter@ffwll.ch>
CC: Ben Widawsky <ben@bwidawsk.net>
CC: Keith Packard <keithp@keithp.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42696
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40564
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41353
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38862Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NEugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: NKeith Packard <keithp@keithp.com>

f45b5557

22 9月, 2011 1 次提交

drm/i915: Dumb down the semaphore logic · c8c99b0f

由 Ben Widawsky 提交于 9月 14, 2011

While I think the previous code is correct, it was hard to follow and
hard to debug. Since we already have a ring abstraction, might as well
use it to handle the semaphore updates and compares.

I don't expect this code to make semaphores better or worse, but you
never know...

v2:
Remove magic per Keith's suggestions.
Ran Daniel's gem_ring_sync_loop test on this.

v3:
Ignored one of Keith's suggestions.

v4:
Removed some bloat per Daniel's recommendation.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NKeith Packard <keithp@keithp.com>

c8c99b0f

22 6月, 2011 1 次提交

Revert "drm/i915: Kill GTT mappings when moving from GTT domain" · e92d03bf

由 Eric Anholt 提交于 6月 14, 2011

This reverts commit 4a684a41.
Userland has always been required to set the object's domain to GTT
before using it through a GTT mapping, it's not something that the
kernel is supposed to enforce.  (The pagefault support is so that we
can handle multiple mappings without userland having to pin across
them, not so that userland can use GTT after GPU domains without
telling the kernel).

Fixes 19.2% +/- 0.8% (n=6) performance regression in cairo-gl
firefox-talos-gfx on my T420 latop.
Signed-off-by: NKeith Packard <keithp@keithp.com>

e92d03bf

23 3月, 2011 1 次提交

drm/i915: Disable pagefaults along execbuffer relocation fast path · d4aeee77

由 Chris Wilson 提交于 3月 14, 2011

Along the fast path for relocation handling, we attempt to copy directly
from the user data structures whilst holding our mutex. This causes
lockdep to warn about circular lock dependencies if we need to pagefault
the user pages. [Since when handling a page fault on a mmapped bo, we
need to acquire the struct mutex whilst already holding the mm
semaphore, it is then verboten to acquire the mm semaphore when already
holding the struct mutex. The likelihood of the user passing in the
relocations contained in a GTT mmaped bo is low, but conceivable for
extreme pathology.] In order to force the mm to return EFAULT rather
than handle the pagefault, we therefore need to disable pagefaults
across the relocation fast path.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d4aeee77

07 3月, 2011 2 次提交

drm/i915: Only wait on a pending flip if we intend to write to the buffer · c59a333f

由 Chris Wilson 提交于 3月 06, 2011

... as if we are only reading from it, we can do that concurrently with
the queue flip.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

c59a333f

drm/i915: Disable GPU semaphores by default · a1656b90

由 Chris Wilson 提交于 3月 04, 2011

Andi Kleen narrowed his GPU hangs on his Sugar Bay (SNB desktop) rev 09
down to the use of GPU semaphores, and we already know that they appear
broken up to Huron River (mobile) rev 08. (I'm optimistic that disabling
GPU semaphores is simply hiding another bug by the latency and
side-effects of the additional device interaction it introduces...)

However, use of semaphores is a massive performance improvement... Only
as long as the system remains stable. Enable at your peril.
Reported-by: NAndi Kleen <andi-fd@firstfloor.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33921Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

a1656b90

02 3月, 2011 2 次提交

drm/i915: Re-enable GPU semaphores for SandyBridge mobile · e8b2c3c4

由 Chris Wilson 提交于 3月 01, 2011

This seems to be running stably on my test laptop, so hopefully the
reported hangs where just symptoms of other bugs.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

e8b2c3c4

drm/i915: Allow relocation deltas outside of target bo · 271d81b8

由 Chris Wilson 提交于 3月 01, 2011

Userspace has a legitimate requirement to use a delta that points to
outside of the target bo, and so we need to enable this. (As this is an
abi break, albeit a relaxation of the current restrictions, mark the change
with a new flag.)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

271d81b8

22 2月, 2011 3 次提交

drm/i915: Use a device flag for non-interruptible phases · ce453d81

由 Chris Wilson 提交于 2月 21, 2011

The code paths for modesetting are growing in complexity as we may need
to move the buffers around in order to fit the scanout in the aperture.
Therefore we face a choice as to whether to thread the interruptible status
through the entire pinning and unbinding code paths or to add a flag to
the device when we may not be interrupted by a signal. This does the
latter and so fixes a few instances of modesetting failures under stress.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

ce453d81

drm/i915: First try a normal large kmalloc for the temporary exec buffers · 8408c282

由 Chris Wilson 提交于 2月 21, 2011

As we just need a temporary array whilst performing the relocations for
the execbuffer, first attempt to allocate using kmalloc even if it is
not of order page-0. This avoids the overhead of remapping the
discontiguous array and so gives a moderate boost to execution
throughput.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

8408c282

drm/i915: Protect against drm_gem_object not being the first member · c8725226

由 Chris Wilson 提交于 2月 19, 2011

Dave Airlie spotted that we had a potential bug should we ever rearrange
the drm_i915_gem_object so not the base drm_gem_object was not its first
member. He noticed that we often convert the return of
drm_gem_object_lookup() immediately into drm_i915_gem_object and then
check the result for nullity. This is only valid when the base object is
the first member and so the superobject has the same address. Play safe
instead and use the compiler to convert back to the original return
address for sanity testing.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

c8725226

07 2月, 2011 1 次提交

drm/i915: Refine tracepoints · db53a302

由 Chris Wilson 提交于 2月 03, 2011

A lot of minor tweaks to fix the tracepoints, improve the outputting for
ftrace, and to generally make the tracepoints useful again. It is a start
and enough to begin identifying performance issues and gaps in our
coverage.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

db53a302

23 1月, 2011 1 次提交

drm/i915: Fix use of invalid array size for ring->sync_seqno · 076e2c0e

由 Chris Wilson 提交于 1月 21, 2011

There are I915_NUM_RINGS-1 inter-ring synchronisation counters, but we
were clearing I915_NUM_RINGS of them. Oops.
Reported-by: NJiri Slaby <jirislaby@gmail.com>
Tested-by: NJiri Slaby <jirislaby@gmail.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>

076e2c0e