提交 · 437c30874cb90537e6275f9ed6f6b88307a906a1 · openeuler / Kernel

06 8月, 2016 4 次提交

drm/i915: Update comment before i915_spin_request · 437c3087

由 Daniel Vetter 提交于 8月 05, 2016

~jiffie and a few usecs is 3 orders of magnitude different. A bit
much. This was changed in

commit ca5b721e
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Dec 11 11:32:58 2015 +0000

    drm/i915: Limit the busy wait on requests to 5us not 10ms!

But probably missed the comment since the change was non-local to the
comment.

v2: Polish comment more (Chris).

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470413484-23775-1-git-send-email-daniel.vetter@ffwll.ch

437c3087

drm/i915: Use drm official vblank_no_hw_counter callback. · 4194c088

由 Rodrigo Vivi 提交于 8月 03, 2016

No functional change. Instead of defining a new empty function
let's use what is available on drm.

It gets cleaner, and easy to read, and understand.
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>

4194c088

drm/i915: Fix copy_to_user usage for pipe_crc · 4e9121e6

由 Rodrigo Vivi 提交于 8月 03, 2016

Copy to user return the number of bytes it couldn't write
and zero on success. So any number different than 0 should
be considered a fault, not only when it doesn't write
the full size.

v2: fixed the inverted logic. (Ville)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>

4e9121e6

Revert "drm/i915: Track active streams also for DP SST" · 19e0b4ca

由 Ville Syrjälä 提交于 8月 05, 2016

This reverts commit f64425a8.

active_streams will get totally out of whack with SST unless we
sync up with the hw state at readout, obviously! We don't yet
do that, so now the WARNs fire all the time. Let's revert :(
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470413142-26402-1-git-send-email-ville.syrjala@linux.intel.com
References: https://bugs.freedesktop.org/show_bug.cgi?id=95472#c14Acked-by: NChris Wilson <chris@chris-wilson.co.uk>

19e0b4ca

05 8月, 2016 36 次提交

drm/i915: fix WaInsertDummyPushConstPs · 575e3ccb

由 Matthew Auld 提交于 8月 02, 2016

As pointed out by Chris Harris, we are using the wrong WA name, it
should in fact be WaToEnableHwFixForPushConstHWBug, also it should be
applied from C0 onwards for both BXT and KBL.

Fixes: 7b9005cd ("drm/i915: Add WaInsertDummyPushConstP for bxt and kbl")

Cc: Chris Harris <chris.harris@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reported-by: NChris Harris <chris.harris@intel.com>
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470127013-29653-1-git-send-email-matthew.auld@intel.com

575e3ccb

drm/i915: Assert that the request hasn't been retired · 209b3f7e

由 Chris Wilson 提交于 8月 05, 2016

With all callers now not playing tricks with dropping the struct_mutex
between waiting and retiring, we can assert that the request is ready to
be retired.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-19-git-send-email-chris@chris-wilson.co.uk

209b3f7e

drm/i915: Repack fence tiling mode and stride into a single integer · 3e510a8e

由 Chris Wilson 提交于 8月 05, 2016

In the previous commit, we moved the obj->tiling_mode out of a bitfield
and into its own integer so that we could safely use READ_ONCE(). Let us
now repair some of that damage by sharing the tiling_mode with its
companion, the fence stride.

v2: New magic
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-18-git-send-email-chris@chris-wilson.co.uk

3e510a8e

drm/i915: Document and reject invalid tiling modes · deeb1519

由 Chris Wilson 提交于 8月 05, 2016

Through the GTT interface to the fence registers, we can only handle
linear, X and Y tiling. The more esoteric tiling patterns are ignored.
Document that the tiling ABI only supports upto Y tiling, and reject any
attempts to set a tiling mode other than NONE, X or Y.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-17-git-send-email-chris@chris-wilson.co.uk

deeb1519

drm/i915: Remove locking for get_tiling · 9ad36761

由 Chris Wilson 提交于 8月 05, 2016

Since we are not concerned with userspace racing itself with set-tiling
(the order is indeterminant even if we take a lock), then we can safely
read back the single obj->tiling_mode and do the static lookup of
swizzle mode without having to take a lock.

get-tiling is reasonably frequent due to the back-channel passing around
of tiling parameters in DRI2/DRI3.

v2: Make tiling_mode a full unsigned int so that we can trivially use it
with READ_ONCE(). Separating it out into manual control over the flags
field was too noisy for a simple patch. Note that we could use the lower
bits of obj->stride for the tiling mode.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-16-git-send-email-chris@chris-wilson.co.uk

9ad36761

drm/i915: Remove pinned check from madvise ioctl · e883d735

由 Chris Wilson 提交于 8月 05, 2016

We don't need to incur the overhead of checking whether the object is
pinned prior to changing its madvise. If the object is pinned, the
madvise will not take effect until it is unpinned and so we cannot free
the pages being pointed at by hardware. Marking a pinned object with
allocated pages as DONTNEED will not trigger any undue warnings. The check
is therefore superfluous, and by removing it we can remove a linear walk
over all the vma the object has.

Still despite it being an overzealous check, that error code is part of
the current ABI and so we must proceed with caution.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-15-git-send-email-chris@chris-wilson.co.uk

e883d735

drm/i915: Reduce locking inside swfinish ioctl · c21724cc

由 Chris Wilson 提交于 8月 05, 2016

We only need to take the struct_mutex if the object is pinned to the
display engine and so requires checking for clflush. (The race with
userspace pinning the object to a framebuffer is irrelevant.)

v2: Use access once for compiler hints (or not as it is a bitfield)
v3: READ_ONCE, obj->pin_display is not a bitfield anymore
v4: Don't be creative with goto.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-14-git-send-email-chris@chris-wilson.co.uk

c21724cc

drm/i915: Remove (struct_mutex) locking for busy-ioctl · 3fdc13c7

由 Chris Wilson 提交于 8月 05, 2016

By applying the same logic as for wait-ioctl, we can query whether a
request has completed without holding struct_mutex. The biggest impact
system-wide is removing the flush_active and the contention that causes.

Testcase: igt/gem_busy
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-13-git-send-email-chris@chris-wilson.co.uk

3fdc13c7

drm/i915: Remove (struct_mutex) locking for wait-ioctl · 033d549b

由 Chris Wilson 提交于 8月 05, 2016

With a bit of care (and leniency) we can iterate over the object and
wait for previous rendering to complete with judicial use of atomic
reference counting. The ABI requires us to ensure that an active object
is eventually flushed (like the busy-ioctl) which is guaranteed by our
management of requests (i.e. everything that is submitted to hardware is
flushed in the same request). All we have to do is ensure that we can
detect when the requests are complete for reporting when the object is
idle (without triggering ETIME), locklessly - this is handled by
i915_gem_active_wait_unlocked().

The impact of this is actually quite small - the return to userspace
following the wait was already lockless and so we don't see much gain in
latency improvement upon completing the wait. What we do achieve here is
completing an already finished wait without hitting the struct_mutex,
our hold is quite short and so we are typically just a victim of
contention rather than a cause - but it is still one less contention
point!

v2: Break up a long line.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-12-git-send-email-chris@chris-wilson.co.uk

033d549b

drm/i915: Do a nonblocking wait first in pread/pwrite · 258a5ede

由 Chris Wilson 提交于 8月 05, 2016

If we try and read or write to an active request, we first must wait
upon the GPU completing that request. Let's do that without holding the
mutex (and so allow someone else to access the GPU whilst we wait). Upon
completion, we will acquire the mutex and only then start the operation
(i.e. we do not rely on state from before the initial wait).

v2: Repaint the goto labels
v3: Move the tracepoints back to the start of the ioctls
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-11-git-send-email-chris@chris-wilson.co.uk

258a5ede

drm/i915: Remove unused no-shrinker-steal · 3b4e896f

由 Chris Wilson 提交于 8月 05, 2016

After removing the user of this wart, we can remove the wart entirely.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-10-git-send-email-chris@chris-wilson.co.uk

3b4e896f

drm/i915: Tidy generation of the GTT mmap offset · f3f6184c

由 Chris Wilson 提交于 8月 05, 2016

If we make the observation that mmap-offsets are only released when we
free an object, we can then deduce that the shrinker only creates free
space in the mmap arena indirectly by flushing the request list and
freeing expired objects. If we combine this with the lockless
vma-manager and lockless idling, we can avoid taking our big struct_mutex
until we need to actually free the requests.

One side-effect is that we defer the madvise checking until we need the
pages (i.e. the fault handler). This brings us into line with the other
delayed checks (and madvise in general).

v2: s/ret/err/ and use if (!err) rather than if (ret == 0)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-9-git-send-email-chris@chris-wilson.co.uk

f3f6184c

drm/i915/shrinker: Wait before acquiring struct_mutex under oom · 5cba5be6

由 Chris Wilson 提交于 8月 05, 2016

We can now wait for the GPU (all engines) to become idle without
requiring the struct_mutex. Inside the shrinker, we need to currently
take the struct_mutex in order to purge objects and to purge the objects
we need the GPU to be idle - causing a stall whilst we hold the
struct_mutex. We can hide most of that stall by performing the wait
before taking the struct_mutex and only doing essential waits for
new rendering on objects to be freed.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-8-git-send-email-chris@chris-wilson.co.uk

5cba5be6

drm/i915: Simplify do_idling() (Ironlake vt-d w/a) · 307dc25b

由 Chris Wilson 提交于 8月 05, 2016

Now that we pass along the expected interruptible nature for the
wait-for-idle, we do not need to modify the global
i915->mm.interruptible for this single call. (Only the immediate call to
i915_gem_wait_for_idle() takes the interruptible status as the other
action, dma_map_sg(), is independent of i915.ko)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-7-git-send-email-chris@chris-wilson.co.uk

307dc25b

drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex · dcff85c8

由 Chris Wilson 提交于 8月 05, 2016

The principal motivation for this was to try and eliminate the
struct_mutex from i915_gem_suspend - but we still need to hold the mutex
current for the i915_gem_context_lost(). (The issue there is that there
may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
struct_mutex via the stop_machine().) For the moment, enabling last
request tracking for the engine, allows us to do busyness checking and
waiting without requiring the struct_mutex - which is useful in its own
right.

As a side-effect of having a robust means for tracking engine busyness,
we can replace our other busyness heuristic, that of comparing against
the last submitted seqno. For paranoid reasons, we have a semi-ordered
check of that seqno inside the hangchecker, which we can now improve to
an ordered check of the engine's busyness (removing a locked xchg in the
process).

v2: Pass along "bool interruptible" as being unlocked we cannot rely on
i915->mm.interruptible being stable or even under our control.
v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk

dcff85c8

drm/i915: Remove forced stop ring on suspend/unload · 90f4fcd5

由 Chris Wilson 提交于 8月 05, 2016

Before suspending (or unloading), we would first wait upon all rendering
to be completed and then disable the rings. This later step is a remanent
from DRI1 days when we did not use request tracking for all operations
upon the ring. Now that we are sure we are waiting upon the very last
operation by the engine, we can forgo clobbering the ring registers,
though we do keep the assert that the engine is indeed idle before
sleeping.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-5-git-send-email-chris@chris-wilson.co.uk

90f4fcd5

drm/i915/userptr: Remove superfluous interruptible=false on waiting · f826ee21

由 Chris Wilson 提交于 8月 05, 2016

Inside the kthread context, we can't be interrupted by signals so
touching the mm.interruptible flag is pointless and wait-request now
consumes EIO itself.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-4-git-send-email-chris@chris-wilson.co.uk

f826ee21

drm/i915: Convert non-blocking userptr waits for requests over to using RCU · 8a3b3d57

由 Chris Wilson 提交于 8月 05, 2016

We can completely avoid taking the struct_mutex around the non-blocking
waits by switching over to the RCU request management (trading the mutex
for a RCU read lock and some complex atomic operations). The improvement
is that we gain further contention reduction, and overall the code
become simpler due to the reduced mutex dancing.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-3-git-send-email-chris@chris-wilson.co.uk

8a3b3d57

drm/i915: Convert non-blocking waits for requests over to using RCU · b8f9096d

由 Chris Wilson 提交于 8月 05, 2016

v2: Move i915_gem_fault tracepoint back to the start of the function,
before the unlocked wait.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-2-git-send-email-chris@chris-wilson.co.uk

b8f9096d

drm/i915: Introduce i915_gem_active_wait_unlocked() · 2467658e

由 Chris Wilson 提交于 8月 05, 2016

It is useful to be able to wait on pending rendering without grabbing
the struct_mutex. We can do this by using the i915_gem_active_get_rcu()
primitive to acquire a reference to the pending request without
requiring struct_mutex, just the RCU read lock, and then call
i915_wait_request().

v2: Rebase onto new i915_gem_active_get_unlocked() semantics that take
the RCU read lock on behalf of the caller.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-1-git-send-email-chris@chris-wilson.co.uk

2467658e

Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued · 94558e26

由 Daniel Vetter 提交于 8月 05, 2016

Backmerge the 4.8 pull request state from Dave - conflicts were
getting out of hand, and Chris has some patches which outright don't
apply without everything merged together again.
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

94558e26

drm/i915: Fix iboost setting for SKL Y/U DP DDI buffer translation entry 2 · 5ac90567

由 Ville Syrjälä 提交于 8月 02, 2016

The spec was recently fixed to have the correct iboost setting for the
SKL Y/U DP DDI buffer translation table entry 2. Update our tables
to match.

Cc: David Weinehall <david.weinehall@linux.intel.com>
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470140517-13011-1-git-send-email-ville.syrjala@linux.intel.com
Cc: stable@vger.kernel.org
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>

5ac90567

drm/i915/gen9: Give one extra block per line for SKL plane WM calculations · 055c3ff6

由 Matt Roper 提交于 8月 04, 2016

The bspec was updated a couple weeks ago to add an extra block per line
to plane watermark calculations for linear pixel formats.

Bspec update 115327 description:
  "Gen9+ - Updated the plane blocks per line calculation for linear
  cases. Adds +1 for all linear cases to handle the non-block aligned
  stride cases."

Cc: Lyude <cpaul@redhat.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470344880-27394-1-git-send-email-matthew.d.roper@intel.comReviewed-by: NLyude <cpaul@redhat.com>

055c3ff6

drm/i915: Export our request as a dma-buf fence on the reservation object · ad778f89

由 Chris Wilson 提交于 8月 04, 2016

If the GEM objects being rendered with in this request have been
exported via dma-buf to a third party, hook ourselves into the dma-buf
reservation object so that the third party can serialise with our
rendering via the dma-buf fences.

Testcase: igt/prime_busy
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-26-git-send-email-chris@chris-wilson.co.uk

ad778f89

drm/i915: Enable lockless lookup of request tracking via RCU · 0eafec6d

由 Chris Wilson 提交于 8月 04, 2016

If we enable RCU for the requests (providing a grace period where we can
inspect a "dead" request before it is freed), we can allow callers to
carefully perform lockless lookup of an active request.

However, by enabling deferred freeing of requests, we can potentially
hog a lot of memory when dealing with tens of thousands of requests per
second - with a quick insertion of a synchronize_rcu() inside our
shrinker callback, that issue disappears.

v2: Currently, it is our responsibility to handle reclaim i.e. to avoid
hogging memory with the delayed slab frees. At the moment, we wait for a
grace period in the shrinker, and block for all RCU callbacks on oom.
Suggested alternatives focus on flushing our RCU callback when we have a
certain number of outstanding request frees, and blocking on that flush
after a second high watermark. (So rather than wait for the system to
run out of memory, we stop issuing requests - both are nondeterministic.)

Paul E. McKenney wrote:

Another approach is synchronize_rcu() after some largish number of
requests.  The advantage of this approach is that it throttles the
production of callbacks at the source.  The corresponding disadvantage
is that it slows things up.

Another approach is to use call_rcu(), but if the previous call_rcu()
is still in flight, block waiting for it.  Yet another approach is
the get_state_synchronize_rcu() / cond_synchronize_rcu() pair.  The
idea is to do something like this:

        cond_synchronize_rcu(cookie);
        cookie = get_state_synchronize_rcu();

You would of course do an initial get_state_synchronize_rcu() to
get things going.  This would not block unless there was less than
one grace period's worth of time between invocations.  But this
assumes a busy system, where there is almost always a grace period
in flight.  But you can make that happen as follows:

        cond_synchronize_rcu(cookie);
        cookie = get_state_synchronize_rcu();
        call_rcu(&my_rcu_head, noop_function);

Note that you need additional code to make sure that the old callback
has completed before doing a new one.  Setting and clearing a flag
with appropriate memory ordering control suffices (e.g,. smp_load_acquire()
and smp_store_release()).

v3: More comments on compiler and processor order of operations within
the RCU lookup and discover we can use rcu_access_pointer() here instead.

v4: Wrap i915_gem_active_get_rcu() to take the rcu_read_lock itself.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-25-git-send-email-chris@chris-wilson.co.uk

0eafec6d

drm/i915: Move i915_gem_object_wait_rendering() · 00e60f26

由 Chris Wilson 提交于 8月 04, 2016

Just move it earlier so that we can use the companion nonblocking
version in a couple of more callsites without having to add a forward
declaration.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-24-git-send-email-chris@chris-wilson.co.uk

00e60f26

drm/i915: Move obj->active:5 to obj->flags · 573adb39

由 Chris Wilson 提交于 8月 04, 2016

We are motivated to avoid using a bitfield for obj->active for a couple
of reasons. Firstly, we wish to document our lockless read of obj->active
using READ_ONCE inside i915_gem_busy_ioctl() and that requires an
integral type (i.e. not a bitfield). Secondly, gcc produces abysmal code
when presented with a bitfield and that shows up high on the profiles of
request tracking (mainly due to excess memory traffic as it converts
the bitfield to a register and back and generates frequent AGI in the
process).

v2: BIT, break up a long line in compute the other engines, new paint
for i915_gem_object_is_active (now i915_gem_object_get_active).
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-23-git-send-email-chris@chris-wilson.co.uk

573adb39

drm/i915: Use dev_priv consistently through the intel_frontbuffer interface · 5748b6a1

由 Chris Wilson 提交于 8月 04, 2016

Rather than a mismash of struct drm_device *dev and struct
drm_i915_private *dev_priv being used freely within a function, be
consistent and only pass along dev_priv.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-22-git-send-email-chris@chris-wilson.co.uk

5748b6a1

drm/i915: Use atomics to manipulate obj->frontbuffer_bits · faf5bf0a

由 Chris Wilson 提交于 8月 04, 2016

The individual bits inside obj->frontbuffer_bits are protected by each
plane->mutex, but the whole bitfield may be accessed by multiple KMS
operations simultaneously and so the RMW need to be under atomics.
However, for updating the single field we do not need to mandate that it
be under the struct_mutex, one more step towards its removal as the de
facto BKL.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-21-git-send-email-chris@chris-wilson.co.uk

faf5bf0a

drm/i915: Make fb_tracking.lock a spinlock · b5add959

由 Chris Wilson 提交于 8月 04, 2016

We only need a very lightweight mechanism here as the locking is only
used for co-ordinating a bitfield.

v2: Move the cheap unlikely tests into the caller
v3: Move the kerneldoc into the header (now separated out into
intel_fronbuffer.h for better kerneldoc and readability)
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtien <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-20-git-send-email-chris@chris-wilson.co.uk

b5add959

drm/i915: Separate intel_frontbuffer into its own header · 5d723d7a

由 Chris Wilson 提交于 8月 04, 2016

In view of adding inline functions into the intel_frontbuffer section,
we first split the header into its own file so that we can integrate it
more easily with kerneldoc.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-19-git-send-email-chris@chris-wilson.co.uk

5d723d7a

drm/i915: Remove highly confusing i915_gem_obj_ggtt_pin() · de895082

由 Chris Wilson 提交于 8月 04, 2016

Since i915_gem_obj_ggtt_pin() is an idiom breaking curry function for
i915_gem_object_ggtt_pin(), spare us the confusion and remove it.
Removing it now simplifies later patches to change the i915_vma_pin()
(and friends) interface.

v2: Add a redundant GEM_BUG_ON(!view) to
i915_gem_obj_lookup_or_create_ggtt_vma()
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-18-git-send-email-chris@chris-wilson.co.uk

de895082

drm/i915: Make i915_vma_pin() small and inline · 305bc234

由 Chris Wilson 提交于 8月 04, 2016

Not only is i915_vma_pin() called for every single object on every single
execbuf, it is usually a simple increment as the VMA is already bound for
execution by the GPU. Rearrange the tests for unbound and pin_count
overflow so that we can do the increment and test very cheaply and
compact enough to inline the operation into execbuf. The trick used is
to note that we can check for an overflow bit (keeping space available
for it inside the flags) at the same time as checking the binding bits.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-17-git-send-email-chris@chris-wilson.co.uk

305bc234

drm/i915: Combine all i915_vma bitfields into a single set of flags · 3272db53

由 Chris Wilson 提交于 8月 04, 2016

In preparation to perform some magic to speed up i915_vma_pin(), which
is among the hottest of hot paths in execbuf, refactor all the bitfields
accessed by i915_vma_pin() into a single unified set of flags.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-16-git-send-email-chris@chris-wilson.co.uk

3272db53

drm/i915: Start passing around i915_vma from execbuffer · 59bfa124

由 Chris Wilson 提交于 8月 04, 2016

During execbuffer we look up the i915_vma in order to reserve them in
the VM. However, we then do a double lookup of the vma in order to then
pin them, all because we lack the necessary interfaces to operate on
i915_vma - so introduce i915_vma_pin()!

v2: Tidy parameter lists to remove one level of redirection in the hot
path.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-15-git-send-email-chris@chris-wilson.co.uk

59bfa124

drm/i915: Wrap vma->pin_count accessors with small inline helpers · 20dfbde4

由 Chris Wilson 提交于 8月 04, 2016

In the next few patches, the VMA pinning API is overhauled and to reduce
the churn we pull out the update to the accessors into a prep patch.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-14-git-send-email-chris@chris-wilson.co.uk

20dfbde4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功