提交 · 3089c6f239d7d2c4cb2dd5c353e8984cf79af1d7 · openeuler / Kernel

06 8月, 2013 5 次提交

drm/i915: make caching operate on all address spaces · 3089c6f2

由 Ben Widawsky 提交于 7月 31, 2013

For now, objects will maintain the same cache levels amongst all address
spaces. This is to limit the risk of bugs, as playing with cacheability
in the different domains can be very error prone.

In the future, it may be optimal to allow setting domains per VMA (ie.
an object bound into an address space).
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

3089c6f2

drm/i915: Add VM to pin · c37e2204

由 Ben Widawsky 提交于 7月 31, 2013

To verbalize it, one can say, "pin an object into the given address
space." The semantics of pinning remain the same otherwise.

Certain objects will always have to be bound into the global GTT.
Therefore, global GTT is a special case, and keep a special interface
around for it (i915_gem_obj_ggtt_pin).

v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c37e2204

drm/i915: Use bound list for inactive shrink · fcb4a578

由 Ben Widawsky 提交于 7月 31, 2013

Do to the move active/inactive lists, it no longer makes sense to use
them for shrinking, since shrinking isn't VM specific (such a need may
also exist, but doesn't yet).

What we can do instead is use the global bound list to find all objects
which aren't active.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

fcb4a578

drm/i915: Make proper functions for VMs · a70a3148

由 Ben Widawsky 提交于 7月 31, 2013

Earlier in the conversion sequence we attempted to quickly wedge in the
transitional interface as static inlines.

Now that we're sure these interfaces are sane, for easier debug and to
decrease code size (since many of these functions may be called quite a
bit), make them real functions

While at it, kill off the set_color interface. We'll always have the
VMA, or easily get to it.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a70a3148

drm/i915: Create an init vm · fc8c067e

由 Ben Widawsky 提交于 7月 31, 2013

Move all the similar address space (VM) initialization code to one
function. Until we have multiple VMs, there should only ever be 1 VM.
The aliasing ppgtt is a special case without it's own VM (since it
doesn't need it's own address space management).
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

fc8c067e

25 7月, 2013 2 次提交

drm/i915: fix the racy object accounting · c20e8355

由 Daniel Vetter 提交于 7月 24, 2013

Just use a spinlock to protect them.

v2: Rebase onto the new object create refcount fix patch.

v3: Don't kill dev_priv->mm.object_memory as requested by Chris and
hence just use a spinlock instead of atomic_t.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67287Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c20e8355

drm/i915: fix reference counting in i915_gem_create · d861e338

由 Daniel Vetter 提交于 7月 24, 2013

This function is called without the dev->struct_mutex held, hence we
need to use the _unlocked unreference variants.

As soon as the object is registered userspace can sneak in here with a
gem_close ioctl call, so the object can (and with my new evil tests
actually does) get the final unreference in this place. The lack of
locking then results in hilarity and some good leakage.

To fix this we simply need to revert

Chris Wilson <chris@chris-wilson.co.uk>

v2: We need to make the trace call _before_ we drop our ref - the
object might very well be gone by then already.

v3: Just revert the original patch as suggested by Chris Wilson.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet: Remove the added white line again to tighten the return
block, requested by Chris.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d861e338

24 7月, 2013 1 次提交

drm/i915: fix up error cleanup in i915_gem_object_bind_to_gtt · bc6bc15b

由 Daniel Vetter 提交于 7月 22, 2013

This has been broken in

commit 2f633156
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Wed Jul 17 12:19:03 2013 -0700

    drm/i915: Create VMAs

which resulted in an OOPS the first time around we've hit -ENOSPC.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67156
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Nmeng <mengmeng.meng@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bc6bc15b

19 7月, 2013 4 次提交

drm/i915: add prefault_disable module option · 0b74b508

由 Xiong Zhang 提交于 7月 19, 2013

prefault is stll enabled by default which prevent most of pwrite/pread/reloc
from running slow path, in order to verify these slow pathes, prefault need
to be disabled.
Signed-off-by: NXiong Zhang <xiong.y.zhang@intel.com>
[danvet: Make checkpatch happy and bikeshed the module option help
text a bit.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0b74b508

drm/i915: use after free on error path · 6286ef9b

由 Dan Carpenter 提交于 7月 19, 2013

i915_gem_vma_destroy() frees its argument so we have to move the
drm_mm_remove_node() call up a few lines.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6286ef9b

drm/i915: checking for NULL instead of IS_ERR() · db473b36

由 Dan Carpenter 提交于 7月 19, 2013

i915_gem_vma_create() returns and ERR_PTR() or a valid pointer, it never
returns NULL.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

db473b36

drm/i915: correctly restore fences with objects attached · 94a335db

由 Daniel Vetter 提交于 7月 17, 2013

To avoid stalls we delay tiling changes and especially hold of
committing the new fence state for as long as possible.
Synchronization points are in the execbuf code and in our gtt fault
handler.

Unfortunately we've missed that tricky detail when adding proper fence
restore code in

commit 19b2dbde
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jun 12 10:15:12 2013 +0100

    drm/i915: Restore fences after resume and GPU resets

The result was that we've restored fences for objects with no tiling,
since the object<->fence link still existed after resume. Now that
wouldn't have been too bad since any subsequent access would have
fixed things up, but if we've changed from tiled to untiled real havoc
happened:

The tiling stride is stored -1 in the fence register, so a stride of 0
resulted in all 1s in the top 32bits, and so a completely bogus fence
spanning everything from the start of the object to the top of the
GTT. The tell-tale in the register dumps looks like:

                 FENCE START 2: 0x0214d001
                 FENCE END 2: 0xfffff3ff

Bit 11 isn't set since the hw doesn't store it, even when writing all
1s (at least on my snb here).

To prevent such a gaffle in the future add a sanity check for fences
with an untiled object attached in i915_gem_write_fence.

v2: Fix the WARN, spotted by Chris.

v3: Trying to reuse get_fences looked ugly and obfuscated the code.
Instead reuse update_fence and to make it really dtrt also move the
fence dirty state clearing into update_fence.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=60530
Cc: stable@vger.kernel.org (for 3.10 only)
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Tested-by: NMatthew Garrett <matthew.garrett@nebula.com>
Tested-by: NBjörn Bidar <theodorstormgrade@gmail.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

94a335db

18 7月, 2013 4 次提交

drm/i915: Create VMAs · 2f633156

由 Ben Widawsky 提交于 7月 17, 2013

Formerly: "drm/i915: Create VMAs (part 1)"

In a previous patch, the notion of a VM was introduced. A VMA describes
an area of part of the VM address space. A VMA is similar to the concept
in the linux mm. However, instead of representing regular memory, a VMA
is backed by a GEM BO. There may be many VMAs for a given object, one
for each VM the object is to be used in. This may occur through flink,
dma-buf, or a number of other transient states.

Currently the code depends on only 1 VMA per object, for the global GTT
(and aliasing PPGTT). The following patches will address this and make
the rest of the infrastructure more suited

v2: s/i915_obj/i915_gem_obj (Chris)

v3: Only move an object to the now global unbound list if there are no
more VMAs for the object which are bound into a VM (ie. the list is
empty).

v4: killed obj->gtt_space
some reworks due to rebase

v5: Free vma on error path (Imre)

v6: Another missed vma free in i915_gem_object_bind_to_gtt error path
(Imre)
Fixed vma freeing in stolen preallocation (Imre)
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NImre Deak <imre.deak@intel.com>
[danvet: Squash in fixup from Ben to not deref a non-existing vma in
set_cache_level, reported by Chris.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

2f633156

drm/i915: Move active/inactive lists to new mm · 5cef07e1

由 Ben Widawsky 提交于 7月 16, 2013

Shamelessly manipulated out of Daniel :-)
"When moving the lists around explain that the active/inactive stuff is
used by eviction when we run out of address space, so needs to be
per-vma and per-address space. Bound/unbound otoh is used by the
shrinker which only cares about the amount of memory used and not one
bit about in which address space this memory is all used in. Of course
to actual kick out an object we need to unbind it from every address
space, but for that we have the per-object list of vmas."

v2: Leave the bound list as a global one. (Chris, indirectly)

v3: Rebased with no i915_gtt_vm. In most places I added a new *vm local,
since it will eventually be replaces by a vm argument.
Put comment back inline, since it no longer makes sense to do otherwise.

v4: Rebased on hangcheck/error state movement
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5cef07e1

drm/i915: Put the mm in the parent address space · 93bd8649

由 Ben Widawsky 提交于 7月 16, 2013

Every address space should support object allocation. It therefore makes
sense to have the allocator be part of the "superclass" which GGTT and
PPGTT will derive.

Since our maximum address space size is only 2GB we're not yet able to
avoid doing allocation/eviction; but we'd hope one day this becomes
almost irrelvant.

v2: Rebased
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

93bd8649

drm/i915: Move gtt and ppgtt under address space umbrella · 853ba5d2

由 Ben Widawsky 提交于 7月 16, 2013

The GTT and PPGTT can be thought of more generally as GPU address
spaces. Many of their actions (insert entries), state (LRU lists), and
many of their characteristics (size) can be shared. Do that.

The change itself doesn't actually impact most of the VMA/VM rework
coming up, it just fits in with the grand scheme of abstracting the GPU
VM operations. GGTT will usually be a special case where we either know
an object must be in the GGTT (dislay engine, workarounds, etc.).

The scratch page is left as part of the VM (even though it's currently
shared with the ppgtt code) because in the future when we have Full
PPGTT, I intend to create a separate scratch page for each.

v2: Drop usage of i915_gtt_vm (Daniel)
Make cleanup also part of the parent class (Ben)
Modified commit msg
Rebased

v3: Properly share scratch page (Imre)
Finish commit message (Daniel, Imre)
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

853ba5d2

16 7月, 2013 3 次提交

drm/i915: introduce i915_queue_hangcheck · 10cd45b6

由 Mika Kuoppala 提交于 7月 03, 2013

To run hangcheck in near future.
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

10cd45b6

drm/i915: store eLLC size · 59124506

由 Ben Widawsky 提交于 7月 04, 2013

The eLLC cannot be determined by PCIID because as far as we know, even
machines supporting eLLC may not have it enabled, or fused off or
whatever. It's possible this isn't actually true, and at that point we
can switch to a DEV_INFO flag instead.

I've defined everything where the docs are clear, and left the rest as
magic.

But we need it before we set the pte_encode function pointers, which
happens really early, in gtt_init.

The problem with just doing the normal sequence earlier is we don't have
the ability to use forcewake until after the pte functions have been set
up.

Since all solutions are somewhat ugly (barring rewriting all the init
ordering), I've opted to do the detection really early, and the enabling
later - since the register to detect doesn't require forcewake.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
[danvet: Move dev_priv->ellc_size away from the dri1 dungeon to a nice
place right next to the l3 parity stuff. Also squash in the follow-up
commit to read out the eLLC size a bit earlier.]
Reviewed-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

59124506

drm/i915: Define some of the eLLC magic · 05e21cc4

由 Ben Widawsky 提交于 7月 04, 2013

The EDRAM present register isn't really defined in the docs. It just
says check to see if it's set to 1. So I haven't defined the 1 value not
knowing what it actually means.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

05e21cc4

10 7月, 2013 4 次提交

Revert "drm/i915: Workaround incoherence between fences and LLC across multiple CPUs" · 46a0b638

由 Chris Wilson 提交于 7月 10, 2013

This reverts commit 25ff1195 and the follow on for Valleyview commit 2dc8aae0.

commit 25ff1195
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Apr 4 21:31:03 2013 +0100

    drm/i915: Workaround incoherence between fences and LLC across multiple CPUs

commit 2dc8aae0
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed May 22 17:08:06 2013 +0100

    drm/i915: Workaround incoherence with fence updates on Valleyview

Jon Bloomfield came up with a plausible explanation and cheap fix
(drm/i915: Fix incoherence with fence updates on Sandybridge+) for the
race condition, so lets run with it.

This is a candidate for stable as the old workaround incurs a
significant cost (calling wbinvd on all CPUs before performing the
register write) for some workloads as noted by Carsten Emde.

Link: http://lists.freedesktop.org/archives/intel-gfx/2013-June/028819.html
References: https://www.osadl.org/?id=1543#c7602
References: https://bugs.freedesktop.org/show_bug.cgi?id=63825Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Carsten Emde <C.Emde@osadl.org>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

46a0b638

drm/i915: Fix incoherence with fence updates on Sandybridge+ · d18b9619

由 Chris Wilson 提交于 7月 10, 2013

This hopefully fixes the root cause behind the workaround added in

commit 25ff1195
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Apr 4 21:31:03 2013 +0100

    drm/i915: Workaround incoherence between fences and LLC across multiple CPUs

Thanks to further investigation by Jon Bloomfield, he realised that
the 64-bit register might be broken up by the hardware into two 32-bit
writes (a problem we have encountered elsewhere). This non-atomicity
would then cause an issue where a second thread would see an
intermediate register state (new high dword, old low dword), and this
register would randomly be used in preference to its own thread register.
This would cause the second thread to read from and write into a fairly
random tiled location.  Breaking the operation into 3 explicit 32-bit
updates (first disable the fence, poke the upper bits, then poke the lower
bits and enable) ensures that, given proper serialisation between the
32-bit register write and the memory transfer, that the fence value is
always consistent.

Armed with this knowledge, we can explain how the previous workaround
work. The key to the corruption is that a second thread sees an
erroneous fence register that conflicts and overrides its own. By
serialising the fence update across all CPUs, we have a small window
where no GTT access is occurring and so hide the potential corruption.
This also leads to the conclusion that the earlier workaround was
incomplete.

v2: Be overly paranoid about the order in which fence updates become
visible to the GPU to make really sure that we turn the fence off before
doing the update, and then only switch the fence on afterwards.
Signed-off-by: NJon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Carsten Emde <C.Emde@osadl.org>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d18b9619

drm/i915: don't frob mm.suspended when not using ums · db1b76ca

由 Daniel Vetter 提交于 7月 09, 2013

In kernel modeset driver mode we're in full control of the chip,
always. So there's no need at all to set mm.suspended in
i915_gem_idle. Hence move that out into the leavevt ioctl. Since
i915_gem_idle doesn't suspend gem any more we can also drop the
re-enabling for KMS in the thaw function.

Also clean up the handling of mm.suspend at driver load by coalescing
all the assignments.

Stumbled over while reading through our resume code for unrelated
reasons.

v2: Shovel mm.suspended into the (newly created) ums dungeon as
suggested by Chris Wilson. The plan is that once we've completely
stopped relying on the register save/restore code we could shovel even
that in there.

v3: Improve the locking for the entervt/leavevt ioctls a bit by moving
the dev->struct_mutex locking outside of i915_gem_idle. Also don't
clear dev_priv->ums.mm_suspended for the kms case, we allocate it with
kzalloc. Both suggested by Chris Wilson.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

db1b76ca

drm/i915: Fix write-read race with multiple rings · 02978ff5

由 Chris Wilson 提交于 7月 09, 2013

Daniel noticed a problem where is we wrote to an object with ring A in
the middle of a very long running batch, then executed a quick batch on
ring B before a batch that reads from the same object, its obj->ring would
now point to ring B, but its last_write_seqno would be still relative to
ring A. This would allow for the user to read from the object before the
GPU had completed the write, as set_domain would only check that ring B
had passed the last_write_seqno.

To fix this simply (and inelegantly), we bump the last_write_seqno when
switching rings so that the last_write_seqno is always relative to the
current obj->ring.

This fixes igt/tests/gem_write_read_ring_switch.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
[danvet: Add note about the newly created igt which exercises this
bug.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

02978ff5

09 7月, 2013 4 次提交

drm/i915: Correct obj->mm_list link to dev_priv->dev_priv->mm.inactive_list · 06755608

由 Xiong Zhang 提交于 7月 05, 2013

obj->mm_list link to dev_priv->mm.inactive_list/active_list
obj->global_list link to dev_priv->mm.unbound_list/bound_list

This regression has been introduced in

commit 93927ca5
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Jan 10 18:03:00 2013 +0100

    drm/i915: Revert shrinker changes from "Track unbound pages"

Cc: stable@vger.kernel.org
Signed-off-by: NXiong Zhang <xiong.y.zhang@intel.com>
[danvet: Add regression notice.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

06755608

drm/i915: Embed drm_mm_node in i915 gem obj · c6cfb325

由 Ben Widawsky 提交于 7月 05, 2013

Embedding the node in the obj is more natural in the transition to VMAs
which will also have embedded nodes. This change also helps transition
away from put_block to remove node.

Though it's quite an uncommon occurrence, it's somewhat convenient to not
fail at bind time because we cannot allocate the node. Though in
practice there are other allocations (like the request structure) which
would probably make this point not terribly useful.

Quoting Daniel:
Note that the only difference between put_block and remove_node is
that the former fills up the preallocation cache. Which we don't need
anyway and hence is just wasted space.

v2: Clean up the stolen preallocation code.
Rebased on the reserve_node patches
renames ggtt_ stuff to gtt_ stuff
WARN_ON if the object is already bound (which doesn't mean it's in the
bound list, tricky)
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c6cfb325

drm/i915: Kill obj->gtt_offset · edd41a87

由 Ben Widawsky 提交于 7月 05, 2013

With the getters in place from the previous patch this members serves no
purpose other than saving one spare pointer chase, which will be killed
in the next patch anyway.

Moving to VMAs, this members adds unnecessary confusion since an object
may exist at different offsets in different VMs.

v2: Properly preserve the stolen offset. This code is a bit hacky but it
all goes away when we embed the drm_mm_node and removes the need for the
incorrect patch I submitted previously: "Use gtt_space->start for stolen
reservation"
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

edd41a87

drm/i915: Getter/setter for object attributes · f343c5f6

由 Ben Widawsky 提交于 7月 05, 2013

Soon we want to gut a lot of our existing assumptions how many address
spaces an object can live in, and in doing so, embed the drm_mm_node in
the object (and later the VMA).

It's possible in the future we'll want to add more getter/setter
methods, but for now this is enough to enable the VMAs.

v2: Reworked commit message (Ben)
Added comments to the main functions (Ben)
sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
(Daniel)

v3: Rebased on new reserve_node patch
Changed DRM_DEBUG_KMS to actually work (will need fixing later)
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f343c5f6

01 7月, 2013 4 次提交

drm/i915: Refactor the wait_rendering completion into a common routine · d26e3af8

由 Chris Wilson 提交于 6月 29, 2013

Harmonise the completion logic between the non-blocking and normal
wait_rendering paths, and move that logic into a common function.

In the process, we note that the last_write_seqno is by definition the
earlier of the two read/write seqnos and so all successful waits will
have passed the last_write_seqno. Therefore we can unconditionally clear
the write seqno and its domains in the completion logic.

v2: Add the missing ring parameter, because sometimes it is good to have
things compile.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d26e3af8

drm/i915: Only clear write-domains after a successful wait-seqno · daa13e1c

由 Chris Wilson 提交于 6月 28, 2013

In the introduction of the non-blocking wait, I cut'n'pasted the wait
completion code from normal locked path. Unfortunately, this neglected
that the normal path returned early if the wait returned early. The
result is that read-only waits may return whilst the GPU is still
writing to the bo.

Fixes regression from
commit 3236f57a [v3.7]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Aug 24 09:35:09 2012 +0100

    drm/i915: Use a non-blocking wait for set-to-domain ioctl
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66163
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

daa13e1c

drm/i915: fix build warning on format specifier mismatch · 3765f304

由 Jani Nikula 提交于 6月 07, 2013

drivers/gpu/drm/i915/i915_gem.c: In function ‘i915_gem_object_bind_to_gtt’:
drivers/gpu/drm/i915/i915_gem.c:3002:3: warning: format ‘%ld’ expects
argument of type ‘long int’, but argument 5 has type ‘size_t’ [-Wformat]

v2: Use %zu instead of %d. Two char patch, and 100% wrong. (Ville)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

3765f304

drm/i915: make compact dma scatter lists creation work with SWIOTLB backend. · 1625e7e5

由 Konrad Rzeszutek Wilk 提交于 6月 24, 2013

Git commit 90797e6d
("drm/i915: create compact dma scatter lists for gem objects") makes
certain assumptions about the under laying DMA API that are not always
correct.

On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup
I see:

[drm:intel_pipe_set_base] *ERROR* pin & fence failed
[drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28

Bit of debugging traced it down to dma_map_sg failing (in
i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB).

That unfortunately are sizes that the SWIOTLB is incapable of handling -
the maximum it can handle is a an entry of 512KB of virtual contiguous
memory for its bounce buffer. (See IO_TLB_SEGSIZE).

Previous to the above mention git commit the SG entries were of 4KB, and
the code introduced by above git commit squashed the CPU contiguous PFNs
in one big virtual address provided to DMA API.

This patch is a simple semi-revert - were we emulate the old behavior
if we detect that SWIOTLB is online. If it is not online then we continue
on with the new compact scatter gather mechanism.

An alternative solution would be for the the '.get_pages' and the
i915_gem_gtt_prepare_object to retry with smaller max gap of the
amount of PFNs that can be combined together - but with this issue
discovered during rc7 that might be too risky.
Reported-and-Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Imre Deak <imre.deak@intel.com>
CC: Daniel Vetter <daniel.vetter@ffwll.ch>
CC: David Airlie <airlied@linux.ie>
CC: <dri-devel@lists.freedesktop.org>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1625e7e5

25 6月, 2013 1 次提交

drm/i915: make compact dma scatter lists creation work with SWIOTLB backend. · 426729dc

由 Konrad Rzeszutek Wilk 提交于 6月 24, 2013

Git commit 90797e6d
("drm/i915: create compact dma scatter lists for gem objects") makes
certain assumptions about the under laying DMA API that are not always
correct.

On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup
I see:

[drm:intel_pipe_set_base] *ERROR* pin & fence failed
[drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28

Bit of debugging traced it down to dma_map_sg failing (in
i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB).

That unfortunately are sizes that the SWIOTLB is incapable of handling -
the maximum it can handle is a an entry of 512KB of virtual contiguous
memory for its bounce buffer. (See IO_TLB_SEGSIZE).

Previous to the above mention git commit the SG entries were of 4KB, and
the code introduced by above git commit squashed the CPU contiguous PFNs
in one big virtual address provided to DMA API.

This patch is a simple semi-revert - were we emulate the old behavior
if we detect that SWIOTLB is online. If it is not online then we continue
on with the new compact scatter gather mechanism.

An alternative solution would be for the the '.get_pages' and the
i915_gem_gtt_prepare_object to retry with smaller max gap of the
amount of PFNs that can be combined together - but with this issue
discovered during rc7 that might be too risky.
Reported-and-Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Imre Deak <imre.deak@intel.com>
CC: Daniel Vetter <daniel.vetter@ffwll.ch>
CC: David Airlie <airlied@linux.ie>
CC: <dri-devel@lists.freedesktop.org>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

426729dc

16 6月, 2013 1 次提交

drm/i915: Restore fences after resume and GPU resets · 19b2dbde

由 Chris Wilson 提交于 6月 12, 2013

Stéphane Marchesin found that fences for pinned objects (i.e. the
scanout) were not being restored upon resume, leading to corruption on
the display and reference counting issues. This is due to a bug in

commit 312817a3 [2.6.38]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Nov 22 11:50:11 2010 +0000

    drm/i915: Only save and restore fences for UMS

that zapped the pinned fences even though they were in use.
Fortuitously, whilst we forced a VT switch during suspend and resume,
no fences were ever pinned at the time. However, we now can do
switchless S3 transitions and so the old bug finally surfaces.
Reported-by: NStéphane Marchesin <marcheu@chromium.org>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

19b2dbde

13 6月, 2013 3 次提交

drm/i915: find guilty batch buffer on ring resets · aa60c664

由 Mika Kuoppala 提交于 6月 12, 2013

After hang check timer has declared gpu to be hung,
rings are reset. In ring reset, when clearing
request list, do post mortem analysis to find out
the guilty batch buffer.

Select requests for further analysis by inspecting
the completed sequence number which has been updated
into the HWS page. If request was completed, it can't
be related to the hang.

For noncompleted requests mark the batch as guilty
if the ring was not waiting and the ring head was
stuck inside the buffer object or in the flush region
right after the batch. For everything else, mark
them as innocents.

v2: Fixed a typo in commit message (Ville Syrjälä)

v3: - more descriptive function parameters (Chris Wilson)
    - use masked head address when inspecting if request is in ring
    - s/hangcheck.last_action/hangcheck.action
    - added comment about unmasked head hitting batch_obj range
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Acked-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

aa60c664

drm/i915: add batch bo to i915_add_request() · 7d736f4f

由 Mika Kuoppala 提交于 6月 12, 2013

In order to track down a batch buffer and context which
caused the ring to hang, store reference to bo into the request struct.
Request can also cause gpu to hang after the batch in the flush section
in the ring. To detect this add start of the flush portion offset into the
request.

v2: Included comment about request vs batch_obj lifetimes (Chris Wilson)
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7d736f4f

drm/i915: change i915_add_request to macro · 0025c077

由 Mika Kuoppala 提交于 6月 12, 2013

Only execbuffer needed all the parameters on i915_add_request().
By putting __i915_add_request behind macro, all current callsites
become cleaner. Following patch will introduce a new parameter
for __i915_add_request. With this patch, only the relevant callsite
will reflect the change making commit smaller and easier to understand.

v2: _i915_add_request as function name (Chris Wilson)

v3: change name __i915_add_request and fix ordering of params (Ben Widawsky)
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0025c077

03 6月, 2013 4 次提交

drm/i915: Fix spurious -EIO/SIGBUS on wedged gpus · 7abb690a

由 Daniel Vetter 提交于 5月 24, 2013

Chris Wilson noticed that since

commit 1f83fee0 [v3.9]
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Nov 15 17:17:22 2012 +0100

    drm/i915: clear up wedged transitions

X can again get -EIO when it does not expect it. And even worse score
a SIGBUS when accessing gtt mmaps. The established ABI is that we
_only_ return an -EIO from execbuf - all other ioctls should just
work. And since the reset code moves all bos out of gpu domains and
clears out all the last_seqno/ring tracking there really shouldn't be
any reason for non-execbuf code to ever touch the hw and see an -EIO.

After some extensive discussions we've noticed that these spurios -EIO
are caused by i915_gem_wait_for_error:

http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg20540.html

That is easy to fix by returning 0 instead of -EIO, since grabbing the
dev->struct_mutex does not yet mean that we actually want to touch the
hw. And so there is no reason at all to fail with -EIO.

But that's not the entire since, since often (at least it's easily
googleable) dmesg indicates that the reset fails and we declare the
gpu wedged. Then, quite a bit later X wakes up with the "Timed out
waiting for the gpu reset to complete" DRM_ERROR message in
wait_for_errror and brings down the desktop with an -EIO/SIGBUS.

So clearly we're missing a wakeup somewhere, since the gpu reset just
doesn't take 10 seconds to complete. And indeed we're do handle the
terminally wedged state wrong.

Fix this all up.

References: https://bugs.freedesktop.org/show_bug.cgi?id=63921
References: https://bugs.freedesktop.org/show_bug.cgi?id=64073
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7abb690a

drm/i915: Rename the gtt_list to global_list · 35c20a60

由 Ben Widawsky 提交于 5月 31, 2013

Since it will be used for the global bound/unbound list with full PPGTT,
this helps clarify things for upcoming code rework.
Recommended-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

35c20a60

drm/i915: unpin pages at unbind · 401c29f6

由 Ben Widawsky 提交于 5月 31, 2013

If we properly keep track of the pages_pin_count, then when we later add
multiple address spaces, the put_pages doesn't need any special checks
to be able to perform it's job.

CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet: Rebased on top of the fix for stolen memory pinning.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

401c29f6

drm/i915: Unpin stolen pages · 1d64ae71

由 Ben Widawsky 提交于 5月 31, 2013

The way the stolen handling works is we take a pin on the backing pages,
but we never actually get a reference to the bo. On freeing objects
allocated with stolen memory, the final unref will end up freeing the
object with pinned pages count left. To enable an assertion to catch
bugs in this code path, this patch cleans up that remaining pin.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1d64ae71

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功