提交 · da51a1e7e398129d9fddd4b26b8469145dd4fd08 · openeuler / Kernel

11 8月, 2014 3 次提交

drm/i915: Fix secure dispatch with full ppgtt · da51a1e7

由 Daniel Vetter 提交于 8月 11, 2014

Based upon a hunk from a patch from Chris Wilson, but augmented to:
- Process the batch in the full ppgtt vm so that self-relocations
  match again with userspace's expectations..
- Add a comment why plain pin for the global gtt binding is safe at
  that point.

v2: Drop local bind_vm variable (Chris).

v3: Explain why this works despite the lack of proper active tracking
for the ggtt batch vma.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

da51a1e7

drm/i915: Remove fenced_gpu_access and pending_fenced_gpu_access · 82b6b6d7

由 Chris Wilson 提交于 8月 09, 2014

This migrates the fence tracking onto the existing seqno
infrastructure so that the later conversion to tracking via requests is
simplified.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

82b6b6d7

drm/i915: Force CPU relocations if not GTT mapped · e6a84468

由 Chris Wilson 提交于 8月 11, 2014

Move the decision on whether we need to have a mappable object during
execbuffer to the fore and then reuse that decision by propagating the
flag through to reservation. As a corollary, before doing the actual
relocation through the GTT, we can make sure that we do have a GTT
mapping through which to operate.

Note that the key to make this work is to ditch the
obj->map_and_fenceable unbind optimization - with full ppgtt it
doesn't make a lot of sense any more anyway.

v2: Revamp and resend to ease future patches.
v3: Refresh patch rationale

References: https://bugs.freedesktop.org/show_bug.cgi?id=81094Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Explain why obj->map_and_fenceable is key and split out the
secure batch fix.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

e6a84468

08 7月, 2014 2 次提交

drm/i915: Extract the actual workload submission mechanism from execbuffer · 78382593

由 Oscar Mateo 提交于 7月 03, 2014

So that we isolate the legacy ringbuffer submission mechanism, which becomes
a good candidate to be abstracted away. This is prep-work for Execlists (which
will its own workload submission mechanism).

No functional changes.
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

78382593

drm/i915: Emphasize that ctx->id is merely a user handle · 821d66dd

由 Oscar Mateo 提交于 7月 03, 2014

This is an Execlists preparatory patch, since they make context ID become an
overloaded term:

- In the software, it was used to distinguish which context userspace was
  trying to use.
- In the BSpec, the term is used to describe the 20-bits long field the
  hardware uses to it to discriminate the contexts that are submitted to
  the ELSP and inform the driver about their current status (via Context
  Switch Interrupts and Context Status Buffers).

Initially, I tried to make the different meanings converge, but it proved
impossible:

- The software ctx->id is per-filp, while the hardware one needs to be
  globally unique.
- Also, we multiplex several backing states objects per intel_context,
  and all of them need unique HW IDs.
- I tried adding a per-filp ID and then composing the HW context ID as:
  ctx->id + file_priv->id + ring->id, but the fact that the hardware only
  uses 20-bits means we have to artificially limit the number of filps or
  contexts the userspace can create.

The ctx->user_handle renaming bits are done with this Cocci patch (plus
manual frobbing of the struct declaration):

    @@
    struct intel_context c;
    @@
    - (c).id
    + c.user_handle

    @@
    struct intel_context *c;
    @@
    - (c)->id
    + c->user_handle

Also, while we are at it, s/DEFAULT_CONTEXT_ID/DEFAULT_CONTEXT_HANDLE and
change the type to unsigned 32 bits.

v2: s/handle/user_handle and change the type to uint32_t as suggested by
Chris Wilson.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1)
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

821d66dd

20 6月, 2014 1 次提交

drm/i915: Track frontbuffer invalidation/flushing · f99d7069

由 Daniel Vetter 提交于 6月 19, 2014

So these are the guts of the new beast. This tracks when a frontbuffer
gets invalidated (due to frontbuffer rendering) and hence should be
constantly scaned out, and when it's flushed again and can be
compressed/one-shot-upload.

Rules for flushing are simple: The frontbuffer needs one more full
upload starting from the next vblank. Which means that the flushing
can _only_ be called once the frontbuffer update has been latched.

But this poses a problem for pageflips: We can't just delay the
flushing until the pageflip is latched, since that would pose the risk
that we override frontbuffer rendering that has been scheduled
in-between the pageflip ioctl and the actual latching.

To handle this track asynchronous invalidations (and also pageflip)
state per-ring and delay any in-between flushing until the rendering
has completed. And also cancel any delayed flushing if we get a new
invalidation request (whether delayed or not).

Also call intel_mark_fb_busy in both cases in all cases to make sure
that we keep the screen at the highest refresh rate both on flips,
synchronous plane updates and for frontbuffer rendering.

v2: Lots of improvements

Suggestions from Chris:
- Move invalidate/flush in flush_*_domain and set_to_*_domain.
- Drop the flush in busy_ioctl since it's redundant. Was a leftover
  from an earlier concept to track flips/delayed flushes.
- Don't forget about the initial modeset enable/final disable.
  Suggested by Chris.

Track flips accurately, too. Since flips complete independently of
rendering we need to track pending flips in a separate mask. Again if
an invalidate happens we need to cancel the evenutal flush to avoid
races.

v3:
Provide correct header declarations for flip functions. Currently not
needed outside of intel_display.c, but part of the proper interface.

v4: Add proper domain management to fbcon so that the fbcon buffer is
also tracked correctly.

v5: Fixup locking around the fbcon set_to_gtt_domain call.

v6: More comments from Chris:
- Split out fbcon changes.
- Drop superflous checks for potential scanout before calling intel_fb
  functions - we can micro-optimize this later.
- s/intel_fb_/intel_fb_obj_/ to make it clear that this deals in gem
  object. We already have precedence for fb_obj in the pin_and_fence
  functions.

v7: Clarify the semantics of the flip flush handling by renaming
things a bit:
- Don't go through a gem object but take the relevant frontbuffer bits
  directly. These functions center on the plane, the actual object is
  irrelevant - even a flip to the same object as already active should
  cause a flush.
- Add a new intel_frontbuffer_flip for synchronous plane updates. It
  currently just calls intel_frontbuffer_flush since the implemenation
  differs.

This way we achieve a clear split between one-shot update events on
one side and frontbuffer rendering with potentially a very long delay
between the invalidate and flush.

Chris and I also had some discussions about mark_busy and whether it
is appropriate to call from flush. But mark busy is a state which
should be derived from the 3 events (invalidate, flush, flip) we now
have by the users, like psr does by tracking relevant information in
psr.busy_frontbuffer_bits. DRRS (the only real use of mark_busy for
frontbuffer) needs to have similar logic. With that the overall
mark_busy in the core could be removed.

v8: Only when retiring gpu buffers only flush frontbuffer bits we
actually invalidated in a batch. Just for safety since before any
additional usage/invalidate we should always retire current rendering.
Suggested by Chris Wilson.

v9: Actually use intel_frontbuffer_flip in all appropriate places.
Spotted by Chris.

v10: Address more comments from Chris:
- Don't call _flip in set_base when the crtc is inactive, avoids redunancy
  in the modeset case with the initial enabling of all planes.
- Add comments explaining that the initial/final plane enable/disable
  still has work left to do before it's fully generic.

v11: Only invalidate for gtt/cpu access when writing. Spotted by Chris.

v12: s/_flush/_flip/ in intel_overlay.c per Chris' comment.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f99d7069

13 6月, 2014 1 次提交

drm/i915: Fix __user sparse warning · d593d992

由 Ville Syrjälä 提交于 6月 13, 2014

CHECK linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c
linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1529:47: warning: incorrect type in initializer (different address spaces)
linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1529:47: expected struct drm_i915_gem_exec_object2 *user_exec_list
linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1529:47: got void [noderef] <asn:1>*
linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1533:61: warning: incorrect type in argument 1 (different address spaces)
linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1533:61: expected void [noderef] <asn:1>*dst
linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1533:61: got unsigned long long *<noident>
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d593d992

27 5月, 2014 2 次提交

drm/i915: Prevent negative relocation deltas from wrapping · d23db88c

由 Chris Wilson 提交于 5月 23, 2014

This is pure evil. Userspace, I'm looking at you SNA, repacks batch
buffers on the fly after generation as they are being passed to the
kernel for execution. These batches also contain self-referenced
relocations as a single buffer encompasses the state commands, kernels,
vertices and sampler. During generation the buffers are placed at known
offsets within the full batch, and then the relocation deltas (as passed
to the kernel) are tweaked as the batch is repacked into a smaller buffer.
This means that userspace is passing negative relocations deltas, which
subsequently wrap to large values if the batch is at a low address. The
GPU hangs when it then tries to use the large value as a base for its
address offsets, rather than wrapping back to the real value (as one
would hope). As the GPU uses positive offsets from the base, we can
treat the relocation address as the minimum address read by the GPU.
For the upper bound, we trust that userspace will not read beyond the
end of the buffer.

So, how do we fix negative relocations from wrapping? We can either
check that every relocation looks valid when we write it, and then
position each object such that we prevent the offset wraparound, or we
just special-case the self-referential behaviour of SNA and force all
batches to be above 256k. Daniel prefers the latter approach.

This fixes a GPU hang when it tries to use an address (relocation +
offset) greater than the GTT size. The issue would occur quite easily
with full-ppgtt as each fd gets its own VM space, so low offsets would
often be handed out. However, with the rearrangement of the low GTT due
to capturing the BIOS framebuffer, it is already affecting kernels 3.15
onwards. I think only IVB+ is susceptible to this bug, but the workaround
should only kick in rarely, so it seems sensible to always apply it.

v3: Use a bias for batch buffers to prevent small negative delta relocations
from wrapping.

v4 from Daniel:
- s/BIAS/BATCH_OFFSET_BIAS/
- Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
  were growing rather cumbersome.
- Add a comment to eb_get_batch explaining why we do this.
- Apply the batch offset bias everywhere but mention that we've only
  observed it on gen7 gpus.
- Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.

v5: Add static to eb_get_batch, spotted by 0-day tester.

Testcase: igt/gem_bad_reloc
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d23db88c

drm/i915: Only copy back the modified fields to userspace from execbuffer · 9aab8bff

由 Chris Wilson 提交于 5月 23, 2014

We only want to modifiy a single field in the userspace view of the
execbuffer command buffer, so explicitly change that rather than copy
everything back again.

This serves two purposes:

1. The single fields are much cheaper to copy (constant size so the
copy uses special case code) and much smaller than the whole array.

2. We modify the array for internal use that need to be masked from
the user.

Note: We need this backported since without it the next bugfix will
blow up when userspace recycles batchbuffers and relocations.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9aab8bff

23 5月, 2014 2 次提交

drm/i915: s/i915_hw_context/intel_context · 273497e5

由 Oscar Mateo 提交于 5月 22, 2014

Up until now, contexts had one (and only one) backing object that was
used by the hardware to save/restore render ring contexts (via the
MI_SET_CONTEXT command). Other rings did not have or need this, so
our i915_hw_context struct had a 1:1 relationship with a a real HW
context.

With Logical Ring Contexts and Execlists, this is not possible anymore:
all rings need a backing object, and it cannot be reused. To prepare
for that, rename our contexts to the more generic term intel_context.

No functional changes.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

273497e5

drm/i915: s/intel_ring_buffer/intel_engine_cs · a4872ba6

由 Oscar Mateo 提交于 5月 22, 2014

In the upcoming patches we plan to break the correlation between
engine command streamers (a.k.a. rings) and ringbuffers, so it
makes sense to refactor the code and make the change obvious.

No functional changes.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a4872ba6

22 5月, 2014 1 次提交

drm/i915: move bsd dispatch index somewhere better · bdf1e7e3

由 Daniel Vetter 提交于 5月 21, 2014

Adding stuff at the bottom is really no how this should be done, since
that's the place for ums/dri dungeons.

This was added in

commit a8ebba75
Author: Zhao Yakui <yakui.zhao@intel.com>
Date:   Thu Apr 17 10:37:40 2014 +0800

    drm/i915: Use the coarse ping-pong mechanism based on drm fd to dispatch the BSD command on BDW GT3

Also add a note to prevent this from happening again - people really
should be less lazy and take more time to look for a good home of
their new driver-global state.

Cc: Imre Deak <imre.deak@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bdf1e7e3

19 5月, 2014 1 次提交

drm/i915: Retire requests before creating a new one · 227f782e

由 Chris Wilson 提交于 5月 15, 2014

More fallout from

commit c8725f3d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Mar 17 12:21:55 2014 +0000

    drm/i915: Do not call retire_requests from wait_for_rendering

is that we can completely fill all of memory using small objects, such
that we exhaust the filp space, and spend all of our time evicting
objects from the aperture. As such, we never fill the ring, and never
trigger the last resort flushing in

commit 1cf0ba14
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon May 5 09:07:33 2014 +0100

    drm/i915: Flush request queue when waiting for ring space

and so all the requests are left active and the objects keep that last
active reference. Eventually the system comes to a halt as it runs out
of memory.

The impact is mainly limited to test cases as regular userspace will
trigger retirement by manually checking whether an object is active.

Testcase: igt/gem_lut_handle
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78724Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Tested-by: NGuo Jinxian <jinxianx.guo@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

227f782e

13 5月, 2014 1 次提交

drm/i915: Work-around garbage DR4 from UXA · ffd93f24

由 Daniel Vetter 提交于 5月 13, 2014

Somehow UXA submits a completely bogus DR4 value since essentially
forever. It was originally introduced in

commit bade7d7d2505a10a8a7d24b084aff9742e2d6d64
Author: Eric Anholt <eric@anholt.net>
Date:   Fri Jun 6 14:03:25 2008 -0700

    Use the DRM for submitting batchbuffers when available.

and dutifully copied around ever since. Since we want to keep the
general dirt catching around just special case the UXA value.

This regression was introduced in

commit 9cb34664
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Apr 24 08:09:11 2014 +0200

    drm/i915: Catch dirt in unused execbuffer fields

Comment from Chris' review:

"To be fair, it is a sensible value if one supposes a Region style API to
cliprects. Under that API, DR[14] define the extents of the clip region,
and ((0,0), (0,0)) [DR1==DR4==0] would mean all clipped, do not draw
anything."

v2: Pimp commit message a bit and remove the double space.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78494
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jörg Otte <jrg.otte@gmail.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ffd93f24

05 5月, 2014 8 次提交

drm/i915: Support 64b relocations · d9ceb957

由 Ben Widawsky 提交于 4月 28, 2014

All the rest of the code to enable this is in my branch. Without my
branch, hitting > 32b offsets is impossible. The code has always
"supported" 64b, but it's never actually been run of tested. This change
doesn't actually fix anything. [1] I am not sure why X won't work yet. I
do not get hangs or obvious errors.

There are 3 fixes grouped together here. First is to remove the
hardcoded 0 for the upper dword of the relocation. The next fix is to
use a 64b value for target_offset. The final fix is to not directly
apply target_offset to reloc->delta. reloc->delta is part of ABI, and so
we cannot change it. As it stands, 32b is enough to represent everything
we're interested in representing anyway. The main problem is, we cannot
add greater than 32b values to it directly.

[1] Almost all of intel-gpu-tools is not yet ready to test 64b
relocations. There are a few places that expect 32b values for offsets
and these all won't work.

Cc: Rafael Barbalho <rafael.barbalho@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d9ceb957

drm/i915: Support 64b execbuf · 9bcb144c

由 Ben Widawsky 提交于 4月 28, 2014

Previously, our code only had a 32b offset value for where the
batchbuffer starts. With full PPGTT, and 64b canonical GPU address
space, that is an insufficient value. The code to expand is pretty
straight forward, and only one platform needs to do anything with the
extra bits.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NRafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9bcb144c

drm/i915: Do not call retire_requests from wait_for_rendering · c8725f3d

由 Chris Wilson 提交于 3月 17, 2014

A common issue we have is that retiring requests causes recursion
through GTT manipulation or page table manipulation which we can only
handle at very specific points. However, to maintain internal
consistency (enforced through our sanity checks on write_domain at
various points in the GEM object lifecycle) we do need to retire the
object prior to marking it with a new write_domain, and also clear the
write_domain for the implicit flush following a batch.

Note that this then allows the unbound objects to still be on the active
lists, and so care must be taken when removing objects from unbound lists
(similar to the caveats we face processing the bound lists).

v2: Fix i915_gem_shrink_all() to handle updated object lifetime rules,
by refactoring it to call into __i915_gem_shrink().

v3: Missed an object-retire prior to changing cache domains in
i915_gem_object_set_cache_leve()

v4: Rebase
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NBrad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c8725f3d

drm/i915: Use the coarse ping-pong mechanism based on drm fd to dispatch the BSD command on BDW GT3 · a8ebba75

由 Zhao Yakui 提交于 4月 17, 2014

The BDW GT3 has two independent BSD rings, which can be used to process the
video commands. To be simpler, it is transparent to user-space driver/middle.
Instead the kernel driver will decide which ring is to dispatch the BSD video
command.

As every BSD ring is powerful, it is enough to dispatch the BSD video command
based on the drm fd. In such case it can play back video stream while encoding
another video stream. The coarse ping-pong mechanism is used to determine
which BSD ring is used to dispatch the BSD video command.

V1->V2: Follow Daniel's comment and use the simple ping-pong mechanism.
This is only to add the support of dual BSD rings on BDW GT3 machine.
The further optimization will be considered in another patch set.

V2->V3: Follow Daniel's comment to use the struct_mutext instead of
atomic_t during determining which ring can be used to dispatch Video command.
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NZhao Yakui <yakui.zhao@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a8ebba75

drm/i915: Update the restrict check to filter out wrong Ring ID passed by user-space · b1a93306

由 Zhao Yakui 提交于 4月 17, 2014

Signed-off-by: NZhao Yakui <yakui.zhao@intel.com>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b1a93306

drm/i915: Catch dirt in unused execbuffer fields · 9cb34664

由 Daniel Vetter 提交于 4月 24, 2014

We need to make sure that userspace keeps on following the contract,
otherwise we won't be able to use the reserved fields at all.

v2: Add DRM_DEBUG (Chris)

Testcase: igt/gem_exec_params/*-dirt
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9cb34664

drm/i915: Catch abuse of I915_EXEC_CONSTANTS_* · c0f5b82c

由 Daniel Vetter 提交于 4月 24, 2014

A bit tricky since 0 is also a valid constant ...

v2: Add DRM_DEBUG (Chris)

Testcase: igt/gem_exec_params/rel-constants-*
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c0f5b82c

drm/i915: Catch abuse of I915_EXEC_GEN7_SOL_RESET · 9d662da8

由 Daniel Vetter 提交于 4月 24, 2014

Currently we catch it, but silently succeed. Our userspace is
better than this.

v2: Add DRM_DEBUG (Chris)

Testcase: igt/gem_exec_params/sol-reset-*
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

9d662da8

11 4月, 2014 1 次提交

drm/i915: Always use kref tracking for all contexts. · 691e6415

由 Chris Wilson 提交于 4月 09, 2014

If we always initialize kref for the context, even if we are using fake
contexts for hangstats when there is no hw support, we can forgo the
dance to dereference the ctx->obj and inspect whether we are permitted
to use kref inside i915_gem_context_reference() and _unreference().

My ulterior motive here is to improve the debugging of a use-after-free
of ctx->obj. This patch avoids the dereference here and instead forces
the assertion checks associated with kref.

v2: Refactor the fake contexts to being even more like the real
contexts, so that there is much less duplicated and special case code.

v3: Tweaks.
v4: Tweaks, minor.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76671Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Tested-by: Nlu hua <huax.lu@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
[Jani: tiny change to backport to drm-intel-fixes.]
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

691e6415

09 4月, 2014 1 次提交

drm/i915: Unref context on failed eb_create · 935f38d6

由 Ben Widawsky 提交于 4月 04, 2014

I opted to do this instead of grabbing the context reference after
eb_create since eb_create can potentially call the shrinker, and that
makes things very complicated. This simple patch balances the ref count
without requiring a great deal of review to make sure the shrinker path
is safe.

Theoretically (by design) the shrinker can end up destroying a context,
which enforces the reasoning for doing the fix this way instead of
moving the reference to later in the function.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

935f38d6

31 3月, 2014 1 次提交

drm/i915: prefer struct drm_i915_private to drm_i915_private_t · 50227e1c

由 Jani Nikula 提交于 3月 31, 2014

Remove the rest of the references to drm_i915_private_t. No functional
changes.
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
[danvet: Drop hunk in i915_cmd_parser.c]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

50227e1c

08 3月, 2014 1 次提交

drm/i915: Implement command buffer parsing logic · 351e3db2

由 Brad Volkin 提交于 2月 18, 2014

The command parser scans batch buffers submitted via execbuffer ioctls before
the driver submits them to hardware. At a high level, it looks for several
things:

1) Commands which are explicitly defined as privileged or which should only be
   used by the kernel driver. The parser generally rejects such commands, with
   the provision that it may allow some from the drm master process.
2) Commands which access registers. To support correct/enhanced userspace
   functionality, particularly certain OpenGL extensions, the parser provides a
   whitelist of registers which userspace may safely access (for both normal and
   drm master processes).
3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The
   parser always rejects such commands.

See the overview comment in the source for more details.

This patch only implements the logic. Subsequent patches will build the tables
that drive the parser.

v2: Don't set the secure bit if the parser succeeds
Fail harder during init
Makefile cleanup
Kerneldoc cleanup
Clarify module param description
Convert ints to bools in a few places
Move client/subclient defs to i915_reg.h
Remove the bits_count field

OTC-Tracker: AXIA-4631
Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30
Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
[danvet: Appease checkpatch.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

351e3db2

14 2月, 2014 3 次提交

drm/i915: Only bind each object rather than for every execbuffer · 8ea99c92

由 Daniel Vetter 提交于 2月 14, 2014

One side-effect of the introduction of ppgtt was that we needed to
rebind the object into the appropriate vm (and global gtt in some
peculiar cases). For simplicity this was done twice for every object on
every call to execbuffer. However, that adds a tremendous amount of CPU
overhead (rewriting all the PTE for all objects into WC memory) per
draw. The fix is to push all the decision about which vm to bind into
and when down into the low-level bind routines through hints rather than
performing the bind unconditionally in the execbuffer routine.

Note that this is a regression introduced in the full ppgtt feature
branch, before this we've only done re-bound objects when the relevant
has_(aliasing_ppgtt|global_gtt)_mapping flag was clear. But since
that's per-object and not per-vma that optimization broke.

v2: Split out prep work and unrelated changes.

v3: Bring back functional change around PIN_GLOBAL that I've
accidentally split out.

v4: Remove the temporary hack for the old binding logic to avoid
bisection issues.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72906
Tested-by: jianx.zhou@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8ea99c92

drm/i915: split PIN_GLOBAL out from PIN_MAPPABLE · bf3d149b

由 Daniel Vetter 提交于 2月 14, 2014

With abitrary pin flags it makes sense to split out a "please bind
this into global gtt" from the "please allocate in the mappable
range".

Use this unconditionally in our global gtt pin helper since this is
what its callers want. Later patches will drop PIN_MAPPABLE where it's
not strictly needed.
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bf3d149b

drm/i915: Consolidate binding parameters into flags · 1ec9e26d

由 Daniel Vetter 提交于 2月 14, 2014

Anything more than just one bool parameter is just a pain to read,
symbolic constants are much better.

Split out from Chris' vma-binding rework patch.

v2: Undo the behaviour change in object_pin that Chris spotted.

v3: Split out misplaced hunk to handle set_cache_level errors,
spotted by Jani.

v4: Keep the current over-zealous binding logic in the execbuffer code
working with a quick hack while the overall binding code gets shuffled
around.

v5: Reorder the PIN_ flags for more natural patch splitup.

v6: Pull out the PIN_GLOBAL split-up again.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1ec9e26d

28 1月, 2014 1 次提交

drm/i915: move module parameters into a struct, in a new file · d330a953

由 Jani Nikula 提交于 1月 21, 2014

With 20+ module parameters, I think referring to them via a struct
improves clarity over just having a bunch of globals. While at it, move
the parameter initialization and definitions into a new file
i915_params.c to reduce clutter in i915_drv.c.

Apart from the ill-named i915_enable_rc6, i915_enable_fbc and
i915_enable_ppgtt parameters, for which we lose the "i915_" prefix
internally, the module parameters now look the same both on the kernel
command line and in code. For example, "i915.modeset".

The downsides of the change are losing static on a couple of variables
and not having the initialization and module_param_named() right next to
each other. On the other hand, all module parameters are now defined in
one place at i915_params.c. Plus you can do this to find all module
parameter references:

$ git grep "i915\." -- drivers/gpu/drm/i915

v2:
- move the definitions into a new file
- s/i915_params/i915/
- make i915_try_reset i915.reset, for consistency
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d330a953

27 1月, 2014 1 次提交

drm: dp helper: Add DP test sink CRC definition. · a25eebb0

由 Rodrigo Vivi 提交于 1月 14, 2014

This address will be used to verify panel CRC for test and
validation purposes.
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
[danvet: Fix whitespace fail.]
Acked-by: NDave Airlie <airlied@gmail.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a25eebb0

22 1月, 2014 1 次提交

drm/i915: Clarify relocation errnos · 8b78f0e5

由 Ben Widawsky 提交于 12月 26, 2013

While trying to find a random -EINVAL from a failing test, I noticed we
had a few hard to follow return values.

The first two hunks in this patch replace completely useless
initialization of ret. The last several hunks help to distinguish
between altering 'return ret' and 'return <ERROR>'
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8b78f0e5

07 1月, 2014 1 次提交

drm/i915/ppgtt: Fix ioctl errno for "no such context" · 72ad5c45

由 Ben Widawsky 提交于 1月 02, 2014

Without this fix the ioctls silently succeeded (but actually did
nothing).

It makes all the code which calls into this function way too confusing.

v2: Fix destroy IOCTL as well

v3: Clarify the other two callers of i915_gem_context_get() to never
check for NULL. (Mika)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72903Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Testcase: igt/gem_ctx_exec/basic
[danvet: Fix up the commit message and actually bother to mention the
testcase this fixes.]
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

72ad5c45

19 12月, 2013 3 次提交

drm/i915: Don't check for NEEDS_GTT when deciding the address space · a7c1d426

由 Daniel Vetter 提交于 12月 18, 2013

This means something different and is only relevant for gen6 and the
reason why we cant use anything else than aliasing ppgtt there.

Note that the currently implemented logic for secure batches is
broken: Userspace wants the buffer both in ppgtt (for self-referencing
relocations) and in ggtt (for priveledge operations).

This is the same issue the command parser is also facing.
Unfortunately our coverage for corner-cases of self-referencing
batches is spotty.

Note that this will break vsync'ed Xv and DRI2 copies.
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a7c1d426

drm/i915: Reject NEEDS_GTT relocations with full ppgtt · 2c9f8d56

由 Daniel Vetter 提交于 12月 18, 2013

Doesn't make sense. Spotted while fixing an issue Chris
noticed in the same area.
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

2c9f8d56

drm/i915: Reject non-default contexts on non-render again · 7c9c4b8f

由 Daniel Vetter 提交于 12月 18, 2013

This reverts the abi-change from

commit 67e3d297
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Dec 6 14:11:01 2013 -0800

    drm/i915: Permit contexts on all rings

We don't actually need this, only the internal changes to allow
contexts on all rings for the purpose of ppgtt switching are required.
And I'm not sure whether this is the right thing to do given some of
the hw features in the pipeline.

Also, new abi needs userspace patches as a proof-of-need, which is
completely lacking here.
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7c9c4b8f

18 12月, 2013 4 次提交

drm/i915: Use multiple VMs -- the point of no return · 7e0d96bc

由 Ben Widawsky 提交于 12月 06, 2013

As with processes which run on the CPU, the goal of multiple VMs is to
provide process isolation. Specific to GEN, there is also the ability to
map more objects per process (2GB each instead of 2Gb-2k total).

For the most part, all the pipes have been laid, and all we need to do
is remove asserts and actually start changing address spaces with the
context switch. Since prior to this we've converted the setting of the
page tables to a streamed version, this is quite easy.

One important thing to point out (since it'd been hotly contested) is
that with this patch, every context created will have it's own address
space (provided the HW can do it).

v2: Disable BDW on rebase

NOTE: I tried to make this commit as small as possible. I needed one
place where I could "turn everything on" and that is here. It could be
split into finer commits, but I didn't really see much point.

Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7e0d96bc

drm/i915: Get context early in execbuf · 41bde553

由 Ben Widawsky 提交于 12月 06, 2013

We need to have the address space when reserving space for the objects.
Since the address space and context are tied together, and reserve
occurs before context switch (for good reason), we must lookup our
context earlier in the process.

This leaves some room for optimizations where we no longer need to use
ctx_id in certain places. This will be addressed in a subsequent patch.

Important tricky bit:
Because slow relocations during execbuffer drop struct_mutex

Perhaps it would be best to acquire the reference when we get the
context, but I'll save that for another day (note I have written the
patch before, and I found the changes required to be uglier than this).

Note that since we currently access everything via context id, and not
the data structure this is fine, though not desirable. The next change
attempts to get the context only once via the context ID idr lookup, and
as such, the following can happen:

CTX-A is created, refcount = 1
CTX-A execbuf, mutex dropped
close IOCTL called on CTX-A, refcount = 0
CTX-A resumes in execbuf.

v2: Rebased on top of
commit b6359918
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Wed Oct 30 15:44:16 2013 +0200

    drm/i915: add i915_get_reset_stats_ioctl

v3: Rebased on top of
commit 25b3dfc8
Author: Mika Westerberg <mika.westerberg@linux.intel.com>
Date:   Tue Nov 12 11:57:30 2013 +0200

Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Tue Nov 26 16:14:33 2013 +0200

    drm/i915: check context reset stats before relocations
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

41bde553

drm/i915: Permit contexts on all rings · 67e3d297

由 Ben Widawsky 提交于 12月 06, 2013

If we want to use contexts in more abstract terms (specifically with
PPGTT in mind), we need to allow them to be specified for any ring.

Since the upcoming patches will bring about the use of multiple address
spaces, and each ring needs to have an address space programmed (which
we intend to do at context switch time), we can no longer only use RCS.

With multiple rings having a last context, we must now unreference these
contexts.

NOTE: This commit requires an update to intel-gpu-tools to make it not
fail.

v2: Rebased with some logical conflicts.
Squashed in the context fini refcount patch
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

67e3d297

drm/i915: Simplify ring handling in execbuf · ca01b12b

由 Ben Widawsky 提交于 12月 06, 2013

Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ca01b12b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功