提交 · 57007df74a547c68b3373861aa15efb861af6efc · openanolis / cloud-kernel

07 8月, 2014 1 次提交

drm/i915: work around warning in i915_gem_gtt · 57007df7

由 Pavel Machek 提交于 7月 28, 2014

Gcc warns that addr might be used uninitialized. It may not, but I see
why gcc gets confused.

Additionally, hiding code with side-effects inside WARN_ON() argument
seems uncool, so I moved it outside.
Signed-off-by: NPavel Machek <pavel@ucw.cz>
[danvet: Add obligatory /* shuts up gcc */ comment.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

57007df7

11 7月, 2014 1 次提交

drm/i915: Don't disable PPGTT for CHV based in PCI rev · ca2aed6c

由 Ville Syrjälä 提交于 6月 28, 2014

In
 commit 62942ed7
 Author: Jesse Barnes <jbarnes@virtuousgeek.org>
 Date:   Fri Jun 13 09:28:33 2014 -0700

    drm/i915/vlv: disable PPGTT on early revs v3

we forgot about CHV. IS_VALLEYVIEW() is true for CHV, so we need to
explicitly avoid disabling PPGTT on CHV.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NDeepak S <deepak.s@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ca2aed6c

17 6月, 2014 1 次提交

drm/i915: Added write-enable pte bit supportt · 24f3a8cf

由 Akash Goel 提交于 6月 17, 2014

This adds support for a write-enable bit in the entry of GTT.
This is handled via a read-only flag in the GEM buffer object which
is then used to see how to set the bit when writing the GTT entries.
Currently by default the Batch buffer & Ring buffers are marked as read only.

v2: Moved the pte override code for read-only bit to 'byt_pte_encode'. (Chris)
Fixed the issue of leaving 'gt_old_ro' as unused. (Chris)

v3: Removed the 'gt_old_ro' field, now setting RO bit only for Ring Buffers(Daniel).

v4: Added a new 'flags' parameter to all the pte(gen6) encode & insert_entries functions,
in lieu of overloading the cache_level enum (Daniel).

v5: Removed the superfluous VLV check & changed the definition location of PTE_READ_ONLY flag (Imre)
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NAkash Goel <akash.goel@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

24f3a8cf

14 6月, 2014 1 次提交

drm/i915/vlv: disable PPGTT on early revs v3 · 62942ed7

由 Jesse Barnes 提交于 6月 13, 2014

Early revs didn't have PPGTT support, so disable there.

v2: add debug msg when disabling on early stepping
v3: enable on other B3 packages as well (untested) (Ville)

References: https://bugs.freedesktop.org/show_bug.cgi?id=79669
References: https://bugs.freedesktop.org/show_bug.cgi?id=79670Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

62942ed7

07 6月, 2014 1 次提交

drm/i915: Fixup global gtt cleanup · 4c2e0990

由 Daniel Vetter 提交于 6月 05, 2014

The global gtt is setup up in 2 parts, so we need to be careful
with the cleanup. For consistency shovel it all into the ->cleanup
callback, like with ppgtt.

Noticed because it blew up in the out_gtt: cleanup code while
fiddling with the vgacon code.
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4c2e0990

05 6月, 2014 1 次提交

drm/i915/bdw: Only use 2g GGTT for 32b platforms · 562d55d9

由 Ben Widawsky 提交于 5月 27, 2014

Daniel requested in the bug that I use a 3GB fallback size. Since this
is not in the spec as a valid size, I decided against it. We could
potentially add a patch to bump it to 3GB on top of this one.

This probably should be CC: stable - but I'll let the powers that be
decide that one.

Regression from a revert of the revert:
commit 7907f45b
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Wed Feb 19 22:05:46 2014 -0800

    Revert "drm/i915/bdw: Limit GTT to 2GB"

v2: Change ifdef to 32b, instead of ifndef
update comment

v3. Update comment to not wrap (Daniel).
Update commit message

v4: s/CONFIG_32/CONFIG_X86_32 (Jani).

v5: s/CONFIG_x86_32BIT/CONFIG_x86_32, as meant in v4
s/32B/32b (chris)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76619
Cc: stable@vger.kernel.org
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Tested-by: N"Yang, Guang A" <guang.a.yang@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

562d55d9

27 5月, 2014 1 次提交

drm/i915: Prevent negative relocation deltas from wrapping · d23db88c

由 Chris Wilson 提交于 5月 23, 2014

This is pure evil. Userspace, I'm looking at you SNA, repacks batch
buffers on the fly after generation as they are being passed to the
kernel for execution. These batches also contain self-referenced
relocations as a single buffer encompasses the state commands, kernels,
vertices and sampler. During generation the buffers are placed at known
offsets within the full batch, and then the relocation deltas (as passed
to the kernel) are tweaked as the batch is repacked into a smaller buffer.
This means that userspace is passing negative relocations deltas, which
subsequently wrap to large values if the batch is at a low address. The
GPU hangs when it then tries to use the large value as a base for its
address offsets, rather than wrapping back to the real value (as one
would hope). As the GPU uses positive offsets from the base, we can
treat the relocation address as the minimum address read by the GPU.
For the upper bound, we trust that userspace will not read beyond the
end of the buffer.

So, how do we fix negative relocations from wrapping? We can either
check that every relocation looks valid when we write it, and then
position each object such that we prevent the offset wraparound, or we
just special-case the self-referential behaviour of SNA and force all
batches to be above 256k. Daniel prefers the latter approach.

This fixes a GPU hang when it tries to use an address (relocation +
offset) greater than the GTT size. The issue would occur quite easily
with full-ppgtt as each fd gets its own VM space, so low offsets would
often be handed out. However, with the rearrangement of the low GTT due
to capturing the BIOS framebuffer, it is already affecting kernels 3.15
onwards. I think only IVB+ is susceptible to this bug, but the workaround
should only kick in rarely, so it seems sensible to always apply it.

v3: Use a bias for batch buffers to prevent small negative delta relocations
from wrapping.

v4 from Daniel:
- s/BIAS/BATCH_OFFSET_BIAS/
- Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
  were growing rather cumbersome.
- Add a comment to eb_get_batch explaining why we do this.
- Apply the batch offset bias everywhere but mention that we've only
  observed it on gen7 gpus.
- Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.

v5: Add static to eb_get_batch, spotted by 0-day tester.

Testcase: igt/gem_bad_reloc
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d23db88c

23 5月, 2014 1 次提交

drm/i915: s/intel_ring_buffer/intel_engine_cs · a4872ba6

由 Oscar Mateo 提交于 5月 22, 2014

In the upcoming patches we plan to break the correlation between
engine command streamers (a.k.a. rings) and ringbuffers, so it
makes sense to refactor the code and make the change obvious.

No functional changes.
Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a4872ba6

13 5月, 2014 1 次提交

drm/i915/chv: Implement stolen memory size detection · d7f25f23

由 Damien Lespiau 提交于 5月 08, 2014

CHV uses the same bits as SNB/VLV to code the Graphics Mode Select field
(GFX stolen memory size) with the addition of finer granularity modes:
4MB increments from 0x11 (8MB) to 0x1d.

Values strictly above 0x1d are either reserved or not supported.

v2: 4MB increments, not 8MB. 32MB has been omitted from the list of new
    values (Ville Syrjälä)

v3: Also correctly interpret GGMS (GTT Graphics Memory Size) (Ville
    Syrjälä)

v4: Don't assign a value that needs 20bits or more to a u16 (Rafael
    Barbalho)

[vsyrjala: v5: Split the early quirks to another patch]
Reviewed-by: NJani Nikula <jani.nikula@intel.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NRafael Barbalho <rafael.barbalho@intel.com>
Tested-by: NRafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d7f25f23

07 5月, 2014 3 次提交

drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7 · 3e8b5ae9

由 Ben Widawsky 提交于 5月 06, 2014

It was always the intention to do the topdown allocation for context
objects (Chris' idea originally). Unfortunately, I never managed to land
the patch, but someone else did, so now we can use it.

As a reminder, hardware contexts never need to be in the precious GTT
aperture space - which is what is what happens with the normal bottom up
allocation we do today. Doing a top down allocation increases the odds
that the HW contexts can get out of the way, especially with per FD
contexts as is done in full PPGTT
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

3e8b5ae9

drm/i915/chv: Flush caches when programming page tables · fd1ab8f4

由 Rafael Barbalho 提交于 4月 09, 2014

Page table updates were getting stuck in the CPU cache on chv causing
spurious page faults and strange behaviour.
Signed-off-by: NRafael Barbalho <rafael.barbalho@intel.com>
[vsyrjala: Add !HAS_LLC checks]
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

fd1ab8f4

drm/i915/chv: PPAT setup for Cherryview · ee0ce478

由 Ville Syrjälä 提交于 4月 09, 2014

Ignore the cache bits in PPAT and just set the snoop bit where
appropriate. BDW WB is mapped to snooped access, while all other
modes are mapped to non-snooped access.

The hardware supposedly ignores everything except the snoop bit
in the PPAT entries.

Additionally the hardware actually enforces snooping for all
page table accesses, and thus the snoop bit is ignored for PDEs.

v2: Rebased on top of the bdw resume fix to reload the ppat entries.

v3: Rebase on top of the i915_gem_gtt.h header extraction.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Acked-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NRafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ee0ce478

05 5月, 2014 1 次提交

drm/i915/bdw: Add WT caching ability · 63c42e56

由 Ben Widawsky 提交于 4月 18, 2014

I don't have any insight on what parts can do what. The docs do seem to
suggest WT caching works in at least the same manner as it does on
Haswell.

The addr = 0  is to shut up GCC:
drivers/gpu/drm/i915/i915_gem_gtt.c:80:7: warning: 'addr' may be used
uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: NBrad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

63c42e56

29 4月, 2014 1 次提交

drm/i915: Sanitize the enable_ppgtt module option once · cfa7c862

由 Daniel Vetter 提交于 4月 29, 2014

Otherwise we'll end up spamming dmesg on every context creation on snb
with vt-d enabled. This regression was introduced in

commit 246cbfb5
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Fri Dec 6 14:11:14 2013 -0800

    drm/i915: Reorganize intel_enable_ppgtt

As the i915.enable_ppgtt is read-only it cannot be changed after the
module is loaded and so we can perform an early sanitization of the
values.

v2:
- Add comment and pimp commit message (Chris)
- Use the param consistently (Jani)

v3:
- Fix init sequence on pre-gen6 by moving the sanitize_ppgtt call to
  gtt_init. Fixes boot hangs on pre-gen6.
- Add a debug output for the sanitize ppgtt mode.

References: https://lkml.org/lkml/2014/4/17/599
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77916
Cc: Alessandro Suardi <alessandro.suardi@gmail.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

cfa7c862

24 4月, 2014 1 次提交

drm/i915: Allow full PPGTT with param override · 0f9dc59d

由 Ben Widawsky 提交于 3月 24, 2014

When PPGTT was disabled by default, the patch also prevented the user
from overriding this behavior via module parameter. Being able to test
this on arbitrary kernels is extremely beneficial to track down the
remaining bugs. The patch that prevented this was:

commit 93a25a9e
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Mar 6 09:40:43 2014 +0100

    drm/i915: Disable full ppgtt by default

By default PPGTT is set to -1. 0 means off, 1 means aliasing only, 2
means full, all other values are reserved.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

0f9dc59d

04 4月, 2014 1 次提交

drm: Add support for two-ended allocation, v3 · 62347f9e

由 Lauri Kasanen 提交于 4月 02, 2014

Clients like i915 need to segregate cache domains within the GTT which
can lead to small amounts of fragmentation. By allocating the uncached
buffers from the bottom and the cacheable buffers from the top, we can
reduce the amount of wasted space and also optimize allocation of the
mappable portion of the GTT to only those buffers that require CPU
access through the GTT.

For other drivers, allocating small bos from one end and large ones
from the other helps improve the quality of fragmentation.

Based on drm_mm work by Chris Wilson.

v3: Changed to use a TTM placement flag
v2: Updated kerneldoc

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Christian König <deathsimple@vodafone.de>
Signed-off-by: NLauri Kasanen <cand@gmx.com>
Signed-off-by: NDavid Airlie <airlied@redhat.com>

62347f9e

03 4月, 2014 1 次提交

drm/i915: dmesg output for VT-d testing · 5db6c735

由 Daniel Vetter 提交于 3月 31, 2014

Our validation guys want to have a positive proof that the gfx driver
is indeed using VT-d, since setting up a gfx stack, especially in
early bring-up and by people not versed in linux gfx is a bit tricky.
So provide just that.

Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5db6c735

02 4月, 2014 2 次提交

drm/i915: Allow full PPGTT with param override · 8d214b7d

由 Ben Widawsky 提交于 3月 24, 2014

When PPGTT was disabled by default, the patch also prevented the user
from overriding this behavior via module parameter. Being able to test
this on arbitrary kernels is extremely beneficial to track down the
remaining bugs. The patch that prevented this was:

commit 93a25a9e
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Mar 6 09:40:43 2014 +0100

    drm/i915: Disable full ppgtt by default

By default PPGTT is set to -1. 0 means off, 1 means aliasing only, 2
means full, all other values are reserved.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8d214b7d

drm/i915: Split out GTT specific header file · 0260c420

由 Ben Widawsky 提交于 3月 22, 2014

This file contains all necessary defines, prototypes and typesdefs for
manipulating GEN graphics address translation (this does not include the
legacy AGP driver)

Reiterating the comment in the header,
"Please try to maintain the following order within this file unless it
makes sense to do otherwise. From top to bottom:
1. typedefs
2. #defines, and macros
3. structure definitions
4. function prototypes

Within each section, please try to order by generation in ascending
order, from top to bottom (ie. GEN6 on the top, GEN8 on the bottom)."

I've made some minor cleanups, and fixed a couple of typos while here -
but there should be no functional changes.

The purpose of the patch is to reduce clutter in our main header file,
making room for new growth, and make documentation of our interfaces
easier by splitting things out.

With a little more work, like making i915_gtt a pointer, we could
potentially completely isolate this header from i915_drv.h. At the
moment however, I don't think it's worth the effort.

Personally, I would have liked to put the PTE encoding functions in this
file too, but I didn't want to rock the boat too much.

A similar patch has been in use on my machine for some time. This exact
patch though has only been compile tested.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

0260c420

31 3月, 2014 1 次提交

drm/i915: prefer struct drm_i915_private to drm_i915_private_t · 50227e1c

由 Jani Nikula 提交于 3月 31, 2014

Remove the rest of the references to drm_i915_private_t. No functional
changes.
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
[danvet: Drop hunk in i915_cmd_parser.c]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

50227e1c

29 3月, 2014 1 次提交

drm/i915: Undo gtt scratch pte unmapping again · e568af1c

由 Daniel Vetter 提交于 3月 26, 2014

It apparently blows up on some machines. This functionally reverts

commit 828c7908
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Wed Oct 16 09:21:30 2013 -0700

    drm/i915: Disable GGTT PTEs on GEN6+ suspend

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841Reported-and-Tested-by: NBrad  Jackson <bjackson0971@gmail.com>
Cc: stable@vger.kernel.org
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Todd Previte <tprevite@gmail.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

e568af1c

28 3月, 2014 1 次提交

drm/i915: Undo gtt scratch pte unmapping again · 8ee661b5

由 Daniel Vetter 提交于 3月 26, 2014

It apparently blows up on some machines. This functionally reverts

commit 828c7908
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Wed Oct 16 09:21:30 2013 -0700

    drm/i915: Disable GGTT PTEs on GEN6+ suspend

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841Reported-and-Tested-by: NBrad  Jackson <bjackson0971@gmail.com>
Cc: stable@vger.kernel.org
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Todd Previte <tprevite@gmail.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NDave Airlie <airlied@redhat.com>

8ee661b5

19 3月, 2014 1 次提交

drm/i915/bdw: Restore PPAT on thaw · a2319c08

由 Ben Widawsky 提交于 3月 18, 2014

Apparently it is wiped out from under us, and we get some really fun
caching artifacts upon resume (it seems to be WB for all types by
default).
Reported-by: NJames Ausmus <james.ausmus@intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Tested-by: NJames Ausmus <james.ausmus@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76113Tested-by: NTimo Aaltonen <timo.aaltonen@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a2319c08

13 3月, 2014 1 次提交

drm/i915: Drop WARN_ON(flags) from ppgtt_bind_vma() · c5139450

由 Ville Syrjälä 提交于 3月 12, 2014

We will call ppgtt_bind_vma() with flags != 0, so the WARN_ON(flags)
is bogus. Kill it.
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c5139450

12 3月, 2014 2 次提交

drm/i915: Correct PPGTT total size · 5a6c93fe

由 Ben Widawsky 提交于 3月 08, 2014

Our code allows have a PPGTT that is smaller than the maximum size for
GEN6-GEN7. Though I don't think this actually ever occurs, the code may
as well work properly and more importantly look correct by using the
variable size instead of the HW max.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5a6c93fe

drm/i915/bdw: Use scratch page table for GEN8 PPGTT · 8407bb91

由 Ben Widawsky 提交于 3月 08, 2014

I'm not clear if the hardware is still subject to the same prefetching
issues that made us use a scratch page in the first place. In either
case, we're using garbage with the current code (we will end up using
offset 0).

This may be the cause of our current gem_cpu_reloc regression with
PPGTT. I cannot test it at the moment.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8407bb91

08 3月, 2014 1 次提交

drm/i915: Disable full ppgtt by default · 93a25a9e

由 Daniel Vetter 提交于 3月 06, 2014

There are too many oustanding issues:

- Fence handling in the current code is broken. There's a patch series
  from me, but it's blocked on and extended review (which includes
  writing the testcases).

- IOMMU mapping handling is broken, we need to properly refcount it -
  currently it gets destroyed when the first vma is unbound, so way
  too early.

- There's a pending reset issue on snb. Since Mika's reset work and
  full ppgtt have been pulled in in separate branches and ended up
  intermittingly breaking each another it's unclear who's the exact
  culprit here.

- We still have persistent evidince of crazy recursion bugs through
  vma_unbind and ppgtt_relase, e.g.

  https://bugs.freedesktop.org/show_bug.cgi?id=73383

  This issue (and a few others meanwhile resolved) have blocked our
  performance measuring/tuning group since 3 months.

- Secure batch dispatching is broken. This is blocking Brad Volkin's
  command checker work since 3 months.

All these issues are confirmed to only happen when full ppgtt is
enabled, falling back to aliasing ppgtt resolves them. But even
aliasing ppgtt itself still has a regression:

- We currently unconditionally bind objects into the aliasing ppgtt,
  which means all priviledged objects like ringbuffers are visible to
  unpriviledged access again. On top of that this also breaks the
  command checker for aliasing ppgtt, since it can't hide the
  validated batch any more.

Furthermore topic/full-ppgtt has never been reviewed:

- Lifetime rules around vma unbinding/release are unclear, resulting
  into this awesome hack called ppgtt_release. Which seems to take the
  blame for most of the recursion fallout.

- Context/ring init works different on gpu reset than anywhere else.
  Such differeneces have in the past always lead to really hard to
  track down bugs.

- Aliasing ppgtt is treated in a bunch of places as a real address
  space, but it isn't - the real address space is always the global
  gtt in that case. This results in a bit a mess between contexts and
  ppgtt object, further complication the context/ppgtt/vma lifetime
  rules.

- We don't have any docs describing the overall concepts introduced
  with full ppgtt. A short, concise overview describing vmas and some
  of the strange bits around them (like the unbound vmas used by
  execbuf, or the new binding rules) really is needed.

Note that a lot of the post topic/full-ppgtt merge fallout has already
been addressed, this entire list here of 10 issues really only contains
the still outstanding issues.

Finally the 3.15 merge window is approaching and I think we need to
use the remaining time to ensure that our fallback option of using
aliasing ppgtt is in solid shape. Hence I think it's time to throw the
switch. While at it demote the helper from static inline status
because really.

Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

93a25a9e

06 3月, 2014 6 次提交

drm/i915/bdw: Kill ppgtt->num_pt_pages · 5abbcca3

由 Ben Widawsky 提交于 2月 21, 2014

With the original PPGTT implementation if the number of PDPs was not a
power of two, the number of pages for the page tables would end up being
rounded up. The code actually had a bug here afaict, but this is a
theoretical bug as I don't believe this can actually occur with the
current code/HW..

With the rework of the page table allocations, there is no longer a
distinction between number of page table pages, and number of page
directory entries. To avoid confusion, kill the redundant (and newer)
struct member.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5abbcca3

drm/i915: Split GEN6 PPGTT initialization up · b146520f

由 Ben Widawsky 提交于 2月 19, 2014

Simply to match the GEN8 style of PPGTT initialization, split up the
allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
allocation function, as it is much simpler pre-gen8.

With this code it would be easy to make a more general PPGTT
initialization function with per GEN alloc/map/etc. or use a common
helper, similar to the ringbuffer code. I don't see a benefit to doing
this just yet, but who knows...
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b146520f

drm/i915: Split GEN6 PPGTT cleanup · a00d825d

由 Ben Widawsky 提交于 2月 19, 2014

This cleanup is similar to the GEN8 cleanup (though less necessary).
Having everything split will make cleaning the initialization path error
paths easier to understand.
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a00d825d

drm/i915: Update i915_gem_gtt.c copyright · c4ac524c

由 Ben Widawsky 提交于 2月 19, 2014

I keep meaning to do this... by now almost the entire file has been
written by an Intel employee (including Daniel post-2010).
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c4ac524c

Revert "drm/i915/bdw: Limit GTT to 2GB" · 7907f45b

由 Ben Widawsky 提交于 2月 19, 2014

This reverts commit 3a2ffb65.

Now that the code is fixed to use smaller allocations, it should be safe
to let the full GGTT be used on BDW.

The testcase for this is anything which uses more than half of the GTT,
thus eclipsing the old limit.
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7907f45b

drm/i915/bdw: Reorganize PT allocations · 7ad47cf2

由 Ben Widawsky 提交于 2月 20, 2014

The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.

In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.

To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.

NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.

v2/NOTE2: This patch predated commit:
6f1cc993
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 31 15:50:31 2013 +0000

    drm/i915: Avoid dereference past end of page arr

It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)

v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)

v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)

v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

7ad47cf2

04 3月, 2014 4 次提交

drm/i915: Make clear/insert vfuncs args absolute · 782f1495

由 Ben Widawsky 提交于 2月 20, 2014

This patch converts insert_entries and clear_range, both functions which
are specific to the VM. These functions tend to encapsulate the gen
specific PTE writes. Passing absolute addresses to the insert_entries,
and clear_range will help make the logic clearer within the functions as
to what's going on. Currently, all callers simply do the appropriate
page shift, which IMO, ends up looking weird with an upcoming change for
the gen8 page table allocations.

Up until now, the PPGTT was a funky 2 level page table. GEN8 changes
this to look more like a 3 level page table, and to that extent we need
a significant amount more memory simply for the page tables. To address
this, the allocations will be split up in finer amounts.

v2: Replace size_t with uint64_t (Chris, Imre)

v3: Fix size in gen8_ppgtt_init (Ben)
Fix Size in i915_gem_suspend_gtt_mappings/restore (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com> (v2)
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

782f1495

drm/i915/bdw: Split ppgtt initialization up · bf2b4ed2

由 Ben Widawsky 提交于 2月 19, 2014

Like cleanup in an earlier patch, the code becomes much more readable,
and easier to extend if we extract out helper functions for the various
stages of init.

Note that with this patch it becomes really simple, and tempting to begin
using the 'goto out' idiom with explicit free/fini semantics. I've
kept the error path as similar as possible to the cleanup() function to
make sure cleanup is as robust as possible

v2: Remove comment "NB:From here on, ppgtt->base.cleanup() should
function properly"
Update commit message to reflect above

v3: Rebased on top of bugfixes found in the previous patch by Imre
Moved number of pd pages assertion to the proper place (Imre)

v4:
Allocate dma address space for num_pd_pages, not num_pd_entries (Ben)
Don't use gen8_pt_dma_addr after free on error path (Imre)
With new fix from v4 of the previous patch.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bf2b4ed2

drm/i915/bdw: Reorganize PPGTT init · f3a964b9

由 Ben Widawsky 提交于 2月 19, 2014

Create 3 clear stages in PPGTT init. This will help with upcoming
changes be more readable. The 3 stages are, allocation, dma mapping, and
writing the P[DT]Es

One nice benefit to the patches is that it makes 2 very clear error
points, allocation, and mapping, and avoids having to do any handling
after writing PTEs (something which was likely buggy before). This
simplified error handling I suspect will be helpful when we move to
deferred/dynamic page table allocation and mapping.

The patches also attempts to break up some of the steps into more
logical reviewable chunks, particularly when we free.

v2: Don't call cleanup on the error path since that takes down the
drm_mm and list entry, which aren't setup at this point.

v3: Fixes addressing Imre's comments from:
<1392821989.19792.13.camel@intelbox>

Don't do dynamic allocation for the page table DMA addresses. I can't
remember why I did it in the first place. This addresses one of Imre's
other issues.

Fix error path leak of page tables.

v4: Fix the fix of the error path leak. Original fix still leaked page
tables. (Imre)
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

f3a964b9

drm/i915/bdw: Free PPGTT struct · b18b6bde

由 Ben Widawsky 提交于 2月 20, 2014

GEN8 never freed the PPGTT struct. As GEN8 doesn't use full PPGTT, the
leak is small and only found on a module reload. ie. I don't think this
needs to go to stable.

v2: The very naive, kfree in gen8 ppgtt cleanup, is subject to a double
free on PPGTT initialization failure. (Spotted by Imre). Instead this
patch pulls the ppgtt struct freeing out of the cleanup and leaves it to
the allocators/callers or the one doing the last kref_put as in standard
convention
Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b18b6bde

14 2月, 2014 2 次提交

drm/i915: Allow blocking in the PDE alloc when running low on gtt space · d47c3ea2

由 Daniel Vetter 提交于 2月 14, 2014

There's no need not to, really.

Split out from Chris vma-bind rework.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

d47c3ea2

drm/i915: Consolidate binding parameters into flags · 1ec9e26d

由 Daniel Vetter 提交于 2月 14, 2014

Anything more than just one bool parameter is just a pain to read,
symbolic constants are much better.

Split out from Chris' vma-binding rework patch.

v2: Undo the behaviour change in object_pin that Chris spotted.

v3: Split out misplaced hunk to handle set_cache_level errors,
spotted by Jani.

v4: Keep the current over-zealous binding logic in the execbuffer code
working with a quick hack while the overall binding code gets shuffled
around.

v5: Reorder the PIN_ flags for more natural patch splitup.

v6: Pull out the PIN_GLOBAL split-up again.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

1ec9e26d

13 2月, 2014 1 次提交

drm/i915/bdw: Split up PPGTT cleanup · b45a6715

由 Ben Widawsky 提交于 2月 12, 2014

This will make the code more readable, and extensible which is needed
for upcoming feature work. Eventually, we'll do the same for init.
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b45a6715

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功