提交 · ddbb271aea87fc6004d3c8bcdb0710e980c7ec85 · openanolis / cloud-kernel

18 11月, 2016 1 次提交

drm/i915: add i915_address_space_fini · ed9724dd

由 Matthew Auld 提交于 11月 17, 2016

We already have an i915_address_space_init, so for symmetry we should
also have a _fini, plus we already open code it twice. This then also
fixes a bug where we leak the timeline for the ggtt vm.

v2: don't forget about the struct_mutex for the ggtt path.

Fixes: 80b204bc ("drm/i915: Enable multiple timelines")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161117210411.14044-1-chris@chris-wilson.co.uk

ed9724dd

17 11月, 2016 3 次提交

drm/i915: dev_priv cleanup in i915_gem_stolen.c · 7ace3d30

由 Tvrtko Ursulin 提交于 11月 16, 2016

And a little bit of cascaded function prototype changes.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>

7ace3d30

drm/i915: dev_priv cleanup in i915_gem_gtt.c · 275a991c

由 Tvrtko Ursulin 提交于 11月 16, 2016

Started with removing INTEL_INFO(dev) and cascaded into a quite
big trickle of function prototype changes. Still, I think it is
for the better.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>

275a991c

T
drm/i915: dev_priv and a small cascade of cleanups in i915_gem.c · c6be607a
由 Tvrtko Ursulin 提交于 11月 16, 2016
```
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
```
c6be607a

11 11月, 2016 2 次提交

drm/i915: Pass dev_priv to INTEL_INFO everywhere apart from the gen use · b7f05d4a

由 Tvrtko Ursulin 提交于 11月 09, 2016

After this patch only conversion of INTEL_INFO(p)->gen to
INTEL_GEN(dev_priv) remains before the __I915__ macro can
be removed.

v2: Tidy vlv_compute_wm. (David Weinehall)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>

b7f05d4a

drm/i915: Split out i915_vma.c · b42fe9ca

由 Joonas Lahtinen 提交于 11月 11, 2016

As a side product, had to split two other files;
- i915_gem_fence_reg.h
- i915_gem_object.h (only parts that needed immediate untanglement)

I tried to move code in as big chunks as possible, to make review
easier. i915_vma_compare was moved to a header temporarily.

v2:
- Use i915_gem_fence_reg.{c,h}

v3:
- Rebased

v4:
- Fix building when DEBUG_GEM is enabled by reordering a bit.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com

b42fe9ca

07 11月, 2016 1 次提交

drm/i915: Remove the vma from the object list upon close · dfd2812e

由 Chris Wilson 提交于 11月 04, 2016

Currently, the vma is being unlink from the object lookup on destroy.
However, we are meant to be decoupling it upon close so that the user
cannot access the closed vma whilst it remains active on the GPU.

[   34.074858] kernel BUG at drivers/gpu/drm/i915/i915_gem_gtt.c:3561!
[   34.074875] invalid opcode: 0000 [#1] PREEMPT SMP
[   34.074888] Modules linked in: snd_hda_intel i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich mei_me mei snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core i2c_designware_platform i2c_designware_core snd_pcm e1000e ptp pps_core sdhci_acpi sdhci mmc_core i2c_hid [last unloaded: i915]
[   34.075010] CPU: 1 PID: 6224 Comm: gem_close_race Tainted: G     U          4.9.0-rc3-CI-CI_DRM_1800+ #1
[   34.075034] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0355.2016.0224.1501 02/24/2016
[   34.075057] task: ffff8802459a8040 task.stack: ffffc90000524000
[   34.075074] RIP: 0010:[<ffffffffa0392cbc>]  [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
[   34.075118] RSP: 0018:ffffc90000527b68  EFLAGS: 00010202
[   34.075135] RAX: ffff8802426c5e40 RBX: 0000000000000000 RCX: ffff8802447fc2a8
[   34.075158] RDX: 0000000000000000 RSI: ffff8802447fc2a8 RDI: ffff880248a4a880
[   34.075181] RBP: ffffc90000527b88 R08: 0000000000000008 R09: 0000000000000000
[   34.075203] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880248a4a880
[   34.075225] R13: ffff8802447fc2a8 R14: ffff880243e9afa8 R15: ffff880248a4a9c8
[   34.075248] FS:  00007f9b43e59740(0000) GS:ffff880256c80000(0000) knlGS:0000000000000000
[   34.075273] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   34.075292] CR2: 00007f9b43419140 CR3: 000000024455d000 CR4: 00000000003406e0
[   34.075314] Stack:
[   34.075323]  0000000000000000 ffffc90000527bd0 ffff880243cb8008 ffff880243e9afa8
[   34.075353]  ffffc90000527c08 ffffffffa03874c7 ffffc90000527bb8 ffff880243e9afa8
[   34.075383]  ffff880243e9afb0 ffffc90000527e10 ffff8802447fc2a8 ffff880243cb8040
[   34.075414] Call Trace:
[   34.075435]  [<ffffffffa03874c7>] eb_lookup_vmas.isra.7+0x247/0x330 [i915]
[   34.075468]  [<ffffffffa0388c34>] i915_gem_do_execbuffer.isra.15+0x604/0x1a10 [i915]
[   34.075507]  [<ffffffffa039c957>] ? i915_gem_object_get_sg+0x347/0x380 [i915]
[   34.075532]  [<ffffffff811a69ce>] ? __might_fault+0x3e/0x90
[   34.075562]  [<ffffffffa038a430>] i915_gem_execbuffer2+0xc0/0x250 [i915]
[   34.075585]  [<ffffffff81552926>] drm_ioctl+0x1f6/0x480
[   34.075604]  [<ffffffff8100107a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
[   34.075635]  [<ffffffffa038a370>] ? i915_gem_execbuffer+0x330/0x330 [i915]
[   34.075658]  [<ffffffff81202d2e>] do_vfs_ioctl+0x8e/0x690
[   34.075677]  [<ffffffff8181582d>] ? _raw_spin_unlock_irqrestore+0x3d/0x60
[   34.075700]  [<ffffffff810fcd51>] ? SyS_timer_settime+0x141/0x1e0
[   34.075721]  [<ffffffff810d6de2>] ? trace_hardirqs_on_caller+0x122/0x1b0
[   34.075742]  [<ffffffff8120336c>] SyS_ioctl+0x3c/0x70
[   34.075760]  [<ffffffff8181602e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[   34.075781] Code: 44 a0 48 c7 c2 9a 7e 43 a0 be e0 0d 00 00 48 c7 c7 a0 45 44 a0 e8 55 b8 ce e0 48 85 db 74 a3 49 83 bd f8 03 00 00 00 74 99 0f 0b <0f> 0b 48 89 da 4c 89 ee 4c 89 e7 e8 04 a9 ff ff 48 89 da 49 89
[   34.075955] RIP  [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
[   34.075994]  RSP <ffffc90000527b68>

Testcase: igt/gem_close_race/basic-threads
Fixes: db6c2b41 ("drm/i915: Store the vma in an rbtree...")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161104161241.25871-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

dfd2812e

04 11月, 2016 2 次提交

drm/i915: Fix pages pin counting around swizzle quirk · 2c3a3f44

由 Chris Wilson 提交于 11月 04, 2016

commit bc0629a7 ("drm/i915: Track pages pinned due to swizzling
quirk") fixed one problem, but revealed a whole lot more. The root cause
of the pin count mismatch for the swizzle quirk (for L-shaped memory on
gen3/4) was that we were incrementing the pages_pin_count upon getting
the backing pages but then overwriting the pages_pin_count to set it to
1 afterwards. With a little bit of adjustment to satisfy the GEM_BUG_ON
sanitychecks, the fix is to replace the explicit atomic_set with an
atomic_inc.

v2: Consistently use atomics (not mix atomics and helpers) within the
lowlevel get_pages routines. This makes the atomic operations much
clearer.

Fixes: 1233e2db ("drm/i915: Move object backing storage manipulation")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161104103001.27643-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

2c3a3f44

drm/i915: Fix test on inputs for vma_compare() · a44342ac

由 Chris Wilson 提交于 11月 03, 2016

When supplying a view to vma_compare() it is required that the supplied
i915_address_space is the global GTT. I tested the VMA instead (which is
the current position in the rbtree and maybe from any address space).
Reported-by: NMatthew Auld <matthew.auld@intel.com>
Tested-by: NMatthew Auld <matthew.auld@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98579
Fixes: db6c2b41 ("drm/i915: Store the vma in an rbtree...")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161103200852.23431-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>

a44342ac

02 11月, 2016 3 次提交

drm/i915: Unify global_list into global_link · 56cea323

由 Joonas Lahtinen 提交于 11月 02, 2016

$ sed -i -r 's/\bglobal_list\b/global_link/g' *.c *.h

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478081764-8058-1-git-send-email-joonas.lahtinen@linux.intel.com

56cea323

drm/i915/gtt: Mark tlbs dirty on clear · fce93755

由 Mika Kuoppala 提交于 10月 31, 2016

Now when clearing ptes can modify upper level pdp's,
we need to mark them dirty so that they will be flushed
correctly.
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478006856-8313-1-git-send-email-mika.kuoppala@intel.com

fce93755

drm/i915/gtt: Fix pte clear range · 37c63934

由 Mika Kuoppala 提交于 11月 01, 2016

Comparing pte index to a number of entries is wrong
when clearing a range of pte entries. Use end marker
of 'one past' to correctly point adequate number of
ptes to the scratch page.

v2: assert early instead of warning late (Chris)
v3: removed consts (Joonas)

Fixes: d209b9c3 ("drm/i915/gtt: Split gen8_ppgtt_clear_pte_range")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98282
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reported-by: NMike Lothian <mike@fireburn.co.uk>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Tested-by: NMike Lothian <mike@fireburn.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>

37c63934

01 11月, 2016 1 次提交

drm/i915: Store the vma in an rbtree under the object · db6c2b41

由 Chris Wilson 提交于 11月 01, 2016

With full-ppgtt one of the main bottlenecks is the lookup of the VMA
underneath the object. For execbuf there is merit in having a very fast
direct lookup of ctx:handle to the vma using a hashtree, but that still
leaves a large number of other lookups. One way to speed up the lookup
would be to use a rhashtable, but that requires extra allocations and
may exhibit poor worse case behaviour. An alternative is to use an
embedded rbtree, i.e. no extra allocations and deterministic behaviour,
but at the slight cost of O(lgN) lookups (instead of O(1) for
rhashtable). The major of such tree will be very shallow and so not much
slower, and still scales much, much better than the current unsorted
list.

v2: Bump vma_compare() to return a long, as we return the result of
comparing two pointers.

References: https://bugs.freedesktop.org/show_bug.cgi?id=87726Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101115400.15647-1-chris@chris-wilson.co.uk

db6c2b41

29 10月, 2016 7 次提交

drm/i915: Enable multiple timelines · 80b204bc

由 Chris Wilson 提交于 10月 28, 2016

With the infrastructure converted over to tracking multiple timelines in
the GEM API whilst preserving the efficiency of using a single execution
timeline internally, we can now assign a separate timeline to every
context with full-ppgtt.

v2: Add a comment to indicate the xfer between timelines upon submission.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-35-chris@chris-wilson.co.uk

80b204bc

drm/i915: Move GEM activity tracking into a common struct reservation_object · d07f0e59

由 Chris Wilson 提交于 10月 28, 2016

In preparation to support many distinct timelines, we need to expand the
activity tracking on the GEM object to handle more than just a request
per engine. We already use the struct reservation_object on the dma-buf
to handle many fence contexts, so integrating that into the GEM object
itself is the preferred solution. (For example, we can now share the same
reservation_object between every consumer/producer using this buffer and
skip the manual import/export via dma-buf.)

v2: Reimplement busy-ioctl (by walking the reservation object), postpone
the ABI change for another day. Similarly use the reservation object to
find the last_write request (if active and from i915) for choosing
display CS flips.

Caveats:

* busy-ioctl: busy-ioctl only reports on the native fences, it will not
warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
being rendered to by external fences. It also will not report the same
busy state as wait-ioctl (or polling on the dma-buf) in the same
circumstances. On the plus side, it does retain reporting of which
*i915* engines are engaged with this object.

* non-blocking atomic modesets take a step backwards as the wait for
render completion blocks the ioctl. This is fixed in a subsequent
patch to use a fence instead for awaiting on the rendering, see
"drm/i915: Restore nonblocking awaits for modesetting"

* dynamic array manipulation for shared-fences in reservation is slower
than the previous lockless static assignment (e.g. gem_exec_lut_handle
runtime on ivb goes from 42s to 66s), mainly due to atomic operations
(maintaining the fence refcounts).

* loss of object-level retirement callbacks, emulated by VMA retirement
tracking.

* minor loss of object-level last activity information from debugfs,
could be replaced with per-vma information if desired
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk

d07f0e59

drm/i915: Pass around sg_table to get_pages/put_pages backend · 03ac84f1

由 Chris Wilson 提交于 10月 28, 2016

The plan is to move obj->pages out from under the struct_mutex into its
own per-object lock. We need to prune any assumption of the struct_mutex
from the get_pages/put_pages backends, and to make it easier we pass
around the sg_table to operate on rather than indirectly via the obj.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk

03ac84f1

drm/i915: Refactor object page API · a4f5ea64

由 Chris Wilson 提交于 10月 28, 2016

The plan is to make obtaining the backing storage for the object avoid
struct_mutex (i.e. use its own locking). The first step is to update the
API so that normal users only call pin/unpin whilst working on the
backing storage.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk

a4f5ea64

drm/i915: Use radixtree to jump start intel_partial_pages() · d2a84a76

由 Chris Wilson 提交于 10月 28, 2016

We can use the radixtree index of the obj->pages to find the start
position of the desired partial range.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-11-chris@chris-wilson.co.uk

d2a84a76

drm/i915: Markup GEM API with lockdep asserts · 4c7d62c6

由 Chris Wilson 提交于 10月 28, 2016

Add lockdep_assert_held(struct_mutex) to the API preamble of the
internal GEM interfaces.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-9-chris@chris-wilson.co.uk

4c7d62c6

drm/i915: Defer active reference until required · f8a7fde4

由 Chris Wilson 提交于 10月 28, 2016

We only need the active reference to keep the object alive after the
handle has been deleted (so as to prevent a synchronous gem_close). Why
then pay the price of a kref on every execbuf when we can insert that
final active ref just in time for the handle deletion?
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-6-chris@chris-wilson.co.uk

f8a7fde4

24 10月, 2016 2 次提交

drm/i915: Remove RPM sequence checking · 2eedfc7d

由 Chris Wilson 提交于 10月 24, 2016

We only used the RPM sequence checking inside the lowlevel GTT
accessors, when we had to rely on callers taking the wakeref on our
behalf. Now that we take the RPM wakeref inside the GTT management
routines themselves, we can forgo the sanitycheck of the callers.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-4-chris@chris-wilson.co.uk

2eedfc7d

drm/i915: Use RPM as the barrier for controlling user mmap access · 9c870d03

由 Chris Wilson 提交于 10月 24, 2016

We can remove the false coupling between RPM and struct mutex by the
observation that we can use the RPM wakeref as the barrier around user
mmap access. That is as we tear down the user's PTE atomically from
within rpm suspend and then to fault in new PTE requires the rpm
wakeref, means that no user access is possible through those PTE without
RPM being awake. Having made that observation, we can then remove the
presumption of having to take rpm outside of struct_mutex and so allow
fine grained acquisition of a wakeref around hw access rather than
having to remember to acquire the wakeref early on.

v2: Rejig placement of the new intel_runtime_pm_get() to be as tight
as possible around the GTT pread/pwrite.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NDaniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-2-chris@chris-wilson.co.uk

9c870d03

14 10月, 2016 10 次提交

drm/i915/gtt: Free unused lower-level page tables · 2ce5179f

由 Michał Winiarski 提交于 10月 13, 2016

Since "Dynamic page table allocations" were introduced, our page tables
can grow (being dynamically allocated) with address space range usage.
Unfortunately, their lifetime is bound to vm. This is not a huge problem
when we're not using softpin - drm_mm is creating an upper bound on used
range by causing addresses for our VMAs to eventually be reused.

With softpin, long lived contexts can drain the system out of memory
even with a single "small" object. For example:

bo = bo_alloc(size);
while(true)
    offset += size;
    exec(bo, offset);

Will cause us to create new allocations until all memory in the system
is used for tracking GPU pages (even though almost all PTEs in this vm
are pointing to scratch).

Let's free unused page tables in clear_range to prevent this - if no
entries are used, we can safely free it and return this information to
the caller (so that higher-level entry is pointing to scratch).

v2: Document return value and free semantics (Joonas)
v3: No newlines in vars block (Joonas)
v4: Drop redundant local 'reduce' variable
v5: Handle CI fail with enable_ppgtt=2

Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-3-git-send-email-michal.winiarski@intel.com

2ce5179f

drm/i915/gtt: Split gen8_ppgtt_clear_pte_range · d209b9c3

由 Michał Winiarski 提交于 10月 13, 2016

Let's use more top-down approach, where each gen8_ppgtt_clear_* function
is responsible for clearing the struct passed as an argument and calling
relevant clear_range functions on lower-level tables.
Doing this rather than operating on PTE ranges makes the implementation
of shrinking page tables quite simple.

v2: Drop min when calculating num_entries, no negation in 48b ppgtt
check, no newlines in vars block (Joonas)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-2-git-send-email-michal.winiarski@intel.com

d209b9c3

drm/i915: Remove unused "valid" parameter from pte_encode · 4fb84d99

由 Michał Winiarski 提交于 10月 13, 2016

We never used any invalid ptes, those were put in place for
a possibility of doing gpu faults. However our batchbuffers are not
restricted in length, so everything needs to be pointing to something
and thus out-of-bounds is pointing to scratch.

Remove the valid flag as it is always true.

v2: Expand commit msg, patch reorder (Mika)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-1-git-send-email-michal.winiarski@intel.com

4fb84d99

drm/i915: Make IS_GEN macros only take dev_priv · 5db94019

由 Tvrtko Ursulin 提交于 10月 13, 2016

Saves 1416 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476352990-2504-1-git-send-email-tvrtko.ursulin@linux.intel.com

5db94019

drm/i915: Make IS_CHERRYVIEW only take dev_priv · 920a14b2

由 Tvrtko Ursulin 提交于 10月 14, 2016

Saves 864 bytes of .rodata strings and ~100 of .text.

v2: Add parantheses around dev_priv. (Ville Syrjala)
v3: Rebase.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>

920a14b2

drm/i915: Make IS_BROXTON only take dev_priv · e2d214ae

由 Tvrtko Ursulin 提交于 10月 13, 2016

Saves 1392 bytes of .rodata strings.

Also change a few function/macro prototypes in i915_gem_gtt.c
from dev to dev_priv where it made more sense to do so.

v2: Add parantheses around dev_priv. (Ville Syrjala)
v3: Mention function prototype changes. (David Weinehall)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>

e2d214ae

drm/i915: Make IS_SKYLAKE only take dev_priv · d9486e65

由 Tvrtko Ursulin 提交于 10月 13, 2016

Saves 1016 bytes of .rodata strings and couple hundred of .text.

v2: Add parantheses around dev_priv. (Ville Syrjala)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>

d9486e65

drm/i915: Make IS_HASWELL only take dev_priv · 772c2a51

由 Tvrtko Ursulin 提交于 10月 13, 2016

Saves 2432 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>

772c2a51

drm/i915: Make IS_BROADWELL only take dev_priv · 8652744b

由 Tvrtko Ursulin 提交于 10月 13, 2016

Saves 1808 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>

8652744b

drm/i915: Allocate intel_engine_cs structure only for the enabled engines · 3b3f1650

由 Akash Goel 提交于 10月 13, 2016

With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
	struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
	struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
i915.o file (but for i915.ko file text size remain same as 1193131 bytes).

v2:
- Remove the engine iterator field added in drm_i915_private structure,
  instead pass a local iterator variable to the for_each_engine**
  macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
  NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
  can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
  engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
  allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

v10:
- For index calculation use engine ID instead of pointer based arithmetic in
  intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
- For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
- Use for_each_engine macro for cleanup in intel_engines_init() and remove
  check for NULL engine pointer in cleanup() routines. (Joonas)

v11: Rebase.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NAkash Goel <akash.goel@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com

3b3f1650

12 10月, 2016 1 次提交

drm/i915: Always use the GTT for error capture · 95374d75

由 Chris Wilson 提交于 10月 12, 2016

Since the GTT provides universal access to any GPU page, we can use it
to reduce our plethora of read methods to just one. It also has the
important characteristic of being exactly what the GPU sees - if there
are incoherency problems, seeing the batch as executed (rather than as
trapped inside the cpu cache) is important.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-4-chris@chris-wilson.co.uk

95374d75

12 9月, 2016 1 次提交

drm/i915: remove writeq ifdeffery · 82daabae

由 Matthew Auld 提交于 9月 09, 2016

drm already provides fallback versions of readq and writeq.
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1473451373-9852-1-git-send-email-matthew.auld@intel.com

82daabae

10 9月, 2016 1 次提交

drm/i915: Flush to GTT domain all GGTT bound objects after hibernation · fbb30a5c

由 Chris Wilson 提交于 9月 09, 2016

Recently I have been applying an optimisation to avoid stalling and
clflushing GGTT objects based on their current binding. That is we only
set-to-gtt-domain upon first bind. However, on hibernation the objects
remain bound, but they are in the CPU domain. Currently (since commit
975f7ff4 ("drm/i915: Lazily migrate the objects after hibernation"))
we only flush scanout objects as all other objects are expected to be
flushed prior to use. That breaks down in the face of the runtime
optimisation above - and we need to flush all GGTT pinned objects
(essentially ringbuffers).

To reduce the burden of extra clflushes, we only flush those objects we
cannot discard from the GGTT. Everything pinned to the scanout, or
current contexts or ringbuffers will be flushed and rebound. Other
objects, such as inactive contexts, will be left unbound and in the CPU
domain until first use after resuming.

Fixes: 7abc98fa ("drm/i915: Only change the context object's domain...")
Fixes: 57e88531 ("drm/i915: Use VMA for ringbuffer tracking")
References: https://bugs.freedesktop.org/show_bug.cgi?id=94722Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: David Weinehall <david.weinehall@intel.com>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909201957.2499-1-chris@chris-wilson.co.uk

fbb30a5c

09 9月, 2016 2 次提交

drm/i915: Mark up all locked waiters · 22dd3bb9

由 Chris Wilson 提交于 9月 09, 2016

In the next patch we want to handle reset directly by a locked waiter in
order to avoid issues with returning before the reset is handled. To
handle the reset, we must first know whether we hold the struct_mutex.
If we do not hold the struct_mtuex we can not perform the reset, but we do
not block the reset worker either (and so we can just continue to wait for
request completion) - otherwise we must relinquish the mutex.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-10-chris@chris-wilson.co.uk

22dd3bb9

drm/i915: Expand bool interruptible to pass flags to i915_wait_request() · ea746f36

由 Chris Wilson 提交于 9月 09, 2016

We need finer control over wakeup behaviour during i915_wait_request(),
so expand the current bool interruptible to a bitmask.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-9-chris@chris-wilson.co.uk

ea746f36

06 9月, 2016 2 次提交

drm/i915: disable 48bit full PPGTT when vGPU is active · 557b1a8c

由 Zhi Wang 提交于 9月 06, 2016

Disable 48bit full PPGTT on vGPU too for now.
Signed-off-by: NZhi Wang <zhi.a.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160906040412.1274-3-zhenyuw@linux.intel.com
(cherry picked from commit e320d400)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

557b1a8c

drm/i915: disable 48bit full PPGTT when vGPU is active · e320d400

由 Zhi Wang 提交于 9月 06, 2016

Disable 48bit full PPGTT on vGPU too for now.
Signed-off-by: NZhi Wang <zhi.a.wang@intel.com>
Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160906040412.1274-3-zhenyuw@linux.intel.com

e320d400

22 8月, 2016 1 次提交

drm/i915: Allow DMA pagetables to use highmem · bb8f9cff

由 Chris Wilson 提交于 8月 22, 2016

As we never need to directly access the pages we allocate for scratch and
the pagetables, and always remap them into the GTT through the dma
remapper, we do not need to limit the allocations to lowmem i.e. we can
pass in the __GFP_HIGHMEM flag to the page allocation.

For backwards compatibility, e.g. certain old GPUs not liking highmem
for certain functions that may be accidentally mapped to the scratch
page by userspace, keep the GMCH probe as only allocating from DMA32.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-3-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

bb8f9cff

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功