- 18 11月, 2016 1 次提交
-
-
由 Matthew Auld 提交于
We already have an i915_address_space_init, so for symmetry we should also have a _fini, plus we already open code it twice. This then also fixes a bug where we leak the timeline for the ggtt vm. v2: don't forget about the struct_mutex for the ggtt path. Fixes: 80b204bc ("drm/i915: Enable multiple timelines") Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20161117210411.14044-1-chris@chris-wilson.co.uk
-
- 17 11月, 2016 3 次提交
-
-
由 Tvrtko Ursulin 提交于
And a little bit of cascaded function prototype changes. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Tvrtko Ursulin 提交于
Started with removing INTEL_INFO(dev) and cascaded into a quite big trickle of function prototype changes. Still, I think it is for the better. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Tvrtko Ursulin 提交于
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 11 11月, 2016 2 次提交
-
-
由 Tvrtko Ursulin 提交于
After this patch only conversion of INTEL_INFO(p)->gen to INTEL_GEN(dev_priv) remains before the __I915__ macro can be removed. v2: Tidy vlv_compute_wm. (David Weinehall) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: David Weinehall <david.weinehall@linux.intel.com> Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>
-
由 Joonas Lahtinen 提交于
As a side product, had to split two other files; - i915_gem_fence_reg.h - i915_gem_object.h (only parts that needed immediate untanglement) I tried to move code in as big chunks as possible, to make review easier. i915_vma_compare was moved to a header temporarily. v2: - Use i915_gem_fence_reg.{c,h} v3: - Rebased v4: - Fix building when DEBUG_GEM is enabled by reordering a bit. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com
-
- 07 11月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Currently, the vma is being unlink from the object lookup on destroy. However, we are meant to be decoupling it upon close so that the user cannot access the closed vma whilst it remains active on the GPU. [ 34.074858] kernel BUG at drivers/gpu/drm/i915/i915_gem_gtt.c:3561! [ 34.074875] invalid opcode: 0000 [#1] PREEMPT SMP [ 34.074888] Modules linked in: snd_hda_intel i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich mei_me mei snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core i2c_designware_platform i2c_designware_core snd_pcm e1000e ptp pps_core sdhci_acpi sdhci mmc_core i2c_hid [last unloaded: i915] [ 34.075010] CPU: 1 PID: 6224 Comm: gem_close_race Tainted: G U 4.9.0-rc3-CI-CI_DRM_1800+ #1 [ 34.075034] Hardware name: /NUC5i7RYB, BIOS RYBDWi35.86A.0355.2016.0224.1501 02/24/2016 [ 34.075057] task: ffff8802459a8040 task.stack: ffffc90000524000 [ 34.075074] RIP: 0010:[<ffffffffa0392cbc>] [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915] [ 34.075118] RSP: 0018:ffffc90000527b68 EFLAGS: 00010202 [ 34.075135] RAX: ffff8802426c5e40 RBX: 0000000000000000 RCX: ffff8802447fc2a8 [ 34.075158] RDX: 0000000000000000 RSI: ffff8802447fc2a8 RDI: ffff880248a4a880 [ 34.075181] RBP: ffffc90000527b88 R08: 0000000000000008 R09: 0000000000000000 [ 34.075203] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880248a4a880 [ 34.075225] R13: ffff8802447fc2a8 R14: ffff880243e9afa8 R15: ffff880248a4a9c8 [ 34.075248] FS: 00007f9b43e59740(0000) GS:ffff880256c80000(0000) knlGS:0000000000000000 [ 34.075273] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 34.075292] CR2: 00007f9b43419140 CR3: 000000024455d000 CR4: 00000000003406e0 [ 34.075314] Stack: [ 34.075323] 0000000000000000 ffffc90000527bd0 ffff880243cb8008 ffff880243e9afa8 [ 34.075353] ffffc90000527c08 ffffffffa03874c7 ffffc90000527bb8 ffff880243e9afa8 [ 34.075383] ffff880243e9afb0 ffffc90000527e10 ffff8802447fc2a8 ffff880243cb8040 [ 34.075414] Call Trace: [ 34.075435] [<ffffffffa03874c7>] eb_lookup_vmas.isra.7+0x247/0x330 [i915] [ 34.075468] [<ffffffffa0388c34>] i915_gem_do_execbuffer.isra.15+0x604/0x1a10 [i915] [ 34.075507] [<ffffffffa039c957>] ? i915_gem_object_get_sg+0x347/0x380 [i915] [ 34.075532] [<ffffffff811a69ce>] ? __might_fault+0x3e/0x90 [ 34.075562] [<ffffffffa038a430>] i915_gem_execbuffer2+0xc0/0x250 [i915] [ 34.075585] [<ffffffff81552926>] drm_ioctl+0x1f6/0x480 [ 34.075604] [<ffffffff8100107a>] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 34.075635] [<ffffffffa038a370>] ? i915_gem_execbuffer+0x330/0x330 [i915] [ 34.075658] [<ffffffff81202d2e>] do_vfs_ioctl+0x8e/0x690 [ 34.075677] [<ffffffff8181582d>] ? _raw_spin_unlock_irqrestore+0x3d/0x60 [ 34.075700] [<ffffffff810fcd51>] ? SyS_timer_settime+0x141/0x1e0 [ 34.075721] [<ffffffff810d6de2>] ? trace_hardirqs_on_caller+0x122/0x1b0 [ 34.075742] [<ffffffff8120336c>] SyS_ioctl+0x3c/0x70 [ 34.075760] [<ffffffff8181602e>] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 34.075781] Code: 44 a0 48 c7 c2 9a 7e 43 a0 be e0 0d 00 00 48 c7 c7 a0 45 44 a0 e8 55 b8 ce e0 48 85 db 74 a3 49 83 bd f8 03 00 00 00 74 99 0f 0b <0f> 0b 48 89 da 4c 89 ee 4c 89 e7 e8 04 a9 ff ff 48 89 da 49 89 [ 34.075955] RIP [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915] [ 34.075994] RSP <ffffc90000527b68> Testcase: igt/gem_close_race/basic-threads Fixes: db6c2b41 ("drm/i915: Store the vma in an rbtree...") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161104161241.25871-1-chris@chris-wilson.co.ukReviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
-
- 04 11月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
commit bc0629a7 ("drm/i915: Track pages pinned due to swizzling quirk") fixed one problem, but revealed a whole lot more. The root cause of the pin count mismatch for the swizzle quirk (for L-shaped memory on gen3/4) was that we were incrementing the pages_pin_count upon getting the backing pages but then overwriting the pages_pin_count to set it to 1 afterwards. With a little bit of adjustment to satisfy the GEM_BUG_ON sanitychecks, the fix is to replace the explicit atomic_set with an atomic_inc. v2: Consistently use atomics (not mix atomics and helpers) within the lowlevel get_pages routines. This makes the atomic operations much clearer. Fixes: 1233e2db ("drm/i915: Move object backing storage manipulation") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161104103001.27643-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
When supplying a view to vma_compare() it is required that the supplied i915_address_space is the global GTT. I tested the VMA instead (which is the current position in the rbtree and maybe from any address space). Reported-by: NMatthew Auld <matthew.auld@intel.com> Tested-by: NMatthew Auld <matthew.auld@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98579 Fixes: db6c2b41 ("drm/i915: Store the vma in an rbtree...") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161103200852.23431-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>
-
- 02 11月, 2016 3 次提交
-
-
由 Joonas Lahtinen 提交于
$ sed -i -r 's/\bglobal_list\b/global_link/g' *.c *.h Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478081764-8058-1-git-send-email-joonas.lahtinen@linux.intel.com
-
由 Mika Kuoppala 提交于
Now when clearing ptes can modify upper level pdp's, we need to mark them dirty so that they will be flushed correctly. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478006856-8313-1-git-send-email-mika.kuoppala@intel.com
-
由 Mika Kuoppala 提交于
Comparing pte index to a number of entries is wrong when clearing a range of pte entries. Use end marker of 'one past' to correctly point adequate number of ptes to the scratch page. v2: assert early instead of warning late (Chris) v3: removed consts (Joonas) Fixes: d209b9c3 ("drm/i915/gtt: Split gen8_ppgtt_clear_pte_range") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98282 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reported-by: NMike Lothian <mike@fireburn.co.uk> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Tested-by: NMike Lothian <mike@fireburn.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
- 01 11月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
With full-ppgtt one of the main bottlenecks is the lookup of the VMA underneath the object. For execbuf there is merit in having a very fast direct lookup of ctx:handle to the vma using a hashtree, but that still leaves a large number of other lookups. One way to speed up the lookup would be to use a rhashtable, but that requires extra allocations and may exhibit poor worse case behaviour. An alternative is to use an embedded rbtree, i.e. no extra allocations and deterministic behaviour, but at the slight cost of O(lgN) lookups (instead of O(1) for rhashtable). The major of such tree will be very shallow and so not much slower, and still scales much, much better than the current unsorted list. v2: Bump vma_compare() to return a long, as we return the result of comparing two pointers. References: https://bugs.freedesktop.org/show_bug.cgi?id=87726Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101115400.15647-1-chris@chris-wilson.co.uk
-
- 29 10月, 2016 7 次提交
-
-
由 Chris Wilson 提交于
With the infrastructure converted over to tracking multiple timelines in the GEM API whilst preserving the efficiency of using a single execution timeline internally, we can now assign a separate timeline to every context with full-ppgtt. v2: Add a comment to indicate the xfer between timelines upon submission. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-35-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In preparation to support many distinct timelines, we need to expand the activity tracking on the GEM object to handle more than just a request per engine. We already use the struct reservation_object on the dma-buf to handle many fence contexts, so integrating that into the GEM object itself is the preferred solution. (For example, we can now share the same reservation_object between every consumer/producer using this buffer and skip the manual import/export via dma-buf.) v2: Reimplement busy-ioctl (by walking the reservation object), postpone the ABI change for another day. Similarly use the reservation object to find the last_write request (if active and from i915) for choosing display CS flips. Caveats: * busy-ioctl: busy-ioctl only reports on the native fences, it will not warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is being rendered to by external fences. It also will not report the same busy state as wait-ioctl (or polling on the dma-buf) in the same circumstances. On the plus side, it does retain reporting of which *i915* engines are engaged with this object. * non-blocking atomic modesets take a step backwards as the wait for render completion blocks the ioctl. This is fixed in a subsequent patch to use a fence instead for awaiting on the rendering, see "drm/i915: Restore nonblocking awaits for modesetting" * dynamic array manipulation for shared-fences in reservation is slower than the previous lockless static assignment (e.g. gem_exec_lut_handle runtime on ivb goes from 42s to 66s), mainly due to atomic operations (maintaining the fence refcounts). * loss of object-level retirement callbacks, emulated by VMA retirement tracking. * minor loss of object-level last activity information from debugfs, could be replaced with per-vma information if desired Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The plan is to move obj->pages out from under the struct_mutex into its own per-object lock. We need to prune any assumption of the struct_mutex from the get_pages/put_pages backends, and to make it easier we pass around the sg_table to operate on rather than indirectly via the obj. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The plan is to make obtaining the backing storage for the object avoid struct_mutex (i.e. use its own locking). The first step is to update the API so that normal users only call pin/unpin whilst working on the backing storage. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We can use the radixtree index of the obj->pages to find the start position of the desired partial range. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-11-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Add lockdep_assert_held(struct_mutex) to the API preamble of the internal GEM interfaces. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-9-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We only need the active reference to keep the object alive after the handle has been deleted (so as to prevent a synchronous gem_close). Why then pay the price of a kref on every execbuf when we can insert that final active ref just in time for the handle deletion? Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-6-chris@chris-wilson.co.uk
-
- 24 10月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
We only used the RPM sequence checking inside the lowlevel GTT accessors, when we had to rely on callers taking the wakeref on our behalf. Now that we take the RPM wakeref inside the GTT management routines themselves, we can forgo the sanitycheck of the callers. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: NImre Deak <imre.deak@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We can remove the false coupling between RPM and struct mutex by the observation that we can use the RPM wakeref as the barrier around user mmap access. That is as we tear down the user's PTE atomically from within rpm suspend and then to fault in new PTE requires the rpm wakeref, means that no user access is possible through those PTE without RPM being awake. Having made that observation, we can then remove the presumption of having to take rpm outside of struct_mutex and so allow fine grained acquisition of a wakeref around hw access rather than having to remember to acquire the wakeref early on. v2: Rejig placement of the new intel_runtime_pm_get() to be as tight as possible around the GTT pread/pwrite. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NDaniel Vetter <daniel@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-2-chris@chris-wilson.co.uk
-
- 14 10月, 2016 10 次提交
-
-
由 Michał Winiarski 提交于
Since "Dynamic page table allocations" were introduced, our page tables can grow (being dynamically allocated) with address space range usage. Unfortunately, their lifetime is bound to vm. This is not a huge problem when we're not using softpin - drm_mm is creating an upper bound on used range by causing addresses for our VMAs to eventually be reused. With softpin, long lived contexts can drain the system out of memory even with a single "small" object. For example: bo = bo_alloc(size); while(true) offset += size; exec(bo, offset); Will cause us to create new allocations until all memory in the system is used for tracking GPU pages (even though almost all PTEs in this vm are pointing to scratch). Let's free unused page tables in clear_range to prevent this - if no entries are used, we can safely free it and return this information to the caller (so that higher-level entry is pointing to scratch). v2: Document return value and free semantics (Joonas) v3: No newlines in vars block (Joonas) v4: Drop redundant local 'reduce' variable v5: Handle CI fail with enable_ppgtt=2 Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-3-git-send-email-michal.winiarski@intel.com
-
由 Michał Winiarski 提交于
Let's use more top-down approach, where each gen8_ppgtt_clear_* function is responsible for clearing the struct passed as an argument and calling relevant clear_range functions on lower-level tables. Doing this rather than operating on PTE ranges makes the implementation of shrinking page tables quite simple. v2: Drop min when calculating num_entries, no negation in 48b ppgtt check, no newlines in vars block (Joonas) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-2-git-send-email-michal.winiarski@intel.com
-
由 Michał Winiarski 提交于
We never used any invalid ptes, those were put in place for a possibility of doing gpu faults. However our batchbuffers are not restricted in length, so everything needs to be pointing to something and thus out-of-bounds is pointing to scratch. Remove the valid flag as it is always true. v2: Expand commit msg, patch reorder (Mika) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-1-git-send-email-michal.winiarski@intel.com
-
由 Tvrtko Ursulin 提交于
Saves 1416 bytes of .rodata strings. v2: Add parantheses around dev_priv. (Ville Syrjala) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NJani Nikula <jani.nikula@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1476352990-2504-1-git-send-email-tvrtko.ursulin@linux.intel.com
-
由 Tvrtko Ursulin 提交于
Saves 864 bytes of .rodata strings and ~100 of .text. v2: Add parantheses around dev_priv. (Ville Syrjala) v3: Rebase. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NJani Nikula <jani.nikula@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
-
由 Tvrtko Ursulin 提交于
Saves 1392 bytes of .rodata strings. Also change a few function/macro prototypes in i915_gem_gtt.c from dev to dev_priv where it made more sense to do so. v2: Add parantheses around dev_priv. (Ville Syrjala) v3: Mention function prototype changes. (David Weinehall) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: David Weinehall <david.weinehall@linux.intel.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NJani Nikula <jani.nikula@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com>
-
由 Tvrtko Ursulin 提交于
Saves 1016 bytes of .rodata strings and couple hundred of .text. v2: Add parantheses around dev_priv. (Ville Syrjala) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NJani Nikula <jani.nikula@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
-
由 Tvrtko Ursulin 提交于
Saves 2432 bytes of .rodata strings. v2: Add parantheses around dev_priv. (Ville Syrjala) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NJani Nikula <jani.nikula@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
-
由 Tvrtko Ursulin 提交于
Saves 1808 bytes of .rodata strings. v2: Add parantheses around dev_priv. (Ville Syrjala) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NDavid Weinehall <david.weinehall@linux.intel.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NJani Nikula <jani.nikula@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
-
由 Akash Goel 提交于
With the possibility of addition of many more number of rings in future, the drm_i915_private structure could bloat as an array, of type intel_engine_cs, is embedded inside it. struct intel_engine_cs engine[I915_NUM_ENGINES]; Though this is still fine as generally there is only a single instance of drm_i915_private structure used, but not all of the possible rings would be enabled or active on most of the platforms. Some memory can be saved by allocating intel_engine_cs structure only for the enabled/active engines. Currently the engine/ring ID is kept static and dev_priv->engine[] is simply indexed using the enums defined in intel_engine_id. To save memory and continue using the static engine/ring IDs, 'engine' is defined as an array of pointers. struct intel_engine_cs *engine[I915_NUM_ENGINES]; dev_priv->engine[engine_ID] will be NULL for disabled engine instances. There is a text size reduction of 928 bytes, from 1028200 to 1027272, for i915.o file (but for i915.ko file text size remain same as 1193131 bytes). v2: - Remove the engine iterator field added in drm_i915_private structure, instead pass a local iterator variable to the for_each_engine** macros. (Chris) - Do away with intel_engine_initialized() and instead directly use the NULL pointer check on engine pointer. (Chris) v3: - Remove for_each_engine_id() macro, as the updated macro for_each_engine() can be used in place of it. (Chris) - Protect the access to Render engine Fault register with a NULL check, as engine specific init is done later in Driver load sequence. v4: - Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris) - Kill the superfluous init_engine_lists(). v5: - Cleanup the intel_engines_init() & intel_engines_setup(), with respect to allocation of intel_engine_cs structure. (Chris) v6: - Rebase. v7: - Optimize the for_each_engine_masked() macro. (Chris) - Change the type of 'iter' local variable to enum intel_engine_id. (Chris) - Rebase. v8: Rebase. v9: Rebase. v10: - For index calculation use engine ID instead of pointer based arithmetic in intel_engine_sync_index() as engine pointers are not contiguous now (Chris) - For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas) - Use for_each_engine macro for cleanup in intel_engines_init() and remove check for NULL engine pointer in cleanup() routines. (Joonas) v11: Rebase. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NAkash Goel <akash.goel@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
-
- 12 10月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Since the GTT provides universal access to any GPU page, we can use it to reduce our plethora of read methods to just one. It also has the important characteristic of being exactly what the GPU sees - if there are incoherency problems, seeing the batch as executed (rather than as trapped inside the cpu cache) is important. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-4-chris@chris-wilson.co.uk
-
- 12 9月, 2016 1 次提交
-
-
由 Matthew Auld 提交于
drm already provides fallback versions of readq and writeq. Signed-off-by: NMatthew Auld <matthew.auld@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1473451373-9852-1-git-send-email-matthew.auld@intel.com
-
- 10 9月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Recently I have been applying an optimisation to avoid stalling and clflushing GGTT objects based on their current binding. That is we only set-to-gtt-domain upon first bind. However, on hibernation the objects remain bound, but they are in the CPU domain. Currently (since commit 975f7ff4 ("drm/i915: Lazily migrate the objects after hibernation")) we only flush scanout objects as all other objects are expected to be flushed prior to use. That breaks down in the face of the runtime optimisation above - and we need to flush all GGTT pinned objects (essentially ringbuffers). To reduce the burden of extra clflushes, we only flush those objects we cannot discard from the GGTT. Everything pinned to the scanout, or current contexts or ringbuffers will be flushed and rebound. Other objects, such as inactive contexts, will be left unbound and in the CPU domain until first use after resuming. Fixes: 7abc98fa ("drm/i915: Only change the context object's domain...") Fixes: 57e88531 ("drm/i915: Use VMA for ringbuffer tracking") References: https://bugs.freedesktop.org/show_bug.cgi?id=94722Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: David Weinehall <david.weinehall@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909201957.2499-1-chris@chris-wilson.co.uk
-
- 09 9月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
In the next patch we want to handle reset directly by a locked waiter in order to avoid issues with returning before the reset is handled. To handle the reset, we must first know whether we hold the struct_mutex. If we do not hold the struct_mtuex we can not perform the reset, but we do not block the reset worker either (and so we can just continue to wait for request completion) - otherwise we must relinquish the mutex. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-10-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We need finer control over wakeup behaviour during i915_wait_request(), so expand the current bool interruptible to a bitmask. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-9-chris@chris-wilson.co.uk
-
- 06 9月, 2016 2 次提交
-
-
由 Zhi Wang 提交于
Disable 48bit full PPGTT on vGPU too for now. Signed-off-by: NZhi Wang <zhi.a.wang@intel.com> Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: drm-intel-fixes@lists.freedesktop.org Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160906040412.1274-3-zhenyuw@linux.intel.com (cherry picked from commit e320d400) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Zhi Wang 提交于
Disable 48bit full PPGTT on vGPU too for now. Signed-off-by: NZhi Wang <zhi.a.wang@intel.com> Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: drm-intel-fixes@lists.freedesktop.org Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160906040412.1274-3-zhenyuw@linux.intel.com
-
- 22 8月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
As we never need to directly access the pages we allocate for scratch and the pagetables, and always remap them into the GTT through the dma remapper, we do not need to limit the allocations to lowmem i.e. we can pass in the __GFP_HIGHMEM flag to the page allocation. For backwards compatibility, e.g. certain old GPUs not liking highmem for certain functions that may be accidentally mapped to the scratch page by userspace, keep the GMCH probe as only allocating from DMA32. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-3-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-