- 10 4月, 2015 8 次提交
-
-
由 Chris Wilson 提交于
requests are even more frequently allocated than objects and equally benefit from having a dedicated slab. v2: Rebase Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
When we submit a request to the GPU, we first take the rpm wakelock, and only release it once the GPU has been idle for a small period of time after all requests have been complete. This means that we are sure no new interrupt can arrive whilst we do not hold the rpm wakelock and so can drop the individual get/put around every single request inside execlists. Note: to close one potential issue we should mark the GPU as busy earlier in __i915_add_request. To elaborate: The issue is that we emit the irq signalling sequence before we grab the rpm reference, which means we could miss the resulting interrupt (since that's not set up when suspended). The only bad side effect is a missed interrupt, gt mmio writes automatically wake up the hw itself. But otoh we have an umbrella rpm reference for the entirety of execbuf, as long as that's there we're covered. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> [danvet: Explain a bit more about the add_request issue, which after some irc chatting with Chris turns out to not be an issue really.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Now with the trimmed memcpy before the command parser, we try to allocate many different sizes of batches, predominantly one or two pages. We can therefore speed up searching for a good sized batch by keeping the objects of buckets of roughly the same size. v2: Add a comment about bucket sizes Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
At runtime, this helps ensure that the batch pools are kept trim and fast. Then at suspend, this releases memory that we do not need to restore. It also ties into the oom-notifier to ensure that we recover as much kernel memory as possible during OOM. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
I woke up one morning and found 50k objects sitting in the batch pool and every search seemed to iterate the entire list... Painting the screen in oils would provide a more fluid display. One issue with the current design is that we only check for retirements on the current ring when preparing to submit a new batch. This means that we can have thousands of "active" batches on another ring that we have to walk over. The simplest way to avoid that is to split the pools per ring and then our LRU execution ordering will also ensure that the inactive buffers remain at the front. v2: execlists still requires duplicate code. v3: execlists requires more duplicate code Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
This reverts commit ec5cc0f9 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jun 12 10:28:55 2014 +0100 drm/i915: Restrict GPU boost to the RCS engine The premise that media/blitter workloads are not affected by boosting is patently false with a trip through igt. The question that remains is what exactly is going wrong with the media workload that prompted this? Hopefully that would be fixed by the missing agressive downclocking, in addition to the extra restrictions imposed on how frequent a process is allowed to boost. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll> Acked-by: NDeepak S <deepak.s@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
With boosting for missed pageflips, we have a much stronger indication of when we need to (temporarily) boost GPU frequency to ensure smooth delivery of frames. So now only allow each client to perform one RPS boost in each period of GPU activity due to stalling on results. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Reviewed-by: NDeepak S <deepak.s@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
The biggest user of i915_gem_object_get_page() is the relocation processing during execbuffer. Typically userspace passes in a set of relocations in sorted order. Sadly, we alternate between relocations increasing from the start of the buffers, and relocations decreasing from the end. However the majority of consecutive lookups will still be in the same page. We could cache the start of the last sg chain, however for most callers, the entire sgl is inside a single chain and so we see no improve from the extra layer of caching. v2: Avoid the double increment inside unlikely() References: https://bugs.freedesktop.org/show_bug.cgi?id=88308Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 01 4月, 2015 2 次提交
-
-
由 John Harrison 提交于
The request allocation code is largely duplicated between legacy mode and execlist mode. The actual difference between the two versions of the code is pretty minimal. This patch moves the common code out into a separate function. This is then called by the execution specific version prior to setting up the one different value. For: VIZ-5190 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The submission portion of the execbuffer code path was abstracted into a function pointer indirection as part of the legacy vs execlist work. The two implementation functions are called 'i915_gem_ringbuffer_submission' and 'intel_execlists_submission' but the pointer was called 'do_execbuf'. There is already a 'i915_gem_do_execbuffer' function (which is what calls the pointer indirection). The name of the pointer is therefore considered to be backwards and should be changed. This patch renames it to 'execbuf_submit' which is hopefully a bit clearer. For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 30 3月, 2015 1 次提交
-
-
由 Chris Wilson 提交于
We were missing a convenience stub to aquire the right mutex whilst dropping the request, so add it. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 27 3月, 2015 2 次提交
-
-
由 Joonas Lahtinen 提交于
To allow for views where the view type is not defined by the view type only, like it is in stereo or rotated 90 degree view, change the semantic to require the whole view structure for comparison when we match a GGTT view. This allows including parameters like offset to be included in the view which is useful for eg. partial views. v3: - Rely on ggtt_view type being 0 for non-GGTT vma's, which equals to I915_GGTT_VIEW_NORMAL. (Daniel Vetter) - Do not use potentially slower comparison when we only want to know if something is or is not a normal view. - Rebase on top of rotated view patches. Add rotated view singleton. - If one view is missing in comparison they're equal only if both are missing. v4: - Use comparison helper in obj_to_ggtt_view too. (Tvrtko Ursulin) - Do WARN_ON if one view is NULL. (Tvrtko Ursulin) Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Michel Thierry 提交于
Traces for page directories and tables allocation and map. v2: Removed references to teardown. v3: bitmap_scnprintf has been deprecated. v4: Replace bitmap_scnprintf with scnprintf correctly, and get right range lengths. (Mika) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: NMichel Thierry <michel.thierry@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 26 3月, 2015 1 次提交
-
-
由 Chris Wilson 提交于
If we retire requests last, we may use a later seqno and so clear the requests lists without clearing the active list, leading to confusion. Hence we should retire requests first for consistency with the early return. The order used to be important as the lifecycle for the object on the active list was determined by request->seqno. However, the requests themselves are now reference counted removing the constraint from the order of retirement. Fixes regression from commit 1b5a433a Author: John Harrison <John.C.Harrison@Intel.com> Date: Mon Nov 24 18:49:42 2014 +0000 drm/i915: Convert 'i915_seqno_passed' calls into 'i915_gem_request_completed ' and a WARNING: CPU: 0 PID: 1383 at drivers/gpu/drm/i915/i915_gem_evict.c:279 i915_gem_evict_vm+0x10c/0x140() WARN_ON(!list_empty(&vm->active_list)) Identified by updating WATCH_LISTS: [drm:i915_verify_lists] *ERROR* blitter ring: active list not empty, but no requests WARNING: CPU: 0 PID: 681 at drivers/gpu/drm/i915/i915_gem.c:2751 i915_gem_retire_requests_ring+0x149/0x230() WARN_ON(i915_verify_lists(ring->dev)) Note that this is only a problem in evict_vm where the following happens after a retire_request has cleaned out all requests, but not all active bo: - intel_ring_idle called from i915_gpu_idle notices that no requests are outstanding and immediately returns. - i915_gem_retire_requests_ring called from i915_gem_retire_requests also immediately returns when there's no request, still leaving the bo on the active list. - evict_vm hits the WARN_ON(!list_empty(&vm->active_list)) after evicting all active objects that there's still stuff left that shouldn't be there. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 23 3月, 2015 2 次提交
-
-
由 Tvrtko Ursulin 提交于
90/270 rotated scanout needs a rotated GTT view of the framebuffer. This is put in a separate VMA with a dedicated ggtt view and wired such that it is created when a framebuffer is pinned to a 90/270 rotated plane. Rotation is only possible with Yb/Yf buffers and error is propagated to user space in case of a mismatch. Special rotated page view is constructed at the VMA creation time by borrowing the DMA addresses from obj->pages. v2: * Do not bother with pages for rotated sg list, just populate the DMA addresses. (Daniel Vetter) * Checkpatch cleanup. v3: * Rebased on top of new plane handling (create rotated mapping when setting the rotation property). * Unpin rotated VMA on unpinning from display plane. * Simplify rotation check using bitwise AND. (Chris Wilson) v4: * Fix unpinning of optional rotated mapping so it is really considered to be optional. v5: * Rebased for fb modifier changes. * Rebased for atomic commit. * Only pin needed view for display. (Ville Syrjälä, Daniel Vetter) v6: * Rebased after preparatory work has been extracted out. (Daniel Vetter) v7: * Slightly simplified tiling geometry calculation. * Moved rotated GGTT view implementation into i915_gem_gtt.c (Daniel Vetter) v8: * Do not use i915_gem_obj_size to get object size since that actually returns the size of an VMA which may not exist. * Rebased for ggtt view changes. v9: * Rebased after code review changes on the preceding patches. * Tidy function definitions. (Joonas Lahtinen) For: VIZ-4726 Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v4) Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Tvrtko Ursulin 提交于
To support frame buffer rotation we need to be able to pass on the information on what kind of GGTT view is required for display. This patch just adds the parameter and makes all the callers default to the normal view. v2: Rebased for ggtt view changes. v3: Don't limit PIN_MAPPABLE to normal views just yet. (Joonas Lahtinen) Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v3) [danvet: s/BUG/WARN/ in the patch hunk because. At least where the BUG_ON isn't fatal right away.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 20 3月, 2015 5 次提交
-
-
由 Ben Widawsky 提交于
This patch was formerly known as, "Force pd restore when PDEs change, gen6-7." I had to change the name because it is needed for GEN8 too. The real issue this is trying to solve is when a new object is mapped into the current address space. The GPU does not snoop the new mapping so we must do the gen specific action to reload the page tables. GEN8 and GEN7 do differ in the way they load page tables for the RCS. GEN8 does so with the context restore, while GEN7 requires the proper load commands in the command streamer. Non-render is similar for both. Caveat for GEN7 The docs say you cannot change the PDEs of a currently running context. We never map new PDEs of a running context, and expect them to be present - so I think this is okay. (We can unmap, but this should also be okay since we only unmap unreferenced objects that the GPU shouldn't be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag to signal that even if the context is the same, force a reload. It's unclear exactly what this does, but I have a hunch it's the right thing to do. The logic assumes that we always emit a context switch after mapping new PDEs, and before we submit a batch. This is the case today, and has been the case since the inception of hardware contexts. A note in the comment let's the user know. It's not just for gen8. If the current context has mappings change, we need a context reload to switch v2: Rebased after ppgtt clean up patches. Split the warning for aliasing and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt is always null. v3: Invalidate PPGTT TLBs inside alloc_va_range. v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when neither ctx->ppgtt and aliasing_ppgtt exist. v5: Removed references to teardown_va_range. v6: Updated needs_pd_load_pre/post. v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move comment about updated PDEs to object_pin/bind (Mika). Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
Instead of implementing the full tracking + dynamic allocation, this patch does a bit less than half of the work, by tracking and warning on unexpected conditions. The tracking itself follows which PTEs within a page table are currently being used for objects. The next patch will modify this to actually allocate the page tables only when necessary. With the current patch there isn't much in the way of making a gen agnostic range allocation function. However, in the next patch we'll add more specificity which makes having separate functions a bit easier to manage. One important change introduced here is that DMA mappings are created/destroyed at the same page directories/tables are allocated/deallocated. Notice that aliasing PPGTT is not managed here. The patch which actually begins dynamic allocation/teardown explains the reasoning for this. v2: s/pdp.page_directory/pdp.page_directories Make a scratch page allocation helper v3: Rebase and expand commit message. v4: Allocate required pagetables only when it is needed, _bind_to_vm instead of bind_vma (Daniel). v5: Rebased to remove the unnecessary noise in the diff, also: - PDE mask is GEN agnostic, renamed GEN6_PDE_MASK to I915_PDE_MASK. - Removed unnecessary checks in gen6_alloc_va_range. - Changed map/unmap_px_single macros to use dma functions directly and be part of a static inline function instead. - Moved drm_device plumbing through page tables operation to its own patch. - Moved allocate/teardown_va_range calls until they are fully implemented (in subsequent patch). - Merged pt and scratch_pt unmap_and_free path. - Moved scratch page allocator helper to the patch that will use it. v6: Reduce complexity by not tearing down pagetables dynamically, the same can be achieved while freeing empty vms. (Daniel) v7: s/i915_dma_map_px_single/i915_dma_map_single s/gen6_write_pdes/gen6_write_pde Prevent a NULL case when only GGTT is available. (Mika) v8: Rebased after s/page_tables/page_table/. v9: Reworked i915_pte_index and i915_pte_count. Also exercise bitmap allocation here (gen6_alloc_va_range) and fix incorrect write_page_range in i915_gem_restore_gtt_mappings (Mika). Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+) Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Two code changes: - Extract i915_gem_shrinker_init. - Inline i915_gem_object_is_purgeable since we open-code it everywhere else too. This already has the benefit of pulling all the shrinker code together, next patch adds a bit of kerneldoc. Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Tvrtko Ursulin 提交于
This makes the interface consistent to old i915_gem_obj_ggtt_pin. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Joonas Lahtinen 提交于
GGTT views are only applicable when dealing with GGTT. Change the code to reject ggtt_view where it should not be used and require it when it should be. v2: - Dropped _ppgtt_ infixes, allow both types to be passed - Disregard other but normal views when no view is specified - More checks that valid parameters are passed - More readable error checking v3: - Prefer WARN_ONCE over BUG_ON when there is code path for failure Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> [danvet: Drop unecessary forward decl from earlier patch iterations.] [danvet: Remove unused variable spotted by Tvrtko.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 18 3月, 2015 2 次提交
-
-
由 Paulo Zanoni 提交于
We need this for FBC, and possibly for PSR too. v2: Don't only flush: invalidate too (Daniel). Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
We want to port FBC to the frontbuffer tracking infrastructure, but for that we need to know what caused the object invalidation so we can react accordingly: CPU mmaps need manual, GTT mmaps and flips don't need handling and ring rendering needs nukes. v2: - s/ORIGIN_RENDER/ORIGIN_CS/ (Daniel, Rodrigo) - Fix copy/pasted wrong documentation - Rebase v3: - Rebase v4: - Don't pass the operation to flushes (Daniel). Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 10 3月, 2015 2 次提交
-
-
由 Chris Wilson 提交于
Long ago I found that I was getting sporadic errors when booting SNB, with the symptom being that the first batch died with IPEHR != *ACTHD, typically caused by the TLB being invalid. These magically disappeared if I held the forcewake during the entire ring initialisation sequence. (It can probably be shortened to a short critical section, but the whole initialisation is full of register writes and so we would be taking and releasing forcewake almost continually, and so holding it over the entire sequence will probably be a net win!) Note some of the kernels I encounted the issue already had the deferred forcewake release, so it is still relevant. I know that there have been a few other reports with similar failure conditions on SNB, I think such as References: https://bugs.freedesktop.org/show_bug.cgi?id=80913 v2: Wrap i915_gem_init_hw() with its own security blanket as we take that path following resume and reset. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Chris Wilson 提交于
This fixes a regression from commit 5ed0bdf2 Author: Thomas Gleixner <tglx@linutronix.de> Date: Wed Jul 16 21:05:06 2014 +0000 drm: i915: Use nsec based interfaces that made a negative timeout return immediately rather than the previously defined behaviour of waiting indefinitely. Testcase: igt/gem_wait Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89494Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: stable@vger.kernel.org Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> [Jani: fixed a checkpatch complaint about whitespace.] Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 28 2月, 2015 1 次提交
-
-
由 Chris Wilson 提交于
For an object right on the boundary of mappable space, as the fenceable size is stricly greater than the actual size, its fence region may extend out of mappable space. Note that only pnv/g33 has fence_size > obj.size and an unmappable range in the gtt, and there alignment constraints prevent bad things from happening. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> [danvet: Clarify why this shouldn't change anything as per the discussion on intel-gfx.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 27 2月, 2015 1 次提交
-
-
由 Daniel Vetter 提交于
Hooray! Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 26 2月, 2015 1 次提交
-
-
由 John Harrison 提交于
In execlist mode, the ringbuf is a function of the ring and context whereas in legacy mode, it is derived from the ring alone. Thus the calculation required to determine the ringbuf pointer from the ring (and context) also needs to test execlist mode or not. This is messy. Further, the request structure holds a pointer to both the ring and the context for which it was created. Thus, given a request, it is possible to derive the ringbuf in either legacy or execlist mode. Hence it is necessary to pass just the request in to all the low level functions rather than some combination of request, ring, context and ringbuf. However, rather than recalculating it each time, it is much simpler to just cache the ringbuf pointer in the request structure itself. Caching the pointer means the calculation is done once at request creation time and all further code and simply read it directly from the request structure. OTC-Jira: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> [danvet: Drop contentless comment in lrc alloc request entirely. And spelling fix in the commit message.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 24 2月, 2015 1 次提交
-
-
由 Nick Hoath 提交于
When converting from implicitly tracked execlist queue items to ref counted requests, not all frees of requests were replaced with unrefs, and extraneous refs/unrefs of contexts were added. Correct the unbalanced refcount & replace the frees. Remove a noisy warning when hitting the request creation path. drm_i915_gem_request and intel_context are both kref reference counted structures. Upon allocation, drm_i915_gem_request's ref count should be bumped using kref_init. When a context is assigned to the request, the context's reference count should be bumped using i915_gem_context_reference. i915_gem_request_reference will reduce the context reference count when the request is freed. Problem introduced in commit 6d3d8274 Author: Nick Hoath <nicholas.hoath@intel.com> AuthorDate: Thu Jan 15 13:10:39 2015 +0000 drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request v2: Added comments explaining how the ctx pointer and the request object should be ref-counted. Removed noisy warning. v3: Cleaned up the language used in the commit & the header description (Thanks David Gordon) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NThomas Daniel <thomas.daniel@intel.com> Reviewed-by: NDaniel Vetter <daniel@ffwll.ch> Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 14 2月, 2015 2 次提交
-
-
由 Mika Kuoppala 提交于
We use the pid of the process which opened our device when we track which was the culprit of the gpu hang. But as that file descriptor might get inherited, we might blame the wrong process when we record the error state. Track process identifiers in requests to always find the correct offender. v2: Track only user processes (Chris) Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> [danvet: drop NULL check before put_pid as suggested by Chris.] Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Yu Zhang 提交于
With Intel GVT-g, the fence registers are partitioned by multiple vGPU instances in different VMs. Routine i915_gem_load() is modified to reset the num_fence_regs, when the driver detects it's running in a VM. Accesses to the fence registers from vGPU will be trapped and remapped by the host side. And the allocated fence number is provided in PV INFO page structure. By now, the value of fence number is fixed, but in the future we can relax this limitation, to allocate the fence registers dynamically from host side. Signed-off-by: NYu Zhang <yu.c.zhang@linux.intel.com> Signed-off-by: NJike Song <jike.song@intel.com> Signed-off-by: NEddie Dong <eddie.dong@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 29 1月, 2015 2 次提交
-
-
由 Chris Wilson 提交于
The core fix was applied in commit a63b03e2 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Jan 6 10:29:35 2015 +0000 mutex: Always clear owner field upon mutex_unlock() (note the absence of stable@ tag) so we can now revert our band-aid commit 226e5ae9 for -next. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
When run as a timer, i915_hangcheck_elapsed() must adhere to all the rules of running in a softirq context. This is advantageous to us as we want to minimise the risk that a driver bug will prevent us from detecting a hung GPU. However, that is irrelevant if the driver bug prevents us from resetting and recovering. Still it is prudent not to rely on mutexes inside the checker, but given the coarseness of dev->struct_mutex doing so is extremely hard. Give in and run from a work queue, i.e. outside of softirq. v2: Use own workqueue to avoid deadlocks (Daniel) Cleanup commit msg and add comment to i915_queue_hangcheck() (Chris) Cc: Jani Nikula <jani.nikula@intel.com> Cc: Daniel Vetter <dnaiel.vetter@ffwll.chm> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> [danvet: Remove accidental kerneldoc comment starter, to appease the 0 day builder.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 27 1月, 2015 6 次提交
-
-
由 Daniel Vetter 提交于
We can push down the decision whether to force flushing into the implementation since in all places that matter obj->pin_display is accurate already. The only place where the optimization really matters is the sw_finish_ioctl, and that already checks for obj->pin_display on its own. I suspect that this was simply an artifact of how commit 2c22569b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Aug 9 12:26:45 2013 +0100 drm/i915: Update rules for writing through the LLC with the cpu evolved - only v2 added the pin_display tracking. Note that we still retain the gist of this logic from the above commit with the explicit force argument for the low-level clflush function. Ville noted in his review that there's a slight behavioural change in the set_to_gtt_domain function, which now also will flush display plane data. This opens-open the potential for userspace to start doing buggy things by omitting the sw_finish_ioctl, which is why I've rejected a functional equivalent patch from Ville a while ago: http://lists.freedesktop.org/archives/intel-gfx/2013-November/036421.html But on second consideration it's not that evil, and in any case the justification here is more clarity, not allowing crazy userspace. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Currently we are hitting the WARN inside i915_gem_object_set_cache_level() as we can now have an unbound object in the GTT write domain (due to 43566ded "drm/i915: Broaden application of set-domain(GTT)"). To avoid the warning, we need to track when we elided the clflush on a cacheable object and then evict the cache for the object when we move the object out of a cacheable domain. Reported-by: NJani Nikula <jani.nikula@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: NJani Nikula <jani.nikula@intel.com> Testcase: igt/gem_mmap_wc/set-cache-level Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88607 Tested-by: huax.lu@intel.com [danvet: Split if into nested if as discussion on the m-l.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Mika Kuoppala 提交于
We pin when we submit to execlist queue. Balance the pinning when the submitted queue is cleaned on reset. Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NThomas Daniel <thomas.daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Nick Hoath 提交于
Move all remaining elements that were unique to execlists queue items in to the associated request. Issue: VIZ-4274 v2: Rebase. Fixed issue of overzealous freeing of request. v3: Removed re-addition of cleanup work queue (found by Daniel Vetter) v4: Rebase. v5: Actual removal of intel_ctx_submit_request. Update both tail and postfix pointer in __i915_add_request (found by Thomas Daniel) v6: Removed unrelated changes Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NThomas Daniel <thomas.daniel@intel.com> [danvet: Reformat comment with strange linebreaks.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Nick Hoath 提交于
The first pass implementation of execlists required a backpointer to the context to be held in the intel_ringbuffer. However the context pointer is available higher in the call stack. Remove the backpointer from the ring buffer structure and instead pass it down through the call stack. v2: Integrate this changeset with the removal of duplicate request/execlist queue item members. v3: Rebase v4: Rebase. Remove passing of context when the request is passed. Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NThomas Daniel <thomas.daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Nick Hoath 提交于
Where there were duplicate variables for the tail, context and ring (engine) in the gem request and the execlist queue item, use the one from the request and remove the duplicate from the execlist queue item. Issue: VIZ-4274 v1: Rebase v2: Fixed build issues. Keep separate postfix & tail pointers as these are used in different ways. Reinserted missing full tail pointer update. Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NThomas Daniel <thomas.daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 26 1月, 2015 1 次提交
-
-
由 Bob Paauwe 提交于
When creating a fence for a tiled object, only fence the area that makes up the actual tiles. The object may be larger than the tiled area and if we allow those extra addresses to be fenced, they'll get converted to addresses beyond where the object is mapped. This opens up the possiblity of writes beyond the end of object. To prevent this, we adjust the size of the fence to only encompass the area that makes up the actual tiles. The extra space is considered un-tiled and now behaves as if it was a linear object. Testcase: igt/gem_tiled_fence_overflow Reported-by: NDan Hettena <danh@ghs.com> Signed-off-by: NBob Paauwe <bob.j.paauwe@intel.com> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-