- 23 6月, 2015 5 次提交
-
-
由 John Harrison 提交于
The alloc_request() function does not actually return the newly allocated request. Instead, it must be pulled from ring->outstanding_lazy_request. This patch fixes this so that code can create a request and start using it knowing exactly which request it actually owns. v2: Updated for new i915_gem_request_alloc() scheme. For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
Shrunk the parameter list of i915_gem_execbuffer_retire_commands() to a single structure as everything it requires is available in the execbuff_params object. For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The do_execbuf() function takes quite a few parameters. The actual set of parameters is going to change with the conversion to passing requests around. Further, it is due to grow massively with the arrival of the GPU scheduler. This patch simplifies the prototype by passing a parameter structure instead. Changing the parameter set in the future is then simply a matter of adding/removing items to the structure. Note that the structure does not contain absolutely everything that is passed in. This is because the intention is to use this structure more extensively later in this patch series and more especially in the GPU scheduler that is coming soon. The latter requires hanging on to the structure as the final hardware submission can be delayed until long after the execbuf IOCTL has returned to user land. Thus it is unsafe to put anything in the structure that is local to the IOCTL call itself - such as the 'args' parameter. All entries must be copies of data or pointers to structures that are reference counted in some way and guaranteed to exist for the duration of the batch buffer's life. v2: Rebased to newer tree and updated for changes to the command parser. Specifically, a code shuffle has required saving the batch start address in the params structure. For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
Start of explicit request management in the execbuffer code path. This patch adds a call to allocate a request structure before all the actual hardware work is done. Thus guaranteeing that all that work is tagged by a known request. At present, nothing further is done with the request, the rest comes later in the series. The only noticable change is that failure to get a request (e.g. due to lack of memory) will be caught earlier in the sequence. It now occurs right at the start before any un-undoable work has been done. v2: Simplified the error handling path. For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The i915_add_request() function is called to keep track of work that has been written to the ring buffer. It adds epilogue commands to track progress (seqno updates and such), moves the request structure onto the right list and other such house keeping tasks. However, the work itself has already been written to the ring and will get executed whether or not the add request call succeeds. So no matter what goes wrong, there isn't a whole lot of point in failing the call. At the moment, this is fine(ish). If the add request does bail early on and not do the housekeeping, the request will still float around in the ring->outstanding_lazy_request field and be picked up next time. It means multiple pieces of work will be tagged as the same request and driver can't actually wait for the first piece of work until something else has been submitted. But it all sort of hangs together. This patch series is all about removing the OLR and guaranteeing that each piece of work gets its own personal request. That means that there is no more 'hoovering up of forgotten requests'. If the request does not get tracked then it will be leaked. Thus the add request call _must_ not fail. The previous patch should have already ensured that it _will_ not fail by removing the potential for running out of ring space. This patch enforces the rule by actually removing the early exit paths and the return code. Note that if something does manage to fail and the epilogue commands don't get written to the ring, the driver will still hang together. The request will be added to the tracking lists. And as in the old case, any subsequent work will generate a new seqno which will suffice for marking the old one as complete. v2: Improved WARNings (Tomas Elf review request). For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 22 6月, 2015 2 次提交
-
-
由 Chris Wilson 提交于
Internal requirement for the alignment is that it must be a power-of-two, so enforce rejection at the user interface to execbuffer (which allows the caller to specify a stricter-than-expected alignment criterion). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Rodrigo Vivi 提交于
This patch doesn't have any functional change, but organize fruntbuffer invalidate and busy by removing unecesarry signature argument for ring. It was unsed on mark_fb_busy and only used on fb_obj_invalidate for the same ORIGIN_CS usage. So let's clean it a bit Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 29 5月, 2015 1 次提交
-
-
由 David Weinehall 提交于
Export a new context parameter that can be set/queried through the context_{get,set}param ioctls. This parameter is passed as a context flag and decides whether or not a GPU address mapping is allowed to be made at address zero. The default is to allow such mappings. Signed-off-by: NDavid Weinehall <david.weinehall@intel.com> Acked-by: N"Zou, Nanhai" <nanhai.zou@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 21 5月, 2015 1 次提交
-
-
由 Chris Wilson 提交于
This trims a little overhead from the common case of not needing to synchronize between rings. v2: execlists is special and likes to duplicate code. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 08 5月, 2015 2 次提交
-
-
由 Rebecca N. Palmer 提交于
i915_parse_cmds returns -EACCES on chained batches, which "tells the caller to abort and dispatch the workload as a non-secure batch", but the mechanism implementing that was broken when flags |= I915_DISPATCH_SECURE was moved from i915_gem_execbuffer_parse to i915_gem_do_execbuffer (17cabf57): i915_gem_execbuffer_parse returns the original batch_obj in this case, and i915_gem_do_execbuffer doesn't check for that. Don't set the secure bit in this case to make sure such batches don't run with elevated priviledges. Signed-off-by: NRebecca Palmer <rebecca_palmer@zoho.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> [danvet: Stitch together commit message. Also remove a comment as suggested by Mika. And style-align the comment while at it.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
i915_needs_cmd_parser already checks that for us. Suggested-by: NMika Kuoppala <mika.kuoppala@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
-
- 30 4月, 2015 1 次提交
-
-
由 Daniel Vetter 提交于
With the binding regression from the original full ppgtt patches fixed we can throw the switch. Yay! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90190Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> [Jani: tweaked commit title per Chris' suggestion] Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 24 4月, 2015 3 次提交
-
-
由 Daniel Vetter 提交于
Currently we have the problem that the decision whether ptes need to be (re)written is splattered all over the codebase. Move all that into i915_vma_bind. This needs a few changes: - Just reuse the PIN_* flags for i915_vma_bind and do the conversion to vma->bound in there to avoid duplicating the conversion code all over. - We need to make binding for EXECBUF (i.e. pick aliasing ppgtt if around) explicit, add PIN_USER for that. - Two callers want to update ptes, give them a PIN_UPDATE for that. Of course we still want to avoid double-binding, but that should be taken care of: - A ppgtt vma will only ever see PIN_USER, so no issue with double-binding. - A ggtt vma with aliasing ppgtt needs both types of binding, and we track that properly now. - A ggtt vma without aliasing ppgtt could be bound twice. In the lower-level ->bind_vma functions hence unconditionally set GLOBAL_BIND when writing the ggtt ptes. There's still a bit room for cleanup, but that's for follow-up patches. v2: Fixup fumbles. v3: s/PIN_EXECBUF/PIN_USER/ for clearer meaning, suggested by Chris. Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
-
由 Daniel Vetter 提交于
It's already protected by the bkl^Wdev->struct_mutex. While at it realign some related code. Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
-
由 Daniel Vetter 提交于
We load the ppgtt ptes once per gpu reset/driver load/resume and that's all that's needed. Note that this only blows up when we're using the allocate_va_range funcs and not the special-purpose ones used. With this change we can get rid of that duplication. Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
-
- 20 4月, 2015 1 次提交
-
-
由 Daniel Vetter 提交于
PIN_GLOBAL is set only when userspace asked for it, and that is only the case for the gen6 PIPE_CONTROL workaround. We're not allowed to just clear this. The important part of the fallback is to drop the restriction to the mappable range. This issue has been introduced in commit edf4427b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 14 11:20:56 2015 +0000 drm/i915: Fallback to using CPU relocations for large batch buffers v2: Chris pointed out that we also miss to set PIN_GLOBAL when the buffer is already bound. Fix this up too. Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 10 4月, 2015 2 次提交
-
-
由 Chris Wilson 提交于
I woke up one morning and found 50k objects sitting in the batch pool and every search seemed to iterate the entire list... Painting the screen in oils would provide a more fluid display. One issue with the current design is that we only check for retirements on the current ring when preparing to submit a new batch. This means that we can have thousands of "active" batches on another ring that we have to walk over. The simplest way to avoid that is to split the pools per ring and then our LRU execution ordering will also ensure that the inactive buffers remain at the front. v2: execlists still requires duplicate code. v3: execlists requires more duplicate code Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Move the madvise logic out of the execbuffer main path into the relatively rare allocation path, making the execbuffer manipulation less fragile. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 01 4月, 2015 1 次提交
-
-
由 John Harrison 提交于
The submission portion of the execbuffer code path was abstracted into a function pointer indirection as part of the legacy vs execlist work. The two implementation functions are called 'i915_gem_ringbuffer_submission' and 'intel_execlists_submission' but the pointer was called 'do_execbuf'. There is already a 'i915_gem_do_execbuffer' function (which is what calls the pointer indirection). The name of the pointer is therefore considered to be backwards and should be changed. This patch renames it to 'execbuf_submit' which is hopefully a bit clearer. For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 30 3月, 2015 1 次提交
-
-
由 Chris Wilson 提交于
Since commit 17cabf57 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 14 11:20:57 2015 +0000 drm/i915: Trim the command parser allocations we may then try to allocate a zero-sized object and attempt to extract its pages. Understandably this fails. Note that the real offender seems to be commit b9ffd80e Author: Brad Volkin <bradley.d.volkin@intel.com> Date: Thu Dec 11 12:13:10 2014 -0800 drm/i915: Use batch length instead of object size in command parser Testcase: igt/gem_exec_nop #ivb,byt,hsw Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> [cherry picked from commit 743e78c1 from drm-intel-next because 4.0 seems to be affected by this too, despite that the obvious culprit is definitely not in 4.0. Whatever, if fixes a bug. Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
-
- 27 3月, 2015 1 次提交
-
-
由 Chris Wilson 提交于
Since commit 17cabf57 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 14 11:20:57 2015 +0000 drm/i915: Trim the command parser allocations we may then try to allocate a zero-sized object and attempt to extract its pages. Understandably this fails. Testcase: igt/gem_exec_nop #ivb,byt,hsw Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 20 3月, 2015 2 次提交
-
-
由 Ben Widawsky 提交于
This patch was formerly known as, "Force pd restore when PDEs change, gen6-7." I had to change the name because it is needed for GEN8 too. The real issue this is trying to solve is when a new object is mapped into the current address space. The GPU does not snoop the new mapping so we must do the gen specific action to reload the page tables. GEN8 and GEN7 do differ in the way they load page tables for the RCS. GEN8 does so with the context restore, while GEN7 requires the proper load commands in the command streamer. Non-render is similar for both. Caveat for GEN7 The docs say you cannot change the PDEs of a currently running context. We never map new PDEs of a running context, and expect them to be present - so I think this is okay. (We can unmap, but this should also be okay since we only unmap unreferenced objects that the GPU shouldn't be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag to signal that even if the context is the same, force a reload. It's unclear exactly what this does, but I have a hunch it's the right thing to do. The logic assumes that we always emit a context switch after mapping new PDEs, and before we submit a batch. This is the case today, and has been the case since the inception of hardware contexts. A note in the comment let's the user know. It's not just for gen8. If the current context has mappings change, we need a context reload to switch v2: Rebased after ppgtt clean up patches. Split the warning for aliasing and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt is always null. v3: Invalidate PPGTT TLBs inside alloc_va_range. v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when neither ctx->ppgtt and aliasing_ppgtt exist. v5: Removed references to teardown_va_range. v6: Updated needs_pd_load_pre/post. v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move comment about updated PDEs to object_pin/bind (Mika). Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
If the batch buffer is too large to fit into the aperture and we need a GTT mapping for relocations, we currently fail. This only applies to a subset of machines for a subset of environments, quite undesirable. We can simply check after failing to insert the batch into the GTT as to whether we only need a mappable binding for relocation and, if so, we can revert to using a non-mappable binding and an alternate relocation method. However, using relocate_entry_cpu() is excruciatingly slow for large buffers on non-LLC as the entire buffer requires clflushing before and after the relocation handling. Alternatively, we can implement a third relocation method that only clflushes around the relocation entry. This is still slower than updating through the GTT, so we prefer using the GTT where possible, but is orders of magnitude faster as we typically do not have to then clflush the entire buffer. An alternative idea of using a temporary WC mapping of the backing store is promising (it should be faster than using the GTT itself), but requires fairly extensive arch/x86 support - along the lines of kmap_atomic_prof_pfn() (which is not universally implemented even for x86). Testcase: igt/gem_exec_big #pnv,byt Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88392Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> [danvet: Add a WARN_ONCE for the impossible reloc case and explain in a short comment why we want to avoid ping-pong.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 18 3月, 2015 2 次提交
-
-
由 Paulo Zanoni 提交于
We want to port FBC to the frontbuffer tracking infrastructure, but for that we need to know what caused the object invalidation so we can react accordingly: CPU mmaps need manual, GTT mmaps and flips don't need handling and ring rendering needs nukes. v2: - s/ORIGIN_RENDER/ORIGIN_CS/ (Daniel, Rodrigo) - Fix copy/pasted wrong documentation - Rebase v3: - Rebase v4: - Don't pass the operation to flushes (Daniel). Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Yannick Guerrini 提交于
Change 'mutliple' to 'multiple' Change 'mutlipler' to 'multiplier' Change 'Haswel' to 'Haswell' Signed-off-by: NYannick Guerrini <yguerrini@tomshardware.fr> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 26 2月, 2015 1 次提交
-
-
由 John Harrison 提交于
There is a flags word that is passed through the execbuffer code path all the way from initial decoding of the user parameters down to the very final dispatch buffer call. It is simply called 'flags'. Unfortuantely, there are many other flags words floating around in the same blocks of code. Even more once the GPU scheduler arrives. This patch makes it more obvious exactly which flags word is which by renaming 'flags' to 'dispatch_flags'. Note that the bit definitions for this flags word already have an 'I915_DISPATCH_' prefix on them and so are not quite so ambiguous. OTC-Jira: VIZ-1587 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> [danvet: Resolve conflict with Chris' rework of the bb parsing.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 24 2月, 2015 1 次提交
-
-
由 Chris Wilson 提交于
Currently, the command parser tries to create a secondary batch exactly as large as the original, and vmap both. This is open to abuse by userspace using extremely large batch objects, but only executing very short batches. For example, this would be if userspace were to implement a command submission ringbuffer. However, we only need to allocate pages for just the contents of the command sequence in the batch - all relocations copied to the secondary batch will reference the original batch and so there can be no access to the secondary batch outside of the explicit execution region. Testcase: igt/gem_exec_big #ivb,byt,hsw Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88308Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJohn Harrison <John.C.Harrison@Intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 27 1月, 2015 1 次提交
-
-
由 Zhipeng Gong 提交于
On Skylake GT3 we have 2 Video Command Streamers (VCS), which is asymmetrical. For example, HEVC GPU commands can be only dispatched to VCS1 ring. But userspace has no control when using VCS1 or VCS2. This patch introduces a mechanism to avoid the default ping-pong mode and use one specific ring through execution flag. This mechanism is usable for all the platforms with 2 VCS rings. The open source usage is from these two commits in vaapi/intel: commit 702050f04131a44ef8ac16651708ce8a8d98e4b8 Author: Zhao, Yakui <yakui.zhao@intel.com> Date: Mon Nov 17 12:44:19 2014 +0800 Allow the batchbuffer to be submitted with override flag commit a56efcdf27d11ad9b21664b4a2cda72d7f90f5a8 Author: Zhao Yakui <yakui.zhao@intel.com> Date: Mon Nov 17 12:44:22 2014 +0800 Add the override flag to assure that HEVC video command always uses BSD ring0 for SKL GT3 machine v2: fix whitespace (Rodrigo) v3: remove incorrect chunk that came on -collector rebase. (Rodrigo) v4: change the comment (Zhipeng) v5: address Daniel's comment (Zhipeng) Signed-off-by: NZhipeng Gong <zhipeng.gong@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 08 1月, 2015 1 次提交
-
-
由 Tvrtko Ursulin 提交于
If not pinned VMA can become an eviction target just before it needs to be executed which breaks the internal object lifetime rules. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87399Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 24 12月, 2014 1 次提交
-
-
由 Dave Airlie 提交于
This reverts commit 355a7018. This had some bad side effects under normal operation, and should have been dropped earlier. Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 16 12月, 2014 4 次提交
-
-
由 Brad Volkin 提交于
Move it to a separate function since the main do_execbuffer function already has so much going on. v2: - Move pin/unpin calls inside i915_parse_cmds() (Chris W, v4 7/7 feedback) Issue: VIZ-4719 Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com> Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Brad Volkin 提交于
By adding a new exec_entry flag, we cleanly mark the shadow objects as purgeable after they are on the active list. v2: - Move 'shadow_batch_obj->madv = I915_MADV_WILLNEED' inside _get fnc (danvet, from v4 6/7 feedback) v3: - Remove duplicate 'madv = I915_MADV_WILLNEED' (danvet, from v6 4/5) Issue: VIZ-4719 Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com> Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Brad Volkin 提交于
Previously we couldn't trust the user-supplied batch length because it came directly from userspace (i.e. untrusted code). It would have affected what commands software parsed without regard to what hardware would actually execute, leaving a potential hole. With the parser now copying the user supplied batch buffer and writing MI_NOP commands to any space after the copied region, we can safely use the batch length input. This should be a performance win as the actual batch length is frequently much smaller than the allocated object size. v2: Fix handling of non-zero batch_start_offset Issue: VIZ-4719 Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com> Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Brad Volkin 提交于
This patch sets up all of the tracking and copying necessary to use batch pools with the command parser and dispatches the copied (shadow) batch to the hardware. After this patch, the parser is in 'enabling' mode. Note that performance takes a hit from the copy in some cases and will likely need some work. At a rough pass, the memcpy appears to be the bottleneck. Without having done a deeper analysis, two ideas that come to mind are: 1) Copy sections of the batch at a time, as they are reached by parsing. Might improve cache locality. 2) Copy only up to the userspace-supplied batch length and memset the rest of the buffer. Reduces the number of reads. v2: - Remove setting the capacity of the pool - One global pool instead of per-ring pools - Replace batch_obj with shadow_batch_obj and hook into eb->vmas - Memset any space in the shadow batch beyond what gets copied - Rebased on execlist prep refactoring v3: - Rebase on chained batch handling - Squash in setting the secure dispatch flag - Add a note about the interaction w/secure dispatch pinning - Check for request->batch_obj == NULL in i915_gem_free_request v4: - Fix read domains for shadow_batch_obj - Remove the set_to_gtt_domain call from i915_parse_cmds - ggtt_pin/unpin in the parser block to simplify error handling - Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag - Remove i915_gem_batch_pool_put calls v5: - Move 'pending_read_domains |= I915_GEM_DOMAIN_COMMAND' after the parser (danvet, from v4 0/7 feedback) Issue: VIZ-4719 Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com> Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 15 12月, 2014 1 次提交
-
-
由 Tvrtko Ursulin 提交于
Things like reliable GGTT mappings and mirrored 2d-on-3d display will need to map objects into the same address space multiple times. Added a GGTT view concept and linked it with the VMA to distinguish between multiple instances per address space. New objects and GEM functions which do not take this new view as a parameter assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the previous behaviour. This now means that objects can have multiple VMA entries so the code which assumed there will only be one also had to be modified. Alternative GGTT views are supposed to borrow DMA addresses from obj->pages which is DMA mapped on first VMA instantiation and unmapped on the last one going away. v2: * Removed per view special casing in i915_gem_ggtt_prepare / finish_object in favour of creating and destroying DMA mappings on first VMA instantiation and last VMA destruction. (Daniel Vetter) * Simplified i915_vma_unbind which does not need to count the GGTT views. (Daniel Vetter) * Also moved obj->map_and_fenceable reset under the same check. * Checkpatch cleanups. v3: * Only retire objects once the last VMA is unbound. v4: * Keep scatter-gather table for alternative views persistent for the lifetime of the VMA. * Propagate binding errors to callers and handle appropriately. v5: * Explicitly look for normal GGTT view in i915_gem_obj_bound to align usage in i915_gem_object_ggtt_unpin. (Michel Thierry) * Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry) * Removed stray semi-colon in i915_gem_object_set_cache_level. For: VIZ-4544 Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NMichel Thierry <michel.thierry@intel.com> [danvet: Drop hunk from i915_gem_shrink since it's just prettification but upsets a __must_check warning.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 03 12月, 2014 4 次提交
-
-
由 John Harrison 提交于
All the code above is now using requests not seqnos so it is possible to convert the trace functions across. Note that rather than get into problematic reference counting issues, the trace code only saves the seqno and ring values from the request structure not the structure pointer itself. For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
There is no longer any need to retrieve a seqno value from an i915_add_request() call. The calling code already knows which request structure is being processed (it can only be ring->OLR). And as the request itself is now used in preference to the basic seqno value, the latter is now redundant in this situation. For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The OLS value is now obsolete. Exactly the same value is guarateed to be always available as PLR->seqno. Thus it is safe to remove the OLS completely. And also to rename the PLR to OLR to keep the 'outstanding lazy ...' naming convention valid. For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The object structure contains the last read, write and fenced seqno values for use in syncrhonisation operations. These have now been replaced with their request structure counterparts. Note that to ensure that objects do not end up with dangling pointers, the assignments of last_*_req include reference count updates. Thus a request cannot be freed if an object is still hanging on to it for any reason. v2: Corrected 'last_rendering_' to 'last_read_' in a number of comments that did not get updated when 'last_rendering_seqno' became 'last_read|write_seqno' several millenia ago. For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 21 11月, 2014 1 次提交
-
-
由 Thomas Hellstrom 提交于
It happens on occasion that developers of generic user-space applications abuse the dumb buffer API to get hold of drm buffers that they can both mmap() and use for GPU acceleration, using the assumptions that dumb buffers and buffers available for GPU are a) The same type and can be aribtrarily type-casted. b) fully coherent. This patch makes the most widely used drivers warn nicely when that happens, the next step will be to fail. v2: Move drmP.h changes to drm_gem.h. Fix Radeon dumb mmap breakage. Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-