- 11 7月, 2017 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
The right variable to check here is port, not dp. This issue was detected using Coccinelle and the following semantic patch: @@ expression x; identifier fld; @@ * x = devm_kzalloc(...); ... when != x == NULL x->fld Signed-off-by: NGustavo A. R. Silva <garsilva@embeddedor.com> Acked-by: NMark Yao <mark.yao@rock-chips.com> Signed-off-by: NSean Paul <seanpaul@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170706215833.GA25411@embeddedgus
-
- 03 7月, 2017 1 次提交
-
-
由 Maarten Lankhorst 提交于
All atomic state should be cleared when drm_modeset_backoff() is called, because it drops all locks and the state becomes invalid. The call to drm_atomic_state_clear was missing in atomic_remove_fb, so add the missing call there. Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170629115954.26029-1-maarten.lankhorst@linux.intel.comReviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Fixes: db8f6403 ("drm: Convert drm_framebuffer_remove to atomic, v4.") Cc: stable@vger.kernel.org # v4.12-rc1+
-
- 29 6月, 2017 1 次提交
-
-
由 Laurent Pinchart 提交于
Commit 3fcdcb27 ("drm/vblank: Switch to bool in_vblank_irq in get_vblank_timestamp") inverted a condition by mistake that resulted in vblank timestamps always being 0 on hardware without a vblank counter. Fix it. Fixes: 3fcdcb27 ("drm/vblank: Switch to bool in_vblank_irq in get_vblank_timestamp") Suggested-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170629123720.27173-1-laurent.pinchart+renesas@ideasonboard.com
-
- 21 6月, 2017 10 次提交
-
-
由 Eric Anholt 提交于
ERR_PTR() needs a negative errno argument. Signed-off-by: NEric Anholt <eric@anholt.net> Signed-off-by: NArchit Taneja <architt@codeaurora.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170615175423.17954-1-eric@anholt.net
-
由 Arnd Bergmann 提交于
The last rework left behind two unused variables: drm/arm/hdlcd_crtc.c: In function 'hdlcd_plane_atomic_update': drm/arm/hdlcd_crtc.c:264:13: warning: unused variable 'src_y' [-Wunused-variable] drm/arm/hdlcd_crtc.c:264:6: warning: unused variable 'src_x' [-Wunused-variable] This removes them. Fixes: b2ae06ae9834 ("drm/arm: hdlcd: Use CMA helper for plane buffer address calculation") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NLiviu Dudau <liviu.dudau@arm.com> Signed-off-by: NLiviu Dudau <liviu.dudau@arm.com>
-
由 Liviu Dudau 提交于
CMA has gained a recent helper function for calculating the start of the plane buffer's physical address. Use that instead of the hand rolled version. Signed-off-by: NLiviu Dudau <liviu.dudau@arm.com>
-
由 Liviu Dudau 提交于
The component-based encoder(s) used by HDLCD expect the CRTC port to be set before binding in order to find the right endpoint. Without this patch, the TDA19988 encoder driver prints a warning "Falling back to first CRTC". Signed-off-by: NLiviu Dudau <Liviu.Dudau@arm.com>
-
由 Daniel Vetter 提交于
In commit 91eefc05 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Dec 14 00:08:10 2016 +0100 drm: Tighten locking in drm_mode_getconnector I reordered the logic a bit in that IOCTL, but that broke userspace since it'll get the new mode list, but not the new property values. Fix that again. v2: Fix up the error path handling when copy_to_user for the modes failes (Dhinakaran). Fixes: 91eefc05 ("drm: Tighten locking in drm_mode_getconnector") Cc: Sean Paul <seanpaul@chromium.org> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: David Airlie <airlied@linux.ie> Cc: dri-devel@lists.freedesktop.org Reported-by: N"H.J. Lu" <hjl.tools@gmail.com> Tested-by: N"H.J. Lu" <hjl.tools@gmail.com> Cc: <stable@vger.kernel.org> # v4.11+ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100576 Cc: "H.J. Lu" <hjl.tools@gmail.com> Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com> Reviewed-by: NSean Paul <seanpaul@chromium.org> Reviewed-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620202837.1701-1-daniel.vetter@ffwll.ch
-
由 Dave Airlie 提交于
This was from a merge I did incorrectly. Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Alex Deucher 提交于
Fixes resume from suspend. bug: https://bugzilla.kernel.org/show_bug.cgi?id=196121Reported-by: NPrzemek <soprwa@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
由 Alex Deucher 提交于
Disable PX on these systems. bug: https://bugs.freedesktop.org/show_bug.cgi?id=101491Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Increase the default display clock on newer asics to accomodate some high res modes with really high refresh rates. bug: https://bugs.freedesktop.org/show_bug.cgi?id=93826Acked-by: NChunming Zhou <david1.zhou@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
由 Alex Deucher 提交于
We were using the wrong structure which lead to an overflow on some boards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=101387Acked-by: NChunming Zhou <david1.zhou@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 20 6月, 2017 5 次提交
-
-
由 Kasin Li 提交于
In function submit_create, if nr_cmds or nr_bos is assigned with negative value, the allocated buffer may be small than intended. Using this buffer will lead to buffer overflow issue. Signed-off-by: NKasin Li <donglil@codeaurora.org> Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org> Signed-off-by: NRob Clark <robdclark@gmail.com>
-
由 Alex Xie 提交于
In original function amdgpu_bo_list_get, the waiting for result->lock can be quite long while mutex bo_list_lock was holding. It can make other tasks waiting for bo_list_lock for long period. Secondly, this patch allows several tasks(readers of idr) to proceed at the same time. v2: use rcu and kref (Dave Airlie and Christian König) v3: update v1 commit message (Michel Dänzer) v4: rebase on upstream (Alex Deucher) Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
由 Alex Xie 提交于
v2: Remove duplication of zeroing of bo list (Christian König) Move idr_alloc function to end of ioctl (Christian König) Call kfree bo_list when amdgpu_bo_list_set return error. Combine the previous two patches into this patch. Add amdgpu_bo_list_set function prototype. Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
由 Junshan Fang 提交于
Signed-off-by: NJunshan Fang <Junshan.Fang@amd.com> Reviewed-by: NRoger.He <Hongbo.He@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dhinakaran Pandiyan 提交于
Maarten and Ville noticed that we are enabling backlight via DP aux very early in the modeset_init path via the intel_dp_aux_setup_backlight() function, since commit e7156c83 ("drm/i915: Add Backlight Control using DPCD for eDP connectors (v9)"). Looks like all we need to do during _setup_backlight() is read the current brightness state instead of modifying it. v2: Rewrote commit message. Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Yetunde Adebisi <yetundex.adebisi@intel.com> Signed-off-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: NJani Nikula <jani.nikula@intel.com> Tested-by: NPuthikorn Voravootivat <puthik@chromium.org> Fixes: e7156c83 ("drm/i915: Add Backlight Control using DPCD for eDP connectors (v9)") Link: http://patchwork.freedesktop.org/patch/msgid/1497384239-2965-1-git-send-email-dhinakaran.pandiyan@intel.comSigned-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> (cherry picked from commit f6262bda) Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1497895708-19422-1-git-send-email-dhinakaran.pandiyan@intel.com
-
- 19 6月, 2017 6 次提交
-
-
由 Ville Syrjälä 提交于
If intel_crtc_disable_noatomic() were to ever get called during resume we'd end up deadlocking since resume has its own acqcuire_ctx but intel_crtc_disable_noatomic() still tries to use the mode_config.acquire_ctx. Pass down the correct acquire ctx from the top. Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: e2c8b870 ("drm/i915: Use atomic helpers for suspend, v2.") Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-3-ville.syrjala@linux.intel.comReviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit da1d0e26) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Ville Syrjälä 提交于
Pass down the correct acquire context to the pipe A quirk load detect hack during display resume. Avoids deadlocking the entire thing. Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: e2c8b870 ("drm/i915: Use atomic helpers for suspend, v2.") Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-2-ville.syrjala@linux.intel.comReviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit aecd36b8) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Chris Wilson 提交于
I tried __GFP_NORETRY in the belief that __GFP_RECLAIM was effective. It struggles with handling reclaim of our dirty buffers and relies on reclaim via kswapd. As a result, a single pass of direct reclaim is unreliable when i915 occupies the majority of available memory, and the only means of effectively waiting on kswapd to amke progress is by not setting the __GFP_NORETRY flag and lopping. That leaves us with the dilemma of invoking the oomkiller instead of propagating the allocation failure back to userspace where it can be handled more gracefully (one hopes). In the future we may have __GFP_MAYFAIL to allow repeats up until we genuinely run out of memory and the oomkiller would have been invoked. Until then, let the oomkiller wreck havoc. v2: Stop playing with side-effects of gfp flags and await __GFP_MAYFAIL v3: Update comments that direct reclaim only appears to be ignoring our dirty buffers! Fixes: 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") Testcase: igt/gem_tiled_swapping Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Michal Hocko <mhocko@suse.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-2-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit eaf41801) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Chris Wilson 提交于
Commit 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") made the bold decision to try and avoid the oomkiller by reporting -ENOMEM to userspace if our allocation failed after attempting to free enough buffer objects. In short, it appears we were giving up too easily (even before we start wondering if one pass of reclaim is as strong as we would like). Part of the problem is that if we only shrink just enough pages for our expected allocation, the likelihood of those pages becoming available to us is less than 100% To counter-act that we ask for twice the number of pages to be made available. Furthermore, we allow the shrinker to pull pages from the active list in later passes. v2: Be a little more cautious in paging out gfx buffers, and leave that to a more balanced approach from shrink_slab(). Important when combined with "drm/i915: Start writeback from the shrinker" as anything shrunk is immediately swapped out and so should be more conservative. Fixes: 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-1-chris@chris-wilson.co.uk (cherry picked from commit 4846bf0c) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Chris Wilson 提交于
We need to keep track of the last location we ask the hw to read up to (RING_TAIL) separately from our last write location into the ring, so that in the event of a GPU reset we do not tell the HW to proceed into a partially written request (which can happen if that request is waiting for an external signal before being executed). v2: Refactor intel_ring_reset() (Mika) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100144 Testcase: igt/gem_exec_fence/await-hang Fixes: 821ed7df ("drm/i915: Update reset path to fix incomplete requests") Fixes: d55ac5bf ("drm/i915: Defer transfer onto execution timeline to actual hw submission") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170425130049.26147-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> (cherry picked from commit e6ba9992) Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170615131129.3061-1-chris@chris-wilson.co.uk
-
由 Daniel Vetter 提交于
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 17 6月, 2017 7 次提交
-
-
由 Sushmita Susheelendra 提交于
Buffer object specific resources like pages, domains, sg list need not be protected with struct_mutex. They can be protected with a buffer object level lock. This simplifies locking and makes it easier to avoid potential recursive locking scenarios for SVM involving mmap_sem and struct_mutex. This also removes unnecessary serialization when creating buffer objects, and also between buffer object creation and GPU command submission. Signed-off-by: NSushmita Susheelendra <ssusheel@codeaurora.org> [robclark: squash in handling new locking for shrinker] Signed-off-by: NRob Clark <robdclark@gmail.com>
-
由 Ben Skeggs 提交于
Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
-
由 Rodrigo Vivi 提交于
Coffee Lake inherit most of Kabylake production workarounds. v2: Fix typo on commit message and remove WaDisableKillLogic and GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC, since as Mika pointed out they shouldn't be here for cfl according to BSpec. Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1497653398-15722-1-git-send-email-rodrigo.vivi@intel.com
-
由 Dave Airlie 提交于
This creates a new command submission chunk for amdgpu to add in and out sync objects around the submission. Sync objects are managed via the drm syncobj ioctls. The command submission interface is enhanced with two new chunks, one for syncobj pre submission dependencies, and one for post submission sync obj signalling, and just takes a list of handles for each. This is based on work originally done by David Zhou at AMD, with input from Christian Konig on what things should look like. In theory VkFences could be backed with sync objects and just get passed into the cs as syncobj handles as well. NOTE: this interface addition needs a version bump to expose it to userspace. TODO: update to dep_sync when rebasing onto amdgpu master. (with this - r-b from Christian) v1.1: keep file reference on import. v2: move to using syncobjs v2.1: change some APIs to just use p pointer. v3: make more robust against CS failures, we now add the wait sems but only remove them once the CS job has been submitted. v4: rewrite names of API and base on new syncobj code. v5: move post deps earlier, rename some apis v6: lookup post deps earlier, and just replace fences in post deps stage (Christian) Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dave Airlie 提交于
This just splits out the fence depenency checking into it's own function to make it easier to add semaphore dependencies. v2: rebase onto other changes. v1-Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Avoids printing spurious messages like this: [ 3.102059] amdgpu 0000:01:00.0: VM size (-1) must be a power of 2 Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dhinakaran Pandiyan 提交于
Although we use 9 bits of Device ID for identifying PCH, only 8 bits are stored in dev_priv->pch_id. This makes HAS_PCH_CNP_LP() and HAS_PCH_SPT_LP() incorrect. Fix this by storing all the 9 bits for the platforms with LP PCH. v2: Drop PCH_LPT_LP change (Imre) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Imre Deak <imre.deak@intel.com> Fixes: commit ec7e0bb3 ("drm/i915/cnp: Add PCI ID for Cannonpoint LP PCH") Reported-by: NImre Deak <imre.deak@intel.com> Reviewed-by: NImre Deak <imre.deak@intel.com> Signed-off-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: NImre Deak <imre.deak@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1497641774-29104-1-git-send-email-dhinakaran.pandiyan@intel.com
-
- 16 6月, 2017 9 次提交
-
-
由 Chris Wilson 提交于
During execbuf, a mandatory step is that we add this request (this fence) to each object's reservation_object. Inside execbuf, we track the vma, and to add the fence to the reservation_object then means having to first chase the obj, incurring another cache miss. We can reduce the number of cache misses by stashing a pointer to the reservation_object in the vma itself. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170616140525.6394-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
If the user requires patching of their batch or auxiliary buffers, we currently make the alterations on the cpu. If they are active on the GPU at the time, we wait under the struct_mutex for them to finish executing before we rewrite the contents. This happens if shared relocation trees are used between different contexts with separate address space (and the buffers then have different addresses in each), the 3D state will need to be adjusted between execution on each context. However, we don't need to use the CPU to do the relocation patching, as we could queue commands to the GPU to perform it and use fences to serialise the operation with the current activity and future - so the operation on the GPU appears just as atomic as performing it immediately. Performing the relocation rewrites on the GPU is not free, in terms of pure throughput, the number of relocations/s is about halved - but more importantly so is the time under the struct_mutex. v2: Break out the request/batch allocation for clearer error flow. v3: A few asserts to ensure rq ordering is maintained Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
Currently, the last object in the execlist is the always the batch. However, when building the batch buffer we often know the batch object first and if we can use the first slot in the execlist we can emit relocation instructions relative to it immediately and avoid a separate pass to adjust the relocations to point to the last execlist slot. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
This simply hides the EAGAIN caused by userptr when userspace causes resource contention. However, it is quite beneficial with highly contended userptr users as we avoid repeating the setup costs and kernel-user context switches. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
-
由 Chris Wilson 提交于
When choosing a slot for an execbuffer, we ideally want to use the same address as last time (so that we don't have to rebind it) and the same address as expected by the user (so that we don't have to fixup any relocations pointing to it). If we first try to bind the incoming execbuffer->offset from the user, or the currently bound offset that should hopefully achieve the goal of avoiding the rebind cost and the relocation penalty. However, if the object is not currently bound there we don't want to arbitrarily unbind an object in our chosen position and so choose to rebind/relocate the incoming object instead. After we report the new position back to the user, on the next pass the relocations should have settled down. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtien@linux.intel.com>
-
由 Chris Wilson 提交于
If we take a reference to the object/vma when it is first used in an execbuf, we can keep that reference until the object's file-local handle is closed. Thereby saving a frequent ref/unref pair. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
The major scaling bottleneck in execbuffer is the processing of the execobjects. Creating an auxiliary list is inefficient when compared to using the execobject array we already have allocated. Reservation is then split into phases. As we lookup up the VMA, we try and bind it back into active location. Only if that fails, do we add it to the unbound list for phase 2. In phase 2, we try and add all those objects that could not fit into their previous location, with fallback to retrying all objects and evicting the VM in case of severe fragmentation. (This is the same as before, except that phase 1 is now done inline with looking up the VMA to avoid an iteration over the execobject array. In the ideal case, we eliminate the separate reservation phase). During the reservation phase, we only evict from the VM between passes (rather than currently as we try to fit every new VMA). In testing with Unreal Engine's Atlantis demo which stresses the eviction logic on gen7 class hardware, this speed up the framerate by a factor of 2. The second loop amalgamation is between move_to_gpu and move_to_active. As we always submit the request, even if incomplete, we can use the current request to track active VMA as we perform the flushes and synchronisation required. The next big advancement is to avoid copying back to the user any execobjects and relocations that are not changed. v2: Add a Theory of Operation spiel. v3: Fall back to slow relocations in preparation for flushing userptrs. v4: Document struct members, factor out eb_validate_vma(), add a few more comments to explain some magic and hide other magic behind macros. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
If we write a relocation into the buffer, we require our own implicit synchronisation added after the start of the execbuf, outside of the user's control. As we may end up clflushing, or doing the patch itself on the GPU, asynchronously we need to look at the implicit serialisation on obj->resv and hence need to disable EXEC_OBJECT_ASYNC for this object. If the user does trigger a stall for relocations, we make sure the stall is complete enough so that the batch is not submitted before we complete those relocations. Fixes: 77ae9957 ("drm/i915: Enable userspace to opt-out of implicit fencing") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
We can simplify our tracking of pending writes in an execbuf to the single bit in the vma->exec_entry->flags, but that requires the relocation function knowing the object's vma. Pass it along. Note we have only been using a single bit to track flushing since commit cc889e0f Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jun 13 20:45:19 2012 +0200 drm/i915: disable flushing_list/gpu_write_list unconditionally flushed all render caches before the breadcrumb and commit 6ac42f41 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sat Jul 21 12:25:01 2012 +0200 drm/i915: Replace the complex flushing logic with simple invalidate/flush all did away with the explicit GPU domain tracking. This was then codified into the ABI with NO_RELOC in commit ed5982e6 Author: Daniel Vetter <daniel.vetter@ffwll.ch> # Oi! Patch stealer! Date: Thu Jan 17 22:23:36 2013 +0100 drm/i915: Allow userspace to hint that the relocations were known Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-