- 15 2月, 2017 20 次提交
-
-
由 Chris Wilson 提交于
We flush the entire page every time we update a few bytes, making the update of a page table many, many times slower than is required. If we create a WC map of the page for our updates, we can avoid the clflush but incur additional cost for creating the pagetable. We amoritize that cost by reusing page vmappings, and only changing the page protection in batches. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
-
由 Chris Wilson 提交于
Similar to how we already split the bind_vma for ggtt/aliasing_gtt, also split up the unbind for symmetry. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The aliasing_ppgtt is a regular ppgtt, and we can use the regular i915_ppgtt_put() to properly tear it down. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Improve the sg iteration and in hte process eliminate a bug in miscomputing the pml4 length as orig_nents<<PAGE_SHIFT is no longer the full length of the sg table. v2: Check for the end of the fourth level page table (the final pdpe) and move onto the next. v3: Assert that 3lvl insert_pte_entries doesn't overflow its smaller set of PDP. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Inline the address computation to avoid the vfunc call for every page. We still have to pay the high overhead of sg_page_iter_next(), but now at least GCC can optimise the inner most loop, giving a significant boost to some thrashing Unreal Engine workloads. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The predominant VMA class is normal GTT, so allow gcc to emphasize that path and avoid unnecessary stack movement. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-1-chris@chris-wilson.co.uk
-
由 Hans de Goede 提交于
If there is no OPREGION_ASLE_EXT then a VBT stored in mailbox #4 may use the ASLE_EXT parts of the opregion. Adjust the vbt_size calculation for a vbt in mailbox #4 for this. This fixes the driver not finding the VBT on a jumper ezpad mini3 cherrytrail tablet and on a ACER SW5_017 machine. Cc: stable@vger.kernel.org Signed-off-by: NHans de Goede <hdegoede@redhat.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1487088758-30050-1-git-send-email-jani.nikula@intel.com
-
由 Tvrtko Ursulin 提交于
intel_ring_workarounds_emit is exactly the same code. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170214150017.16058-1-tvrtko.ursulin@linux.intel.com
-
由 Chris Wilson 提交于
Currently we apply the jump to rpe if we are below it and the GPU needs more power. For some GPUs, the rpe is 75% of the maximum range causing us to dramatically overshoot low power applications *and* unable to reach the low frequency that can most efficiently deliver their workload. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170210150348.22146-3-chris@chris-wilson.co.ukReviewed-by: NRadoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
-
由 Chris Wilson 提交于
If we receive a DOWN_TIMEOUT rps interrupt, we respond by reducing the GPU clocks significantly. Before we do, double check that the frequency we pick is actually a decrease. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170210150348.22146-2-chris@chris-wilson.co.ukReviewed-by: NRadoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
-
由 Chris Wilson 提交于
When the RPS tuning was applied to Baytrail, in commit 8fb55197 ("drm/i915: Agressive downclocking on Baytrail"), concern was given that it might cause Cherryview excess wakeups of the common power well. However, the static thresholds perform poorly for Kodi, and the GPU is unable to deliver the video frames on time. Enabling the dynamic, finer thresholds used on all other platforms (including Skylake and Broxton that also have the same multiple powerwell concerns) allows the GPU to pick a more appropriate frequency and not drop frames. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170210150348.22146-1-chris@chris-wilson.co.ukReviewed-by: NRadoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
-
由 Chris Wilson 提交于
Once upon a time before we had automated GPU state capture upon hangs, we had intel_gpu_dump. Now we come almost full circle and reinstate that view of the current GPU queues and registers by using the error capture facility to snapshot the GPU state when debugfs/.../i915_gpu_info is opened - which should provided useful debugging to both the error capture routines (without having to cause a hang and avoid the error state being eaten by igt) and generally. v2: Rename drm_i915_error_state to i915_gpu_state to alleviate some name collisions between the error state dump and inspecting the gpu state. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170214164611.11381-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We no longer need to take the struct_mutex for freeing objects, and on the finalisation paths here the mutex is not been used for serialisation of the pointer access, so remove the BKL wart. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170214133420.7977-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
In general, the compiler should not be able to detect if we do any passes through the test loops: In file included from drivers/gpu/drm/i915/i915_gem.c:5029: drivers/gpu/drm/i915/selftests/i915_gem_coherency.c: In function 'igt_gem_coherency': drivers/gpu/drm/i915/selftests/i915_gem_coherency.c:274: error: 'err' may be used uninitialized in this function Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170214143509.15719-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>
-
由 Chris Wilson 提交于
gcc-4.7 spotted that In file included from drivers/gpu/drm/i915/i915_gem_gtt.c:3791:0: drivers/gpu/drm/i915/selftests/i915_gem_gtt.c: In function ‘pot_hole’: drivers/gpu/drm/i915/selftests/i915_gem_gtt.c:594:6: error: ‘err’ may be used uninitialized in this function [-Werror=maybe-uninitialized] So set it to 0 should we ever skip over a hole smaller than a few pages. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170214113756.27834-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>
-
由 Chris Wilson 提交于
When using the mock_ppgtt selftest, the GTT is large enough to cause an overflow in pot_hole() when adding 2 pages to the address. Avoid the overflow by computing the final valid address and iterating up to that address. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170214092344.12330-1-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>
-
由 Deepak S 提交于
With latest Punit FW, vgg input voltag drop falling to minimum is fixed. So reverting the WA patch & moving to turbo freq opreation range to [RPn -> RP0] This is not a 1:1 revert of the commit 5b7c91b7. You can refer to commit 5b5929cb ("drm/i915/chv: remove pre-production hardware workarounds") as the reason for the discrepancy commit 5b7c91b7 Author: Deepak S <deepak.s@linux.intel.com> Date: Sat May 9 18:15:46 2015 +0530 drm/i915/chv: Set min freq to efficient frequency on chv v2: Fix inconsistent return type. (Chris) v3: drop pre-production hw case (Ville) Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDeepak S <deepak.s@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471007801-86075-1-git-send-email-deepak.s@linux.intel.comReviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
-
由 Ville Syrjälä 提交于
Dump out more of the DSI configuration details during init. This includes pclk, burst_mode_ratio, lane_count, pixel_overlap, video_mode_format and reset_timer_val. v2: Dump more info (Chris) v3: Use the VIDEO_MODE_ defines for consistency (Chris) Dump dphy_reg too (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20161221143114.23530-1-ville.syrjala@linux.intel.com
-
由 Robert Bragg 提交于
This workaround for BDW was incomplete as it also requires EUTC clock gating to be disabled via UCGCTL1. v2: read modify write UCGTL1 in broadwell_init_clock_gating (Ville) Signed-off-by: NRobert Bragg <robert@sixbynine.org> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170212133252.20990-1-robert@sixbynine.orgReviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
-
由 Tvrtko Ursulin 提交于
For some reason my compiler (and CI as well) failed to spot the uninitialized ret in mi_set_context. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 73dec95e ("drm/i915: Emit to ringbuffer directly") Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170214152901.20361-1-tvrtko.ursulin@linux.intel.com
-
- 14 2月, 2017 20 次提交
-
-
由 Tvrtko Ursulin 提交于
This removes the usage of intel_ring_emit in favour of directly writing to the ring buffer. intel_ring_emit was preventing the compiler for optimising fetch and increment of the current ring buffer pointer and therefore generating very verbose code for every write. It had no useful purpose since all ringbuffer operations are started and ended with intel_ring_begin and intel_ring_advance respectively, with no bail out in the middle possible, so it is fine to increment the tail in intel_ring_begin and let the code manage the pointer itself. Useless instruction removal amounts to approximately two and half kilobytes of saved text on my build. Not sure if this has any measurable performance implications but executing a ton of useless instructions on fast paths cannot be good. v2: * Change return from intel_ring_begin to error pointer by popular demand. * Move tail increment to intel_ring_advance to enable some error checking. v3: * Move tail advance back into intel_ring_begin. * Rebase and tidy. v4: * Complete rebase after a few months since v3. v5: * Remove unecessary cast and fix !debug compile. (Chris Wilson) v6: * Make intel_ring_offset take request as well. * Fix recording of request postfix plus a sprinkle of asserts. (Chris Wilson) v7: * Use intel_ring_offset to get the postfix. (Chris Wilson) * Convert GVT code as well. v8: * Rename *out++ to *cs++. v9: * Fix GVT out to cs conversion in GVT. v10: * Rebase for new intel_ring_begin in selftests. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170214113242.29241-1-tvrtko.ursulin@linux.intel.com
-
I screwed up the rebase of commit d8fc70b7 ("drm/i915: Make power domain masks 64 bit long") before sending v2, causing a couple of conversions from 32 to 64 bit masks to be lost. Fixes: d8fc70b7 ("drm/i915: Make power domain masks 64 bit long") Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: intel-gfx@lists.freedesktop.org Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213145733.8779-1-ander.conselvan.de.oliveira@intel.com
-
由 Chris Wilson 提交于
The i915_gem_object_wait_fence() uses an incoming timeout=0 to query whether the current fence is busy or idle, without waiting. This can be used by the wait-ioctl to implement a busy query. Fixes: e95433c7 ("drm/i915: Rearrange i915_wait_request() accounting with callers") Testcase: igt/gem_wait/basic-busy-write-all Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+ Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20170212215344.16600-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Chris Wilson 提交于
Explicitly disable stolen memory when running as a guest in a virtual machine, since the memory is not mediated between clients and reserved entirely for the host. The actual size should be reported as zero, but like every other quirk we want to tell the user what is happening. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99028Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161109103905.17860-1-chris@chris-wilson.co.ukReviewed-by: NZhenyu Wang <zhenyuw@linux.intel.com> Cc: stable@vger.kernel.org
-
由 Chris Wilson 提交于
Check that we can reset the GPU and continue executing from the next request. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-47-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
As the page-table trees within the GTT are naturally aligned to power-of-two boundaries, by inserting an object that crosses a power-of-two (and the power-of-two intervals) we can quickly check the code for errors in switching between levels in the tree. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-46-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Move a single page of an object around within the GGTT and check coherency of writes and reads. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-45-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Use the live tests against the mock ppgtt for quick testing on all platforms of the VMA layer. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-44-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
i915_gem_gtt_insert should allocate from the available free space in the GTT, evicting as necessary to create space. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-43-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
i915_gem_gtt_reserve should put the node exactly as requested in the GTT, evicting as required. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-42-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Very simple tests to just ask eviction to find some free space in a full GTT and one with some available space. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-41-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Ensure that we minimally exercise the aliasing_ppgtt, even on a full-ppgtt, by allocating one and similarly creating a context to use it. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-40-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In order to force testing of the aliasing ppgtt, extract its initialisation function. v2: Also extract the cleanup function for symmetry. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-39-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Check we can create and execution within a context. v2: Write one set of dwords through each context/engine to exercise more contexts within the same time period. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-38-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Mock testing to ensure we can create and lookup partial VMA. v2: Named phases Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-37-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Exercise creating rotated VMA and checking the page order within. v2..v3: Be more creative in rotated params Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-36-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
High-level testing of the struct drm_mm by verifying our handling of weird requests to i915_vma_pin. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-35-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Simple test to exercise creation and lookup of VMA within an object. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-34-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
It is possible whilst allocating the page-directory tree for a ppgtt bind that the shrinker may run and reap unused parts of the tree. If the shrinker happens to remove a chunk of the tree that the allocate_va_range has already processed, we may then try to insert into the dangling tree. This test uses the fault-injection framework to force the shrinker to be invoked before we allocate new pages, i.e. new chunks of the PD tree. References: https://bugs.freedesktop.org/show_bug.cgi?id=99295Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
-
由 Chris Wilson 提交于
Directly test allocating the va range and clearing it, this bypasses the use of i915_vma_bind() and inserting the pages to focus on testing of the pagetables. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-32-chris@chris-wilson.co.uk
-