- 14 2月, 2015 11 次提交
-
-
由 Damien Lespiau 提交于
This function is only used in intel_ringbuffer.c, so restrict it to that file. The function was moved around to avoid a forward declaration and group it with its user. Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> [danvet: Squash in fixup from Wu Fengguang.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Damien Lespiau 提交于
v2: Reorder defines (Ben) v3: More bikesheds, this time re-ordering comments! (Chris) Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> [danvet: Resolve conflict.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Nick Hoath 提交于
Fixed the stepping check on WaDisableDgMirrorFixInHalfSliceChicken5 to be for the correct SOC (Skylake) Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Hoath, Nicholas 提交于
v2: Don't add WaHdcDisableFetchWhenMasked. Add stepping check for WaForceEnableNonCoherent Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Hoath, Nicholas 提交于
Move Wa4x4STCOptimizationDisable to gen9_init_workarounds v2: rebase Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Nick Hoath 提交于
Move WaEnableYV12BugFixInHalfSliceChicken7 to gen9_init_workarounds v2: Add stepping check. Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Nick Hoath 提交于
This one doesn't have one of these nice cryptic names unfortunately. v2: Added missing register bitmap Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Nick Hoath 提交于
Move WaDisableDgMirrorFixInHalfSliceChicken5 to gen9_init_workarounds v2: Added stepping check v3: Removed unused register bitmap Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> [danvet: Bikesheds.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Hoath, Nicholas 提交于
v2: Dont add WaDisableThreadStallDopClockGating as not SKL WA. (Found by Damien Lespiau) Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> [danvet: Bikeshed commit message a bit as per Damien's suggestions.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Hoath, Nicholas 提交于
Add framework for gen 9 HW WAs v1: Changed SOC specific WA function to gen 9 common function (Req: Damien Lespiau) Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Damien Lespiau 提交于
We already track this in the intel_info struct. Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> [danvet: Make the commit message a bit less terse.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 27 1月, 2015 4 次提交
-
-
由 Ville Syrjälä 提交于
I ran a few tests with xonotic and synmark2 trying out the different WIZ hashing modes on CHV. The results seem to match the results I got with IVB/HSW when I did the similar tests on them in the past. That is 16x4 is generally the fastest mode, 8x8 comes next and finally 8x4. On CHV the difference between the modes is at most ~1% in most tests. IIRC on IVB/HSW the difference was a little bigger, but as there doesn't seem to be any real downside to 16x4 let's use it by default. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
Wa4x4STCOptimizationDisable got only implemented for BDW, but according to the w/a database CHV needs it too, so add it. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NArun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Mika Kuoppala 提交于
We have multiple forcewake domains now on recent gens. Change the function naming to reflect this. v2: More verbose names (Chris) v3: Rebase v4: Rebase v5: Add documentation for forcewake_get/put Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> (v2) Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Nick Hoath 提交于
Where there were duplicate variables for the tail, context and ring (engine) in the gem request and the execlist queue item, use the one from the request and remove the duplicate from the execlist queue item. Issue: VIZ-4274 v1: Rebase v2: Fixed build issues. Keep separate postfix & tail pointers as these are used in different ways. Reinserted missing full tail pointer update. Signed-off-by: NNick Hoath <nicholas.hoath@intel.com> Reviewed-by: NThomas Daniel <thomas.daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 17 1月, 2015 2 次提交
-
-
由 Kenneth Graunke 提交于
This is an important optimization for avoiding read-after-write (RAW) stalls in the HiZ buffer. Certain workloads would run very slowly with HiZ enabled, but run much faster with the "hiz=false" driconf option. With this patch, they run at full speed even with HiZ. Increases performance in OglVSInstancing by about 2.7x on Braswell. Signed-off-by: NKenneth Graunke <kenneth@whitecape.org> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Kenneth Graunke 提交于
This is an important optimization for avoiding read-after-write (RAW) stalls in the HiZ buffer. Certain workloads would run very slowly with HiZ enabled, but run much faster with the "hiz=false" driconf option. With this patch, they run at full speed even with HiZ. Improves performance in OglVSInstancing by 3.2x on Broadwell GT3e (Iris Pro 6200). Thanks to Jesse Barnes and Ben Widawsky for their help in tracking this down. Thanks to Chris Wilson for showing me the new workarounds system. Signed-off-by: NKenneth Graunke <kenneth@whitecape.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 13 1月, 2015 1 次提交
-
-
由 Kenneth Graunke 提交于
Found by reading the HIZ_CHICKEN documentation. Improves performance in a HiZ microbenchmark by around 50%. Improves performance in OglZBuffer by around 18%. Thanks to Chris Wilson for helping me figure out where to put this. Signed-off-by: NKenneth Graunke <kenneth@whitecape.org> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 16 12月, 2014 3 次提交
-
-
由 Chris Wilson 提交于
In order to act as a full command barrier by itself, we need to tell the pipecontrol to actually stall the command streamer while the flush runs. We require the full command barrier before operations like MI_SET_CONTEXT, which currently rely on a prior invalidate flush. References: https://bugs.freedesktop.org/show_bug.cgi?id=83677 Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Chris Wilson 提交于
In the gen7 pipe control there is an extra bit to flush the media caches, so let's set it during cache invalidation flushes. v2: Rename to MEDIA_STATE_CLEAR to be more inline with spec. Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Michel Thierry 提交于
Otherwise, new platforms without workarounds will hit this warning for every new context created. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NMichel Thierry <michel.thierry@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 11 12月, 2014 1 次提交
-
-
由 Michel Thierry 提交于
We already implement this workaround, but it was missing its name. Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NMichel Thierry <michel.thierry@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 10 12月, 2014 3 次提交
-
-
由 Damien Lespiau 提交于
We may be hidding bugs by doing that, so let remove it and have the actual mask value shine through, for better or worse. Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Damien Lespiau 提交于
While trying to unify the order of those arguments throughout the driver, Daniel noticed what we were inverting them in this part of the code. Suggested-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
由 Damien Lespiau 提交于
I was playing with clang and oh surprise! a warning trigerred by -Wshift-overflow (gcc doesn't have this one): WA_SET_BIT_MASKED(GEN7_GT_MODE, GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4); drivers/gpu/drm/i915/intel_ringbuffer.c:786:2: warning: signed shift result (0x28002000000) requires 43 bits to represent, but 'int' only has 32 bits [-Wshift-overflow] WA_SET_BIT_MASKED(GEN7_GT_MODE, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/intel_ringbuffer.c:737:15: note: expanded from macro 'WA_SET_BIT_MASKED' WA_REG(addr, _MASKED_BIT_ENABLE(mask), (mask) & 0xffff) Turned out GEN6_WIZ_HASHING_MASK was already shifted by 16, and we were trying to shift it a bit more. The other thing is that it's not the usual case of setting WA bits here, we need to have separate mask and value. To fix this, I've introduced a new _MASKED_FIELD() macro that takes both the (unshifted) mask and the desired value and the rest of the patch ripples through from it. This bug was introduced when reworking the WA emission in: Commit 7225342a Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Oct 7 17:21:26 2014 +0300 drm/i915: Build workaround list in ring initialization v2: Invert the order of the mask and value arguments (Daniel Vetter) Rewrite _MASKED_BIT_ENABLE() and _MASKED_BIT_DISABLE() with _MASKED_FIELD() (Jani Nikula) Make sure we only evaluate 'a' once in _MASKED_BIT_ENABLE() (Dave Gordon) Add check to ensure the value is within the mask boundaries (Chris Wilson) v3: Ensure the the value and mask are 16 bits (Dave Gordon) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 08 12月, 2014 1 次提交
-
-
由 Daniel Vetter 提交于
Similar to a patch from Thomas Daniel for lrc contexts. This keeps both sides somewhat in sync and should make Dave Gordon happy. Note that both the wa and the golden context init code suffer a bit from an inssuficient split into driver load and hw init code. Which means we have a bunch of tests all over the place to check whether the one-time initialization has been done already or not. All that one-tim code should be moved into the one-time ring setup code, but that's work for later. Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Reviewed-by: NDave Gordon <david.s.gordon@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 06 12月, 2014 2 次提交
-
-
由 John Harrison 提交于
For debugging purposes, it is useful to be able to uniquely identify a given request structure as it works its way through the system. This becomes especially tricky once the seqno value is lazily allocated as then the request has nothing but its pointer to identify it for much of its life. Change-Id: Ie76b2268b940467f4cdf5a4ba6f5a54cbb96445d For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
There is a general theory that kzmalloc is better/safer than kmalloc, especially for interesting data structures. This change updates the request structure allocation to be zero filled. This also fixes crashes in the reset code. Quoting Mika's patch: "Clean the request structure on alloc. Otherwise we might end up referencing uninitialized fields. This is apparent when we try to cleanup the preallocated request on ring reset, before any request has been submitted to the ring. The request->ctx is foobar and we end up freeing the foobarness." Note that this fixes a regression introduced in commit 9eba5d4a Author: John Harrison <John.C.Harrison@Intel.com> Date: Mon Nov 24 18:49:23 2014 +0000 drm/i915: Ensure OLS & PLR are always in sync References: https://bugs.freedesktop.org/show_bug.cgi?id=86959 References: https://bugs.freedesktop.org/show_bug.cgi?id=86962 References: https://bugs.freedesktop.org/show_bug.cgi?id=86992 Change-Id: I68715ef758025fab8db763941ef63bf60d7031e2 For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 05 12月, 2014 1 次提交
-
-
由 Michel Thierry 提交于
We already have it for chv, but was missing for bdw. Signed-off-by: NMichel Thierry <michel.thierry@intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 03 12月, 2014 11 次提交
-
-
由 Daniel Vetter 提交于
Now that sanity prevails and we have the clean split between software init and starting the engines we can drop all the "have we allocate this struct already?" nonsense. Execlist code could benefit quite a bit more still, but that's for another patch. Reviewed-by: NDave Gordon <david.s.gordon@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
-
由 Daniel Vetter 提交于
We can do this. And now there's finally the clean split between software setup and hardware setup I kinda wanted since multi-ring support was merged aeons ago. It only took almost 5 years. Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Reviewed-by: NDave Gordon <david.s.gordon@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
With this all the ->init_hw hooks really only set up hw state needed to start the ring, all the software state setup and memory/buffer allocations happen beforehand. v2: We need to call intel_init_pipe_control after the ring init since otherwise engine->dev is NULL and it falls over. Currently that's now after the hw ring is enabled but a) we'll be fine as long as no one submits a batch b) this will change soon. Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Reviewed-by: NDave Gordon <david.s.gordon@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
This is (mostly, some exceptions that need fixing) the hw setup function which starts the ring. And not the function which allocates all the resources. Make this clear by giving it a better name. Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Reviewed-by: NDave Gordon <david.s.gordon@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Dave Gordon 提交于
There are numerous places in the code where the driver's idea of how much space is left in a ring is updated using the driver's latest notions of the positions of 'head' and 'tail' for the ring. Among them are some that update one or both of these values before (re)doing the calculation. In particular, there are four different places in the code where 'last_retired_head' is copied to 'head' and then set to -1; and two of these do not have a guard to check that it has actually been updated since last time it was consumed, leaving the possibility that the dummy -1 can be transferred from 'last_retired_head' to 'head', causing the space calculation to produce 'impossible' results (previously seen on Android/VLV). This code therefore consolidates all the calculation and updating of these values, such that there is only one place where the ring space is updated, and it ALWAYS uses (and consumes) 'last_retired_head' if (and ONLY if) it has been updated since the last call. Signed-off-by: NDave Gordon <david.s.gordon@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Dave Gordon 提交于
The used space in a ring is given by the cyclic distance from the consumer (HEAD) to the producer (TAIL), i.e. ((tail-head) MOD size); conversely, the available space in a ring is the cyclic distance from the producer to the consumer, MINUS the amount reserved for a "gap" that is supposed to guarantee that the producer never catches up with or overruns the consumer. Note that some GEN h/w requires that TAIL never approach to within one cacheline of HEAD, so the gap is usually set to twice the cacheline size to ensure this. While the existing code gives the correct answer for correct inputs, if the producer HAS overrun into the reserved space, the result can be a value larger than the maximum valid value (size-reserved). We can improve this by reorganising the calculation, so that in the event of overrun the result will be negative rather than over-large. This means that the commonly-used test (available >= required) will then reject further writes into the ring after an overrun, giving some chance that we can recover from or at least diagnose the original problem; whereas allowing more writes would likely both confuse the h/w and destroy the evidence of what went wrong. Signed-off-by: NDave Gordon <david.s.gordon@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
It makes a lot more sense (and makes future seqno -> request conversion patches simpler) to fill in the 'ring' field of the request structure at the point of creation rather than submission. Given that the request structure is assigned by ring specific code and thus is locked to a ring from the start, there really is no reason to defer this assignment. For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
There is no longer any need to retrieve a seqno value from an i915_add_request() call. The calling code already knows which request structure is being processed (it can only be ring->OLR). And as the request itself is now used in preference to the basic seqno value, the latter is now redundant in this situation. For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Updated i915_wait_seqno() to take a request structure instead of a seqno value and renamed it accordingly. Internally, it just pulls the seqno out of the request and calls on to __wait_seqno() as before. However, all the code further up the stack is now simplified as it can just pass the request object straight through without having to peek inside. For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> [danvet: Squash in hunk from an earlier patch which was rebased wrongly.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The OLS value is now obsolete. Exactly the same value is guarateed to be always available as PLR->seqno. Thus it is safe to remove the OLS completely. And also to rename the PLR to OLR to keep the 'outstanding lazy ...' naming convention valid. For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The plan is to use request structures everywhere that seqno values were previously used. This means saving pointers to structures in places that used to be simple integers. In turn, that means that the target structure now needs much more stringent lifetime tracking. That is, it must not be freed while some other random object still holds a pointer to it. To achieve this tracking, a reference count needs to be added. Whenever a pointer to the structure is saved away, the count must be incremented and the free must only occur when all references have been released. For: VIZ-4377 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-