- 10 4月, 2015 22 次提交
-
-
由 Chris Wilson 提交于
I woke up one morning and found 50k objects sitting in the batch pool and every search seemed to iterate the entire list... Painting the screen in oils would provide a more fluid display. One issue with the current design is that we only check for retirements on the current ring when preparing to submit a new batch. This means that we can have thousands of "active" batches on another ring that we have to walk over. The simplest way to avoid that is to split the pools per ring and then our LRU execution ordering will also ensure that the inactive buffers remain at the front. v2: execlists still requires duplicate code. v3: execlists requires more duplicate code Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Move the madvise logic out of the execbuffer main path into the relatively rare allocation path, making the execbuffer manipulation less fragile. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
In the next patch, I want to use the structure elsewhere and so require it defined earlier. Rather than move the definition to an earlier location where it feels very odd, place it in its own header file. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
This reverts commit ec5cc0f9 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jun 12 10:28:55 2014 +0100 drm/i915: Restrict GPU boost to the RCS engine The premise that media/blitter workloads are not affected by boosting is patently false with a trip through igt. The question that remains is what exactly is going wrong with the media workload that prompted this? Hopefully that would be fixed by the missing agressive downclocking, in addition to the extra restrictions imposed on how frequent a process is allowed to boost. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll> Acked-by: NDeepak S <deepak.s@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
With boosting for missed pageflips, we have a much stronger indication of when we need to (temporarily) boost GPU frequency to ensure smooth delivery of frames. So now only allow each client to perform one RPS boost in each period of GPU activity due to stalling on results. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Reviewed-by: NDeepak S <deepak.s@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
If we hit a vblank and see that have a pageflip queue but not yet processed, ensure that the GPU is running at maximum in order to clear the backlog. Pageflips are only queued for the following vblank, if we miss it, there will be a visible stutter. Boosting the GPU frequency doesn't prevent us from missing the target vblank, but it should help the subsequent frames hitting theirs. v2: Reorder vblank vs flip-complete so that we only check for a missed flip after processing the completion events, and avoid spurious boosts. v3: Rename missed_vblank v4: Rebase v5: Cancel the outstanding work in runtime suspend v6: Rebase v7: Rebase required fixing Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Deepak S<deepak.s@linux.intel.com> Reviewed-by: Deepak S<deepak.s@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
The issue is that by computing the last_adj value after applying the clamping, we can end up with a bogus value for feeding into the next RPS autotuning step. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Deepak S <deepak.s@linux.intel.com> Reviewed-by: NDeepak S <deepak.s@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Reuse the same reclocking strategy for Baytail as on its bigger brethren, Sandybridge and Ivybridge. In particular, this makes the device quicker to reclock (both up and down) though the tendency now is to downclock more aggressively to compensate for the RPS boosts. v2: Rebase v3: Exclude Cherrytrail as Deepak was concerned that the increased number of register writes would wake the common powerwell too often. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NDeepak S <deepak.s@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Currently we emit semaphore synchronisation as if we were going to flip using the target CS engine, but we then change our minds and do the flip using the CPU. Consequently we write instructions to the ring but never use them - even to the point of filling that ring up entirely and never submitting a request. The wrinkle in the ointment is that we have to tell a white lie to pin-to-display for it to skip the synchronisation for mmioflips as we will create a task specifically for that slow synchronisation. An oddity of note is the discrepancy in requests that we tell to pin-display to serialise to and that we then eventually wait upon. This is due to a limitation in the i915_gem_object_sync() routine that will be lifted later. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
The biggest user of i915_gem_object_get_page() is the relocation processing during execbuffer. Typically userspace passes in a set of relocations in sorted order. Sadly, we alternate between relocations increasing from the start of the buffers, and relocations decreasing from the end. However the majority of consecutive lookups will still be in the same page. We could cache the start of the last sg chain, however for most callers, the entire sgl is inside a single chain and so we see no improve from the extra layer of caching. v2: Avoid the double increment inside unlikely() References: https://bugs.freedesktop.org/show_bug.cgi?id=88308Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Damien Lespiau 提交于
Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Damien Lespiau 提交于
Both WaDisableSDEUnitClockGating and WaSetGAPSunitClckGateDisable are needed on B0 as well. Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Arun Siluvery 提交于
According to Spec this is a reserved bit for Gen9+ and should not be set. Change-Id: I0215fb7057b94139b7a2f90ecc7a0201c0c93ad4 Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Maarten Lankhorst 提交于
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
For the conversion to atomic. The pre_enable() hooks are called as part of the crtc enable sequence, at which point the staged config was already made effective. Furthermore, the function actually changes hardware state, so it should anyway deal with current and not staged config. Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
Reduce dependency on the staged config by using the atomic state instead. Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
Reduce dependency on the staged config by using the atomic state instead. Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
It's not needed anymore, now that all the users were converted to using an atomic state. Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
Move towards atomic by using the atomic state instead. Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
Now that we use a drm atomic state for the legacy modeset, it is possible to get rid of the usage of intel_crtc->new_config in the function intel_mode_max_pixclk(). Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
../drivers/gpu/drm/i915/intel_pm.c:3185:45: warning: Initializer entry defined twice ../drivers/gpu/drm/i915/intel_pm.c:3185:52: also defined here Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Sometimes userspace wants a true overlay that is never clipped. In such cases, we need to disable the destination colorkey. However, it is currently unconditionally enabled in the overlay with no means of disabling. So rectify that by always default to on, and extending the UPDATE_ATTR ioctl to support explicit disabling of the colorkey. This is contrast to the spite code which requires explicit enabling of either the destination or source colorkey. Handling source colorkey is still todo for the overlay. (Of course it may be worth migrating overlay to sprite before then.) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 07 4月, 2015 3 次提交
-
-
由 Jani Nikula 提交于
Occasionally it would be interesting to read some of the DPCD registers for debug purposes, without having to resort to logging. Add an i915 specific i915_dpcd debugfs file for DP and eDP connectors to dump parts of the DPCD. Currently the DPCD addresses to be dumped are statically configured, and more can be added trivially. The implementation also makes it relatively easy to add other i915 and connector specific debugfs files in the future, as necessary. This is currently i915 specific just because there's no generic way to do AUX transactions given just a drm_connector. However it's all pretty straightforward to port to other drivers. v2: Add more DPCD registers to dump. Signed-off-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NBob Paauwe <bob.j.paauwe@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Rodrigo Vivi 提交于
Program the default initial value of the L3SqcReg1 on BDW for performance v2: Default confirmed and using intel_ring_emit_wa as Mika pointed out. v3: Spec shows now a different value. It tells us to set to 0x784000 instead the 0x610000 that is there already. Also rebased after a long time so using WA_WRITE now. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Sonika Jindal 提交于
We make use of HW tracking for Selective update region and enable frame sync on sink. We use hardware's hardcoded data values for frame sync and GTC. v2: Add 3200x2000 resolution restriction with PSR2, move psr2_support to i915_psr struct, add aux_frame_sync to independently control aux frame sync, rename the TP2 TIME macro for 2500us (Rodrigo, Siva) v3: Moving the resolution restriction to intel_psr_enable so that we check it only once(Durga) Cc: Durgadoss R <durgadoss.r@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: NSonika Jindal <sonika.jindal@intel.com> Reviewed-by: NDurgadoss R <durgadoss.r@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 01 4月, 2015 10 次提交
-
-
由 Chris Wilson 提交于
Count the number of requests in a ring for the user and show who submitted them. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
The best_encoder field of connector_state wasn't properly set when a connector was being disabled, leading to an incosistent atomic state. For now, this doesn't cause anything to blow up, because everywhere we're using connector_state->best_encoder there is a check for connector_state->crtc which is properly initialized. I reached the issue while testing some patches I haven't sent out yet, that remove the usage of intel_connector->new_encoder from check_digital_port_conflicts(). In that case, it would be possible to trigger the converted version of the WARN in that function. Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> [danvet: Add commit message augmentation Ander supplied.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Jani Nikula 提交于
This will be helpful for adding future platforms. It is better to keep the information in the single point of truth (the table) instead of duplicating it into the validity function. While at it, add dev_priv parameter to the function, also to prepare for adding future platform support. Signed-off-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Jani Nikula 提交于
Index the gmbus tables directly using the pin instead of having a confusing "port = i + 1" mapping. This finishes off removing the "gmbus port" as a notion, and leaves us with just the "gmbus pin". As pin 0 is invalid by definition and the gmbus tables will have a gap at that index, add pin validity check to all the loops. This will be benefitial for supporting platforms that have different numbers of pins, or gaps. v2: s/GMBUS_PIN_MAX/GMBUS_NUM_PINS/ (Ville, Daniel) Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Jani Nikula 提交于
Rename intel_gmbus_is_port_valid to intel_gmbus_is_valid_pin, and rename port parameters to pin as well. This matches usage all around, as usually a pin is passed to the validity check function. No functional changes. Signed-off-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Jani Nikula 提交于
The specs refer to pin pairs. Start moving towards using pin rather than port all around to avoid confusion. No functional changes. Signed-off-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The legacy and LRC code paths have an almost identical procedure for waiting for space in the ring buffer. They both search for a request in the free list that will advance the tail to a point where sufficient space is available. They then wait for that request, retire it and recalculate the free space value. Unfortunately, a bug in the LRC side meant that the resulting free space might not be as large as expected and indeed, might not be sufficient. This is because it was testing against the value of request->tail not request->postfix. Whereas, when a request is retired, ringbuf->tail is updated to req->postfix not req->tail. Another significant difference between the two is that the LRC one did not trust the wait for request to work! It redid the is there enough space available test and would fail the call if insufficient. Whereas, the legacy version just said 'return 0' - it assumed the preceeding code works. This difference meant that the LRC version still worked even with the bug - it just fell back to the polling wait path. For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NThomas Daniel <thomas.daniel@intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The request allocation code is largely duplicated between legacy mode and execlist mode. The actual difference between the two versions of the code is pretty minimal. This patch moves the common code out into a separate function. This is then called by the execution specific version prior to setting up the one different value. For: VIZ-5190 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The only usage of intel_logical_ring_begin() is within intel_lrc.c so it can be made static. To avoid a forward declaration at the top of the file, it and bunch of other functions have been shuffled upwards. For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 John Harrison 提交于
The submission portion of the execbuffer code path was abstracted into a function pointer indirection as part of the legacy vs execlist work. The two implementation functions are called 'i915_gem_ringbuffer_submission' and 'intel_execlists_submission' but the pointer was called 'do_execbuf'. There is already a 'i915_gem_do_execbuffer' function (which is what calls the pointer indirection). The name of the pointer is therefore considered to be backwards and should be changed. This patch renames it to 'execbuf_submit' which is hopefully a bit clearer. For: VIZ-5115 Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com> Reviewed-by: NTomas Elf <tomas.elf@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 31 3月, 2015 5 次提交
-
-
由 Ville Syrjälä 提交于
Unify the HSW/BDW/SKL cdclk extraction code to conform to the same .get_display_clock_speed() mold that all the other platforms use. v2: Update due to SKL code getting added v3: Rebase on top of -nightly (introduction of intel_audio.c) (Mika Kahola) Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NMika Kahola <mika.kahola@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> [danvet: Add v3 note as suggested by Damien.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
Now that we are "extracting" the cdclk frequency on ILK-IVB we can also simplify ilk_get_aux_clock_divider() to calculate the divider based on cdclk instead of hardcoding the values. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NMika Kahola <mika.kahola@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
We don't currently have cdclk extraction code for 965g,snb,ivb. Let's assume 400 MHz until we know better. That seems to match hints in various vague documents. Whether that's good enough is not entirely clear. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NMika Kahola <mika.kahola@intel.com> Acked-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
Based on the BIOS DP A AUX 2x clock divider the cdclk frequency on ILK is 450Mhz. At least that holds on my ILK and it matches how we program the divider. Supposedly cdclk is 400MHz on SNB and IVB, again based on the AUX 2x clock divider. Note that I don't have a SNB or IVB machine with eDP so I couldn't verify what the BIOS used, so this notion is purely based on our current code, Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NMika Kahola <mika.kahola@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
Fill out the lower three digits for gen2 and gen3 cdclk frqeuncy. It's not clear if these are accurate frquencies or just in the ballpark, but without docs this is the best we can do. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NMika Kahola <mika.kahola@intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-