- 03 9月, 2012 8 次提交
-
-
由 Daniel Vetter 提交于
<ickle> danvet: in the force wake, both DRM_ERRORs have the same string. <ickle> useful for .txt shrinkage, horrible for debugging Acked-by: NPaul Menzel <paulepanter@users.sourceforge.net> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
For some odd reasons, the vlv forcewake code is rather different from all other platforms, with no clear justification. Adjust things: - Don't check whether the gt is awake already (and bail out early), we need to grab a forcewake anyway. Otherwise the chip might go to sleep too early. And this would also screw up our forcewake accounting. - Like all other platforms, check whether the gt has cleared the forcewake bit in the _ACK register before setting it again. - Use _MASKED_BIT_ENABLE/DISABLE macros - Only use bit0 of the forcewake reg, not all 16 bits. - check the gtfifodb reg like on all other platforms in _put. - Drop the POSTING_READs for consistency. v2: Failure to git add ... again. v3: Fixup the spelling fail a bit. Tested-by: N"Purushothaman, Vijay A" <vijay.a.purushothaman@intel.com> Tested-by: N"Widawsky, Benjamin" <benjamin.widawsky@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
We've had and still have too many issues where the gpu turbo doesn't quite to what it's supposed to do (or what we want it to do). Adding a tracepoint to track when the desired gpu frequency changes should help a lot in characterizing and understanding problematic workloads. Also, this should be fairly interesting for power tuning (and especially noticing when the gpu is stuck in high frequencies, as has happened in the past) and hence for integration into powertop and similar tools. Cc: Arjan van de Ven <arjan@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NPaul Menzel <paulepanter@users.sourceforge.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Like with the equivalent change for gen6+ rps state, this helps in clarifying the code (and in fixing a few places that have fallen through the cracks in the locking review). Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
From Bspec, Vol 2a, Section 1.9.3.4 "PIPE_CONTROL", intro section detailing the various workarounds: "[DevIVB {W/A}, DevHSW {W/A}]: Pipe_control with CS-stall bit set must be issued before a pipe-control command that has the State Cache Invalidate bit set." Note that public Bspec has different numbering, it's Vol2Part1, Section 1.10.4.1 "PIPE_CONTROL" there. There's also a second workaround for the PIPE_CONTROL command itself: "[DevIVB, DevVLV, DevHSW] {WA}: Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with only read-cache-invalidate bit(s) set, must have a CS_STALL bit set" For simplicity we simply set the CS_STALL bit on every pipe_control on gen7+ Note that this massively helps on some hsw machines, together with the following patch to unconditionally set the CS_STALL bit on every pipe_control it prevents a gpu hang every few seconds. This is a regression that has been introduced in the pipe_control cleanup: commit 6c6cf5aa Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 18:02:28 2012 +0100 drm/i915: Only apply the SNB pipe control w/a to gen6 It looks like the massive snb pipe_control workaround also papered over any issues on ivb and hsw. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> [danvet: squashed both workarounds together, pimped commit message with Bsepc citations, regression commit citation and changed the comment in the code a bit to clarify that we unconditionally set CS_STALL to avoid being hurt by trying to be clever.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
Since gen 7+ now run the new gen7_render_ring_flush function. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
For now, just a copy of gen6_render_ring_flush. Different gens have different workarounds, so we want different functions. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Otherwise it just won't compile ... Reported-by: NFengguang Wu <fengguang.wu@intel.com> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NDave Airlie <airlied@gmail.com>
-
- 27 8月, 2012 2 次提交
-
-
由 Sedat Dilek 提交于
When I pulled-in today's drm-intel-next into linux-next (next-20120824) I saw this build-breakage: drivers/gpu/drm/i915/i915_gem.c: In function 'i915_gem_object_get_pages_gtt': drivers/gpu/drm/i915/i915_gem.c:1778:40: error: '__GFP_NO_KSWAPD' undeclared (first use in this function) drivers/gpu/drm/i915/i915_gem.c:1778:40: note: each undeclared identifier is reported only once for each function it appears in This is caused by commit ba099ef165f8 ("mm: remove __GFP_NO_KSWAPD") and commit b6beae2c2014 ("mm: remove __GFP_NO_KSWAPD fixes") in linux-next (next-20120824). Fix this by removing __GFP_NO_KSWAPD from drm/i915 driver. Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
It blows up. And hopefully this is the root-cause of the mysterious rc6 related hang on ilk. For reference, the commit that enabled rc6 on ilk again is: commit 456470eb Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Aug 8 23:35:40 2012 +0200 drm/i915: enable rc6 on ilk again Reported-by: NDave Airlie <airlied@gmail.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 25 8月, 2012 1 次提交
-
-
由 Chris Wilson 提交于
If we need to stall in order to complete the pin_and_fence operation during execbuffer reservation, there is a high likelihood that the operation will be interrupted by a signal (thanks X!). In order to simplify the cleanup along that error path, the object was unconditionally unbound and the error propagated. However, being interrupted here is far more common than I would like and so we can strive to avoid the extra work by eliminating the forced unbind. v2: In discussion over the indecent colour of the new functions and unwind path, we realised that we can use the new unreserve function to clean up the code even further. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 24 8月, 2012 12 次提交
-
-
由 Ben Widawsky 提交于
Using the extracted INSTDONE reading, and our new register definitions, update our hangcheck detection and error collection to use it. This primarily means changing == to memcmp, and changing = to memcpy. Hopefully this will give more info on error dump, and provide more accurate hangcheck detection (both are actually TBD). Also, remove the reading of instdone1 from the ring error collection function, and just crap everything in capture_error_state (that could be split into a separate patch if it wasn't so trivial). v2: Now assuming i915_get_extra_instdone does the memset we can clean up the code a bit (Jani) v3: use ARRAY_SIZE as requested earlier by Jani (didn't change sizeof) Updated commit msg Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
INSTDONE is used in many places, and it varies from generation to generation. This provides a good reason for us to extract the logic to read the relevant information. The patch has no functional change. It's prep for some new stuff. v2: move the memset inside of i915_get_extra_instdone (Jani) v3,4: bugs caught by (Jani) Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
The principal use for set-to-domain is for userspace to serialise operations with a particular buffer, for example to maintain coherency with a CPU map or to ratelimit its rendering by waiting on all previous operations before continuing. As such we tend to hold the struct_mutex for long periods during the synchronisation and so cause contention issues with other users of the graphics device, even for independent operations as memory management. An example is the contention between compiz and X which causes jitter in the display and a drop in peak throughput. The ultimate solution would be a set of fine grained locks and lockless operations, but an intermediate step is to first attempt the synchronisation for set-to-domain without holding the mutex. This introduces a number of race conditions, so we limit it use to the ioctl periphery where we have no dependent state and can safely complete with a locked synchronisation afterwards. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Move the wait-for-rendering logic around in the file so that we can group it together with the subsequent variations. The general goal is to have the lower level routines clustered together and then the higher level logic building upon those low level routines that came before. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
This prevents the case of unbinding the object in order to process the relocations through the GTT and then rebinding it only to then proceed to use cpu relocations as the object is now in the CPU write domain. By choosing to use cpu relocations up front, we can therefore avoid the rebind penalty. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
As we wish to create specialised object constructions in the near future that share the same basic GEM object struct, export the default initializer. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
If the object has no backing shmemfs filp, then we obviously cannot perform a truncation operation upon it. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Jani Nikula 提交于
Neither the drm core nor any of the drivers really need the raw_edid field of struct drm_display_info for anything. Instead of being useful, it creates confusion about who is responsible for freeing the memory it points to and setting the field to NULL afterwards, leading to memory leaks and dangling pointers. Remove the raw_edid field, and fix drivers as necessary. Reported-by: NRussell King <linux@arm.linux.org.uk> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Acked-by: NInki Dae <inki.dae@samsung.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Jani Nikula 提交于
The EDID returned by drm_get_edid() was never freed. Signed-off-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Tejun Heo 提交于
This is an equivalent conversion and will ease scheduled removal of WQ_NON_REENTRANT. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 23 8月, 2012 2 次提交
-
-
由 Ben Widawsky 提交于
ERR_INT on HSW will display unclaimed MMIO accesses. This can be either the result of a driver bug writing to an invalid addresses, or the result of RC6. Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NAntti Koskipaa <antti.koskipaa@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
ERR_INT can generate interrupts. However since most of the conditions seem quite fatal the patch opts to simply report it in error state instead of adding more complexity to the interrupt handler for little gain (the bits are sticky anyway). Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NAntti Koskipaa <antti.koskipaa@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 22 8月, 2012 1 次提交
-
-
由 Chris Wilson 提交于
This addresses WaPruneModeWithIncorrectHsyncOffset. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=50236Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 21 8月, 2012 7 次提交
-
-
由 Xu, Anhua 提交于
In intel_dp_mode_set we OR in the exact same bits twice at the same spot. Kill one of the redundant assignments. This little regression was introduced by: commit 417e822d Author: Keith Packard <keithp@keithp.com> Date: Tue Nov 1 19:54:11 2011 -0700 drm/i915: Treat PCH eDP like DP in most places PCH eDP has many of the same needs as regular PCH DP connections, including the DP_CTl bit settings, the TRANS_DP_CTL register. The reachable tag for this commit is: v3.1-5461-g417e822dSigned-off-by: NAnhua Xu <anhua.xu@intel.com> [danvet: Improved the commit message somewhat and ensured the diff is clearer.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Given the persistence of an offset for the lifetime of an object, itis easy to contemplate how the mmap space becomes badly fragmented to the point that further allocations fail with ENOSPC. Our only recourse at this point is to try to purge the objects to release some space and reattempt the allocation. References: https://bugs.freedesktop.org/show_bug.cgi?id=39552Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
A pair of universally true checks that just need to be put in the right place depending on where in the patch sequence you go. Note that i915_gem_object_put_pages_gtt() already gains the BUG_ON(obj->gtt_space), but on reflection that needed to migrate to put_pages(). Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
When dealing with a working set larger than the GATT, or even the mappable aperture when touching through the GTT, we end up with evicting objects only to rebind them at a new offset again later. Moving an object into and out of the GTT requires clflushing the pages, thus causing a double-clflush penalty for rebinding. To avoid having to clflush on rebinding, we can track the pages as they are evicted from the GTT and only relinquish those pages on memory pressure. As usual, if it were not for the handling of out-of-memory condition and having to manually shrink our own bo caches, it would be a net reduction of code. Alas. Note: The patch also contains a few changes to the last-hope evict_everything logic in i916_gem_execbuffer.c - we no longer try to only evict the purgeable stuff in a first try (since that's superflous and only helps in OOM corner-cases, not fragmented-gtt trashing situations). Also, the extraction of the get_pages retry loop from bind_to_gtt (and other callsites) to get_pages should imo have been a separate patch. v2: Ditch the newly added put_pages (for unbound objects only) in i915_gem_reset. A quick irc discussion hasn't revealed any important reason for this, so if we need this, I'd like to have a git blame'able explanation for it. v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> [danvet: Split out code movements and rant a bit in the commit message with a few Notes. Done v2] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
James Bottomley reported [1] a massive power regression, due to the enabling of semaphores by default in 3.5. A workaround for him is to again disable semaphores. And indeed, his system has a very hard time to enter rc6 with semaphores enabled. Ben Widawsky run around with a kill-a-watt a lot and noticed: - There are indeed a few rare systems that seem to have a hard time entering rc6 when desktop-idle. - One machine, The Indestructible Toshiba regressed in this behaviour between 3.5 and 3.6 in a merge commit! So rc6 behaviour with the current setting seems to be highly timing dependent and not robust at all. - The behaviour James reported wrt semaphores seems to be a freak timing thing that only happens on his specific machine, confirming that enabling semaphores shouldn't reduce rc6 residency. Now furthermore the Google ChromeOS guys reported [2] a while ago that at least on some machines a simply a blinking cursor can keep the gpu turbo at the highest frequency. This is because the current rps limits used on snb/ivb are highly asymmetric. On the theory that gpu turbo and rc6 tuning values are related, we've tried whether the much saner looking (since much less asymmetric) rps tuning values used for hsw would also help entering rc6 more robustly. And it seems to mostly work, and we don't really have the resources to through-roughly tune things in any better way: The values from the ChromeOS ppl seem to fare a bit worse for James' machine, so I guess we better stick with something vpg (the gpu hw/windows group) provided, hoping that they've done their jobs. Reference[1]: http://lists.freedesktop.org/archives/dri-devel/2012-July/025675.html Reference[2]: http://lists.freedesktop.org/archives/intel-gfx/2012-July/018692.html Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53393Tested-by: NBen Widawsky <ben@bwidawsk.net> Cc: stable@vger.kernel.org Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 20 8月, 2012 1 次提交
-
-
由 Daniel Vetter 提交于
Prep work to make Chris Wilson's unbound tracking patch a bit easier to read. Alas, I'd have preferred that moving the page allocation retry loop from bind to get_pages would have been a separate patch, too. But that looks like real work ;-) Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 17 8月, 2012 6 次提交
-
-
由 Wang Xingchao 提交于
Added new haswell_write_eld() to initialize Haswell HDMI audio registers to generate an unsolicited response to the audio controller driver to indicate that the controller sequence should start. Reviewed-by: NImre Deak <imre.deak@intel.com> Signed-off-by: NWang Xingchao <xingchao.wang@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Dave Airlie 提交于
In order for udl vmap to work properly, we need to push the object into the CPU domain before we start copying the data to the USB device. This along with the udl change avoids userspace explicit mapping to be used. v2: add a flag for userspace to query to know if Intel kernel driver can deal with the vmap flushing properly. In theory udl would need a flag also, but I intend to push the patches very close to each other and other drivers should do the right thing from the start. I've added a test to my intel-gpu-tools prime branch, however testing this is a bit messy since the only way to get udl to vmap is to rendering something. I've tested this with real code as well to make sure it works. Signed-off-by: NDave Airlie <airlied@redhat.com> [danvet: resolved conflict, which required reallocating the PARAM number to 21.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Keith Packard 提交于
This is left over from the old PLL sharing code and isn't useful now that PLLs are shared when possible. Signed-off-by: NKeith Packard <keithp@keithp.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Damien Lespiau 提交于
New-ish devices have 3 pipes, so let's not just hardcode 2 but use the for_each_pipe() macro and make struct intel_display_error_state is big enough. V2: Also add the number of pipes emitted (Chris Wilson) Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Wang Xingchao 提交于
Use _PIPE macro to get correct register definition for IBX/CPT, discard old variable "i" way. Signed-off-by: NWang Xingchao <xingchao.wang@intel.com> Reviewed-by: NImre Deak <imre.deak@intel.com> [danvet: Added the DIP_PORT_SEL #define from a preceeding patch in the series that needs more work.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Wang Xingchao 提交于
HDMI audio related registers will be configured in write_eld callback. Signed-off-by: NWang Xingchao <xingchao.wang@intel.com> Reviewed-by: NImre Deak <imre.deak@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-