- 15 1月, 2013 1 次提交
-
-
由 Chris Wilson 提交于
These are useful for investigating hangs involving WAIT_FOR_EVENT. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> [danvet: Apply a droplet of Future-Proof in the if-ladder.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 06 12月, 2012 1 次提交
-
-
由 Chris Wilson 提交于
Before queuing the flip but crucially after attaching the unpin-work to the crtc, we continue to setup the unpin-work. However, should the hardware fire early, we see the connected unpin-work and queue the task. The task then promptly runs and unpins the fb before we finish taking the required references or even pinning it... Havoc. To close the race, we use the flip-pending atomic to indicate when the flip is finally setup and enqueued. So during the flip-done processing, we can check more accurately whether the flip was expected. v2: Add the appropriate mb() to ensure that the writes to the page-flip worker are complete prior to marking it active and emitting the MI_FLIP. On the read side, the mb should be enforced by the spinlocks. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org [danvet: Review the barriers a bit, we need a write barrier both before and after updating ->pending. Similarly we need a read barrier in the interrupt handler both before and after reading ->pending. With well-ordered irqs only one barrier in each place should be required, but since this patch explicitly sets out to combat spurious interrupts with is staged activation of the unpin work we need to go full-bore on the barriers, too. Discussed with Chris Wilson on irc and changes acked by him.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 29 11月, 2012 1 次提交
-
-
由 Chris Wilson 提交于
Should be useful to know what the driver thought the other ring's seqno was when it last used a semaphore. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 12 11月, 2012 3 次提交
-
-
由 Ben Widawsky 提交于
Fixes a WARN_ON in igt/tests/debugfs_reader CC: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Jesse Barnes 提交于
This allows the power related code to run independently of the rest of the pipeline, extending the resume and init time improvements into userspace, which would otherwise have been blocked on the struct mutex if we were doing PCU communication. v2: Also convert the locking for the rps sysfs interface. Suggested-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1) Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 16 10月, 2012 2 次提交
-
-
由 Ben Widawsky 提交于
CC: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
There is a special mechanism for communicating with the PCU already being used for the ring frequency stuff. As we'll be needing this for other commands, extract it now to make future code less error prone and the current code more reusable. I'm not entirely sure if this code matches 1:1 with the previous code behaviorally. Functionally however, it should be the same. CC: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> [danvet: Fixup compile fail reported by Wu Fengguang.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 03 10月, 2012 2 次提交
-
-
由 David Howells 提交于
Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NDave Airlie <airlied@redhat.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NDave Jones <davej@redhat.com>
-
由 David Howells 提交于
Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding patch. Without this patch and the patch to make include the UAPI headers from the core headers, after the UAPI split, the DRM C sources cannot find these UAPI headers because the DRM code relies on specific -I flags to make #include "..." work on headers in include/drm/ - but that does not work after the UAPI split without adding more -I flags. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NDave Airlie <airlied@redhat.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NDave Jones <davej@redhat.com>
-
- 20 9月, 2012 1 次提交
-
-
由 Ben Widawsky 提交于
Magic numbers are bad mmmkay. In this case in particular the value is especially weird because the docs say multiple things. We'll need this value for sysfs, so extracting it is useful for that as well. Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 03 9月, 2012 1 次提交
-
-
由 Chris Wilson 提交于
For code consolidation and future maintainability. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-and-Tested-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 24 8月, 2012 1 次提交
-
-
由 Ben Widawsky 提交于
Using the extracted INSTDONE reading, and our new register definitions, update our hangcheck detection and error collection to use it. This primarily means changing == to memcmp, and changing = to memcpy. Hopefully this will give more info on error dump, and provide more accurate hangcheck detection (both are actually TBD). Also, remove the reading of instdone1 from the ring error collection function, and just crap everything in capture_error_state (that could be split into a separate patch if it wasn't so trivial). v2: Now assuming i915_get_extra_instdone does the memset we can clean up the code a bit (Jani) v3: use ARRAY_SIZE as requested earlier by Jani (didn't change sizeof) Updated commit msg Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 23 8月, 2012 1 次提交
-
-
由 Ben Widawsky 提交于
ERR_INT can generate interrupts. However since most of the conditions seem quite fatal the patch opts to simply report it in error state instead of adding more complexity to the interrupt handler for little gain (the bits are sticky anyway). Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NAntti Koskipaa <antti.koskipaa@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 21 8月, 2012 3 次提交
-
-
由 Chris Wilson 提交于
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
When dealing with a working set larger than the GATT, or even the mappable aperture when touching through the GTT, we end up with evicting objects only to rebind them at a new offset again later. Moving an object into and out of the GTT requires clflushing the pages, thus causing a double-clflush penalty for rebinding. To avoid having to clflush on rebinding, we can track the pages as they are evicted from the GTT and only relinquish those pages on memory pressure. As usual, if it were not for the handling of out-of-memory condition and having to manually shrink our own bo caches, it would be a net reduction of code. Alas. Note: The patch also contains a few changes to the last-hope evict_everything logic in i916_gem_execbuffer.c - we no longer try to only evict the purgeable stuff in a first try (since that's superflous and only helps in OOM corner-cases, not fragmented-gtt trashing situations). Also, the extraction of the get_pages retry loop from bind_to_gtt (and other callsites) to get_pages should imo have been a separate patch. v2: Ditch the newly added put_pages (for unbound objects only) in i915_gem_reset. A quick irc discussion hasn't revealed any important reason for this, so if we need this, I'd like to have a git blame'able explanation for it. v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> [danvet: Split out code movements and rant a bit in the commit message with a few Notes. Done v2] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 12 8月, 2012 1 次提交
-
-
由 Daniel Vetter 提交于
Since forcewake is now protected by a spinlock, we don't need to grab dev->struct_mutex any more. This way we can also get rid of a stale comment, noticed by Ben Widawsky while reviewing some locking changes. v2: Kill the unused variable ret, noticed by Fengguang Wu. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 10 8月, 2012 5 次提交
-
-
由 Chris Wilson 提交于
Avoid the forcewake overhead when simply retiring requests, as often the last seen seqno is good enough to satisfy the retirment process and will be promptly re-run in any case. Only ensure that we force the coherent seqno read when we are explicitly waiting upon a completion event to be sure that none go missing, and also for when we are reporting seqno values in case of error or debugging. This greatly reduces the load for userspace using the busy-ioctl to track active buffers, for instance halving the CPU used by X in pushing the pixels from a software render (flash). The effect will be even more magnified with userptr and so providing a zero-copy upload path in that instance, or in similar instances where X is simply compositing DRI buffers. v2: Reverse the polarity of the tachyon stream. Daniel suggested that 'force' was too generic for the parameter name and that 'lazy_coherency' better encapsulated the semantics of it being an optimization and its purpose. Also notice that gen6_get_seqno() is only used by gen6/7 chipsets and so the test for IS_GEN6 || IS_GEN7 is redundant in that function. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
This way it's easier so see what belongs together, and what is used by the ilk ips code. Also add some comments that explain the locking. Note that (cur|min|max)_delay need to be duplicated, because they're also used by the ips code. v2: Missed one place that the dev_priv->ips change caught ... Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
It's no fun if your shell hangs when the driver has gone on vacation and you want to know why ... Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
- Take the dev->struct_mutex around access the corresponding state (and adjusting the rps hw state). - Add an assert to gen6_set_rps to ensure we don't forget about this in the future. - Don't set up the min/max_freq files if it doesn't apply to the hw. And do the same for the gen6+ cache sharing file while at it. v2: Move the gen6+ checks into the read/write callbacks. Thanks to the awesome drm midlayer we can't check that when registering the debugfs files, because the driver is not yet fully set up, specifically the ->load callback hasn't run yet. Oh how I despise this disaster ... v3: Also add a WARN_ON(mutex_is_locked) in set_rps to check the locking. v4: Use mutex_lock_interruptible, suggested by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (for v2) Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Handy for lazy people like me, or when people forget to add the output of lspci -nn. v2: Chris Wilson noticed that we have this duplicated already in the i915_capabilites debugfs file. But there \n as separator looks better, which would be a bit verbose in dmesg. Abuse the preprocessor to extract this all. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 26 7月, 2012 2 次提交
-
-
由 Chris Wilson 提交于
As we guarantee to emit a flush before emitting the breadcrumb or the next batchbuffer, there is no further need for the flushing list. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
As we always flush the GPU cache prior to emitting the breadcrumb, we no longer have to worry about the deferred flush causing the pending_gpu_write to be delayed. So we can instead utilize the known last_write_seqno to hopefully minimise the wait times. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 20 7月, 2012 1 次提交
-
-
由 Chris Wilson 提交于
Having had to dive into the bspec to understand what each stage of the workaround meant, and how that the ring broadcasting IDLE corresponded with the GT powering down the ring (i.e. rc6) add comments to aide the next reader. And since the register "is used to control all aspects of PSMI and power saving functions" that makes it quite interesting to inspect with regards to RC6 hangs, so add it to the error-state. v2: Rediscover the piece of magic, set the RNCID to 0 before waiting for the ring to wake up. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 14 6月, 2012 1 次提交
-
-
由 Ben Widawsky 提交于
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
-
- 05 6月, 2012 1 次提交
-
-
由 Jesse Barnes 提交于
This makes for easier benchmarking and testing. One can set a fixed frequency by setting min and max to the same value. v2: fix whitespace & comment (Eugeni) Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: NEugeni Dodonov <eugeni@dodonov.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 22 5月, 2012 1 次提交
-
-
由 Daniel Vetter 提交于
We need to remove the debugfs file. Regression introduce in commit d5442303 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Apr 27 15:17:40 2012 +0200 drm/i915: allow the existing error_state to be destroyed Reported-and-Tested-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 20 5月, 2012 1 次提交
-
-
由 Chris Wilson 提交于
In many places we wish to iterate over the rings associated with the GPU, so refactor them to use a common macro. Along the way, there are a few code removals that should be side-effect free and some rearrangement which should only have a cosmetic impact, such as error-state. Note that this slightly changes the semantics in the hangcheck code: We now always cycle through all enabled rings instead of short-circuiting the logic. v2: Pull in a couple of suggestions from Ben and Daniel for intel_ring_initialized() and not removing the warning (just moving them to a new home, closer to the error). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> [danvet: Added note to commit message about the small behaviour change, suggested by Ben Widawsky.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 06 5月, 2012 3 次提交
-
-
由 Daniel Vetter 提交于
... by writing (anything) to i915_error_state. This way we can simulate a bunch of gpu hangs and run the error_state capture code every time (without the need to reload the module). To make that happen we need to abandon the simple seq_file wrappers provided by the drm core. While at it put the new error_state refcounting to some good use and associated the error_state to the debugfs when opening the file. Otherwise the error_state could change while someone is reading it. This should help greatly when we finally get around to split up the giant single seq_file block that the error_state file currently is into smaller parts. v2: Actually squash all the fixes into the patch ... Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
- reduce the irq disabled section, even for a debugfs file this was way too long. - always disable irqs when taking the lock. v2: Thou shalt not mistake locking for reference counting, so: - reference count the error_state to protect from concurent freeeing. This will be only really used in the next patch. Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
gpu reset is a very important piece of our infrastructure. Unfortunately we only really it test by actually hanging the gpu, which often has bad side-effects for the entire system. And the gpu hang handling code is one of the rather complicated pieces of code we have, consisting of - hang detection - error capture - actual gpu reset - reset of all the gem bookkeeping - reinitialition of the entire gpu This patch adds a debugfs to selectively stopping rings by ceasing to update the hw tail pointer, which will result in the gpu no longer updating it's head pointer and eventually to the hangcheck firing. This way we can exercise the gpu hang code under controlled conditions without a dying gpu taking down the entire systems. Patch motivated by me forgetting to properly reinitialize ppgtt after a gpu reset. Usage: echo $((1 << $ringnum)) > i915_ring_stop # stops one ring echo 0xffffffff > i915_ring_stop # stops all, future-proof version then run whatever testload is desired. i915_ring_stop automatically resets after a gpu hang is detected to avoid hanging the gpu to fast and declaring it wedged. v2: Incorporate feedback from Chris Wilson. v3: Add the missing cleanup. v4: Fix up inconsistent size of ring_stop_read vs _write, noticed by Eugeni Dodonov. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 03 5月, 2012 7 次提交
-
-
由 Chris Wilson 提交于
On SandyBridge IPS was entirely implemented in hardware and not reliant on the driver monitoring power consumption and feeding back desired run states, so the hardware is able to adapt quicker and more flexibly. Which is a huge relief for us as we no longer have to carry empirically derived magic algorithms. Yet despite the advance in technology, the driver was still doing its IPS polling on all machines. Restrict it to the only supported hardware, Clarkdale/Arrandale. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
The waiting_seqno is not terribly useful, and as such we can remove it so that we'll be able to extract lockless code. v2: Keep the information for error_state (Chris) Check if ring is initialized in hangcheck (Chris) Capture the waiting ring (Chris) Signed-off-by: NBen Widawsky <ben@bwidawsk.net> [danvet: add some bikeshed to clarify a comment.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
This extra bit of interrupt enabling code doesn't belong in the wait seqno function. If anything we should pull it out to a helper so the throttle code can also use it. The history is a bit vague, but I am going to attempt to just dump it, unless someone can argue otherwise. Removing this allows for a shared lock free wait seqno function. To keep tabs on this issue though, the IER value is stored on error capture (recommended by Chris Wilson) v2: fixed typo EIR->IER (Ben) Fix some white space (Ben) Move IER capture to globally instead of per ring (Ben) Signed-off-by: NBen Widawsky <ben@bwidawsk.net> [danvet: ier is a 16 bit reg on gen2!] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
The use of the mm_list by deferred-free breaks the following patches to extend the range of objects tracked. We can simplify things if we just make the unbind during free uninterrutible. Note that unbinding should never fail, because we hold an additional reference on every active object. Only the ilk vt-d workaround breaks this, but already takes care of not failing by waiting for the gpu to quiescent non-interruptible. But the existence of the deferred free list casted some doubts on this theory, hence WARN if the unbind fails and only then retry non-interruptible. We can kill this additional code after a release in case the theory is indeed right and no one has hit that WARN. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Simplify object tracking by removing the inactive but pinned list. The only place where this was used is for counting the available memory, which is just as easy performed by checking all objects on the rare occasions it is required (application startup). For ease of debugging, we keep the reporting of pinned objects through the error-state and debugfs. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
These were mostly straight forward. No forced casting needed. Signed-off-by: NBen Widawsky <benjamin.widawsky@intel.com> [danvet: fix conflict with ringbuffer_data removal and drop the hunk about the status page - that needs more care to fix up.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-