- 07 12月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
In order for bos to retire eventually, a request must be sent down the ring. This is expected, for example, by occlusion queries for which mesa will wait upon (whilst running glean) before issuing more batches and so the normal activity upon the ring is suspended and we need to emit a request to clear the idle ring. Reported-by: NJinjin, Wang <jinjin.wang@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30380Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 28 11月, 2010 1 次提交
-
-
由 Daniel Vetter 提交于
We don't track gpu flush request in any special way. So even with obj->write_domain == 0, a gpu flush might be outstanding but no yet executed. Even worse, the latest request might use the object only for reading. So and unconditional call to object_wait_rendering is needed for !pipelined. Hence revert that patch fully and untangle the flushing from the synchronization again. Reported-by: NKeith Packard <keithp@keithp.com> Tested-by: NKeith Packard <keithp@keithp.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 24 11月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
Currently if we hit a pagefault when applying a user relocation for the execbuffer, we bail and return EFAULT to the application. Instead, we need to unwind, drop the dev->struct_mutex, copy all the relocation entries to a vmalloc array (to avoid any potential circular deadlocks when resolving the pagefault), retake the mutex and then apply the relocations. Afterwards, we need to again drop the lock and copy the vmalloc array back to userspace. v2: Incorporate feedback from Daniel Vetter. Reported-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 21 11月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
Commit 2549d6c2 removed the vmalloc used for temporary storage of the relocation lists used during execbuffer. However, our use of vmalloc was being protected by an integer overflow check which we do want to preserve! Reported-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 19 11月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
Linus Torvalds found that it was rather trivial to trigger a system freeze: In fact, with lockdep, I don't even need to do the sysrq-d thing: it shows the bug as it happens. It's the X server taking the same lock recursively. Here's the problem: ============================================= [ INFO: possible recursive locking detected ] 2.6.37-rc2-00012-gbdbd01ac #7 --------------------------------------------- Xorg/2816 is trying to acquire lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c626c>] i915_gem_fault+0x50/0x17e but task is already holding lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a other info that might help us debug this: 2 locks held by Xorg/2816: #0: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a #1: (&mm->mmap_sem){++++++}, at: [<ffffffff81022d4f>] page_fault+0x156/0x37b This recursion was introduced by rearranging the locking to avoid the double locking on the fast path (4f27b5d and fbd5a26d) and the introduction of the prefault to encourage the fast paths (b5e4f2b). In order to undo the problem, we rearrange the code to perform the access validation upfront, attempt to prefault and then fight for control of the mutex. the best case scenario where the mutex is uncontended the prefaulting is not wasted. Reported-and-tested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 13 11月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
An old and oft reported bug, is that of the GPU hanging on a MI_WAIT_FOR_EVENT following a mode switch. The cause is that the GPU is waiting on a scanline counter on an inactive pipe, and so waits for a very long time until eventually the user reboots his machine. We can prevent this either by moving the WAIT into the kernel and thereby incurring considerable cost on every swapbuffers, or by waiting for the GPU to retire the last batch that accesses the framebuffer before installing a new one. As mode switches are much rarer than swap buffers, this looks like an easy choice. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28964 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29252Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
-
- 09 11月, 2010 1 次提交
-
-
由 Joe Perches 提交于
Coalesce long formats. Align arguments. Add missing newlines. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 08 11月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
... and so prevent a potential circular reference: [ INFO: possible circular locking dependency detected ] 2.6.37-rc1-uwe1+ #4 ------------------------------------------------------- Xorg/1401 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [<c01e4ddb>] might_fault+0x4b/0xa0 but task is already holding lock: (&dev->struct_mutex){+.+.+.}, at: [<f869c3ac>] i915_mutex_lock_interruptible+0x3c/0x60 [i915] which lock already depends on the new lock. When the locking around the pwrite ioctl was simplified, I did not spot that the phys path never took any locks and so we introduced this potential circular reference. Reported-by: NUwe Helm <uwe.helm@googlemail.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 01 11月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
-
- 29 10月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
It is possible for the active list to only contain a read-only buffer so that the ring->gpu_write_list remains entry. This leads to an inconsistency between i915_gpu_is_active() and i915_gpu_idle() causing an infinite spin during the shrinker and an assertion failure that i915_gpu_idle() does indeed flush all buffers from the active lists. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 27 10月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
Keep the current interface but ignore the KM_type and use a stack based approach. The advantage is that we get rid of crappy code like: #define __KM_PTE \ (in_nmi() ? KM_NMI_PTE : \ in_irq() ? KM_IRQ_PTE : \ KM_PTE0) and in general can stop worrying about what context we're in and what kmap slots might be appropriate for that. The downside is that FRV kmap_atomic() gets more expensive. For now we use a CPP trick suggested by Andrew: #define kmap_atomic(page, args...) __kmap_atomic(page) to avoid having to touch all kmap_atomic() users in a single patch. [ not compiled on: - mn10300: the arch doesn't actually build with highmem to begin with ] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c] Acked-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NChris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 10月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
... to prevent flush processing of an idle (or even absent) ring. This fixes a regression during suspend from 87acb0a5. Reported-and-tested-by: NAlexey Fisher <bug-track@fisher-privat.net> Tested-by: NPeter Clifton <pcjc2@cam.ac.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 23 10月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
When the object has been written to by the gpu it remains on the ring until its flush has been retired. However, when the object is moving to the ring and the associated cache needs to be invalidated, we need to perform the flush on the target ring, not the one it came from (which is NULL in the reported case and so the flush was entirely absent). Reported-by: NPeter Clifton <pcjc2@cam.ac.uk> Reported-and-tested-by: NAlexey Fisher <bug-track@fisher-privat.net> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 22 10月, 2010 2 次提交
-
-
由 Chris Wilson 提交于
Whilst moving the code around in 9af90d19, I dropped the or'ing in of new write domains which would zero out the write domain for a render target if later reused as a source later in the batch. This meant that we might drop a required flush before reading from the render target. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31043 Reported-by: xunx.fang@intel.com Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
Based on an original patch by Zhenyu Wang, this initializes the BLT ring for SandyBridge and enables support for user execbuffers. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 21 10月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
If the userspace driver is using a constant relocation array with a static buffer, they will pass the same relocation array back to the kernel. So we *do* need to update the presumed offset value in those relocations to reflect the current object so that they remain correct with future batchbuffers and we avoid the necessity of having to suspend execution and perform redundant relocations. Fixes the regression introduced by 12f889c for applications using absolute addressing on trees of buffer (i.e. the current consumers of libdrm_intel.so). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30996Reported-by: NWang, Jinjin <jinjin.wang@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 20 10月, 2010 3 次提交
-
-
由 Chris Wilson 提交于
To handle retirements, we need per-ring tracking of active objects. To handle evictions, we need global tracking of active objects. As we enable more rings, rebuilding the global list from the individual per-ring lists quickly grows tiresome and overly complicated. Tracking the active objects in two lists is the lesser of two evils. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
... by always initialising the empty ringbuffer it is always then safe to check whether it is active. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
The most frequent relocation within a batchbuffer is a contiguous sequence of vertex buffer relocations, for which we can virtually eliminate the drm_gem_object_lookup() overhead by caching the last handle to object translation. In doing so we refactor the pin and relocate retry loop out of do_execbuffer into its own helper function and so improve the error paths. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 19 10月, 2010 7 次提交
-
-
由 Chris Wilson 提交于
One of the primarily consumers of the i915 driver is X, a large signal driven application. Frequently when writing into the buffers, there is a pending signal which causes us not to take the interruptible lock but then we need to take that same lock around the object unreference. By rearranging the code to do the interruptible lock as the first check, we can avoid the frequent additional locking around the unreference. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
... to avoid the double acquisition along fast[er] paths. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
... to avoid reacquiring it to drop the object reference count on exit. Note we have to make sure we now drop (and reacquire) the lock around acquiring the mm semaphore on the slow paths. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
... in the hope that it makes the atomic fast paths more likely. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
After allocation a handle for the fresh object, we know that we can safely drop the refcnt without triggering a free so we do not need the mutex. Strangely, this mutex acquisition is the one that appears on driver profiles. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
Avoid an early eviction of the batch buffer into the uncached GTT domain, and so do the relocation fixup in cacheable memory. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
... perform an access validation check up front instead and copy them in on-demand, during i915_gem_object_pin_and_relocate(). As around 20% of the CPU overhead may be spent inside vmalloc for the relocation entries when submitting an execbuffer [for x11perf -aa10text], the savings are considerable and result in around a 10% throughput increase [for glyphs]. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 08 10月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
Currently, if a batch buffer refers to an object with a pending flip, then we sleep until that pending flip is completed (unpinned and signalled). This is so that a flip can be queued and the user can continue rendering to the backbuffer oblivious to whether the buffer is still pinned as the scan out. (The kernel arbitrating at the last moment to stall the batch and wait until the buffer is unpinned and replaced as the front buffer.) As we only have a queue depth of 1, we can simply wait for the current pending flip to complete and continue rendering. We can achieve this with a single WAIT_FOR_EVENT command inserted into the ring buffer prior to executing the batch, *without* stalling the client. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 04 10月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 03 10月, 2010 2 次提交
-
-
由 Chris Wilson 提交于
... and do the same for pread. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
-
由 Chris Wilson 提交于
Move the access control up from the fast paths, which are no longer universally taken first, up into the caller. This then duplicates some sanity checking along the slow paths, but is much simpler. Tracked as CVE-2010-2962. Reported-by: NKees Cook <kees@ubuntu.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
-
- 02 10月, 2010 2 次提交
-
-
由 Julia Lawall 提交于
Extend the error handling code with operations found in other nearby error handling code A simplified version of the sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ @r@ statement S1,S2,S3; constant C1,C2,C3; @@ *if (...) {... S1 return -C1;} ... *if (...) {... when != S1 return -C2;} ... *if (...) {... S1 return -C3;} // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
-
由 Chris Wilson 提交于
The return from move_to_gtt_domain() may indicate a pending signal which needs to handled as opposed to an actual error, for instance, so report the original return value rather than forcing an EINVAL. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 01 10月, 2010 4 次提交
-
-
由 Chris Wilson 提交于
When the GPU is reset, the fence registers are invalidated, so release the objects and clear them out. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30083Reported-by: NSitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
Only drm/i915 does the bookkeeping that makes the information useful, and the information maintained is driver specific, so move it out of the core and into its single user. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com>
-
由 Dave Airlie 提交于
There were lots of places being inconsistent since handle count looked like a kref but it really wasn't. Fix this my just making handle count an atomic on the object, and have it increase the normal object kref. Now i915/radeon/nouveau drivers can drop the normal reference on userspace object creation, and have the handle hold it. This patch fixes a memory leak or corruption on unload, because the driver had no way of knowing if a handle had been actually added for this object, and the fbcon object needed to know this to clean itself up properly. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 30 9月, 2010 3 次提交
-
-
由 Chris Wilson 提交于
At that point as the object is no longer in any GPU write domain it must not be on the list, so the list_del() is redundant. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Chris Wilson 提交于
... and check more regularly. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 29 9月, 2010 1 次提交
-
-
由 Chris Wilson 提交于
Just reschedule the retire requests again if the device is currently busy. The request list will be pruned along other paths so will never grow unbounded and so we can afford to miss the occasional pruning. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-