- 09 2月, 2012 1 次提交
-
-
由 Daniel Vetter 提交于
We have to do this manually. Somebody had a Great Idea. I've measured speed-ups just a few percent above the noise level (below 5% for the best case), but no slowdows. Chris Wilson measured quite a bit more (10-20% above the usual snb variance) on a more recent and better tuned version of sna, but also recorded a few slow-downs on benchmarks know for uglier amounts of snb-induced variance. v2: Incorporate Ben Widawsky's preliminary review comments and elaborate a bit about the performance impact in the changelog. v3: Add a comment as to why we don't need to check the 3rd memory channel. v4: Fixup whitespace. Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 01 2月, 2012 1 次提交
-
-
由 Chris Wilson 提交于
As the buffer is not necessarily accessible through the GTT at the time of a GPU hang, and capturing some of its contents is far more valuable than skipping it, provide a clflushed fallback read path. We still prefer to read through the GTT as that is more consistent with the GPU access of the same buffer. So example it will demonstrate any errorneous tiling or swizzling of the command buffer as seen by the GPU. This becomes necessary with use of CPU relocations and lazy GTT binding, but could potentially happen anyway as a result of a pathological error. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 31 1月, 2012 6 次提交
-
-
由 Daniel Vetter 提交于
Like for shmem_pwrite_slow. The only difference is that because we read data, we can leave the fetched cachelines in the cpu: In the case that the object isn't in the cpu read domain anymore, the clflush for the next cpu read domain invalidation will simply drop these cachelines. slow_shmem_bit17_copy is now ununsed, so kill it. With this patch tests/gem_mmap_gtt now actually works. v2: add __ to copy_to_user_swizzled as suggested by Chris Wilson. v3: Fixup the swizzling logic, it swizzled the wrong pages. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38115Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
... instead of get_user_pages, because that fails on non page-backed user addresses like e.g. a gtt mapping of a bo. To get there essentially copy the vfs read path into pagecache. We can't call that right away because we have to take care of bit17 swizzling. To not deadlock with our own pagefault handler we need to completely drop struct_mutex, reducing the atomicty-guarantees of our userspace abi. Implications for racing with other gem ioctl: - execbuf, pwrite, pread: Due to -EFAULT fallback to slow paths there's already the risk of the pwrite call not being atomic, no degration. - read/write access to mmaps: already fully racy, no degration. - set_tiling: Calling set_tiling while reading/writing is already pretty much undefined, now it just got a bit worse. set_tiling is only called by libdrm on unused/new bos, so no problem. - set_domain: When changing to the gtt domain while copying (without any read/write access, e.g. for synchronization), we might leave unflushed data in the cpu caches. The clflush_object at the end of pwrite_slow takes care of this problem. - truncating of purgeable objects: the shmem_read_mapping_page call could reinstate backing storage for truncated objects. The check at the end of pwrite_slow takes care of this. v2: - add missing intel_gtt_chipset_flush - add __ to copy_from_user_swizzled as suggest by Chris Wilson. v3: Fixup bit17 swizzling, it swizzled the wrong pages. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
The gtt_pwrite slowpath grabs the userspace memory with get_user_pages. This will not work for non-page backed memory, like a gtt mmapped gem object. Hence fall throuh to the shmem paths if we hit -EFAULT in the gtt paths. Now the shmem paths have exactly the same problem, but this way we only need to rearrange the code in one write path. v2: v1 accidentaly falls back to shmem pwrite for phys objects. Fixed. v3: Make the codeflow around phys_pwrite cleara as suggested by Chris Wilson. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
This will also come handy for the gen6+ swizzling support, where the driver is supposed to control swizzling depending upon dram configuration. v2: CxDRB3 are 16 bit regs! Noticed by Chris Wilson. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
It looks like the desktop variants of i915 and i945 also have the DCC register to control dram channel interleave and cpu side bit6 swizzling. Unfortunately internal Cspec/ConfigDB documentation for these ancient chips have already been dropped and there seem to be no archives. Also somebody thought the swizzling behaviour is surely a worthy secret to keep and redacted any mention of these fields from the published Intel datasheets. I suspect the hw engineers were really proud of the page coloring they've achieved in their first dual channel dram controller with bit17 - after all Bspec explains in great length the optimal layout of page frame numbers modulo 4 for the color and depth buffers, too. Later on when they've started to work on VT-d they shamefully discoverd their stupidity and tried to cover the tracks ... Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch> (i915g) Tested-by: Pavel Ondračka <pavel.ondracka@email.cz> (i945g) Tested-by: NChris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42625Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
The original intention of comparing the bo against the mappable GTT limits was to prevent a subsequent faulting of the bo into the GTT from clearing the entire GTT in vain. However, that was clearly a cut'n'paste mistake as a CPU mapping never binds the bo into the aperture. Whilst there may be some merit to limiting the maximum size of the bo to something that can be utilized by the GPU, that limit itself does not belong as a safeguard to mmapping the bo, so remove the check entirely. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 30 1月, 2012 12 次提交
-
-
由 Daniel Vetter 提交于
This was pretty handy when figuring out what exactly went wrong with ppgtt and it might also be useful when we stop filling the entire gart with scratch page entries. Also add the gen6+ DONE reg while at it. v2: Chris Wilson suggested to allocate the error_state with kzalloc for better paranoia. Also kill existing spurious clears of the error_state while at it. Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
This confuses our domain tracking and can (for gtt write domains) lead to a subsequent oops. Tested by tests/gem_exec_bad_domains from i-g-t. Reviewed-by: NEric Anholt <eric@anholt.net> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
With the error_state facility in place, this has outlived it's usefulness. It also oopses with the lates llc-reloc patches because it directly access objects through the gtt without any checks. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Since quite a while we also the basic output configuration in the error_state, so it should contain enough information to diagnose these MI_WAIT hangs. Reviewed-and-tested-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
All r/w debugfs files are created equal. v2: Add some newlines to make the code easier on the eyes as requested by Ben Widawsky. Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NKenneth Graunke <kenneth@whitecape.org> Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Only forcewake has an open with special semantics, the other r/w debugfs only assign the file private pointer. Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NKenneth Graunke <kenneth@whitecape.org> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
With the fence accounting fixed up in the previous commit not finding enough fences is a fatal error and userspace bug. Trashing the entire gtt is not gonna turn up that missing fence, so don't to this by returning another error thatn ENOSPC. This has the added benefit that it's easier to distinguish fence accounting errors from gtt space accounting issues. TTM serves as precendence for the EDEADLK error code - it returns it when the reservation code needs resources already blocked by the current reservation. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
In order to correctly account for reserving space in the GTT and fences for a batch buffer, we need to independently track whether the fence is pinned due to a fenced GPU access in the batch or whether the buffer is pinned in the aperture. Currently we count the fenced as pinned if the buffer has already been seen in the execbuffer. This leads to a false accounting of available fence registers, causing frequent mass evictions. Worse, if coupled with the change to make i915_gem_object_get_fence() report EDADLK upon fence starvation, the batchbuffer can fail with only one fence required... Fixes intel-gpu-tools/tests/gem_fenced_exec_thrash Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38735Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Tested-by: NPaul Neumann <paul104x@yahoo.de> [danvet: Resolve the functional conflict with Jesse Barnes sprite patches, acked by Chris Wilson on irc.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
This was just to facilitate product enablement with pre-production hw. Allows us to kill quite a bit of cruft. Signed-off-by: NKenneth Graunke <kenneth@whitecape.org> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Based on a patch by Ben Widawsky, but with different colors for the bikeshed. In contrast to Ben's patch this one doesn't add the fault regs. Afaics they're for the optional page fault support which - we're not enabling - and which seems to be unsupported by the hw team. Recent bspec lacks tons of information about this that the public docs released half a year back still contain. Also dump ring HEAD/TAIL registers - I've recently seen a few error_state where just guessing these is not good enough. v2: Also dump INSTPM for every ring. v3: Fix a few really silly goof-ups spotted by Chris Wilson. Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
The code already got unwieldy and we want to dump more per-ring registers. Only functional change is that we now also capture the video ring registers on ilk. v2: fixup a refactor fumble spotted by Chris Wilson. Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
... and add a helpr function for the places where we want a flag. This way we can use ring->id to index into arrays. v2: Resurrect the missing beautification-space Chris Wilson noted. I'm moving this space around because I'll reuse ring_str in the next patch. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 29 1月, 2012 1 次提交
-
-
由 Wu Fengguang 提交于
It should be programmed to "0" for HDMI or "1" for DisplayPort. This enables DisplayPort audio for - HP EliteBook 8460p (whose BIOS does not set the N_value_index bit for us) - DisplayPort monitor hot plugged after boot (otherwise most BIOS will fill the N_value_index bit for us) Tested-by: NRobert Lemaire <rlemaire@suse.com> Reviewed-by: NKeith Packard <keithp@keithp.com> Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 26 1月, 2012 4 次提交
-
-
由 Ben Widawsky 提交于
This is only relevant when using module unloading, and really only helps get rid of a probably benign warning. I can't remember if I sent this out already, but it's not turning up in any of my searches. Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Reviewed-by: NKeith Packard <keithp@keithp.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
After the ILK vt-d workaround patches it became clear that we had introduced a bug. Chris Wilson tracked down the issue to recursive calls to unmap. This happens because we try to optimize waiting on requests by calling retire requests after the wait, which may drop the last reference on an object and end up freeing the object (and then unmap the object from the gtt). After the last patch we can now choose to defer processing the retire list. Kudos to Chris Wilson for tracking this one down. This patch fixes gem_unref_active_buffers from i-g-t. It was tested by forcing do_idle_maps to true. This also fixes tests/gem_linear_blits in intel-gpu-tools. Reported-by: guang.a.yang@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42180Reviewed-by: NKeith Packard <keithp@keithp.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
Sometimes it may be the case when we idle the gpu or wait on something we don't actually want to process the retiring list. This patch allows callers to choose the behavior. Reviewed-by: NKeith Packard <keithp@keithp.com> Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Eric Anholt 提交于
Older specs claimed this was bit 11, but newer specs and the actual simulator code say it was bit 12. Regardless, we don't use MI_FLUSH, or try to enable it any more. Signed-off-by: NEric Anholt <eric@anholt.net> Reviewed-by: NKenneth Graunke <kenneth@whitecape.org> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> [danvet: Anyone trying to use this bit, please read all the relevant discussions, it's epic.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 25 1月, 2012 2 次提交
-
-
由 Eric Anholt 提交于
We have always been using the wrong bit -- it's bit 12. However, the bit also doesn't do anything -- hardware has always accepted the MI_FLUSH command even when it was specced not to. Given that there is only one MI_FLUSH emitted in all of the driver stack on gen6+ (in i965_video.c of the 2d driver, and it should be using other code to do its flush instead), just remove the MI_FLUSH enable instead of trying to fix it. Signed-off-by: NEric Anholt <eric@anholt.net> Reviewed-by: NKenneth Graunke <kenneth@whitecape.org> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
This was completely spamming dmesg on my i855gm. This issue was just shortly introduced with: commit 931872fc Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 16 23:01:13 2012 +0000 drm/i915: Check that plane/pipe is disabled before removing the fb Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 22 1月, 2012 3 次提交
-
-
由 Eugeni Dodonov 提交于
Otherwise, we are left with pretty bogus message saying that the pixel format is not supported while leaving the details to the telepatic powers. v2: use DRM_DEBUG_KMS instead of DRM_ERROR Signed-off-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Jesse Barnes 提交于
Now that we're using the sprite WM fields, we need to take care not to clobber them in the main update_wm functions. While we're at it, make sure we mask out the old sprite wm value before or'ing in the new one when the sprite wm is updated. Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: NKeith Packard <keithp@keithp.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
I've reviewed gen2 pageflip code to hunt down multiple prepare pageflip issues. The only thing I've found is a slight but functionally meaningless confusion about the length of the mi cmd. Fix it up and add a comment about what this dword should be (according to docs at least). Reviewed-by: NEric Anholt <eric@anholt.net> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 18 1月, 2012 3 次提交
-
-
由 Eugeni Dodonov 提交于
LLC is not SNB/IVB-specific, so we should check for it in a more generic way. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NEric Anholt <eric@anholt.net> Reviewed-by: NKenneth Graunke <kenneth@whitecape.org> Signed-off-by: NEugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Some decent history digging indicates that this was to be used for the GLX_MESA_allocate_memory extension but never actually implemented for any released i915 userspace code. So just rip it out. v2: Fixup the Makefile. Acked-by: NDave Airlie <airlied@gmail.com> Cc: Keith Whitwell <keithw@vmware.com> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Thomas Meyer 提交于
The advantage of kcalloc is, that will prevent integer overflows which could result from the multiplication of number of elements and size and it is also a bit nicer to read. The semantic patch that makes this change is available in https://lkml.org/lkml/2011/11/25/107Signed-off-by: NThomas Meyer <thomas@m3y3r.de> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 17 1月, 2012 7 次提交
-
-
由 Adam Jackson 提交于
This is paranoid, but I am entirely willing to believe the hardware could come up with a condition where I get a status with both the 'done' and 'receive error' bits set. Signed-off-by: NAdam Jackson <ajax@redhat.com> Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Adam Jackson 提交于
The default in the Sandybridge docs is 5, as on Ironlake, and I have no reason to believe 3 would work any better. Signed-off-by: NAdam Jackson <ajax@redhat.com> Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Adam Jackson 提交于
Matches the advice in the Sandybridge documentation. Signed-off-by: NAdam Jackson <ajax@redhat.com> Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Adam Jackson 提交于
Signed-off-by: NAdam Jackson <ajax@redhat.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Adam Jackson 提交于
Signed-off-by: NAdam Jackson <ajax@redhat.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Adam Jackson 提交于
Signed-off-by: NAdam Jackson <ajax@redhat.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Simon Que 提交于
There is an error in i915_read_blc_pwm_ctl, where the register values are not being copied correctly. BLC_PWM_CTL and BLC_PWM_CTL2 are getting mixed up. This patch fixes that so that saveBLC_PWM_CTL2 and not saveBLC_PWM_CTL is copied to the BLC_PWM_CTL2 register. Signed-off-by: NSimon Que <sque@chromium.org> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-