- 09 4月, 2011 5 次提交
-
-
由 Alex Deucher 提交于
Switch some errors to debug output. These are generally harmless and tend to confuse users. Signed-off-by: NAlex Deucher <alexdeucher@gmail.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Alex Deucher 提交于
Prefer minm over maxp. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=35994Signed-off-by: NAlex Deucher <alexdeucher@gmail.com> Cc: stable@kernel.org Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Michel Dänzer 提交于
Signed-off-by: NMichel Dänzer <daenzer@vmware.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Michel Dänzer 提交于
This is necessary even with PCI(e) GART, and it makes writeback work even with AGP on my PowerBook. Might still be unreliable with older revisions of UniNorth and other AGP bridges though. Signed-off-by: NMichel Dänzer <daenzer@vmware.com> Reviewed-by: NAlex Deucher <alex.deucher@gmail.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Dave Airlie 提交于
This has always used a big hammer, but that hammer is probably too big, I'm also not sure its necessary but at least this should be safe. Should fix: https://bugzilla.kernel.org/show_bug.cgi?id=23592Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 06 4月, 2011 2 次提交
-
-
由 Chris Wilson 提交于
This is a revert of 428d2e82. This is broken in the same manner as for VGA: trying to write to an invalid address on the (currently 7-bit) i2c bus. One notable failure appears to be for MacBooks. The scary part was that it gave the appearance of working (i.e. reporting the absence of the panel) on various all-in-one machines with ghost LVDS panels and not failing for laptops. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NDave Airlie <airlied@linux.ie> Signed-off-by: NKeith Packard <keithp@keithp.com>
-
由 Chris Wilson 提交于
This is a moral revert of 6ec3d0c0. Following the fix to reset the GMBUS controller after a NAK, we finally utilize the 0xa0 probe for a CRT connection. And discover that the code is broken. Shock. There are a number of issues, but following a key insight from Dave Airlie, that 0xA0 is an invalid address on a 7-bit bus (though not if we were to enable 10-bit addressing), and would look like the EDID port 0x50, it is possible to see where the confusion starts. In short, a write to 0xA0 is accepted by the GMBUS controller which we interpreted as meaning the existence of a connection (a slave on the other end of the wire ACKing the write). That was false. During testing with a broken GMBUS implementation, which never reset an earlier NAK, this test always reported a NAK and so we proceeded on to the next test. Reported-and-tested-by: NSitsofe Wheeler <sitsofe@yahoo.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35904Reported-and-tested-by: NRiccardo Magliocchetti <riccardo.magliocchetti@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=32612Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NDave Airlie <airlied@linux.ie> Signed-off-by: NKeith Packard <keithp@keithp.com>
-
- 05 4月, 2011 10 次提交
-
-
由 Ben Skeggs 提交于
Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
-
由 Ben Skeggs 提交于
Not sure how this snuck in... Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
-
由 Ben Skeggs 提交于
It has been reported that this greatly improves (and possibly fixes completely) the stability of NVA3+ chipsets. In traces of my NVA8, NVIDIA now appear to be doing this too. The most recent traces of 0x50 and 0xac I could find don't show NVIDIA checking PGRAPH status on these flushes, so for now, we won't either. Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
-
由 Ben Skeggs 提交于
Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
-
由 David Dillow 提交于
Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
-
由 Ben Skeggs 提交于
Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
-
由 Marcin Slusarz 提交于
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=35135 BUG: unable to handle kernel NULL pointer dereference at 000002d8 IP: [<f83694af>] nv04_dfp_restore+0x7f/0xd0 [nouveau] (...) Call Trace: [<f8372208>] nv04_display_destroy+0xa8/0x140 [nouveau] [<f830344a>] nouveau_unload+0x2a/0x160 [nouveau] [<f80d98fb>] drm_put_dev+0xbb/0x1b0 [drm] [<f8301025>] nouveau_pci_remove+0x15/0x20 [nouveau] [<c1292ad4>] pci_device_remove+0x44/0xf0 [<c13339d1>] __device_release_driver+0x51/0xb0 [<c133401f>] driver_detach+0x8f/0xa0 [<c13338a3>] bus_remove_driver+0x63/0xa0 [<c13340a9>] driver_unregister+0x49/0x80 [<c1182f84>] ? sysfs_remove_file+0x14/0x20 [<c1292bb2>] pci_unregister_driver+0x32/0x90 [<c109b1da>] ? __stop_machine+0x5a/0x70 [<f80d3f93>] drm_exit+0x83/0x90 [drm] [<f837875d>] nouveau_exit+0x1b/0x8be [nouveau] [<c1087b5b>] sys_delete_module+0x13b/0x1f0 [<c1104c3e>] ? do_munmap+0x1fe/0x280 [<c1104780>] ? arch_unmap_area_topdown+0x0/0x20 [<c15096f4>] syscall_call+0x7/0xb Reported-by: NFrancesco Marella <francesco.marella@gmail.com> Tested-by: NFrancesco Marella <francesco.marella@gmail.com> Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com> [ currojerez@riseup.net: No need to spam the logs in that case, an unbound LVDS encoder is not an error. ] Signed-off-by: NFrancisco Jerez <currojerez@riseup.net> Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
-
由 Emil Velikov 提交于
Perf tables v 1.2 and 1.3 (seen on Geforce FX/ 5) are not long enough to store the voltage label/id v2 - Remove comment from the code Signed-off-by: NEmil Velikov <emil.l.velikov@gmail.com> Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
-
由 Roy Spliet 提交于
In line with envytools, verified on 4 or 5 BIOS'es. Signed-off-by: NBen Skeggs <bskeggs@redhat.com> Signed-off-by: NRoy Spliet <r.spliet@student.tudelft.nl>
-
由 Jan Engelhardt 提交于
Signed-off-by: NJan Engelhardt <jengelh@medozas.de> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 04 4月, 2011 2 次提交
-
-
由 Alex Deucher 提交于
Avoid touching the flip setup regs while acceleration is running. Set them at modeset rather than during pageflip. Touching these regs while acceleration is active caused hangs on pre-avivo chips. These chips do not seem to be affected, but better safe than sorry, plus it avoids repeatedly reprogramming the regs every flip. Signed-off-by: NAlex Deucher <alexdeucher@gmail.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Alex Deucher 提交于
Signed-off-by: NAlex Deucher <alexdeucher@gmail.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 01 4月, 2011 2 次提交
-
-
由 Ben Skeggs 提交于
Nouveau needs access to this structure to build an ELD block for use by the HDA audio codec. Signed-off-by: NBen Skeggs <bskeggs@redhat.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 John Lindgren 提交于
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=35502 agd5f: also add sanity check to connector records. v2: fix one more case. Signed-off-by: NAlex Deucher <alexdeucher@gmail.com> Cc: stable@kernel.org Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 31 3月, 2011 3 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
由 Chris Wilson 提交于
Once a NAK has been asserted by the slave, we need to reset the GMBUS controller in order to continue. This is done by asserting the Software Clear Interrupt bit and then clearing it again to restore operations. If we don't clear the NAK, then all future GMBUS xfers will fail, including DDC probes and EDID retrieval. v2: Add some comments as suggested by Keith Packard. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35781Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NKeith Packard <keithp@keithp.com> Tested-by: NJesse Barnes <jbarnes@virtuousgeek.org> Tested-by: N"Mengmeng Meng" <mengmeng.meng@intel.com>
-
由 Chris Wilson 提交于
During modesetting, we need to wait for the hardware to report readiness by polling the registers. Normally, we call msleep() between reads, because some state changes may take a whole vblank or more to complete. However during a panic, we are in an atomic context and cannot sleep. Instead, busy spin polling the termination condition. References: https://bugzilla.kernel.org/show_bug.cgi?id=31772Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NKeith Packard <keithp@keithp.com> Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 25 3月, 2011 1 次提交
-
-
由 Chris Wilson 提交于
The LVDS connector should default to connected. We tried our best to verify the claims of the BIOS that the hardware exists during init(), and then during detect() we then try to verify that the panel is open. In the event of an unsuccessful query, we should then always report that the LVDS panel is connected. This was only the case for gen2/3, later generations leaked the return value from the panel probe instead. Reported-and-tested-by: NAlessandro Suardi <alessandro.suardi@gmail.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NKeith Packard <keithp@keithp.com>
-
- 24 3月, 2011 5 次提交
-
-
由 Dave Airlie 提交于
This makes the interface a bit cleaner by leaving a single gap in the vblank bit space instead of creating two gaps. Suggestions from Michel on mailing list/irc. Reviewed-by: NMichel Dänzer <michel@daenzer.net> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Thomas Renninger 提交于
Throw an error if someone tries to fill this with wrong data, instead of simply ignoring the input. Now you get: echo hello >/sys/../power_method -bash: echo: write error: Invalid argument Signed-off-by: NThomas Renninger <trenn@suse.de> CC: Alexander.Deucher@amd.com CC: dri-devel@lists.freedesktop.org Reviewed-by: NAlex Deucher <alexdeucher@gmail.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Alex Deucher 提交于
On some servers there is a hardcoded EDID provided in the vbios so that the driver will always see a display connected even if something like a KVM prevents traditional means like DDC or load detection from working properly. Also most server boards with DVI are not actually DVI, but DVO connected to a virtual KVM service processor. If we fail to detect a monitor via DDC or load detection and a hardcoded EDID is available, use it. Additionally, when using the hardcoded EDID, use a copy of it rather than the actual one stored in the driver as the detect() and get_modes() functions may free it if DDC is successful. This fixes the virtual KVM on several internal servers. Signed-off-by: NAlex Deucher <alexdeucher@gmail.com> Cc: stable@kernel.org Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Chris Wilson 提交于
This reverts commit a7a75c8f. There are two different variations on how Intel hardware addresses the "Hardware Status Page". One as a location in physical memory and the other as an offset into the virtual memory of the GPU, used in more recent chipsets. (The HWS itself is a cacheable region of memory which the GPU can write to without requiring CPU synchronisation, used for updating various details of hardware state, such as the position of the GPU head in the ringbuffer, the last breadcrumb seqno, etc). These two types of addresses were updated in different locations of code - one inline with the ringbuffer initialisation, and the other during device initialisation. (The HWS page is logically associated with the rings, and there is one HWS page per ring.) During resume, only the ringbuffers were being re-initialised along with the virtual HWS page, leaving the older physical address HWS untouched. This then caused a hang on the older gen3/4 (915GM, 945GM, 965GM) the first time we tried to synchronise the GPU as the breadcrumbs were never being updated. Reported-and-tested-by: NLinus Torvalds <torvalds@linux-foundation.org> Reported-by: NJan Niehusmann <jan@gondor.com> Reported-and-tested-by: NJustin P. Mattock <justinmattock@gmail.com> Reported-and-tested-by: NMichael "brot" Groh <brot@minad.de> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NZhenyu Wang <zhenyuw@linux.intel.com>
-
由 Chris Wilson 提交于
This reverts commit a7a75c8f. There are two different variations on how Intel hardware addresses the "Hardware Status Page". One as a location in physical memory and the other as an offset into the virtual memory of the GPU, used in more recent chipsets. (The HWS itself is a cacheable region of memory which the GPU can write to without requiring CPU synchronisation, used for updating various details of hardware state, such as the position of the GPU head in the ringbuffer, the last breadcrumb seqno, etc). These two types of addresses were updated in different locations of code - one inline with the ringbuffer initialisation, and the other during device initialisation. (The HWS page is logically associated with the rings, and there is one HWS page per ring.) During resume, only the ringbuffers were being re-initialised along with the virtual HWS page, leaving the older physical address HWS untouched. This then caused a hang on the older gen3/4 (915GM, 945GM, 965GM) the first time we tried to synchronise the GPU as the breadcrumbs were never being updated. Reported-and-tested-by: NLinus Torvalds <torvalds@linux-foundation.org> Reported-by: NJan Niehusmann <jan@gondor.com> Reported-by: NJustin P. Mattock <justinmattock@gmail.com> Reported-and-tested-by: NMichael "brot" Groh <brot@minad.de> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 3月, 2011 10 次提交
-
-
由 Chris Wilson 提交于
Found by gem_stress. As we perform retirement from a workqueue, it is possible for us to free and unbind objects after the last close on the device, and so after the address space has been torn down and reset to NULL: BUG: unable to handle kernel NULL pointer dereference at 00000054 IP: [<c1295a20>] mutex_lock+0xf/0x27 *pde = 00000000 Oops: 0002 [#1] SMP last sysfs file: /sys/module/vt/parameters/default_utf8 Pid: 5, comm: kworker/u:0 Not tainted 2.6.38+ #214 EIP: 0060:[<c1295a20>] EFLAGS: 00010206 CPU: 1 EIP is at mutex_lock+0xf/0x27 EAX: 00000054 EBX: 00000054 ECX: 00000000 EDX: 00012fff ESI: 00000028 EDI: 00000000 EBP: f706fe20 ESP: f706fe18 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process kworker/u:0 (pid: 5, ti=f706e000 task=f7060d00 task.ti=f706e000) Stack: f5aa3c60 00000000 f706fe74 c107e7df 00000246 dea55380 00000054 f5aa3c60 f706fe44 00000061 f70b4000 c13fff84 00000008 f706fe54 00000000 00000000 00012f00 00012fff 00000028 c109e575 f6b36700 00100000 00000000 f706fe90 Call Trace: [<c107e7df>] unmap_mapping_range+0x7d/0x1e6 [<c109e575>] ? mntput_no_expire+0x52/0xb6 [<c11c12f6>] i915_gem_release_mmap+0x49/0x58 [<c11c3449>] i915_gem_object_unbind+0x4c/0x125 [<c11c353f>] i915_gem_free_object_tail+0x1d/0xdb [<c11c55a2>] i915_gem_free_object+0x3d/0x41 [<c11a6be2>] ? drm_gem_object_free+0x0/0x27 [<c11a6c07>] drm_gem_object_free+0x25/0x27 [<c113c3ca>] kref_put+0x39/0x42 [<c11c0a59>] drm_gem_object_unreference+0x16/0x18 [<c11c0b15>] i915_gem_object_move_to_inactive+0xba/0xbe [<c11c0c87>] i915_gem_retire_requests_ring+0x16e/0x1a5 [<c11c3645>] i915_gem_retire_requests+0x48/0x63 [<c11c36ac>] i915_gem_retire_work_handler+0x4c/0x117 [<c10385d1>] process_one_work+0x140/0x21b [<c103734c>] ? __need_more_worker+0x13/0x2a [<c10373b1>] ? need_to_create_worker+0x1c/0x35 [<c11c3660>] ? i915_gem_retire_work_handler+0x0/0x117 [<c1038faf>] worker_thread+0xd4/0x14b [<c1038edb>] ? worker_thread+0x0/0x14b [<c103be1b>] kthread+0x68/0x6d [<c103bdb3>] ? kthread+0x0/0x6d [<c12970f6>] kernel_thread_helper+0x6/0x10 Code: 00 e8 98 fe ff ff 5d c3 55 89 e5 3e 8d 74 26 00 ba 01 00 00 00 e8 84 fe ff ff 5d c3 55 89 e5 53 8d 64 24 fc 3e 8d 74 26 00 89 c3 <f0> ff 08 79 05 e8 ab ff ff ff 89 e0 25 00 e0 ff ff 89 43 10 58 EIP: [<c1295a20>] mutex_lock+0xf/0x27 SS:ESP 0068:f706fe18 CR2: 0000000000000054 Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NKeith Packard <keithp@keithp.com>
-
由 Chris Wilson 提交于
Detected by scripts/coccinelle/free/kfree.cocci. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NKeith Packard <keithp@keithp.com>
-
由 Chris Wilson 提交于
We always skipped flushing the BLT ring if the request flush did not include the RENDER domain. However, this neglects that we try to flush the COMMAND domain after every batch and before the breadcrumb interrupt (to make sure the batch is indeed completed prior to the interrupt firing and so insuring CPU coherency). As a result of the missing flush, incoherency did indeed creep in, most notable when using lots of command buffers and so potentially rewritting an active command buffer (i.e. the GPU was still executing from it even though the following interrupt had already fired and the request/buffer retired). As all ring->flush routines now have the same preconditions, de-duplicate and move those checks up into i915_gem_flush_ring(). Fixes gem_linear_blit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35284Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Tested-by: mengmeng.meng@intel.com
-
由 Chris Wilson 提交于
Along the fast path for relocation handling, we attempt to copy directly from the user data structures whilst holding our mutex. This causes lockdep to warn about circular lock dependencies if we need to pagefault the user pages. [Since when handling a page fault on a mmapped bo, we need to acquire the struct mutex whilst already holding the mm semaphore, it is then verboten to acquire the mm semaphore when already holding the struct mutex. The likelihood of the user passing in the relocations contained in a GTT mmaped bo is low, but conceivable for extreme pathology.] In order to force the mm to return EFAULT rather than handle the pagefault, we therefore need to disable pagefaults across the relocation fast path. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
由 Jesse Barnes 提交于
Fix up the debug file to report the right frequencies. On SNB, we program the PCU with a frequency ratio, which is multiplied by 100MHz on the CPU side. But GFX only runs at half that, so report it as such to avoid confusion. Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NKeith Packard <keithp@keithp.com>
-
由 Takashi Iwai 提交于
The order of the calls does matter indeed. Swapping the call order of intel_dp_destroy() and intel_dp_encoder_destroy() fixes the problem. This is because i2c_del_adapter unregisters the device which parent is intel_connector, and connectors are removed in intel_dp_destroy(). Thus intel_dp_encoder_destroy() must be called before intel_dp_destroy(). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=24822Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NKeith Packard <keithp@keithp.com>
-
由 Chris Wilson 提交于
... even though it was disabled. A mistake in the handling of fence reuse caused us to skip the vital delay of waiting for the object to finish rendering before changing the register. This resulted in us changing the fence register whilst the bo was active and so causing the blits to complete using the wrong stride or even the wrong tiling. (Visually the effect is that small blocks of the screen look like they have been interlaced). The fix is to wait for the GPU to finish using the memory region pointed to by the fence before changing it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34584 Cc: Andy Whitcroft <apw@canonical.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> [Note for 2.6.38-stable, we need to reintroduce the interruptible passing] Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Tested-by: NDave Airlie <airlied@linux.ie>
-
由 Yuanhan Liu 提交于
A broken implementation of is_pot() prevented the detection of when a singular pipe was enabled. Eric Anholt pointed out the existence of is_power_of_2() so use that instead of our broken code! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35402Signed-off-by: NYuanhan Liu <yuanhan.liu@intel.com> Tested-by: xunx.fang@intel.com Reviewed-by: NKeith Packard <keithp@keithp.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
When i915_gem_retire_requests_ring calls i915_gem_request_remove_from_client, the client_list for that request may already be removed in i915_gem_release. So we may call twice list_del(&request->client_list), resulting in an oops like this report: [126167.230394] BUG: unable to handle kernel paging request at 00100104 [126167.230699] IP: [<f8c2ce44>] i915_gem_retire_requests_ring+0xd4/0x240 [i915] [126167.231042] *pdpt = 00000000314c1001 *pde = 0000000000000000 [126167.231314] Oops: 0002 [#1] SMP [126167.231471] last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0C0A:00/power_supply/BAT1/current_now [126167.231901] Modules linked in: snd_seq_dummy nls_utf8 isofs btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs cryptd aes_i586 aes_generic binfmt_misc vboxnetadp vboxnetflt vboxdrv parport_pc ppdev snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep arc4 snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq uvcvideo videodev snd_timer snd_seq_device joydev iwlagn iwlcore mac80211 snd cfg80211 soundcore i915 drm_kms_helper snd_page_alloc psmouse drm serio_raw i2c_algo_bit video lp parport usbhid hid sky2 sdhci_pci ahci sdhci libahci [126167.232018] [126167.232018] Pid: 1101, comm: Xorg Not tainted 2.6.38-6-generic-pae #34-Ubuntu Gateway MC7833U / [126167.232018] EIP: 0060:[<f8c2ce44>] EFLAGS: 00213246 CPU: 0 [126167.232018] EIP is at i915_gem_retire_requests_ring+0xd4/0x240 [i915] [126167.232018] EAX: 00200200 EBX: f1ac25b0 ECX: 00000040 EDX: 00100100 [126167.232018] ESI: f1a2801c EDI: e87fc060 EBP: ef4d7dd8 ESP: ef4d7db0 [126167.232018] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [126167.232018] Process Xorg (pid: 1101, ti=ef4d6000 task=f1ba6500 task.ti=ef4d6000) [126167.232018] Stack: [126167.232018] f1a28000 f1a2809c f1a28094 0058bd97 f1aa2400 f1a2801c 0058bd7b 0058bd85 [126167.232018] f1a2801c f1a28000 ef4d7e38 f8c2e995 ef4d7e30 ef4d7e60 c14d1ebc f6b3a040 [126167.232018] f1522cc0 000000db 00000000 f1ba6500 ffffffa1 00000000 00000001 f1a29214 [126167.232018] Call Trace: Unfortunately the call trace reported was cut, but looking at debug symbols the crash is at __list_del, when probably list_del is called twice on the same request->client_list, as the dereferenced value is LIST_POISON1 + 4, and by looking more at the debug symbols before list_del call it should have being called by i915_gem_request_remove_from_client And as I can see in the code, it seems we indeed have the possibility to remove a request->client_list twice, which would cause the above, because we do list_del(&request->client_list) on both i915_gem_request_remove_from_client and i915_gem_release As Chris Wilson pointed out, it's indeed the case: "(...) I had thought that the actual insertion/deletion was serialised under the struct mutex and the intention of the spinlock was to protect the unlocked list traversal during throttling. However, I missed that i915_gem_release() is also called without struct mutex and so we do need the double check for i915_gem_request_remove_from_client()." This change does the required check to avoid the duplicate remove of request->client_list. Bugzilla: http://bugs.launchpad.net/bugs/733780 Cc: stable@kernel.org # 2.6.38 Signed-off-by: NHerton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-