- 18 10月, 2013 4 次提交
-
-
由 Daniel Vetter 提交于
hw designers decided to change the CRC registers and coalesce them all into one. Otherwise nothing changed. I've opted for a new hsw_ version to grab the crc sample since hsw+1 will have the same crc registers, but different interrupt source registers. So this little helper function will come handy there. Also refactor the display error handler with a neat pipe loop. v2: Use for_each_pipe. Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
Suggested by Ville. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
We enable the interrupt unconditionally and only control it through the enable bit in the CRC control register. v2: Extract per-platform helpers to compute the register values. Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Daniel Vetter 提交于
The ringbuffer update logic should always be the same, but different platforms have different amounts of CRC registers. Hence extract it. Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 16 10月, 2013 6 次提交
-
-
由 Daniel Vetter 提交于
Also use #ifdef to keep consistent with all other such cases. Cc: Damien Lespiau <damien.lespiau@intel.com> Acked-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Damien Lespiau 提交于
seq_file is not quite the right interface for these ones. We have a circular buffer with a new entry per vblank on one side and a process wanting to dequeue the CRC with a read(). It's quite racy to wait for vblank in user land and then try to read a pipe_crc file, sometimes the CRC interrupt hasn't been fired and we end up with an EOF. So, let's have the read on the pipe_crc file block until the interrupt gives us a new entry. At that point we can wake the reading process. Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Damien Lespiau 提交于
This shouldn't happen as the buffer is freed after disable pipe CRCs, but better be safe than sorry. Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Damien Lespiau 提交于
Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Damien Lespiau 提交于
There are a few good properties to a circular buffer, for instance it has a number of entries (before we were always dumping the full buffer). Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Shuang He 提交于
There are several points in the display pipeline where CRCs can be computed on the bits flowing there. For instance, it's usually possible to compute the CRCs of the primary plane, the sprite plane or the CRCs of the bits after the panel fitter (collectively called pipe CRCs). v2: Quite a bit of rework here and there (Damien) Signed-off-by: NShuang He <shuang.he@intel.com> Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com> [danvet: Fix intermediate compile file reported by Wu Fengguang's kernel builder.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 14 10月, 2013 5 次提交
-
-
由 Ville Syrjälä 提交于
Gen2 doesn't have a hardware frame counter that can be read out. Just provide a stub .get_vblank_counter() that always returns 0 instead of trying to read non-existing registers. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
Gen2 doesn't have the pixelcount register that gen3 and gen4 have. Instead we must use the scanline counter like we do for ctg+. Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
The DSL register increments at the start of horizontal sync, so it manages to miss the entire active portion of the current line. Improve the get_scanoutpos accuracy a bit when the scanout position is close to the start or end of vblank. We can do that by double checking the DSL value against the vblank status bit from ISR. Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: mario.kleiner.de@gmail.com Tested-by: mario.kleiner.de@gmail.com Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
The reported scanout position must be relative to the end of vblank. Currently we manage to fumble that in a few ways. First we don't consider the case when vtotal != vbl_end. While that isn't very common (happens maybe only w/ old panel fitting hardware), we can fix it easily enough. The second issue is that on pre-CTG hardware we convert the pixel count to horizontal/vertical components at the very beginning, and then forget to adjust the horizontal component to be relative to vbl_end. So instead we should keep our numbers in the pixel count domain while we're adjusting the position to be relative to vbl_end. Then when we do the conversion in the end, both vertical _and_ horizontal components will come out correct. v2: Change position to int from u32 to avoid sign issues Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: mario.kleiner.de@gmail.com Tested-by: mario.kleiner.de@gmail.com Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ville Syrjälä 提交于
We have all the information we need in the mode structure, so going and reading it from the hardware is pointless, and slower. We never populated ->get_vblank_timestamp() in the UMS case, and as that is the only way we'd ever call ->get_scanout_position(), we can completely ignore UMS in i915_get_crtc_scanoutpos(). Also reorganize intel_irq_init() a bit to clarify the KMS vs. UMS situation. v2: Drop UMS code Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: mario.kleiner.de@gmail.com Tested-by: mario.kleiner.de@gmail.com Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 12 10月, 2013 1 次提交
-
-
由 Ville Syrjälä 提交于
The old style frame counter increments at the start of active video. However for i915_get_vblank_counter() we want a counter that increments at the start of vblank. Fortunately the low frame counter register also contains the pixel counter for the current frame. We can can compare that against the vblank start pixel count to determine if we need to increment the frame counter by 1 to get the correct answer. Also reorganize the function pointer assignments in intel_irq_init() a bit to avoid confusing people. Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de> Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NMario Kleiner <mario.kleiner@tuebingen.mpg.de> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 10 10月, 2013 1 次提交
-
-
由 Chris Wilson 提交于
We lost the ability to capture the first error for a stuck ring in the recent hangcheck robustification. Whilst both error states are interesting (why does the GPU not recover is also essential to debug), our primary goal is to fix the initial hang and so we need to capture the first error state upon taking hangcheck action. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 04 10月, 2013 3 次提交
-
-
由 Chris Wilson 提交于
After applying wait-boost we often find ourselves stuck at higher clocks than required. The current threshold value requires the GPU to be continuously and completely idle for 313ms before it is dropped by one bin. Conversely, we require the GPU to be busy for an average of 90% over a 84ms period before we upclock. So the current thresholds almost never downclock the GPU, and respond very slowly to sudden demands for more power. It is easy to observe that we currently lock into the wrong bin and both underperform in benchmarks and consume more power than optimal (just by repeating the task and measuring the different results). An alternative approach, as discussed in the bspec, is to use a continuous threshold for upclocking, and an average value for downclocking. This is good for quickly detecting and reacting to state changes within a frame, however it fails with the common throttling method of waiting upon the outstanding frame - at least it is difficult to choose a threshold that works well at 15,000fps and at 60fps. So continue to use average busy/idle loads to determine frequency change. v2: Use 3 power zones to keep frequencies low in steady-state mostly idle (e.g. scrolling, interactive 2D drawing), and frequencies high for demanding games. In between those end-states, we use a fast-reclocking algorithm to converge more quickly on the desired bin. v3: Bug fixes - make sure we reset adj after switching power zones. v4: Tune - drop the continuous busy thresholds as it prevents us from choosing the right frequency for glxgears style swap benchmarks. Instead the goal is to be able to find the right clocks irrespective of the wait-boost. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Stéphane Marchesin <stephane.marchesin@gmail.com> Cc: Owen Taylor <otaylor@redhat.com> Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com> Cc: "Zhuang, Lena" <lena.zhuang@intel.com> Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
If we encounter a situation where the CPU blocks waiting for results from the GPU, give the GPU a kick to boost its the frequency. This should work to reduce user interface stalls and to quickly promote mesa to high frequencies - but the cost is that our requested frequency stalls high (as we do not idle for long enough before rc6 to start reducing frequencies, nor are we aggressive at down clocking an underused GPU). However, this should be mitigated by rc6 itself powering off the GPU when idle, and that energy use is dependent upon the workload of the GPU in addition to its frequency (e.g. the math or sampler functions only consume power when used). Still, this is likely to adversely affect light workloads. In particular, this nearly eliminates the highly noticeable wake-up lag in animations from idle. For example, expose or workspace transitions. (However, given the situation where we fail to downclock, our requested frequency is almost always the maximum, except for Baytrail where we manually downclock upon idling. This often masks the latency of upclocking after being idle, so animations are typically smooth - at the cost of increased power consumption.) Stéphane raised the concern that this will punish good applications and reward bad applications - but due to the nature of how mesa performs its client throttling, I believe all mesa applications will be roughly equally affected. To address this concern, and to prevent applications like compositors from permanently boosting the RPS state, we ratelimit the frequency of the wait-boosts each client recieves. Unfortunately, this techinique is ineffective with Ironlake - which also has dynamic render power states and suffers just as dramatically. For Ironlake, the thermal/power headroom is shared with the CPU through Intelligent Power Sharing and the intel-ips module. This leaves us with no GPU boost frequencies available when coming out of idle, and due to hardware limitations we cannot change the arbitration between the CPU and GPU quickly enough to be effective. v2: Limit each client to receiving a single boost for each active period. Tested by QA to only marginally increase power, and to demonstrably increase throughput in games. No latency measurements yet. v3: Cater for front-buffer rendering with manual throttling. v4: Tidy up. v5: Sadly the compositor needs frequent boosts as it may never idle, but due to its picking mechanism (using ReadPixels) may require frequent waits. Those waits, along with the waits for the vrefresh swap, conspire to keep the GPU at low frequencies despite the interactive latency. To overcome this we ditch the one-boost-per-active-period and just ratelimit the number of wait-boosts each client can receive. Reported-and-tested-by: NPaul Neumann <paul104x@yahoo.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68716Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Stéphane Marchesin <stephane.marchesin@gmail.com> Cc: Owen Taylor <otaylor@redhat.com> Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com> Cc: "Zhuang, Lena" <lena.zhuang@intel.com> Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org> [danvet: No extern for function prototypes in headers.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Chris Wilson 提交于
When we switched to always using a timeout in conjunction with wait_seqno, we lost the ability to detect missed interrupts. Since, we have had issues with interrupts on a number of generations, and they are required to be delivered in a timely fashion for a smooth UX, it is important that we do log errors found in the wild and prevent the display stalling for upwards of 1s every time the seqno interrupt is missed. Rather than continue to fix up the timeouts to work around the interface impedence in wait_event_*(), open code the combination of wait_event[_interruptible][_timeout], and use the exposed timer to poll for seqno should we detect a lost interrupt. v2: In order to satisfy the debug requirement of logging missed interrupts with the real world requirments of making machines work even if interrupts are hosed, we revert to polling after detecting a missed interrupt. v3: Throw in a debugfs interface to simulate broken hw not reporting interrupts. v4: s/EGAIN/EAGAIN/ (Imre) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NImre Deak <imre.deak@intel.com> [danvet: Don't use the struct typedef in new code.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 01 10月, 2013 1 次提交
-
-
由 Chris Wilson 提交于
We only wish to know the value of seqno when emitting the tracepoint, so move the query from a parameter to the macro to inside the conditional macro body so that the query is only evaluated when required. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 20 9月, 2013 3 次提交
-
-
由 Paulo Zanoni 提交于
We currently disable the ERR_INT interrupts while running the IRQ handler because we fear that if we do an unclaimed register access from inside the IRQ handler we'll keep triggering the IRQ handler forever. The problem is that since we always disable the ERR_INT interrupts at the IRQ handler, when we get a FIFO underrun we'll always print both messages: - "uncleared fifo underrun on pipe A" - "Pipe A FIFO underrun" Because the "was_enabled" variable from ivybridge_set_fifo_underrun_reporting will always be false (since we disable ERR int at the IRQ handler!). Instead of actually fixing ivybridge_set_fifo_underrun_reporting, let's just remove the "disable ERR_INT during the IRQ handler" code. As far as we know we shouldn't really be triggering ERR_INT interrupts from the IRQ handler, so if we ever get stuck in the endless loop of interrupts we can git-bisect and revert (and we can even bisect and revert this patch in case I'm just wrong). As a bonus, our IRQ handler is now simpler and a few nanoseconds faster. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
We'd only ever used this define to denote whether or not we have the dynamic parity feature (DPF) and never to determine whether or not L3 exists. Baytrail is a good example of where L3 exists, and not DPF. This patch provides clarify in the code for future use cases which might want to actually query whether or not L3 exists. v2: Add /* DPF == dynamic parity feature */ Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Ben Widawsky 提交于
Certain HSW SKUs have a second bank of L3. This L3 remapping has a separate register set, and interrupt from the first "slice". A slice is simply a term to define some subset of the GPU's l3 cache. This patch implements both the interrupt handler, and ability to communicate with userspace about this second slice. v2: Remove redundant check about non-existent slice. Change warning about interrupts of unknown slices to WARN_ON_ONCE Handle the case where we get 2 slice interrupts concurrently, and switch the tracking of interrupts to be non-destructive (all Ville) Don't enable/mask the second slice parity interrupt for ivb/vlv (even though all docs I can find claim it's rsvd) (Ville + Bryan) Keep BYT excluded from L3 parity v3: Fix the slice = ffs to be decremented by one (found by Ville). When I initially did my testing on the series, I was using 1-based slice counting, so this code was correct. Not sure why my simpler tests that I've been running since then didn't pick it up sooner. Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 17 9月, 2013 1 次提交
-
-
由 Jani Nikula 提交于
This reduces dmesg noise when there's a glitch on the hpd line, or there are more than one connectors on the same hpd line and only one of them changes. While at it, switch to use the friendly status names instead of numbers. Signed-off-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 09 9月, 2013 1 次提交
-
-
由 Daniel Vetter 提交于
My g33 here seems to be shockingly good at hitting them all. This time around kms_flip/flip-vs-panning-vs-hang blows up: intel_crtc_wait_for_pending_flips correctly checks for gpu hangs and if a gpu hang is pending aborts the wait for outstanding flips so that the setcrtc call will succeed and release the crtc mutex. And the gpu hang handler needs that lock in intel_display_handle_reset to be able to complete outstanding flips. The problem is that we can race in two ways: - Waiters on the dev_priv->pending_flip_queue aren't woken up after we've the reset as pending, but before we actually start the reset work. This means that the waiter doesn't notice the pending reset and hence will keep on hogging the locks. Like with dev->struct_mutex and the ring->irq_queue wait queues we there need to wake up everyone that potentially holds a lock which the reset handler needs. - intel_display_handle_reset was called _after_ we've already signalled the completion of the reset work. Which means a waiter could sneak in, grab the lock and never release it (since the pageflips won't ever get released). Similar to resetting the gem state all the reset work must complete before we update the reset counter. Contrary to the gem reset we don't need to have a second explicit wake up call since that will have happened already when completing the pageflips. We also don't have any issues that the completion happens while the reset state is still pending - wait_for_pending_flips is only there to ensure we display the right frame. After a gpu hang&reset events such guarantees are out the window anyway. This is in contrast to the gem code where too-early wake-up would result in unnecessary restarting of ioctls. Also, since we've gotten these various deadlocks and ordering constraints wrong so often throw copious amounts of comments at the code. This deadlock regression has been introduced in the commit which added the pageflip reset logic to the gpu hang work: commit 96a02917 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Mon Feb 18 19:08:49 2013 +0200 drm/i915: Finish page flips and update primary planes after a GPU reset v2: - Add comments to explain how the wake_up serves as memory barriers for the atomic_t reset counter. - Improve the comments a bit as suggested by Chris Wilson. - Extract the wake_up calls before/after the reset into a little i915_error_wake_up and unconditionally wake up the pending_flip_queue waiters, again as suggested by Chris Wilson. v3: Throw copious amounts of comments at i915_error_wake_up as suggested by Chris Wilson. Cc: stable@vger.kernel.org Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 06 9月, 2013 1 次提交
-
-
由 Mika Kuoppala 提交于
Score and action reveals what all the rings were doing and why hang was declared. Add idle state so that we can distinguish between waiting and idle ring. v2: - add idle as a hangcheck action - consensed hangcheck status to single line (Chris) - mark active explicitly when we are making progress (Chris) Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 05 9月, 2013 1 次提交
-
-
由 Daniel Vetter 提交于
Since we've started to clean up pending flips when the gpu hangs in commit 96a02917 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Mon Feb 18 19:08:49 2013 +0200 drm/i915: Finish page flips and update primary planes after a GPU reset the gpu reset work now also grabs modeset locks. But since work items on our private work queue are not allowed to do that due to the flush_workqueue from the pageflip code this results in a neat deadlock: INFO: task kms_flip:14676 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kms_flip D ffff88019283a5c0 0 14676 13344 0x00000004 ffff88018e62dbf8 0000000000000046 ffff88013bdb12e0 ffff88018e62dfd8 ffff88018e62dfd8 00000000001d3b00 ffff88019283a5c0 ffff88018ec21000 ffff88018f693f00 ffff88018eece000 ffff88018e62dd60 ffff88018eece898 Call Trace: [<ffffffff8138ee7b>] schedule+0x60/0x62 [<ffffffffa046c0dd>] intel_crtc_wait_for_pending_flips+0xb2/0x114 [i915] [<ffffffff81050ff4>] ? finish_wait+0x60/0x60 [<ffffffffa0478041>] intel_crtc_set_config+0x7f3/0x81e [i915] [<ffffffffa031780a>] drm_mode_set_config_internal+0x4f/0xc6 [drm] [<ffffffffa0319cf3>] drm_mode_setcrtc+0x44d/0x4f9 [drm] [<ffffffff810e44da>] ? might_fault+0x38/0x86 [<ffffffffa030d51f>] drm_ioctl+0x2f9/0x447 [drm] [<ffffffff8107a722>] ? trace_hardirqs_off+0xd/0xf [<ffffffffa03198a6>] ? drm_mode_setplane+0x343/0x343 [drm] [<ffffffff8112222f>] ? mntput_no_expire+0x3e/0x13d [<ffffffff81117f33>] vfs_ioctl+0x18/0x34 [<ffffffff81118776>] do_vfs_ioctl+0x396/0x454 [<ffffffff81396b37>] ? sysret_check+0x1b/0x56 [<ffffffff81118886>] SyS_ioctl+0x52/0x7d [<ffffffff81396b12>] system_call_fastpath+0x16/0x1b 2 locks held by kms_flip/14676: #0: (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa0316545>] drm_modeset_lock_all+0x22/0x59 [drm] #1: (&crtc->mutex){+.+.+.}, at: [<ffffffffa031656b>] drm_modeset_lock_all+0x48/0x59 [drm] INFO: task kworker/u8:4:175 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u8:4 D ffff88018de9a5c0 0 175 2 0x00000000 Workqueue: i915 i915_error_work_func [i915] ffff88018e37dc30 0000000000000046 ffff8801938ab8a0 ffff88018e37dfd8 ffff88018e37dfd8 00000000001d3b00 ffff88018de9a5c0 ffff88018ec21018 0000000000000246 ffff88018e37dca0 000000005a865a86 ffff88018de9a5c0 Call Trace: [<ffffffff8138ee7b>] schedule+0x60/0x62 [<ffffffff8138f23d>] schedule_preempt_disabled+0x9/0xb [<ffffffff8138d0cd>] mutex_lock_nested+0x205/0x3b1 [<ffffffffa0477094>] ? intel_display_handle_reset+0x7e/0xbd [i915] [<ffffffffa0477094>] ? intel_display_handle_reset+0x7e/0xbd [i915] [<ffffffffa0477094>] intel_display_handle_reset+0x7e/0xbd [i915] [<ffffffffa044e0a2>] i915_error_work_func+0x128/0x147 [i915] [<ffffffff8104a89a>] process_one_work+0x1d4/0x35a [<ffffffff8104a821>] ? process_one_work+0x15b/0x35a [<ffffffff8104b4a5>] worker_thread+0x144/0x1f0 [<ffffffff8104b361>] ? rescuer_thread+0x275/0x275 [<ffffffff8105076d>] kthread+0xac/0xb4 [<ffffffff81059d30>] ? finish_task_switch+0x3b/0xc0 [<ffffffff810506c1>] ? __kthread_parkme+0x60/0x60 [<ffffffff81396a6c>] ret_from_fork+0x7c/0xb0 [<ffffffff810506c1>] ? __kthread_parkme+0x60/0x60 3 locks held by kworker/u8:4/175: #0: (i915){.+.+.+}, at: [<ffffffff8104a821>] process_one_work+0x15b/0x35a #1: ((&dev_priv->gpu_error.work)){+.+.+.}, at: [<ffffffff8104a821>] process_one_work+0x15b/0x35a #2: (&crtc->mutex){+.+.+.}, at: [<ffffffffa0477094>] intel_display_handle_reset+0x7e/0xbd [i915] This blew up while running kms_flip/flip-vs-panning-vs-hang-interruptible on one of my older machines. Unfortunately (despite the proper lockdep annotations for flush_workqueue) lockdep still doesn't detect this correctly, so we need to rely on chance to discover these bugs. Apply the usual bugfix and schedule the reset work on the system workqueue to keep our own driver workqueue free of any modeset lock grabbing. Note that this is not a terribly serious regression since before the offending commit we'd simply have stalled userspace forever due to failing to abort all outstanding pageflips. v2: Add a comment as requested by Chris. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 04 9月, 2013 1 次提交
-
-
由 Daniel Vetter 提交于
Historically we've run our own driver hotplug handling in our own work-queue, which then launched the drm core hotplug handling in the system workqueue. This is important since we flush our own driver workqueue in the pageflip code while hodling modeset locks, and only the drm hotplug code grabbed these locks. But with commit 69787f7d Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Oct 23 18:23:34 2012 +0000 drm: run the hpd irq event code directly this was changed and now we could deadlock in our flip handler if there's a hotplug work blocking the progress of the crucial unpin works. So this broke the careful deadlock avoidance implemented in commit b4a98e57 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Nov 1 09:26:26 2012 +0000 drm/i915: Flush outstanding unpin tasks before pageflipping Since the rule thus far has been that work items on our own workqueue may never grab modeset locks simply restore that rule again. v2: Add a comment to the declaration of dev_priv->wq to warn readers about the tricky implications of using it. Suggested by Chris Wilson. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Stuart Abercrombie <sabercrombie@chromium.org> Reported-by: NStuart Abercrombie <sabercrombie@chromium.org> References: http://permalink.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/26239 Cc: stable@vger.kernel.org Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> [danvet: Squash in a comment at the place where we schedule the work. Requested after-the-fact by Chris on irc since the hpd work isn't the only place we botch this.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 03 9月, 2013 1 次提交
-
-
由 Daniel Vetter 提交于
We already have a big splashing *ERROR* for all the relevant cases of hangs, so this one here is redudant. And it results in an unclean dmesg when running with simulated hangs. Regression has been introduced in commit 05407ff8 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Thu May 30 09:04:29 2013 +0300 drm/i915: detect hang using per ring hangcheck_score Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68641Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 23 8月, 2013 9 次提交
-
-
由 Paulo Zanoni 提交于
This patch allows PC8+ states on Haswell. These states can only be reached when all the display outputs are disabled, and they allow some more power savings. The fact that the graphics device is allowing PC8+ doesn't mean that the machine will actually enter PC8+: all the other devices also need to allow PC8+. For now this option is disabled by default. You need i915.allow_pc8=1 if you want it. This patch adds a big comment inside i915_drv.h explaining how it works and how it tracks things. Read it. v2: (this is not really v2, many previous versions were already sent, but they had different names) - Use the new functions to enable/disable GTIMR and GEN6_PMIMR - Rename almost all variables and functions to names suggested by Chris - More WARNs on the IRQ handling code - Also disable PC8 when there's GPU work to do (thanks to Ben for the help on this), so apps can run caster - Enable PC8 on a delayed work function that is delayed for 5 seconds. This makes sure we only enable PC8+ if we're really idle - Make sure we're not in PC8+ when suspending v3: - WARN if IRQs are disabled on __wait_seqno - Replace some DRM_ERRORs with WARNs - Fix calls to restore GT and PM interrupts - Use intel_mark_busy instead of intel_ring_advance to disable PC8 v4: - Use the force_wake, Luke! v5: - Remove the "IIR is not zero" WARNs - Move the force_wake chunk to its own patch - Only restore what's missing from RC6, not everything Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
Because hsw_pm_irq_handler does exactly what gen6_rps_irq_handler does and also processes the 2 additional VEBOX bits. So merge those functions and wrap the VEBOX bits on a HAS_VEBOX check. This check isn't really necessary since the bits are reserved on SNB/IVB/VLV, but it's a good documentation on who uses them. v2: - Change IS_HASWELL check to HAS_VEBOX Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
It seems we've been doing this ever since we started processing the RPS events on a work queue, on commit "drm/i915: move gen6 rps handling to workqueue", 4912d041. The problem is: when we add work to the queue, instead of just masking the bits we queued and leaving all the others on their current state, we mask the bits we queued and unmask all the others. This basically means we'll be unmasking a bunch of interrupts we're not going to process. And if you look at gen6_pm_rps_work, we unmask back only GEN6_PM_RPS_EVENTS, which means the bits we unmasked when adding work to the queue will remain unmasked after we process the queue. Notice that even though we unmask those unrelated interrupts, we never enable them on IER, so they don't fire our interrupt handler, they just stay there on IIR waiting to be cleared when something else triggers the interrupt handler. So this patch does what seems to make more sense: mask only the bits we add to the queue, without unmasking anything else, and so we'll unmask them after we process the queue. As a side effect we also have to remove that WARN, because it is not only making sure we don't mask useful interrupts, it is also making sure we do unmask useless interrupts! That piece of code should not be responsible for knowing which bits should be unmasked, so just don't assert anything, and trust that snb_disable_pm_irq should be doing the right thing. With i915.enable_pc8=1 I was getting ocasional "GEN6_PMIIR is not 0" error messages due to the fact that we unmask those unrelated interrupts but don't enable them. Note: if bugs start bisecting to this patch, then it probably means someone was relying on the fact that we unmask everything by accident, then we should fix gen5_gt_irq_postinstall or whoever needs the accidentally unmasked interrupts. Or maybe I was just wrong and we need to revert this patch :) Note: This started to be a more real issue with the addition of the VEBOX support since now we do enable more than just the minimal set of RPS interrupts in the IER register. Which means after the first rps interrupt has happened we will never mask the VEBOX user interrupts again and so will blow through cpu time needlessly when running video workloads. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NBen Widawsky <ben@bwidawsk.net> [danvet: Add note that this started to matter with VEBOX much more.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
On SNB/IVB/VLV we only call gen6_rps_irq_handler if one of the IIR bits set is part of GEN6_PM_RPS_EVENTS, but at gen6_rps_irq_handler we add all the enabled IIR bits to the work queue, not only the ones that are part of GEN6_PM_RPS_EVENTS. But then gen6_pm_rps_work only processes GEN6_PM_RPS_EVENTS, so it's useless to add anything that's not GEN6_PM_RPS_EVENTS to the work queue. As a bonus, gen6_rps_irq_handler looks more similar to hsw_pm_irq_handler, so we may be able to merge them in the future. v2: - Add a WARN in case we queued something we're not going to process. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
If the error interrupts are already disabled, don't disable and reenable them. This is going to be needed when we're in PC8+, where all the interrupts are disabled so we won't risk re-enabling DE_ERR_INT_IVB. v2: Use dev_priv->irq_mask (Chris) Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
Just like irq_mask and gt_irq_mask, use it to track the status of GEN6_PMIMR so we don't need to read it again every time we call snb_update_pm_irq. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
I did some brief tests and the "new_val = pmimr" condition usually happens a few times after exiting games. Note: This is also prep work to track the GEN6_PMIMR register state in dev_priv->pm_imr. This happens in the next patch. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: Add note to explain why we want this, as per the discussion between Chris and Paulo.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
Just like we're doing with the other IMR changes. One of the functional changes is that not every caller was doing the POSTING_READ. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Paulo Zanoni 提交于
Just like the functions that touch DEIMR and SDEIMR, but for GTIMR. The new functions contain a POSTING_READ(GTIMR) which was not present at the 2 callers inside i915_irq.c. The implementation is based on ibx_display_interrupt_update. Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 22 8月, 2013 1 次提交
-
-
由 Jani Nikula 提交于
Although I could not reproduce this (different compiler version, perhaps), reportedly we get: drivers/gpu/drm/i915/i915_irq.c:1943:27: warning: ‘score’ may be used uninitialized in this function [-Wuninitialized] Drop the 'score' variable altogether as it's not really needed. Reported-by: NKees Cook <keescook@chromium.org> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-