- 25 4月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
For many configuration details within RC6 and RPS we are programming intervals for the internal clocks. From gen11, these clocks are configuration via the RPM_CONFIG and so for convenience, we would like to convert to/from more natural units (ns). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Add tracek to the RPS events (interrupts, worker, enabling, threshold selection, frequency setting), so that if we have to debug reticent HW we have some traces to start from. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and upon receipt of that interrupt we reprogram the GPU clocks down to the next idle notch [to help convserve power during rc6]. However, on execlists, we benefit from soft-rc6 immediately parking the GPU and setting idle frequencies upon idling [within a jiffie], and here the interrupt prevents us from restarting from our last frequency. In the process, we can simply opt for a static pm_events mask and rely on the enable/disable interrupts to flush the worker on parking. This will reduce the amount of oscillation observed during steady workloads with microsleeps, as each time the rc6 timeout occurs we immediately follow with a waitboost for a dropped frame. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422001703.1697-1-chris@chris-wilson.co.uk
-
- 24 4月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
The history of i915_vma_close() is confusing, as is its use. As the lifetime of the i915_vma is currently bounded by the object it is attached to, we needed a means of identify when a vma was no longer in use by userspace (via the user's fd). This is further complicated by that only ppgtt vma should be closed at the user's behest, as the ggtt were always shared. Now that we attach the vma to a lut on the user's context, the open count does indicate how many unique and open context/vm are referencing this vma from the user. As such, we can and should just use the open_count to track when the vma is still in use by userspace. It's a poor man's replacement for reference counting. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk
-
由 Mika Kuoppala 提交于
More often than not, we need a byte offset into lrc register state from the start of the hw state. Make it so. Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423182355.21837-3-mika.kuoppala@linux.intel.com
-
由 Mika Kuoppala 提交于
Add per ctx bb and indirect ctx bb register locations to live_lrc_fixed for verification. Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423224159.22078-1-chris@chris-wilson.co.uk
-
- 23 4月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
intel_gt_wait_for_idle() tries to wait until all the outstanding requests are retired and the GPU is idle. As a side effect of retiring requests, we may submit more work to flush any pm barriers, and so the wait-for-idle tries to flush the background pm work and catch the new requests. However, if the work completed in the background before we were able to flush, it would queue the extra barrier request without us noticing -- and so we would return from wait-for-idle with one request remaining. (This breaks e.g. record_default_state where we need to wait until that barrier is retired, and it may slow suspend down by causing us to wait on the background retirement worker as opposed to immediately retiring the barrier.) However, since we track if there has been a submission since the engine pm barrier, we can very quickly detect if the idle barrier is still outstanding. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1763Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423085940.28168-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
During the virtual engine's submission tasklet, we take the request and insert into the submission queue on each of our siblings. This seems quite simply, and so no problems with ordering. However, the sibling execlists' submission tasklets may run concurrently with the virtual engine's tasklet, submitting the request to HW before the virtual finishes its task of telling all the siblings. If this happens, the sibling tasklet may *reorder* the ve->sibling[] array that the virtual engine tasklet is processing. This can *only* reorder within the elements already processed by the virtual engine, nevertheless the race is detected by KCSAN: [ 185.580014] BUG: KCSAN: data-race in execlists_dequeue [i915] / virtual_submission_tasklet [i915] [ 185.580054] [ 185.580076] write to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 2: [ 185.580553] execlists_dequeue+0x6ad/0x1600 [i915] [ 185.581044] __execlists_submission_tasklet+0x48/0x60 [i915] [ 185.581517] execlists_submission_tasklet+0xd3/0x170 [i915] [ 185.581554] tasklet_action_common.isra.0+0x42/0x90 [ 185.581585] __do_softirq+0xc8/0x206 [ 185.581613] run_ksoftirqd+0x15/0x20 [ 185.581641] smpboot_thread_fn+0x15a/0x270 [ 185.581669] kthread+0x19a/0x1e0 [ 185.581695] ret_from_fork+0x1f/0x30 [ 185.581717] [ 185.581736] read to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 0: [ 185.582231] virtual_submission_tasklet+0x10e/0x5c0 [i915] [ 185.582265] tasklet_action_common.isra.0+0x42/0x90 [ 185.582291] __do_softirq+0xc8/0x206 [ 185.582315] run_ksoftirqd+0x15/0x20 [ 185.582340] smpboot_thread_fn+0x15a/0x270 [ 185.582368] kthread+0x19a/0x1e0 [ 185.582395] ret_from_fork+0x1f/0x30 [ 185.582417] We can prevent this race by checking for the ve->request after looking up the sibling array. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423115315.26825-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
When we migrated to execlists, one of the conditions we wanted to test for was whether the breadcrumb seqno was being written before the breadcumb interrupt was delivered. This was following on from issues observed on previous generations which were not so strongly ordered. With the removal of the missed interrupt detection, we have not reliable means of detecting the out-of-order seqno/interrupt but instead tried to assert that the relationship between the CS event interrupt and the breadwrite should be strongly ordered. However, Icelake proves it is possible for the HW implementation to forget about minor little details such as write ordering and so the order between *processing* the CS event and the breadcrumb is unreliable. Remove the unreliable assertion, but leave a debug telltale in case we have reason to suspect. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1658Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422141749.28709-1-chris@chris-wilson.co.uk
-
- 22 4月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
Since batch buffers dominant execution time, most preemption requests should naturally occur during execution of a batch buffer. We wish to verify that should a preemption occur within a batch buffer, when we come to restart that batch buffer, it occurs at the interrupted instruction and most importantly does not rollback to an earlier point. v2: Do not clear the GPR at the start of the batch, but rely on them being clear for new contexts. Suggested-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422100903.25216-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
For verifying reciving the EI interrupts, we need to hold the GPU in very precise conditions (in terms of C0 cycles during the EI). If we preempt the busy load to handle the heartbeat, this may perturb the busy load causing us to miss the interrupt. The other tests, while not as time sensitive, may also be slightly perturbed, so apply the heartbeat protection across all the measurements. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422083855.26842-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Having noticed that MI_BB_START is incurring a memory stall (see the correlation with uncore frequency), we have to unroll the loop in order to diminish the impact of the MI_BB_START on the instruction throughput. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421171351.19575-1-chris@chris-wilson.co.uk
-
- 21 4月, 2020 10 次提交
-
-
由 Chris Wilson 提交于
Since we may lose the content of any buffer when we relinquish control of the system (e.g. suspend/resume), we have to be careful not to rely on regaining control. A good method to detect when we might be using garbage is by always injecting that garbage prior to first use on load/resume/etc. v2: Drop sanitize callback on cleanup v3: Move seqno reset to timeline enter, so we reset all timelines. However, this is done on every activation during runtime and not reset. The similar level of paranoia we apply to correcting context state after a period of inactivity. Suggested-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421092504.7416-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Let's isolate the impact of cpu frequency selection on determing the GPU throughput in response to selection of RPS frequencies. For real systems, we do have to be concerned with the impact of integrating c-states, p-states and rp-states, but for the sake of proving whether or not RPS works, one baby step at a time. For the record, as one would hope, it does not seem to impact on the measured performance, but we do it anyway to reduce the number of variables. Later, we can extend the testing to encourage the the cpu/pkg to try and sleep while the GPU is busy. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421142236.8614-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20200421142236.8614-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
If we detect that the RPS end points do not scale perfectly, take the time to measure all the in between values as well. We are aborting the test, so we might as well spend the available time gathering critical debug information instead. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421124636.22554-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We want to see the pstate limits whenever we fail to set the minimum frequency as that may help for debugging. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420203040.8984-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
After having testing all the RPS controls individually, we need to take a step back and check how our RPS worker integrates them to perform dynamic GPU reclocking. So do that by submitting a spinner and wait and see what happens. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-6-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
If we encounter an error while scaling, read back the frequency tables from the pcu. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-5-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Split the frequency measurement into two modes, so that we can judge the impact of the llc setup on top of the pure CS frequency scaling. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Check that the GPU does respond to our RPS frequency requests by setting our desired frequency. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-3-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
If we can not manipulate the frequency with RPS, then comparing min/max power consumption is pointless / misleading. We will leave the warning about not being able to control the frequency selection via RPS to other tests so as not to confuse this more specialised check. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
One of the core tenents of reclocking the GPU is that its throughput scales with the clock frequency. We can observe this by incrementing a loop counter on the GPU, and compare the different execution rates at the notional RPS frequencies. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-1-chris@chris-wilson.co.uk
-
- 20 4月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Avoid flushing the submission queue (of others) under the client's timeline lock, but instead move it to the end so that we may catch more. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1066Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420125356.26614-2-chris@chris-wilson.co.uk
-
- 18 4月, 2020 3 次提交
-
-
由 Michael J. Ruhl 提交于
DMA_MASK bit values are different for different generations. This will become more difficult to manage over time with the open coded usage of different versions of the device. Fix by: disallow setting of dma mask in AGP path (< GEN(5) for i915, add dma_mask_size to the device info configuration, updating open code call sequence to the latest interface, refactoring into a common function for setting the dma segment and mask info Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> cc: Brian Welty <brian.welty@intel.com> cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200417195107.68732-1-michael.j.ruhl@intel.com
-
由 Chris Wilson 提交于
A basic premise of RPS is that at lower frequencies, not only do we run slower, but we save power compared to higher frequencies. For example, when idle, we set the minimum frequency just in case there is some residual current. Since the power curve should be a physical relationship, if we find no power saving it's likely that we've broken our frequency handling, so test! Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200417152018.13079-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Move the handy utility to measure the GPU energy consumption using RAPL msr into a common lib so that it can be reused easily. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200417152018.13079-1-chris@chris-wilson.co.uk
-
- 17 4月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
Since we are touching the device to read the registers, we are required to ensure the device is awake at the time. Currently, we believe ourselves to be inside the active request [thus an active engine wakeref], but since that may be retired in the background, we can spontaneously lose the wakeref and the ability to probe the HW. <4> [379.686703] RPM wakelock ref not held during HW access <4> [379.686805] WARNING: CPU: 7 PID: 4869 at ./drivers/gpu/drm/i915/intel_runtime_pm.h:115 gen12_fwtable_read32+0x233/0x300 [i915] <4> [379.686808] Modules linked in: i915(+) vgem snd_hda_codec_hdmi mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ax88179_178a usbnet mii ghash_clmulni_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm e1000e mei_me ptp mei pps_core intel_lpss_pci prime_numbers [last unloaded: i915] <4> [379.686827] CPU: 7 PID: 4869 Comm: i915_selftest Tainted: G U 5.7.0-rc1-CI-CI_DRM_8313+ #1 <4> [379.686830] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.2457.A13.1912190237 12/19/2019 <4> [379.686883] RIP: 0010:gen12_fwtable_read32+0x233/0x300 [i915] <4> [379.686887] Code: d8 ea e0 0f 0b e9 19 fe ff ff 80 3d ad 12 2d 00 00 0f 85 17 fe ff ff 48 c7 c7 b0 32 3e a0 c6 05 99 12 2d 00 01 e8 2d d8 ea e0 <0f> 0b e9 fd fd ff ff 8b 05 c4 75 56 e2 85 c0 0f 85 84 00 00 00 48 <4> [379.686889] RSP: 0018:ffffc90000727970 EFLAGS: 00010286 <4> [379.686892] RAX: 0000000000000000 RBX: ffff88848cc20ee8 RCX: 0000000000000001 <4> [379.686894] RDX: 0000000080000001 RSI: ffff88843b1f0900 RDI: 00000000ffffffff <4> [379.686896] RBP: 0000000000000000 R08: ffff88843b1f0900 R09: 0000000000000000 <4> [379.686898] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000a058 <4> [379.686900] R13: 0000000000000001 R14: ffff88848cc2bf30 R15: 00000000ffffffea <4> [379.686902] FS: 00007f7d63f5e300(0000) GS:ffff8884a0180000(0000) knlGS:0000000000000000 <4> [379.686904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [379.686907] CR2: 000055e5c30f4988 CR3: 000000042e190002 CR4: 0000000000760ee0 <4> [379.686910] PKRU: 55555554 <4> [379.686911] Call Trace: <4> [379.686986] live_rps_interrupt+0xb14/0xc10 [i915] <4> [379.687051] ? intel_rps_unpark+0xb0/0xb0 [i915] <4> [379.687057] ? __trace_bprintk+0x57/0x80 <4> [379.687143] __i915_subtests+0xb8/0x210 [i915] <4> [379.687222] ? __i915_live_teardown+0x50/0x50 [i915] <4> [379.687291] ? __intel_gt_live_setup+0x30/0x30 [i915] <4> [379.687361] __run_selftests+0x112/0x170 [i915] <4> [379.687431] i915_live_selftests+0x2c/0x60 [i915] <4> [379.687491] i915_pci_probe+0x93/0x1b0 [i915] Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200417093928.17822-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
It seems that although (perhaps because of the memory stall?) the spinner has signaled that it has started, it still takes some time to spin up to 100% utilisation of the HW. Since the test depends on the full utilisation of the HW to trigger the RPS interrupt, wait a little bit and flush the interrupt status to be sure that the event we see if from the spinner. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200417093928.17822-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Before we resume, we reset the HW so we restart from a known good state. However, as a part of the reset process, we drain our pending CS event queue -- and if we are resuming that does not correspond to internal state. On setup, we are scrubbing the CS pointers, but alas only on setup. Apply the sanitization not just to setup, but to all resumes. Reported-by: NVenkata Ramana Nayana <venkata.ramana.nayana@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200416114117.3460-1-chris@chris-wilson.co.uk
-
- 16 4月, 2020 3 次提交
-
-
由 Chris Wilson 提交于
If we use a non-forcewaked write to PMINTRMSK, it does not take effect until much later, if at all, causing a loss of RPS interrupts and no GPU reclocking, leaving the GPU running at the wrong frequency for long periods of time. Reported-by: NFrancisco Jerez <currojerez@riseup.net> Suggested-by: NFrancisco Jerez <currojerez@riseup.net> Fixes: 35cc7f32 ("drm/i915/gt: Use non-forcewake writes for RPS") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Francisco Jerez <currojerez@riseup.net> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Reviewed-by: NFrancisco Jerez <currojerez@riseup.net> Cc: <stable@vger.kernel.org> # v5.6+ Link: https://patchwork.freedesktop.org/patch/msgid/20200415170318.16771-2-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Since we depend upon RPS generating interrupts after evaluation intervals to determine when to up/down clock the GPU, it is imperative that we successfully enable interrupt generation! Verify that we do see an interrupt if we keep the GPU busy for an entire EI. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200415170318.16771-1-chris@chris-wilson.co.uk
-
由 Matt Roper 提交于
Even though the bspec is missing gen12 register details for the MCR selector register (0xFDC), this is confirmed by hardware folks to be a mistake; the register does exist and we do indeed need to steer multicast register reads to an appropriate instance the same as we did on gen11. Note that despite the lack of documentation we were still using the MCR selector to read INSTDONE and such in read_subslice_reg() too. Cc: Matt Atwood <matthew.s.atwood@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200414211118.2787489-4-matthew.d.roper@intel.comReviewed-by: NJosé Roberto de Souza <jose.souza@intel.com>
-
- 10 4月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
With timeslice yielding on a semaphore, we may complete timeslices much faster than we were expecting and already have yielded the stuck request. Before complaining that timeslicing is not enabled, check that we haven't already applied the switch. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NAndi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200410081638.19893-1-chris@chris-wilson.co.uk
-
- 08 4月, 2020 7 次提交
-
-
由 Chris Wilson 提交于
Since we are peeking into the batch object of the request, it is beholden on us to hold a reference to it. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1634Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200408091723.28937-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We control b->irq_enabled inside the b->irq_lock, but we check before entering the spinlock whether or not the interrupt is currently unmasked. [ 1511.735208] BUG: KCSAN: data-race in __intel_breadcrumbs_disarm_irq [i915] / intel_engine_disarm_breadcrumbs [i915] [ 1511.735231] [ 1511.735242] write to 0xffff8881f75fc214 of 1 bytes by interrupt on cpu 2: [ 1511.735440] __intel_breadcrumbs_disarm_irq+0x4b/0x160 [i915] [ 1511.735635] signal_irq_work+0x337/0x710 [i915] [ 1511.735652] irq_work_run_list+0xd7/0x110 [ 1511.735666] irq_work_run+0x1d/0x50 [ 1511.735681] smp_irq_work_interrupt+0x21/0x30 [ 1511.735701] irq_work_interrupt+0xf/0x20 [ 1511.735722] __do_softirq+0x6f/0x206 [ 1511.735736] irq_exit+0xcd/0xe0 [ 1511.735756] do_IRQ+0x44/0xc0 [ 1511.735773] ret_from_intr+0x0/0x1c [ 1511.735787] schedule+0x0/0xb0 [ 1511.735803] worker_thread+0x194/0x670 [ 1511.735823] kthread+0x19a/0x1e0 [ 1511.735837] ret_from_fork+0x1f/0x30 [ 1511.735848] [ 1511.735867] read to 0xffff8881f75fc214 of 1 bytes by task 432 on cpu 1: [ 1511.736068] intel_engine_disarm_breadcrumbs+0x22/0x80 [i915] [ 1511.736263] __engine_park+0x107/0x5d0 [i915] [ 1511.736453] ____intel_wakeref_put_last+0x44/0x90 [i915] [ 1511.736648] __intel_wakeref_put_last+0x5a/0x70 [i915] [ 1511.736842] intel_context_exit_engine+0xf2/0x100 [i915] [ 1511.737044] i915_request_retire+0x6b2/0x770 [i915] [ 1511.737244] retire_requests+0x7a/0xd0 [i915] [ 1511.737438] intel_gt_retire_requests_timeout+0x3a7/0x6f0 [i915] [ 1511.737633] i915_drop_caches_set+0x1e7/0x260 [i915] [ 1511.737650] simple_attr_write+0xfa/0x110 [ 1511.737665] full_proxy_write+0x94/0xc0 [ 1511.737679] __vfs_write+0x4b/0x90 [ 1511.737697] vfs_write+0xfc/0x280 [ 1511.737718] ksys_write+0x78/0x100 [ 1511.737732] __x64_sys_write+0x44/0x60 [ 1511.737751] do_syscall_64+0x6e/0x2c0 [ 1511.737769] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200408092916.5355-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The intel_ring.head is updated as the requests are retired, but is sampled at any time as we submit requests. Furthermore, it tracks RING_HEAD which is inherently asynchronous. [ 148.630314] BUG: KCSAN: data-race in execlists_dequeue [i915] / i915_request_retire [i915] [ 148.630349] [ 148.630374] write to 0xffff8881f4e28ddc of 4 bytes by task 90 on cpu 2: [ 148.630752] i915_request_retire+0xed/0x770 [i915] [ 148.631123] retire_requests+0x7a/0xd0 [i915] [ 148.631491] engine_retire+0xa6/0xe0 [i915] [ 148.631523] process_one_work+0x3af/0x640 [ 148.631552] worker_thread+0x80/0x670 [ 148.631581] kthread+0x19a/0x1e0 [ 148.631609] ret_from_fork+0x1f/0x30 [ 148.631629] [ 148.631652] read to 0xffff8881f4e28ddc of 4 bytes by task 14288 on cpu 3: [ 148.632019] execlists_dequeue+0x1300/0x1680 [i915] [ 148.632384] __execlists_submission_tasklet+0x48/0x60 [i915] [ 148.632770] execlists_submit_request+0x38e/0x3c0 [i915] [ 148.633146] submit_notify+0x8f/0xc0 [i915] [ 148.633512] __i915_sw_fence_complete+0x5d/0x3e0 [i915] [ 148.633875] i915_sw_fence_complete+0x58/0x80 [i915] [ 148.634238] i915_sw_fence_commit+0x16/0x20 [i915] [ 148.634613] __i915_request_queue+0x60/0x70 [i915] [ 148.634985] i915_gem_do_execbuffer+0x2de0/0x42b0 [i915] [ 148.635366] i915_gem_execbuffer2_ioctl+0x2ab/0x580 [i915] [ 148.635400] drm_ioctl_kernel+0xe9/0x130 [ 148.635429] drm_ioctl+0x27d/0x45e [ 148.635456] ksys_ioctl+0x89/0xb0 [ 148.635482] __x64_sys_ioctl+0x42/0x60 [ 148.635510] do_syscall_64+0x6e/0x2c0 [ 148.635542] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 645.071436] BUG: KCSAN: data-race in gen8_emit_fini_breadcrumb [i915] / i915_request_retire [i915] [ 645.071456] [ 645.071467] write to 0xffff8881efe403dc of 4 bytes by task 14668 on cpu 3: [ 645.071647] i915_request_retire+0xed/0x770 [i915] [ 645.071824] i915_request_create+0x6c/0x160 [i915] [ 645.072000] i915_gem_do_execbuffer+0x206d/0x42b0 [i915] [ 645.072177] i915_gem_execbuffer2_ioctl+0x2ab/0x580 [i915] [ 645.072194] drm_ioctl_kernel+0xe9/0x130 [ 645.072208] drm_ioctl+0x27d/0x45e [ 645.072222] ksys_ioctl+0x89/0xb0 [ 645.072235] __x64_sys_ioctl+0x42/0x60 [ 645.072248] do_syscall_64+0x6e/0x2c0 [ 645.072263] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 645.072275] [ 645.072285] read to 0xffff8881efe403dc of 4 bytes by interrupt on cpu 2: [ 645.072458] gen8_emit_fini_breadcrumb+0x158/0x300 [i915] [ 645.072636] __i915_request_submit+0x204/0x430 [i915] [ 645.072809] execlists_dequeue+0x8e1/0x1680 [i915] [ 645.072982] __execlists_submission_tasklet+0x48/0x60 [i915] [ 645.073154] execlists_submit_request+0x38e/0x3c0 [i915] [ 645.073330] submit_notify+0x8f/0xc0 [i915] [ 645.073499] __i915_sw_fence_complete+0x5d/0x3e0 [i915] [ 645.073668] i915_sw_fence_wake+0xc2/0x130 [i915] [ 645.073836] __i915_sw_fence_complete+0x2cf/0x3e0 [i915] [ 645.074006] i915_sw_fence_complete+0x58/0x80 [i915] [ 645.074175] dma_i915_sw_fence_wake+0x3e/0x80 [i915] [ 645.074344] signal_irq_work+0x62f/0x710 [i915] [ 645.074360] irq_work_run_list+0xd7/0x110 [ 645.074373] irq_work_run+0x1d/0x50 [ 645.074386] smp_irq_work_interrupt+0x21/0x30 [ 645.074400] irq_work_interrupt+0xf/0x20 [ 645.074414] _raw_spin_unlock_irqrestore+0x34/0x40 [ 645.074585] execlists_submission_tasklet+0xde/0x170 [i915] [ 645.074602] tasklet_action_common.isra.0+0x42/0x90 [ 645.074617] __do_softirq+0xc8/0x206 [ 645.074629] irq_exit+0xcd/0xe0 [ 645.074642] do_IRQ+0x44/0xc0 [ 645.074654] ret_from_intr+0x0/0x1c [ 645.074667] finish_task_switch+0x73/0x230 [ 645.074679] __schedule+0x1c5/0x4c0 [ 645.074691] schedule+0x45/0xb0 [ 645.074704] worker_thread+0x194/0x670 [ 645.074716] kthread+0x19a/0x1e0 [ 645.074729] ret_from_fork+0x1f/0x30 Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407221832.15465-1-chris@chris-wilson.co.uk
-
由 Jani Nikula 提交于
Prefer struct drm_device based logging over struct device based logging. No functional changes. Cc: Wambui Karuga <wambui.karugax@gmail.com> Reviewed-by: NWambui Karuga <wambui.karugax@gmail.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-17-jani.nikula@intel.com
-
由 Jani Nikula 提交于
Prefer struct drm_device based logging over struct device based logging. No functional changes. Cc: Wambui Karuga <wambui.karugax@gmail.com> Reviewed-by: NWambui Karuga <wambui.karugax@gmail.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-16-jani.nikula@intel.com
-
由 Jani Nikula 提交于
Prefer struct drm_device based logging over struct device based logging. No functional changes. Cc: Wambui Karuga <wambui.karugax@gmail.com> Reviewed-by: NWambui Karuga <wambui.karugax@gmail.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-10-jani.nikula@intel.com
-
由 Chris Wilson 提交于
Since the semaphore interrupt may cause us to yield the timeslice immediately, we may cancel the timer before we notice the submission is complete. The assertion is no longer valid due to the race with the interrupt. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407222625.15542-1-chris@chris-wilson.co.uk
-