- 09 7月, 2015 1 次提交
-
-
由 Mario Kleiner 提交于
Trying to resolve issues with missed vblanks and impossible values inside delivered kms pageflip completion events showed that radeon's irq handling sometimes doesn't handle valid irqs, but silently skips them. This was observed for vblank interrupts. Although those irqs have corresponding events queued in the gpu's irq ring at time of interrupt, and therefore the corresponding handling code gets triggered by these events, the handling code sometimes silently skipped processing the irq. The reason for those skips is that the handling code double-checks for each irq event if the corresponding irq status bits in the irq status registers are set. Sometimes those bits are not set at time of check for valid irqs, maybe due to some hardware race on some setups? The problem only seems to happen on some machine + card combos sometimes, e.g., never happened during my testing of different PC cards of the DCE-2/3/4 generation a year ago, but happens consistently now on two different Apple Mac cards (RV730, DCE-3, Apple iMac and Evergreen JUNIPER, DCE-4 in a Apple MacPro). It also doesn't happen at each interrupt but only occassionally every couple of hundred or thousand vblank interrupts. This results in XOrg warning messages like "[ 7084.472] (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip completion event has impossible msc 420120 < target_msc 420121" as well as skipped frames and problems for applications that use kms pageflip events or vblank events, e.g., users of DRI2 and DRI3/Present, Waylands Weston compositor, etc. See also https://bugs.freedesktop.org/show_bug.cgi?id=85203 After some talking to Alex and Michel, we decided to fix this by turning the double-check for asserted irq status bits into a warning. Whenever a irq event is queued in the IH ring, always execute the corresponding interrupt handler. Still check the irq status bits, but only to log a DRM_DEBUG message on a mismatch. This fixed the problems reliably on both previously failing cards, RV-730 dual-head tested on both crtcs (pipes D1 and D2) and a triple-output Juniper HD-5770 card tested on all three available crtcs (D1/D2/D3). The r600 and evergreen irq handling is therefore tested, but the cik an si handling is only compile tested due to lack of hw. Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NMario Kleiner <mario.kleiner.de@gmail.com> CC: Michel Dänzer <michel.daenzer@amd.com> CC: Alex Deucher <alexander.deucher@amd.com> CC: <stable@vger.kernel.org> # v3.16+ Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 6月, 2015 1 次提交
-
-
由 Jérôme Glisse 提交于
In order for hibernation to reliably work we need to cleanup more thoroughly the compute ring. Hibernation is different from suspend resume as when we resume from hibernation the hardware is first fully initialize by regular kernel then freeze callback happens (which correspond to a suspend inside the radeon kernel driver) and turn off each of the block. It turns out we were not cleanly shutting down the compute ring. This patch fix that. Hibernation and suspend to ram were tested (several times) on : Bonaire Hawaii Mullins Kaveri Kabini Changed since v1: - Factor the ring stop logic into a function taking ring as arg. Cc: stable@vger.kernel.org Signed-off-by: NJérôme Glisse <jglisse@redhat.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 5月, 2015 1 次提交
-
-
由 Denys Vlasenko 提交于
This patch deinlines indirect register accessor functions. These functions perform two mmio accesses, framed by spin lock/unlock. Spin lock/unlock by itself takes more than 50 cycles in ideal case (if lock is exclusively cached on current CPU). With this .config: http://busybox.net/~vda/kernel_config, after uninlining these functions have sizes and callsite counts as follows: r600_uvd_ctx_rreg: 111 bytes, 4 callsites r600_uvd_ctx_wreg: 113 bytes, 5 callsites eg_pif_phy0_rreg: 106 bytes, 13 callsites eg_pif_phy0_wreg: 108 bytes, 13 callsites eg_pif_phy1_rreg: 107 bytes, 13 callsites eg_pif_phy1_wreg: 108 bytes, 13 callsites rv370_pcie_rreg: 111 bytes, 21 callsites rv370_pcie_wreg: 113 bytes, 24 callsites r600_rcu_rreg: 111 bytes, 16 callsites r600_rcu_wreg: 113 bytes, 25 callsites cik_didt_rreg: 106 bytes, 10 callsites cik_didt_wreg: 107 bytes, 10 callsites tn_smc_rreg: 106 bytes, 126 callsites tn_smc_wreg: 107 bytes, 116 callsites eg_cg_rreg: 107 bytes, 20 callsites eg_cg_wreg: 108 bytes, 52 callsites Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and a locked (slow) path. This patch deinlines only slow path. r100_mm_rreg_slow: 78 bytes, 2083 callsites r100_mm_wreg_slow: 81 bytes, 3570 callsites Reduction in code size is more than 65,000 bytes: text data bss dec hex filename 85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before 85674192 22294776 20627456 128598664 7aa4288 vmlinux Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 5月, 2015 1 次提交
-
-
由 Christian König 提交于
We have that bug for years and some users report side effects when fixing it on older hardware. So revert it for VM_CONTEXT0_PAGE_TABLE_END_ADDR, but keep it for VM 1-15. Signed-off-by: NChristian König <christian.koenig@amd.com> CC: stable@vger.kernel.org Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 5月, 2015 1 次提交
-
-
由 Christian König 提交于
The mapping range is inclusive between starting and ending addresses. Signed-off-by: NChristian König <christian.koenig@amd.com> CC: stable@vger.kernel.org Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 20 3月, 2015 2 次提交
-
-
由 Alex Deucher 提交于
This adds support to process short HPD irqs on CIK gpus. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Registers that can be fetched from the info ioctl. Tested-by: NMarek Olšák <marek.olsak@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 04 3月, 2015 1 次提交
-
-
由 Alex Deucher 提交于
To make sure the writes go through the pci bridge. bug: https://bugzilla.kernel.org/show_bug.cgi?id=90741Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 26 2月, 2015 1 次提交
-
-
由 Leo Liu 提交于
v2: disable it on suspend Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 2月, 2015 2 次提交
-
-
由 Alex Deucher 提交于
Enable at init and disable on fini. Workaround for hardware problems. v2 (chk): extend commit message v3: add new function Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> (v2) Cc: stable@vger.kernel.org
-
由 Christian König 提交于
Emit the EOP twice to avoid cache flushing problems. Signed-off-by: NChristian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 22 1月, 2015 2 次提交
-
-
由 Slava Grigorev 提交于
Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NSlava Grigorev <slava.grigorev@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Slava Grigorev 提交于
Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NSlava Grigorev <slava.grigorev@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 1月, 2015 1 次提交
-
-
由 Alex Deucher 提交于
We need to wait for the GPUVM flush to complete. There was some confusion as to how this mechanism was supposed to work. The operation is not atomic. For GPU initiated invalidations you need to read back a VM register to introduce enough latency for the update to complete. v2: drop gart changes v3: just read back rather than polling Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 21 11月, 2014 4 次提交
-
-
由 Christian König 提交于
Use multiple VMIDs for each VM, one for each ring. That allows us to execute flushes separately on each ring, still not ideal cause in a lot of cases rings can share IDs. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Previously we just allocated space for four hardware semaphores in each software semaphore object. Make software semaphore objects represent only one hardware semaphore address again by splitting the sync code into it's own object. v2: fix typo in comment Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Use ring structure instead of index and provide vm_id and pd_addr separately. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Always need to set bit 0 of RLC_CGTT_MGCG_OVERRIDE to avoid unreliable doorbell updates in some cases. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 13 11月, 2014 1 次提交
-
-
由 Alex Deucher 提交于
Workaround for memory link training on certain variants of 0x6649. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 11月, 2014 2 次提交
-
-
由 Alex Deucher 提交于
The power management code calls into the display code for certain things. If certain power management sysfs attributes are called before the driver has finished initializing all of the hardware we can run into problems with uninitialized modesetting state. Add a check to make sure modesetting init has completed to the bandwidth update callbacks to fix this. Can be triggered by the tlp and laptop start up scripts depending on the timing. bugs: https://bugzilla.kernel.org/show_bug.cgi?id=83611 https://bugs.freedesktop.org/show_bug.cgi?id=85771Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
由 Jammy Zhou 提交于
CE ram size is 32k/0k/0k for GFX/CS0/CS1 with CIK Ported from amdgpu driver. Signed-off-by: NJammy Zhou <Jammy.Zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 03 10月, 2014 2 次提交
-
-
由 Maarten Lankhorst 提交于
Adds an extra argument to radeon_bo_create, which is only used in radeon_prime.c. Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Maarten Lankhorst 提交于
Not the whole world is a radeon! :-) Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 10月, 2014 1 次提交
-
-
由 Alex Deucher 提交于
Helpful for debugging as the version shows up in a register dump. Cc: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 9月, 2014 4 次提交
-
-
由 Alex Deucher 提交于
Otherwise we may fail to init the second compute ring. Noticed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
由 Michel Dänzer 提交于
This might decrease the chance of IH ring buffer overflows. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Michel Dänzer 提交于
Use the same format for all ring indices, and fix the calculation of the post-overflow RPTR. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Michel Dänzer 提交于
Otherwise the bit remains set in rdev->ih.rptr, so the wptr can never match that and we still have an infinite loop. This fix allows me to successfully recover from an IH ring buffer overflow. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 9月, 2014 1 次提交
-
-
由 Christian König 提交于
This allows us to specify if we want to sync to the shared fences of a reservation object or not. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 8月, 2014 2 次提交
-
-
由 Alex Williamson 提交于
If we assign a Radeon device to a virtual machine, we can no longer assume a fixed hardware topology, like the GPU having a parent device. This patch simply adds a few pci_is_root_bus() tests to avoid passing a NULL pointer to PCI access functions, allowing the radeon driver to work in a QEMU 440FX machine with an assigned HD8570 on the emulated PCI root bus. Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Blocking completely innocent processes with a GPU reset is a pretty bad idea. Just set needs_reset and let the next command submission or fence wait do the job. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 27 8月, 2014 1 次提交
-
-
由 Christian König 提交于
This fixes a problem with GPU resets and TLB flushes on SI/CIK. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 22 8月, 2014 1 次提交
-
-
由 Alex Deucher 提交于
bug: https://bugs.freedesktop.org/show_bug.cgi?id=82912Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 20 8月, 2014 2 次提交
-
-
由 Alex Deucher 提交于
Need to initialize the mask to 0 on init, otherwise it keeps increasing. bug: https://bugzilla.kernel.org/show_bug.cgi?id=82581 v2: also fix cu count v3: split count fix into separate patch Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Cc: stable@vger.kernel.org
-
由 Alex Deucher 提交于
This fixes the CU count reported to userspace for OpenCL. bug: https://bugzilla.kernel.org/show_bug.cgi?id=82581Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Cc: stable@vger.kernel.org
-
- 19 8月, 2014 2 次提交
-
-
由 Christian König 提交于
Fixes lockups due to CP read GPUVM faults when running piglit on Cape Verde. v2 (chk): apply the fix to R600+ as well, on CIK only the GFX CP has a PFP, add more comments to R600 code, enable flushing again v3: (agd5f): only apply to 7xx+. r6xx does not have the packet. v4: (agd5f): split flush change into a separate patch, fix formatting Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Tested-by: NMichel Dänzer <michel.daenzer@amd.com>
-
由 Michel Dänzer 提交于
It isn't necessary for command streams generated by the kernel (at least not while we aren't storing ring or indirect buffers in VRAM). Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 03 1月, 2015 1 次提交
-
-
由 Ben Goz 提交于
This patch moves to radeon the initialization of compute vmid. That initializations was done in kfd-->kgd interface, but doing it in radeon as part of radeon's H/W initialization routines is more appropriate. In addition, this simplifies the kfd-->kgd interface. The patch removes the function from the interface file and from the interface declaration file. The function initializes memory apertures to fixed base/limit address and non cached memory types. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 8月, 2014 1 次提交
-
-
由 Alex Deucher 提交于
May fix hangs in some cases. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 8月, 2014 1 次提交
-
-
由 Mario Kleiner 提交于
Skip the "manual" pageflip completion checks via polling and guessing in the vblank handler radeon_crtc_handle_vblank() on asics which are known to reliably support hw pageflip completion irqs. Those pflip irqs are a more reliable and race-free method of handling pageflip completion detection, whereas the "classic" polling method has some small races in combination with dpm on, and with the reworked pageflip implementation since Linux 3.16. On old asics without pflip irqs, the classic method is used. On asics with known good pflip irqs, only pflip irqs are used by default, but a new module parameter "use_pflipirqs" allows to override this in case we encounter asics in the wild with unreliable or faulty pflip irqs. A module parameter of 0 allows to use the classic method only in such a case. A parameter of 1 allows to use both classic method and pflip irqs as additional band-aid to avoid some small races which could happen with the classic method alone. The setting 1 gives Linux 3.16 behaviour. Hw pflip irqs are available since R600. Tested on DCE-4, AMD Cedar - FirePro 2270. v2: agd5f: only enable pflip interrupts on DCE4+ as they are not reliable on older asics. Signed-off-by: NMario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-