- 18 12月, 2013 1 次提交
-
-
由 Daniel Vetter 提交于
Less yelling ftw! Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 03 12月, 2013 1 次提交
-
-
由 Christian König 提交于
Also rename the function to better reflect what it is doing. agd5f: fix argument size warning Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 11月, 2013 1 次提交
-
-
由 Christian König 提交于
To workaround bugs and/or certain limits it's sometimes useful to fall back to waiting on fences. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 02 11月, 2013 1 次提交
-
-
由 Maarten Lankhorst 提交于
op 08-10-13 18:58, Thomas Hellstrom schreef: > On 10/08/2013 06:47 PM, Jerome Glisse wrote: >> On Tue, Oct 08, 2013 at 06:29:35PM +0200, Thomas Hellstrom wrote: >>> On 10/08/2013 04:55 PM, Jerome Glisse wrote: >>>> On Tue, Oct 08, 2013 at 04:45:18PM +0200, Christian König wrote: >>>>> Am 08.10.2013 16:33, schrieb Jerome Glisse: >>>>>> On Tue, Oct 08, 2013 at 04:14:40PM +0200, Maarten Lankhorst wrote: >>>>>>> Allocate and copy all kernel memory before doing reservations. This prevents a locking >>>>>>> inversion between mmap_sem and reservation_class, and allows us to drop the trylocking >>>>>>> in ttm_bo_vm_fault without upsetting lockdep. >>>>>>> >>>>>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> >>>>>> I would say NAK. Current code only allocate temporary page in AGP case. >>>>>> So AGP case is userspace -> temp page -> cs checker -> radeon ib. >>>>>> >>>>>> Non AGP is directly memcpy to radeon IB. >>>>>> >>>>>> Your patch allocate memory memcpy userspace to it and it will then be >>>>>> memcpy to IB. Which means you introduce an extra memcpy in the process >>>>>> not something we want. >>>>> Totally agree. Additional to that there is no good reason to provide >>>>> anything else than anonymous system memory to the CS ioctl, so the >>>>> dependency between the mmap_sem and reservations are not really >>>>> clear to me. >>>>> >>>>> Christian. >>>> I think is that in other code path you take mmap_sem first then reserve >>>> bo. But here we reserve bo and then we take mmap_sem because of copy >>> >from user. >>>> Cheers, >>>> Jerome >>>> >>> Actually the log message is a little confusing. I think the mmap_sem >>> locking inversion problem is orthogonal to what's being fixed here. > >>> This patch fixes the possible recursive bo::reserve caused by > >>> malicious user-space handing a pointer to ttm memory so that the ttm > >>> fault handler is called when bos are already reserved. That may > >>> cause a (possibly interruptible) livelock. >>> Once that is fixed, we are free to choose the mmap_sem -> >>> bo::reserve locking order. Currently it's bo::reserve->mmap_sem(), >>> but the hack required in the ttm fault handler is admittedly a bit >>> ugly. The plan is to change the locking order to >>> mmap_sem->bo::reserve > >>> I'm not sure if it applies to this particular case, but it should be > >>> possible to make sure that copy_from_user_inatomic() will always > >>> succeed, by making sure the pages are present using > >>> get_user_pages(), and release the pages after > >>> copy_from_user_inatomic() is done. That way there's no need for a > >>> double memcpy slowpath, but if the copied data is very fragmented I > >>> guess the resulting code may look ugly. The get_user_pages() > >>> function will return an error if it hits TTM pages. >>> /Thomas >> get_user_pages + copy_from_user_inatomic is overkill. We should just >> do get_user_pages which fails with ttm memory and then use copy_highpage >> helper. >> >> Cheers, >> Jerome > Yeah, it may well be that that's the preferred solution. > > /Thomas > I still disagree, and shuffled radeon_ib_get around to be called sooner. How does the patch below look? 8<------- Allocate and copy all kernel memory before doing reservations. This prevents a locking inversion between mmap_sem and reservation_class, and allows us to drop the trylocking in ttm_bo_vm_fault without upsetting lockdep. Changes since v1: - Kill extra memcpy for !AGP case. Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com> Reviewed-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 19 10月, 2013 1 次提交
-
-
由 Christian König 提交于
This only seem to work for H.264 but not for VC-1 streams. Need to investigate further why exactly. This reverts commit 4b40e592. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 9月, 2013 1 次提交
-
-
由 Christian König 提交于
Starting with UVD3 message and feedback buffers have their own 256MB segment, so no need to force them into VRAM any more. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 9月, 2013 1 次提交
-
-
由 Alex Deucher 提交于
If the user has forced the driver to use the internal GPU gart rather than AGP on an AGP card, force the buffers to vram as well. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Tested-by: NDieter Nützel <Dieter@nuetzel-hh.de> Cc: stable@vger.kernel.org
-
- 16 9月, 2013 1 次提交
-
-
由 Christian König 提交于
Putting everything into VRAM seems to help. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 11 9月, 2013 1 次提交
-
-
由 Christian König 提交于
Neither complete nor perfect, but solves my problem at hand and might be useful in the future. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 31 8月, 2013 2 次提交
-
-
由 Christian König 提交于
Give the ring functions a separate structure and let the asic structure point to the ring specific functions. This simplifies the code and allows us to make changes at only one point. No change in functionality. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Use the UVD handle information to determine which which power states to select when using UVD. For example, decoding a single SD stream requires much lower clocks than multiple HD streams. v2: switch to a cleaner dpm/uvd interface v3: change the uvd power state while streams are active if need be Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 6月, 2013 2 次提交
-
-
由 Maarten Lankhorst 提交于
This commit converts the source of the val_seq counter to the ww_mutex api. The reservation objects are converted later, because there is still a lockdep splat in nouveau that has to resolved first. Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com> Reviewed-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Alex Deucher 提交于
When using UVD, the driver must switch to a special UVD power state. In the CS ioctl, switch to the power state and schedule work to change the power state back, when the work comes up, check if uvd is still busy and if not, switch back to the user state, otherwise, reschedule the work. Note: We really need some better way to decide when to switch out of the uvd power state. Switching power states while playback is active make uvd angry. V2: fix locking. V3: switch from timer to delayed work V4: check fence driver for UVD jobs, reduce timeout to 1 second and rearm timeout on activity v5: rebase on new dpm tree v6: rebase on interim uvd on demand changes v7: fix UVD when DPM is disabled v8: unify non-DPM and DPM UVD handling v9: remove leftover idle work struct Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NChristian König <deathsimple@vodafone.de>
-
- 27 6月, 2013 1 次提交
-
-
由 Alex Deucher 提交于
On CIK, the compute rings work slightly differently than on previous asics, however the basic concepts are the same. The main differences: - New MEC engines for compute queues - Multiple queues per MEC: - CI/KB: 1 MEC, 4 pipes per MEC, 8 queues per pipe = 32 queues - KV: 2 MEC, 4 pipes per MEC, 8 queues per pipe = 64 queues - Queues can be allocated and scheduled by another queue - New doorbell aperture allows you to assign space in the aperture for the wptr which allows for userspace access to queues v2: add wptr shadow, fix eop setup v3: fix comment v4: switch to new callback method Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NJerome Glisse <jglisse@redhat.com>
-
- 26 6月, 2013 1 次提交
-
-
由 Alex Deucher 提交于
Sets up the GFX ring and loads ucode for GFX and Compute. Todo: - handle compute queue setup. v2: add documentation v3: integrate with latest reset changes v4: additional init fixes v5: scratch reg write back no longer supported on CIK v6: properly set CP_RB0_BASE_HI v7: rebase Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 24 4月, 2013 1 次提交
-
-
由 Christian König 提交于
That not only saves some power, but also solves problems with older chips where an idle UVD block on higher clocks can cause problems. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 4月, 2013 3 次提交
-
-
由 Christian König 提交于
Just everything needed to decode videos using UVD. v6: just all the bugfixes and support for R7xx-SI merged in one patch v7: UVD_CGC_GATE is a write only register, lockup detection fix v8: split out VRAM fallback changes, remove support for RV770, add support for HEMLOCK, add buffer sizes checks Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Let the CS module decide if we can fall back to VRAM or not. v2: remove unintended change Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
v2: update error message and comment Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 2月, 2013 1 次提交
-
-
由 Alex Deucher 提交于
For very large page table updates, we can exceed the size of the ring. To avoid this, use an IB to perform the page table update. v2(ck): cleanup the IB infrastructure and the use it instead of filling the struct ourself. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NChristian König <christian.koenig@amd.com>
-
- 01 2月, 2013 6 次提交
-
-
由 Ilija Hadzic 提交于
next_reloc function does the same thing in all ASICs with the exception of R600 which has a special case in legacy mode. Pull out the common function in preparation for refactoring. Signed-off-by: NIlija Hadzic <ihadzic@research.bell-labs.com> Reviewed-by: NMarek Olšák <maraeo@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ilija Hadzic 提交于
This function is not limited to r100, but it can dump a (raw) packet for any ASIC. Rename it accordingly and move its declaration to radeon.h Signed-off-by: NIlija Hadzic <ihadzic@research.bell-labs.com> Reviewed-by: NMarek Olšák <maraeo@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ilija Hadzic 提交于
Once we factored out radeon_cs_packet_parse function, evergreen_cs_next_is_pkt3_nop and r600_cs_next_is_pkt3_nop functions became identical, so they can be factored out into a common function. Signed-off-by: NIlija Hadzic <ihadzic@research.bell-labs.com> Reviewed-by: NMarek Olšák <maraeo@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ilija Hadzic 提交于
CS packet parse functions have a lot of in common across all ASICs. Implement a common function and take care of small differences between families inside the function. This patch is a prep for major refactoring and consolidation of CS parsing code. Signed-off-by: NIlija Hadzic <ihadzic@research.bell-labs.com> Reviewed-by: NMarek Olšák <maraeo@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ilija Hadzic 提交于
Signed-off-by: NIlija Hadzic <ihadzic@research.bell-labs.com> Reviewed-by: NMarek Olšák <maraeo@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ilija Hadzic 提交于
length_dw field was assigned twice. While at it, move user_ptr assignment together with all other assignments to p->chunks[i] structure to make the code more readable. Signed-off-by: NIlija Hadzic <ihadzic@research.bell-labs.com> Reviewed-by: NMarek Olšák <maraeo@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 24 1月, 2013 1 次提交
-
-
由 Ilija Hadzic 提交于
If one (but not both) allocations of p->chunks[].kpage[] in radeon_cs_parser_init fail, the error path will free the successfully allocated page, but leave a stale pointer value in the kpage[] field. This will later cause a double-free when radeon_cs_parser_fini is called. This patch fixes the issue by forcing both pointers to NULL after kfree in the error path. The circumstances under which the problem happens are very rare. The card must be AGP and the system must run out of kmalloc area just at the right time so that one allocation succeeds, while the other fails. Signed-off-by: NIlija Hadzic <ihadzic@research.bell-labs.com> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 11 1月, 2013 2 次提交
-
-
由 Ilija Hadzic 提交于
Index into chunks[] array doesn't look right. Signed-off-by: NIlija Hadzic <ihadzic@research.bell-labs.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ilija Hadzic 提交于
In UMS mode parser->rdev is NULL, so dereferencing will cause an oops. Signed-off-by: NIlija Hadzic <ihadzic@research.bell-labs.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 12月, 2012 2 次提交
-
-
由 Alex Deucher 提交于
This enables the functionality added in the previous patches. Userspace acceleration drivers can use the CS ioctl to submit command buffers to the async DMA rings. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Allows us to use the DMA ring from userspace. DMA doesn't have a good NOP packet in which to embed the reloc idx, so userspace has to add a reloc for each buffer used and order them to match the command stream. v2: fix address bounds checking, reloc indexing Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 10月, 2012 1 次提交
-
-
由 Christian König 提交于
Make it possible to allocate a persistent page table. Signed-off-by: NChristian König <deathsimple@vodafone.de> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 03 10月, 2012 1 次提交
-
-
由 David Howells 提交于
Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NDave Airlie <airlied@redhat.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NDave Jones <davej@redhat.com>
-
- 21 9月, 2012 7 次提交
-
-
由 Christian König 提交于
When a VM is used on more than one ring we need to sync to the last user. Signed-off-by: NChristian König <deathsimple@vodafone.de> Reviewed-by: NJerome Glisse <jglisse@redhat.com>
-
由 Lauri Kasanen 提交于
Let's allow GCC to optimize better. This exposed some five unused functions, but this patch doesn't remove them. Signed-off-by: NLauri Kasanen <cand@gmx.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jerome Glisse 提交于
Make sure that the ib bo is bound and is page table is up to date in the virtual address space. Signed-off-by: NJerome Glisse <jglisse@redhat.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
由 Christian König 提交于
Removing the need to wait for anything. Still not ideal, since we need to free pt on va remove. Signed-off-by: NChristian König <deathsimple@vodafone.de> Reviewed-by: NJerome Glisse <jglisse@redhat.com>
-
由 Christian König 提交于
Move binding onto the ring, simplifying handling a bit. Signed-off-by: NChristian König <deathsimple@vodafone.de> Reviewed-by: NJerome Glisse <jglisse@redhat.com>
-
由 Christian König 提交于
Move flushing the VMs as function into the rings. First step to make VM operations async. Signed-off-by: NChristian König <deathsimple@vodafone.de> Reviewed-by: NJerome Glisse <jglisse@redhat.com>
-
由 Christian König 提交于
Signed-off-by: NChristian König <deathsimple@vodafone.de> Reviewed-by: NJerome Glisse <jglisse@redhat.com>
-