- 19 6月, 2015 1 次提交
-
-
由 Michel Dänzer 提交于
This tells userspace that it's safe to use the RADEON_VA_UNMAP operation of the DRM_RADEON_GEM_VA ioctl. Cc: stable@vger.kernel.org (NOTE: Backporting this commit requires at least backports of commits 26d4d129, 48afbd70 and c29c0876 as well, otherwise using RADEON_VA_UNMAP runs into trouble) Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NChristian König <christian.koenig@amd.com>
-
- 20 3月, 2015 3 次提交
-
-
由 Alex Deucher 提交于
This allows us to query certain registers from userspace for profiling and harvest configuration. E.g., it can be used by the GALLIUM_HUD for profiling the status of various gfx blocks. Tested-by: NMarek Olšák <marek.olsak@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Allow the UMDs to query the current sclk/mclk for profiling, etc. Tested-by: NMarek Olšák <marek.olsak@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Useful for profiling. Tested-by: NMarek Olšák <marek.olsak@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 9月, 2014 2 次提交
-
-
由 Alex Deucher 提交于
Allows pinning of buffers in the non-CPU visible portion of vram. v2: incorporate Michel's comments. v3: rebase on Michel's patch v4: rebase on Michel's v2 patch Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
-
由 Michel Dänzer 提交于
This flag is a hint that userspace expects the BO to be accessed by the CPU. We can use that hint to prevent such BOs from ever being stored in the CPU inaccessible part of VRAM. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 19 8月, 2014 1 次提交
-
-
由 Christian König 提交于
Instead of hard coding the value properly document that this is an userspace interface. No intended functional change. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 11 8月, 2014 4 次提交
-
-
由 Christian König 提交于
Whenever userspace mapping related to our userptr change we wait for it to become idle and unmap it from GTT. v2: rebased, fix mutex unlock in error path v3: improve commit message Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
This way we test userptr availability at BO creation time instead of first use. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Avoid problems with writeback by limiting userptr to anonymous memory. v2: add commit and code comments Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
This patch adds an IOCTL for turning a pointer supplied by userspace into a buffer object. It imposes several restrictions upon the memory being mapped: 1. It must be page aligned (both start/end addresses, i.e ptr and size). 2. It must be normal system memory, not a pointer into another map of IO space (e.g. it must not be a GTT mmapping of another object). 3. The BO is mapped into GTT, so the maximum amount of memory mapped at all times is still the GTT limit. 4. The BO is only mapped readonly for now, so no write support. 5. List of backing pages is only acquired once, so they represent a snapshot of the first use. Exporting and sharing as well as mapping of buffer objects created by this function is forbidden and results in an -EPERM. v2: squash all previous changes into first public version v3: fix tabs, map readonly, don't use MM callback any more v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages, pin/unpin pages on bind/unbind instead of populate/unpopulate v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown flags, better handle READONLY flag, improve permission check v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin v7: add warning about it's availability in the API definition v8: drop access_ok check, fix VM mapping bits Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v4) Reviewed-by: Jérôme Glisse <jglisse@redhat.com> (v4) Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 8月, 2014 1 次提交
-
-
由 Michel Dänzer 提交于
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 6月, 2014 1 次提交
-
-
由 Alex Deucher 提交于
Query to find out how many compute units on a GPU. Useful for OpenCL usermode drivers. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 03 3月, 2014 2 次提交
-
-
由 Marek Olšák 提交于
The statistics are: - VRAM usage in bytes - GTT usage in bytes - number of bytes moved by TTM The last one is actually a counter, so you need to sample it before and after command submission and take the difference. This is useful for finding performance bottlenecks. Userspace queries are also added. v2: use atomic64_t Signed-off-by: NMarek Olšák <marek.olsak@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
由 Marek Olšák 提交于
When passing buffers between processes, the receiving process needs to know the original buffer domain, so that it doesn't accidentally move the buffer. v2: reserve the buffer Signed-off-by: NMarek Olšák <marek.olsak@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
- 18 2月, 2014 2 次提交
-
-
由 Christian König 提交于
Also make the result available to userspace. Signed-off-by: NChristian König <christian.koenig@amd.com>
-
由 Christian König 提交于
Only VCE 2.0 support so far. v2: squashing multiple patches into this one v3: add IRQ support for CIK, major cleanups, basic code documentation v4: remove HAINAN from chipset list Signed-off-by: NChristian König <christian.koenig@amd.com>
-
- 21 1月, 2014 1 次提交
-
-
由 Alex Deucher 提交于
This is needed for reporting the max GPU engine clock in OpenCL. This just reports the max possible engine clock, it does not take into account current conditions that may limit that clock. v2: fix query number for merge with 3.13 Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 12月, 2013 1 次提交
-
-
由 Marek Olšák 提交于
This will allow userspace to correctly program the PA_SC_RASTER_CONFIG register, so it can be considered a fix. Signed-off-by: NMarek Olšák <marek.olsak@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 18 11月, 2013 1 次提交
-
-
由 Michel Dänzer 提交于
This is required to properly calculate the tiling parameters in userspace. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 9月, 2013 1 次提交
-
-
由 Michel Dänzer 提交于
CIK uses a different index for 1D DST surfaces compared to SI. Expose the new index so libdrm_radeon can use it properly for userspace drivers. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 31 8月, 2013 1 次提交
-
-
由 Tom Stellard 提交于
Also add a new RADEON_INFO query to check that CP DMA packets are supported on the compute ring. CP DMA has been supported since the 3.8 kernel, but due to an oversight we forgot to teach the CS checker that the CP DMA packet was legal for the compute ring on Southern Islands GPUs. This patch fixes a bug where the radeon driver will incorrectly reject a legal CP DMA packet from user space. I would like to have the patch backported to stable so that we don't have to require Mesa users to use a bleeding edge kernel in order to take advantage of this feature which is already present in the stable kernels (3.8 and newer). v2: - Don't bump kms version, so this patch can be backported to stable kernels. Cc: stable@vger.kernel.org Signed-off-by: NTom Stellard <thomas.stellard@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 4月, 2013 2 次提交
-
-
由 Jerome Glisse 提交于
Allow userspace to query for the tile mode array so userspace can properly compute surface pitch and alignment requirement depending on tiling. v2: Make strict aliasing safer by casting to char when copying v3: merge fix from Christian Signed-off-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Add new ioctl option and bumb minor version number. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 4月, 2013 2 次提交
-
-
由 Christian König 提交于
Just everything needed to decode videos using UVD. v6: just all the bugfixes and support for R7xx-SI merged in one patch v7: UVD_CGC_GATE is a write only register, lockup detection fix v8: split out VRAM fallback changes, remove support for RV770, add support for HEMLOCK, add buffer sizes checks Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Samuel Li 提交于
This patch allows the CPU to map the stolen vram segment directly rather than going through the PCI BAR. This significantly improves performance for certain workloads with a properly patched ddx. Use radeon.fastfb=1 to enable it (disabled by default). Currently only supported on RS690, but support for RS780/880 and newer APUs may be added eventually. Signed-off-by: NSamuel Li <samuel.li@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 12月, 2012 1 次提交
-
-
由 Alex Deucher 提交于
This enables the functionality added in the previous patches. Userspace acceleration drivers can use the CS ioctl to submit command buffers to the async DMA rings. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 12月, 2012 2 次提交
-
-
由 Alex Deucher 提交于
Add requests to get the number of shader engines (SE) and the number of SH per SE. These are needed for geometry and tesselation shaders in the 3D driver as well as setting up PA_SC_RASTER_CONFIG on SI asics. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Marek Olšák 提交于
No version bump is required because setting the flag on older DRM has no effect. This only reserves the bit and doesn't use it. I assume we will use it for buffer eviction heuristics. Signed-off-by: NMarek Olšák <maraeo@gmail.com>
-
- 05 10月, 2012 1 次提交
-
-
由 David Howells 提交于
Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NDave Jones <davej@redhat.com>
-
- 03 10月, 2012 1 次提交
-
-
由 David Howells 提交于
Convert #include "..." to #include <path/...> in kernel system headers. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NDave Jones <davej@redhat.com>
-
- 13 8月, 2012 1 次提交
-
-
由 Marek Olšák 提交于
Returns a snapshot of the GPU clock counter. Needed for certain OpenGL extensions. v2: agd5f - address Jerome's comments - add function documentation Signed-off-by: NMarek Olšák <maraeo@gmail.com> Reviewed-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 5月, 2012 1 次提交
-
-
由 Jerome Glisse 提交于
We need to sync with the GFX ring as ttm might have schedule bo move on it and new command scheduled for other ring need to wait for bo data to be in place. Signed-off-by: NJerome Glisse <jglisse@redhat.com> Reviewed by: Christian König <christian.koenig@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 21 3月, 2012 2 次提交
-
-
由 Alex Deucher 提交于
This adds a new chunk id to the CS ioctl to support the INDIRECT_BUFFER_CONST packet. On SI, the CP adds a new engine called the CE (Constant Engine) which runs simulatenously with the DE (Drawing Engine, formerly called the ME). This allows the CP to process two related IBs simultaneously. The CE is tasked with loading the constant data (constant buffers, resource descriptors, samplers, etc.) while the DE loads context register state and issues drawing commands. It's up to the userspace application to sychronize the CE and the DE using special synchronization packets. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Tom Stellard 提交于
The maximum number of pipes is needed by the user space compute driver to calculate the number of wavefronts per thread group. Signed-off-by: NTom Stellard <thomas.stellard@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 13 2月, 2012 1 次提交
-
-
由 Jerome Glisse 提交于
evergreen and northern island gpu needs more informations for 2D tiling than previous r6xx/r7xx. Add field to tiling ioctl to allow userspace to provide those. The v8 cs checking change to track color view on r6xx/r7xx doesn't affect old userspace as old userspace always emited 0 for this register. v2 fix r6xx/r7xx 2D tiling computation v3 fix r6xx/r7xx height align for untiled surface & add support for tile split on evergreen and newer v4 improve tiling debugging output v5 fix tile split code for evergreen and newer v6 set proper tile split for crtc register v7 fix tile split limit value v8 add COLOR_VIEW checking to r6xx/r7xx checker, add evergreen cs checking, update safe reg for r600, evergreen and cayman. Evergreen checking need some work around for stencil alignment issues v9 fix tile split value range, fix compressed texture handling and mipmap calculation, allow evergreen check to be silencious in front of current broken userspace (depth/stencil alignment issue) v10 fix eg 3d texture and compressed texture, fix r600 depth array, fix r600 color view computation, add support for evergreen stencil split v11 more verbose debugging in some case Signed-off-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 09 1月, 2012 1 次提交
-
-
由 Alex Deucher 提交于
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: Christian König <deathsimple@vodafone.de> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 06 1月, 2012 2 次提交
-
-
由 Christian König 提交于
Use semaphores to sync buffers across rings in the CS ioctl. Add a reloc flag to allow userspace to skip sync for buffers. agd5f: port to latest CS ioctl changes. v2: add ring lock/unlock to make sure changes hit the ring. Signed-off-by: NChristian König <deathsimple@vodafone.de> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Jerome Glisse 提交于
Virtual address space are per drm client (opener of /dev/drm). Client are in charge of virtual address space, they need to map bo into it by calling DRM_RADEON_GEM_VA ioctl. First 16M of virtual address space is reserved by the kernel. Once using 2 level page table we should be able to have a small vram memory footprint for each pt (there would be one pt for all gart, one for all vram and then one first level for each virtual address space). Plan include using the sub allocator for a common vm page table area and using memcpy to copy vm page table in & out. Or use a gart object and copy things in & out using dma. v2: agd5f fixes: - Add vram base offset for vram pages. The GPU physical address of a vram page is FB_OFFSET + page offset. FB_OFFSET is 0 on discrete cards and the physical bus address of the stolen memory on integrated chips. - VM_CONTEXT1_PROTECTION_FAULT_DEFAULT_ADDR covers all vmid's >= 1 v3: agd5f: - integrate with the semaphore/multi-ring stuff v4: - rebase on top ttm dma & multi-ring stuff - userspace is now in charge of the address space - no more specific cs vm ioctl, instead cs ioctl has a new chunk v5: - properly handle mem == NULL case from move_notify callback - fix the vm cleanup path v6: - fix update of page table to only happen on valid mem placement v7: - add tlb flush for each vm context - add flags to define mapping property (readable, writeable, snooped) - make ring id implicit from ib->fence->ring, up to each asic callback to then do ring specific scheduling if vm ib scheduling function v8: - add query for ib limit and kernel reserved virtual space - rename vm->size to max_pfn (maximum number of page) - update gem_va ioctl to also allow unmap operation - bump kernel version to allow userspace to query for vm support v9: - rebuild page table only when bind and incrementaly depending on bo referenced by cs and that have been moved - allow virtual address space to grow - use sa allocator for vram page table - return invalid when querying vm limit on non cayman GPU - dump vm fault register on lockup v10: agd5f: - Move the vm schedule_ib callback to a standalone function, remove the callback and use the existing ib_execute callback for VM IBs. v11: - rebase on top of lastest Linus v12: agd5f: - remove spurious backslash - set IB vm_id to 0 in radeon_ib_get() v13: agd5f: - fix handling of RADEON_CHUNK_ID_FLAGS v14: - fix va destruction - fix suspend resume - forbid bo to have several different va in same vm v15: - rebase v16: - cleanup left over of vm init/fini v17: agd5f: - cs checker v18: agd5f: - reworks the CS ioctl to better support multiple rings and VM. Rather than adding a new chunk id for VM, just re-use the IB chunk id and add a new flags for VM mode. Also define additional dwords for the flags chunk id to define the what ring we want to use (gfx, compute, uvd, etc.) and the priority. v19: - fix cs fini in weird case of no ib - semi working flush fix for ni - rebase on top of sa allocator changes v20: agd5f: - further CS ioctl cleanups from Christian's comments v21: agd5f: - integrate CS checker improvements v22: agd5f: - final cleanups for release, only allow VM CS on cayman Signed-off-by: NJerome Glisse <jglisse@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 20 11月, 2011 1 次提交
-
-
由 Marek Olšák 提交于
This adds a new optional chunk to the CS ioctl that specifies optional flags to the CS parser. Why this is useful is explained below. Note that some regs no longer need the NOP relocation packet if this feature is enabled. Tested on r300g and r600g with this flag disabled and enabled. Assume there are two contexts sharing the same mipmapped tiled texture. One context wants to render into the first mipmap and the other one wants to render into the last mipmap. As you probably know, the hardware has a MACRO_SWITCH feature, which turns off macro tiling for small mipmaps, but that only applies to samplers. (at least on r300-r500, though later hardware likely behaves the same) So we want to just re-set the tiling flags before rendering (writing packets), right? ... No. The contexts run in parallel, so they may set the tiling flags simultaneously and then fire their command streams also simultaneously. The last one setting the flags wins, the other one loses. Another problem is when one context wants to render into the first and the last mipmap in one CS. Impossible. It must flush before changing tiling flags and do the rendering into the smaller mipmaps in another CS. Yet another problem is that writing copy_blit in userspace would be a mess involving re-setting tiling flags to please the kernel, and causing races with other contexts at the same time. The only way out of this is to send tiling flags with each CS, ideally with each relocation. But we already do that through the registers. So let's just use what we have in the registers. Signed-off-by: NMarek Olšák <maraeo@gmail.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-