- 15 8月, 2017 1 次提交
-
-
由 Lucas Stach 提交于
The stub functions returns -ENODEV when trying to register the cooling device, thus failing the GPU bind, rendering the GPU subsystem unusable when CONFIG_THERMAL isn't enabled. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 05 5月, 2017 2 次提交
-
-
由 Lucas Stach 提交于
GPU cores with the DYNAMIC_FREQUENCY_SCALING feature bit set expect the platform to provide the clock scaling and ignore any requests to use the internal FSCALE divider. Writes to this register still work, but don't have any effect on the GPU clock frequency. Save the initial core and shader clock frequency and ask the platform to provide a slower clock when cooling is requested. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
PA clock gating can be enabled when the right bugfix bit is present. There are broken revs of GC4000 and GC2000, which need TX clock gating to be disabled. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 12 4月, 2017 1 次提交
-
-
由 Wei Yongjun 提交于
Add the missing unlock before return from function etnaviv_gpu_submit() in the error handling case. lst: fixed label name. Fixes: f3cd1b06 ("drm/etnaviv: (re-)protect fence allocation with GPU mutex") CC: stable@vger.kernel.org #4.9+ Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 29 3月, 2017 5 次提交
-
-
由 Lucas Stach 提交于
The next patch will need the complete dma_fence, instead of just the seqno, to create the sync_file in etnaviv_ioctl_gem_submit, in case an out_fence_fd is requested. The submit needs to hold a reference to the dma_fence, to avoid raceing with the GPU completing the fence. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Tested-by: NPhilipp Zabel <p.zabel@pengutronix.de> --- New patch in v3.
-
由 Philipp Zabel 提交于
Loosely based on commit f0a42bb5 ("drm/msm: submit support for in-fences"). Unfortunately, struct drm_etnaviv_gem_submit doesn't have a flags field yet, so we have to extend the structure and trust that drm_ioctl will clear the flags for us if an older userspace only submits part of the struct. Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de> Reviewed-by: NGustavo Padovan <gustavo.padovan@collabora.com> Reviewed-by: NSumit Semwal <sumit.semwal@linaro.org> Reviewed-by: NLucas Stach <l.stach@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Russell King 提交于
Each Vivante GPU contains a clock divider which can divide the GPU clock by 2^n, which can lower the power dissipation from the GPU. It has been suggested that the GC600 on Dove is responsible for 20-30% of the power dissipation from the SoC, so lowering the GPU clock rate provides a way to throttle the power dissiptation, and reduce the temperature when the SoC gets hot. This patch hooks the Etnaviv driver into the kernel's thermal management to allow the GPUs to be throttled when necessary, allowing a reduction in GPU clock rate from /1 to /64 in power of 2 steps. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Reviewed-by: NLucas Stach <l.stach@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
Make sure the GPU lock is taken, so that fence completion order matches seqno order. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
The fence allocation needs to be protected by the GPU mutex, otherwise the fence seqnos of concurrent submits might not match the insertion order of the jobs in the kernel ring. This breaks the assumption that jobs complete with monotonically increasing fence seqnos. Fixes: d9853490 (drm/etnaviv: take GPU lock later in the submit process) CC: stable@vger.kernel.org #4.9+ Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 02 2月, 2017 3 次提交
-
-
由 Lucas Stach 提交于
There are 3 big benefits to suballocating a single big DMA buffer for command submission: 1. Avoid hammering CMA. The old way of allocating and freeing a DMA buffer for each submission was hitting some of the real slow pathes in CMA, as this allocator was not designed for a concurrent small buffers load. 2. Less TLB flushes on IOMMUv2. If a new command buffer is mapped into the GPU address space the MMU TLBs need to be flushed. By having one big buffer statically mapped to the GPU, a lot of those flushes can be avoided. 3. No funky workarounds for GC3000. The FE TLB flush on GC3000 isn't reliable. To work around that we tried to lay out the cmdbufs in the GPU address space in a way to avoid this issue. This hasn't always worked if the address space is crowded. A single statically mapped buffer avoids the erratum completely. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
由 Lucas Stach 提交于
Don't call the IOMMU directly, but go through the new cmdbuf abstraction. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
由 Lucas Stach 提交于
This will get more complex with the following changes, so move it into its own place. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
- 30 1月, 2017 1 次提交
-
-
由 Wladimir J. van der Laan 提交于
Set up the PULSE_EATER register (0x0010C) in etnaviv_gpu_hw_init. This ports three mostly undocumented model/revision-specific register overrides from the Vivante kernel driver. This is relevant as at least the "disable internal DFS" for revisions > 0x5420 has shown to have a huge impact on shader performance (sped up memory read performance by 7.5x and write performance by 1.5x) on an affected GPU. Signed-off-by: NWladimir J. van der Laan <laanwj@gmail.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 03 12月, 2016 1 次提交
-
-
由 Lucas Stach 提交于
On i.MX6SX the physical memory is placed above the 2GB mark, so the GPU linear window has to be moved for the GPU to work at all. This doesn't mix with the FAST_CLEAR feature, as the TS unit doesn't take the linear window offset into account and will corrupt memory when used with a non-zero offset. Move the linear window if it's necessary for the GPU to work, but avoid announcing FAST_CLEAR support to userspace in this case. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Tested-by: NMarek Vasut <marex@denx.de>
-
- 25 10月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
I plan to usurp the short name of struct fence for a core kernel struct, and so I need to rename the specialised fence/timeline for DMA operations to make room. A consensus was reached in https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html that making clear this fence applies to DMA operations was a good thing. Since then the patch has grown a bit as usage increases, so hopefully it remains a good thing! (v2...: rebase, rerun spatch) v3: Compile on msm, spotted a manual fixup that I broke. v4: Try again for msm, sorry Daniel coccinelle script: @@ @@ - struct fence + struct dma_fence @@ @@ - struct fence_ops + struct dma_fence_ops @@ @@ - struct fence_cb + struct dma_fence_cb @@ @@ - struct fence_array + struct dma_fence_array @@ @@ - enum fence_flag_bits + enum dma_fence_flag_bits @@ @@ ( - fence_init + dma_fence_init | - fence_release + dma_fence_release | - fence_free + dma_fence_free | - fence_get + dma_fence_get | - fence_get_rcu + dma_fence_get_rcu | - fence_put + dma_fence_put | - fence_signal + dma_fence_signal | - fence_signal_locked + dma_fence_signal_locked | - fence_default_wait + dma_fence_default_wait | - fence_add_callback + dma_fence_add_callback | - fence_remove_callback + dma_fence_remove_callback | - fence_enable_sw_signaling + dma_fence_enable_sw_signaling | - fence_is_signaled_locked + dma_fence_is_signaled_locked | - fence_is_signaled + dma_fence_is_signaled | - fence_is_later + dma_fence_is_later | - fence_later + dma_fence_later | - fence_wait_timeout + dma_fence_wait_timeout | - fence_wait_any_timeout + dma_fence_wait_any_timeout | - fence_wait + dma_fence_wait | - fence_context_alloc + dma_fence_context_alloc | - fence_array_create + dma_fence_array_create | - to_fence_array + to_dma_fence_array | - fence_is_array + dma_fence_is_array | - trace_fence_emit + trace_dma_fence_emit | - FENCE_TRACE + DMA_FENCE_TRACE | - FENCE_WARN + DMA_FENCE_WARN | - FENCE_ERR + DMA_FENCE_ERR ) ( ... ) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NGustavo Padovan <gustavo.padovan@collabora.co.uk> Acked-by: NSumit Semwal <sumit.semwal@linaro.org> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk
-
- 15 9月, 2016 16 次提交
-
-
由 Lucas Stach 提交于
If we reset the GPU to get it back into a usable state we lose all context, not just the MMU one. Mark the whole context as lost to trigger a restore of the exec and MMU state. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
GC2000+ on the i.MX6QP is just a re-branded GC3000, lets call it by its real name to avoid confusion in other parts of the driver. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
Bit 30 of the interrupt status signals an MMU exception. Handle this condition properly and dump some useful registers. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
With MMUv2 all buffers need to be mapped through the MMU once it is enabled. Align the buffer size to 4K, as the MMU is only able to map page aligned buffers. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
Split out into a new externally visible function, as the IOMMUv2 code needs this functionality, too. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
Split out into a new externally visible function, as the IOMMUv2 code needs this functionality, too. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
The GPU virtual address for the command buffers differs depending on the IOMMU version. Move the calculation of the iova into etnaviv mmu, to enable proper dispatch. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
The GPU code doesn't need to deal with the IOMMU directly, instead it can all be hidden behind the etnaviv mmu interface. Move the last remaining part into etnaviv mmu. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
So we can call the v2 restore code once it is there. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
It is only relevant for the V1 MMU, so we should not do this in the common code. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
This function has external visibility and only handles the Vivant IOMMU version 1. Rename to make this more clear and allow a clear separation of the different IOMMU versions. Also drop the domain parameter, as we can infer it from the GPU we are dealing with. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
There is no linear window on MMUv2 and the FE can access the full 4GB address space either directly (as long as the MMU isn't configured) or through the MMU, once it is up. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
The driver doesn't ever enable individual clocks alone, so there is no need to scatter the clock enable/disable sequences through multiple functions. Fold them into the top one. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Fabio Estevam 提交于
There is no need to initialize variable 'err' with 0 because it will be properly assigned later on. Signed-off-by: NFabio Estevam <festevam@gmail.com>
-
由 Fabio Estevam 提交于
In the etnaviv_gpu_platform_probe() error path the 'fail' label is used to just return the error code. This can be simplified by returning the error code immediately, so get rid of the unneeded 'fail' label. Signed-off-by: NFabio Estevam <festevam@gmail.com>
-
由 Fabio Estevam 提交于
clk_prepare_enable() may fail, so we should better check for its return value and propagate it in the case of failure. Signed-off-by: NFabio Estevam <festevam@gmail.com>
-
- 15 8月, 2016 1 次提交
-
-
由 Lucas Stach 提交于
Both the fence and event alloc are safe to be done without holding the GPU lock, as they either don't need any locking (fences) or are protected by their own lock (events). This solves a bad locking interaction between the submit path and the recover worker. If userspace manages to exhaust all available events while the GPU is hung, the submit will wait for events to become available holding the GPU lock. The recover worker waits for this lock to become available before trying to recover the GPU which frees up the allocated events. Essentially both paths are deadlocked until the submit path times out waiting for available events, failing the submit that could otherwise be handled just fine if the recover worker had the chance to bring the GPU back in a working state. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
- 05 7月, 2016 2 次提交
-
-
由 Lucas Stach 提交于
Print error messages that mention the exact cause of the failure on all paths which may fail the GPU init. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
由 Russell King 提交于
Enable GPU module level hardware clock gating, using the conditions found in the galcore v5 driver. v2 lst: Split out clock gating enable into separate function, as there might be more conditions needed for new hardware. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 06 5月, 2016 1 次提交
-
-
由 Lucas Stach 提交于
The hangcheck handler is already running with very coarse timeouts, so it doesn't hurt to combine this timer with other wakeups in the system. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 28 4月, 2016 1 次提交
-
-
由 Masanari Iida 提交于
This patch fix spelling typos in printk from various part of the codes. Signed-off-by: NMasanari Iida <standby24x7@gmail.com> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 21 4月, 2016 1 次提交
-
-
由 Lucas Stach 提交于
On cores with MC1.0 the memory window offset is not properly respected by all engines in the core, leading to different views of the memory if the offset in non-zero. This causes relocs for those engines to be wrong and might lead to other subtile problems. Rather than trying to work around this, just disable the linear memory window offset for those cores. Suggested-by: NRussell King <linux@arm.linux.org.uk> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 09 3月, 2016 1 次提交
-
-
由 Luis R. Rodriguez 提交于
Rename dma_*_writecombine() to dma_*_wc(), so that the naming is coherent across the various write-combining APIs. Keep the old names for compatibility for a while, these can be removed at a later time. A guard is left to enable backporting of the rename, and later remove of the old mapping defines seemlessly. Build tested successfully with allmodconfig. The following Coccinelle SmPL patch was used for this simple transformation: @ rename_dma_alloc_writecombine @ expression dev, size, dma_addr, gfp; @@ -dma_alloc_writecombine(dev, size, dma_addr, gfp) +dma_alloc_wc(dev, size, dma_addr, gfp) @ rename_dma_free_writecombine @ expression dev, size, cpu_addr, dma_addr; @@ -dma_free_writecombine(dev, size, cpu_addr, dma_addr) +dma_free_wc(dev, size, cpu_addr, dma_addr) @ rename_dma_mmap_writecombine @ expression dev, vma, cpu_addr, dma_addr, size; @@ -dma_mmap_writecombine(dev, vma, cpu_addr, dma_addr, size) +dma_mmap_wc(dev, vma, cpu_addr, dma_addr, size) We also keep the old names as compatibility helpers, and guard against their definition to make backporting easier. Generated-by: Coccinelle SmPL Suggested-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: airlied@linux.ie Cc: akpm@linux-foundation.org Cc: benh@kernel.crashing.org Cc: bhelgaas@google.com Cc: bp@suse.de Cc: dan.j.williams@intel.com Cc: daniel.vetter@ffwll.ch Cc: dhowells@redhat.com Cc: julia.lawall@lip6.fr Cc: konrad.wilk@oracle.com Cc: linux-fbdev@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: luto@amacapital.net Cc: mst@redhat.com Cc: tomi.valkeinen@ti.com Cc: toshi.kani@hp.com Cc: vinod.koul@intel.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/1453516462-4844-1-git-send-email-mcgrof@do-not-panic.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 07 3月, 2016 2 次提交
-
-
由 Russell King 提交于
Currently, we scan the list of mappings each time we want to operate on the vram_mapping struct. Rather than repeatedly scanning these, look them up once in the submission path, and then use _reference and _unreference methods as necessary to manage this object. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Russell King 提交于
Add tracking of the current execution state (iow, active GPU pipe). Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-