- 03 1月, 2018 14 次提交
-
-
由 Lucas Stach 提交于
As long as there is an active submit, we want the GPU to stay awake. This is slightly complicated by the fact that we really want to wake the GPU at the last possible moment to achieve maximum power savings. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
The active count is used to check if the BO is idle, where idle is defined as not active on the GPU and all VM mappings and reference counts dropped to the initial state. As the idling of the mappings and references now only happens in the submit cleanup, the active state handling must be moved to the same location in order to keep the userspace semantics. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
Less dynamic allocations and slims down the cmdbuf object to only the required information, as everything else is already available in the submit object. This also simplifies buffer and mappings lifetime management, as they are now exlusively attached to the submit object and not additionally to the cmdbuf. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
The GPU exec state may have changed at the time when the perfmon sampling is done, as it reflects the state of the last submission, not the current GPU execution state. So for proper sampling we must use the submit exec_state. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
We'll need this in some places where only the submit is available. Also this is a first step at slimming down the cmdbuf object. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
To make them available to the event worker even after the actual command stream execution has finished. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
This is the fence passed out on a sucessful GPU submit. Make the name more clear. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
The object fencing has nothing to do with the actual GPU buffer submit, so move it to the gem submit path to have a cleaner split. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NPhilipp Zabel <p.zabel@pengutronix.de>
-
由 Lucas Stach 提交于
Inserting the END command when suspending the GPU is changing the command buffer state, which requires the GPU to be held. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NPhilipp Zabel <p.zabel@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
由 Lucas Stach 提交于
While the etnaviv workqueue needs to be ordered, as we rely on work items being executed in queuing order, this is only true for a single GPU. Having a shared workqueue for all GPUs in the system limits concurrency artificially. Getting each GPU its own ordered workqueue still meets our ordering expectations and enables retire workers to run concurrently. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NPhilipp Zabel <p.zabel@pengutronix.de>
-
由 Lucas Stach 提交于
There is no need to store this in the gpu struct. MMU flushes are triggered correctly in reaction to MMU maps and unmaps, independent of the current ctx. Any required pipe switches can be infered from the current and the desired GPU exec state. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NPhilipp Zabel <p.zabel@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
由 Lucas Stach 提交于
There is no need to synchronize with oustanding retire jobs if the object has gone idle. Retire jobs only ever change the object state from active to idle, not the other way around. The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding a reference to the GEM object, so the retire worker will not pull the object into the CPU domain, which is the thing we are trying to guard against with etnaviv_gpu_wait_obj_inactive. The ordering of the various counts and waits may change a bit, but the userspace visible behavior at the bounds of the syscall are unchanged. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NPhilipp Zabel <p.zabel@pengutronix.de>
-
由 Lucas Stach 提交于
Flush and prefetch are properly handled in the buffer code, data endianess would need much wider changes than adding something to this single function. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
由 Lucas Stach 提交于
If the FE is restarted before the sync point event is cleared, the GPU might trigger a completion IRQ for the next sync point, corrupting the state of the currently running worker. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NPhilipp Zabel <p.zabel@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
- 02 12月, 2017 1 次提交
-
-
由 Philipp Zabel 提交于
The etnaviv driver causes a link failure if it is built-in but THERMAL is built as a module: drivers/gpu/drm/etnaviv/etnaviv_gpu.o: In function `etnaviv_gpu_bind': etnaviv_gpu.c:(.text+0x4c4): undefined reference to `thermal_of_cooling_device_register' etnaviv_gpu.c:(.text+0x600): undefined reference to `thermal_cooling_device_unregister' drivers/gpu/drm/etnaviv/etnaviv_gpu.o: In function `etnaviv_gpu_unbind': etnaviv_gpu.c:(.text+0x2aac): undefined reference to `thermal_cooling_device_unregister' Adding a Kconfig dependency on THERMAL || !THERMAL to avoid this causes a dependency loop on x86_64: drivers/gpu/drm/tve200/Kconfig:1:error: recursive dependency detected! For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/tve200/Kconfig:1: symbol DRM_TVE200 depends on CMA For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" mm/Kconfig:489: symbol CMA is selected by DRM_ETNAVIV For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/etnaviv/Kconfig:2: symbol DRM_ETNAVIV depends on THERMAL For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/thermal/Kconfig:5: symbol THERMAL is selected by ACPI_VIDEO For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/acpi/Kconfig:189: symbol ACPI_VIDEO is selected by BACKLIGHT_CLASS_DEVICE For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/video/backlight/Kconfig:158: symbol BACKLIGHT_CLASS_DEVICE is selected by DRM_PARADE_PS8622 For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/bridge/Kconfig:62: symbol DRM_PARADE_PS8622 depends on DRM_BRIDGE For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/bridge/Kconfig:1: symbol DRM_BRIDGE is selected by DRM_TVE200 To work around this, add a new option DRM_ETNAVIV_THERMAL to optionally enable thermal throttling support and make DRM_ETNAVIV select THERMAL at the same time. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 03 11月, 2017 1 次提交
-
-
由 Kees Cook 提交于
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: David Airlie <airlied@linux.ie> Cc: etnaviv@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: NKees Cook <keescook@chromium.org>
-
- 10 10月, 2017 12 次提交
-
-
由 Philipp Zabel 提交于
There is no reason to wait for clock stabilization here, as the clock framework guarantees that PLL clock sources are stable before clk_enable returns. Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Philipp Zabel 提交于
After reset assertion, we only have to wait for the reset signals to propagate through the GPU before deasserting the reset again. A few hundred clock cycles should be more than enough. Replace the msleep(1), which can actually take about 30 ms on i.MX6Q in some configurations, with an usleep_range of a few microseconds. If the delay was too short, the FE would not be idle afterwards, and the reset would be retried. Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
This comment is outdated as the driver is taking care about clock gating and the pulse eater for quite some time already. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
由 Christian Gmeiner 提交于
Some performance register are debug register and they need to be enabled in order to be functional. Signed-off-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: NLucas Stach <l.stach@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Christian Gmeiner 提交于
As done by Vivante kernel driver. Signed-off-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: NLucas Stach <l.stach@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Christian Gmeiner 提交于
With 'sync points' we can sample the reqeustes perform signals before and/or after the submited command buffer. Changes v2 -> v3: - fixed indentation and init nr_events to 1 Changes v4 -> v5: - simplify logic around fence handling. Signed-off-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Christian Gmeiner 提交于
Results in less code as the users do not set every struct member to 0/NULL. Signed-off-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: NLucas Stach <l.stach@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Christian Gmeiner 提交于
In order to support performance counters in a sane way we need to provide a method to sync the GPU with the CPU. The GPU can process multpile command buffers/events per irq. With the help of a 'sync point' we can trigger an event and stop the GPU/FE immediately. When the CPU is done with is processing it simply needs to restart the FE and the GPU will process the command stream. Changes from v1 -> v2: - process sync point with a work item to keep irq as fast as possible Changes from v4 -> v5: - renamed pmrs_* to sync_point_* - call event_free(..) in sync_point_worker(..) Signed-off-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Christian Gmeiner 提交于
This commits extends etnaviv_gpu_cmdbuf_new(..) to define the number of struct etnaviv_perfmon elements gets used. Changes from v1 -> v2: - make use of goto as requested by Lucas Signed-off-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Christian Gmeiner 提交于
This makes it possible to allocate multiple events under the event spinlock. This change is needed to support 'sync'-points. Changes v2 -> v3: - wait for the completion of all events - use 10sec timeout regardless of the number of events - removed validation if there are enough free events - fixed return value evaluation of event_alloc(..) in etnaviv_gpu_submit(..) Signed-off-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Christian Gmeiner 提交于
This is prep work to be able to allocate multiple events in one go. Signed-off-by: NChristian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
The reset path wants to initialize the clock control register regardless of the DYNAMIC_FREQUENCY_SCALING feature, so don't call clock update, but explicitly load the register. Also disabling of the debug registers is moved into the reset function, so we always get to the same state after a GPU reset. This means the clock update function should not touch the bits already set in the clock control register, but instead only update the scaling bits. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
- 15 8月, 2017 1 次提交
-
-
由 Lucas Stach 提交于
The stub functions returns -ENODEV when trying to register the cooling device, thus failing the GPU bind, rendering the GPU subsystem unusable when CONFIG_THERMAL isn't enabled. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 05 5月, 2017 2 次提交
-
-
由 Lucas Stach 提交于
GPU cores with the DYNAMIC_FREQUENCY_SCALING feature bit set expect the platform to provide the clock scaling and ignore any requests to use the internal FSCALE divider. Writes to this register still work, but don't have any effect on the GPU clock frequency. Save the initial core and shader clock frequency and ask the platform to provide a slower clock when cooling is requested. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
PA clock gating can be enabled when the right bugfix bit is present. There are broken revs of GC4000 and GC2000, which need TX clock gating to be disabled. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 12 4月, 2017 1 次提交
-
-
由 Wei Yongjun 提交于
Add the missing unlock before return from function etnaviv_gpu_submit() in the error handling case. lst: fixed label name. Fixes: f3cd1b06 ("drm/etnaviv: (re-)protect fence allocation with GPU mutex") CC: stable@vger.kernel.org #4.9+ Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 29 3月, 2017 5 次提交
-
-
由 Lucas Stach 提交于
The next patch will need the complete dma_fence, instead of just the seqno, to create the sync_file in etnaviv_ioctl_gem_submit, in case an out_fence_fd is requested. The submit needs to hold a reference to the dma_fence, to avoid raceing with the GPU completing the fence. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Tested-by: NPhilipp Zabel <p.zabel@pengutronix.de> --- New patch in v3.
-
由 Philipp Zabel 提交于
Loosely based on commit f0a42bb5 ("drm/msm: submit support for in-fences"). Unfortunately, struct drm_etnaviv_gem_submit doesn't have a flags field yet, so we have to extend the structure and trust that drm_ioctl will clear the flags for us if an older userspace only submits part of the struct. Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de> Reviewed-by: NGustavo Padovan <gustavo.padovan@collabora.com> Reviewed-by: NSumit Semwal <sumit.semwal@linaro.org> Reviewed-by: NLucas Stach <l.stach@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Russell King 提交于
Each Vivante GPU contains a clock divider which can divide the GPU clock by 2^n, which can lower the power dissipation from the GPU. It has been suggested that the GC600 on Dove is responsible for 20-30% of the power dissipation from the SoC, so lowering the GPU clock rate provides a way to throttle the power dissiptation, and reduce the temperature when the SoC gets hot. This patch hooks the Etnaviv driver into the kernel's thermal management to allow the GPUs to be throttled when necessary, allowing a reduction in GPU clock rate from /1 to /64 in power of 2 steps. Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Reviewed-by: NLucas Stach <l.stach@pengutronix.de> Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
Make sure the GPU lock is taken, so that fence completion order matches seqno order. Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
由 Lucas Stach 提交于
The fence allocation needs to be protected by the GPU mutex, otherwise the fence seqnos of concurrent submits might not match the insertion order of the jobs in the kernel ring. This breaks the assumption that jobs complete with monotonically increasing fence seqnos. Fixes: d9853490 (drm/etnaviv: take GPU lock later in the submit process) CC: stable@vger.kernel.org #4.9+ Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
-
- 02 2月, 2017 3 次提交
-
-
由 Lucas Stach 提交于
There are 3 big benefits to suballocating a single big DMA buffer for command submission: 1. Avoid hammering CMA. The old way of allocating and freeing a DMA buffer for each submission was hitting some of the real slow pathes in CMA, as this allocator was not designed for a concurrent small buffers load. 2. Less TLB flushes on IOMMUv2. If a new command buffer is mapped into the GPU address space the MMU TLBs need to be flushed. By having one big buffer statically mapped to the GPU, a lot of those flushes can be avoided. 3. No funky workarounds for GC3000. The FE TLB flush on GC3000 isn't reliable. To work around that we tried to lay out the cmdbufs in the GPU address space in a way to avoid this issue. This hasn't always worked if the address space is crowded. A single statically mapped buffer avoids the erratum completely. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
由 Lucas Stach 提交于
Don't call the IOMMU directly, but go through the new cmdbuf abstraction. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-
由 Lucas Stach 提交于
This will get more complex with the following changes, so move it into its own place. Signed-off-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian Gmeiner <christian.gmeiner@gmail.com>
-