- 07 10月, 2016 1 次提交
-
-
由 Masahiro Yamada 提交于
The combo of list_empty() check and return list_first_entry() can be replaced with list_first_entry_or_null(). Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: NEric Anholt <eric@anholt.net>
-
- 20 8月, 2016 1 次提交
-
-
由 Eric Anholt 提交于
Overflow memory handling is tricky: While it's still referenced by the BPO registers, we want to keep it from being freed. When we are putting a new set of overflow memory in the registers, we need to assign the old one to the last rendering job using it. We were looking at "what's currently running in the binner", but since the bin/render submission split, we may end up with the binner completing and having no new job while the renderer is still processing. So, if we don't find a bin job at all, look at the highest-seqno (last) render job to attach our overflow to. Signed-off-by: NEric Anholt <eric@anholt.net> Fixes: ca26d28b ("drm/vc4: improve throughput by pipelining binning and rendering jobs") Cc: stable@vger.kernel.org
-
- 16 7月, 2016 1 次提交
-
-
由 Eric Anholt 提交于
We're already checking that branch instructions are between the start of the shader and the proper PROG_END sequence. The other thing we need to make branching safe is to verify that the shader doesn't read past the end of the uniforms stream. To do that, we require that at any basic block reading uniforms have the following instructions: load_imm temp, <next offset within uniform stream> add unif_addr, temp, unif The instructions are generated by userspace, and the kernel verifies that the load_imm is of the expected offset, and that the add adds it to a uniform. We track which uniform in the stream that is, and at draw call time fix up the uniform stream to have the address of the start of the shader's uniforms at that location. Signed-off-by: NEric Anholt <eric@anholt.net>
-
- 12 7月, 2016 1 次提交
-
-
由 Mario Kleiner 提交于
Precise vblank timestamping is implemented via the usual scanout position based method. On VC4 the pixelvalves PV do not have a scanout position register. Only the hardware video scaler HVS has a similar register which describes which scanline for the output is currently composited and stored in the HVS fifo for later consumption by the PV. This causes a problem in that the HVS runs at a much faster clock (system clock / audio gate) than the PV which runs at video mode dot clock, so the unless the fifo between HVS and PV is full, the HVS will progress faster in its observable read line position than video scan rate, so the HVS position reading can't be directly translated into a scanout position for timestamp correction. Additionally when the PV is in vblank, it doesn't consume from the fifo, so the fifo gets full very quickly and then the HVS stops compositing until the PV enters active scanout and starts consuming scanlines from the fifo again, making new space for the HVS to composite. Therefore a simple translation of HVS read position into elapsed time since (or to) start of active scanout does not work, but for the most interesting cases we can still get useful and sufficiently accurate results: 1. The PV enters active scanout of a new frame with the fifo of the HVS completely full, and the HVS can refill any fifo line which gets consumed and thereby freed up by the PV during active scanout very quickly. Therefore the PV and HVS work effectively in lock-step during active scanout with the fifo never having more than 1 scanline freed up by the PV before it gets refilled. The PV's real scanout position is therefore trailing the HVS compositing position as scanoutpos = hvspos - fifosize and we can get the true scanoutpos as HVS readpos minus fifo size, so precise timestamping works while in active scanout, except for the last few scanlines of the frame, when the HVS reaches end of frame, stops compositing and the PV catches up and drains the fifo. This special case would only introduce minor errors though. 2. If we are in vblank, then we can only guess something reasonable. If called from vblank irq, we assume the irq is usually dispatched with minimum delay, so we can take a timestamp taken at entry into the vblank irq handler as a baseline and then add a full vblank duration until the guessed start of active scanout. As irq dispatch is usually pretty low latency this works with relatively low jitter and good results. If we aren't called from vblank then we could be anywhere within the vblank interval, so we return a neutral result, simply the current system timestamp, and hope for the best. Measurement shows the generated timestamps to be rather precise, and at least never off more than 1 vblank duration worst-case. Limitations: Doesn't work well yet for interlaced video modes, therefore disabled in interlaced mode for now. v2: Use the DISPBASE registers to determine the FIFO size (changes by anholt) Signed-off-by: NMario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: NEric Anholt <eric@anholt.net> Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> (v2)
-
- 02 6月, 2016 1 次提交
-
-
由 Daniel Vetter 提交于
... and use it in msm&vc4. Again just want to encapsulate drm_atomic_state internals a bit. The const threading is a bit awkward in vc4 since C sucks, but I still think it's worth to enforce this. Eventually I want to make all the obj->state pointers const too, but that's a lot more work ... v2: Provide safe macro to wrap up the unsafe helper better, suggested by Maarten. v3: Fixup subject (Maarten) and spelling fixes (Eric Engestrom). Cc: Eric Anholt <eric@anholt.net> Cc: Rob Clark <robdclark@gmail.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1464877304-4213-1-git-send-email-daniel.vetter@ffwll.ch
-
- 15 4月, 2016 1 次提交
-
-
由 Eric Anholt 提交于
The DPI interface involves taking a ton of our GPIOs to be used as outputs, and routing display signals over them in parallel. v2: Use display_info.bus_formats[] to replace our custom DT properties. v3: Rebase on V3D documentation changes. v4: Fix rebase detritus from V3D documentation changes. Signed-off-by: NEric Anholt <eric@anholt.net> Acked-by: NRob Herring <robh@kernel.org>
-
- 14 3月, 2016 1 次提交
-
-
由 Varad Gautam 提交于
The hardware provides us with separate threads for binning and rendering, and the existing model waits for them both to complete before submitting the next job. Splitting the binning and rendering submissions reduces idle time and gives us approx 20-30% speedup with some x11perf tests such as -line10 and -tilerect1. Improves openarena performance by 1.01897% +/- 0.247857% (n=16). Thanks to anholt for suggesting this. v2: Rebase on the spurious resets fix (change by anholt). Signed-off-by: NVarad Gautam <varadgautam@gmail.com> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-off-by: NEric Anholt <eric@anholt.net>
-
- 17 2月, 2016 5 次提交
-
-
由 Eric Anholt 提交于
This gets us functional GPU reset again, like we had until a refactor at merge time. Tested with a little patch to stuff in a broken binner job every 100 frames. Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
This may actually get us a feature that the closed driver didn't have: turning off the GPU in between rendering jobs, while the V3D device is still opened by the client. There may be some tuning to be applied here to use autosuspend so that we don't bounce the device's power so much, but in steady-state GPU-bound rendering we keep the power on (since we keep multiple jobs outstanding) and even if we power cycle on every job we can still manage at least 680 fps. More importantly, though, runtime PM will allow us to power off the device to do a GPU reset. v2: Switch #ifdef to CONFIG_PM not CONFIG_PM_SLEEP (caught by kbuild test robot) Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
We were tracking the "where are the head pointers pointing" globally, so if another job reused the same BOs and execution was at the same point as last time we checked, we'd stop and trigger a reset even though the GPU had made progress. Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
This implements a simple policy for choosing scaling modes (trapezoidal for decimation, PPF for magnification), and a single PPF filter (Mitchell/Netravali's recommendation). Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
So far, we've only ever lit up one CRTC, so this has been fine. To extend to more displays or more planes, we need to make sure we don't run our display lists into each other. Signed-off-by: NEric Anholt <eric@anholt.net>
-
- 08 2月, 2016 1 次提交
-
-
由 Daniel Vetter 提交于
Again since the drm core takes care of event unlinking/disarming this is now just needless code. v2: Fixup misplaced hunk. Cc: Eric Anholt <eric@anholt.net> Acked-by: NDaniel Stone <daniels@collabora.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Acked-by: NEric Anholt <eric@anholt.net> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453756616-28942-14-git-send-email-daniel.vetter@ffwll.ch
-
- 08 12月, 2015 7 次提交
-
-
由 Eric Anholt 提交于
This can be parsed with vc4-gpu-tools tools for trying to figure out what was going on. v2: Use __u32-style types. Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
An async pageflip stores the modeset to be done and executes it once the BOs are ready to be displayed. This gets us about 3x performance in full screen rendering with pageflipping. Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
The user submission is basically a pointer to a command list and a pointer to uniforms. We copy those in to the kernel, validate and relocate them, and store the result in a GPU BO which we queue for execution. v2: Drop support for NV shader recs (not necessary for GL), simplify vc4_use_bo(), improve bin flush/semaphore checks, use __u32 style types. Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
This is the component of the GPU that does 3D rendering. Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
Since we have no MMU, the kernel needs to validate that the submitted shader code won't make any accesses to memory that the user doesn't control, which involves banning some operations (general purpose DMA writes), and tracking where we need to write out pointers for other operations (texture sampling). Once it's validated, we return a GEM BO containing the shader, which doesn't allow mapping for write or exporting to other subsystems. v2: Use __u32-style types. Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
While there exist dumb APIs for creating and mapping BOs, one of the rules is that drivers doing 3D acceleration have to provide their own APIs for buffer allocation (besides, the pitch/height parameters of the dumb alloc don't really make sense for a lot of 3D allocations). v2: Use __u32-style types, use "drm.h" instead of <drm/drm.h>. Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
We need to allocate new BOs in the kernel as part of each frame, but the CMA allocator is way too slow for that. As an optimization, keep track of recently-freed BOs and reuse them, with a 1 second timeout to fully free them back to the system. This improves 3D performance by about 15%. Signed-off-by: NEric Anholt <eric@anholt.net>
-
- 21 10月, 2015 2 次提交
-
-
由 Derek Foreman 提交于
Keep the fbdev_cma pointer around so we can use it on hotplog and close to ensure the frame buffer console is in a useful state. Signed-off-by: NDerek Foreman <derekf@osg.samsung.com> Signed-off-by: NEric Anholt <eric@anholt.net>
-
由 Eric Anholt 提交于
This is enough for fbcon and bringing up X using xf86-video-modesetting. It doesn't support the 3D accelerator or power management yet. v2: Drop FB_HELPER select thanks to Archit's patches. Do manual init ordering instead of using the .load hook. Structure registration more like tegra's, but still using the typical "component" code. Drop no-op hooks for atomic_begin and mode_fixup() now that they're optional. Drop sentinel in Makefile. Fix minor style nits I noticed on another reread. v3: Use the new bcm2835 clk driver to manage pixel/HSM clocks instead of having a fixed video mode. Use exynos-style component driver matching instead of devicetree nodes to list the component driver instances. Rename compatibility strings to say bcm2835, and distinguish pv0/1/2. Clean up some h/vsync code, and add in interlaced mode setup. Fix up probe/bind error paths. Use bitops.h macros for vc4_regs.h v4: Include i2c.h, allow building under COMPILE_TEST, drop msleep now that other bugs have been fixed, add timeouts to cpu_relax() loops, rename hpd-gpio to hpd-gpios. Signed-off-by: NEric Anholt <eric@anholt.net> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-