提交 · 57b9f569447c4ab7b2a7e34a13e468311db4cd64 · openanolis / cloud-kernel

07 10月, 2016 1 次提交

drm/vc4: cleanup with list_first_entry_or_null() · 57b9f569

由 Masahiro Yamada 提交于 9月 13, 2016

The combo of list_empty() check and return list_first_entry()
can be replaced with list_first_entry_or_null().
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: NEric Anholt <eric@anholt.net>

57b9f569

20 8月, 2016 1 次提交

drm/vc4: Fix overflow mem unreferencing when the binner runs dry. · 9326e6f2

由 Eric Anholt 提交于 7月 26, 2016

Overflow memory handling is tricky: While it's still referenced by the
BPO registers, we want to keep it from being freed.  When we are
putting a new set of overflow memory in the registers, we need to
assign the old one to the last rendering job using it.

We were looking at "what's currently running in the binner", but since
the bin/render submission split, we may end up with the binner
completing and having no new job while the renderer is still
processing.  So, if we don't find a bin job at all, look at the
highest-seqno (last) render job to attach our overflow to.
Signed-off-by: NEric Anholt <eric@anholt.net>
Fixes: ca26d28b ("drm/vc4: improve throughput by pipelining binning and rendering jobs")
Cc: stable@vger.kernel.org

9326e6f2

16 7月, 2016 1 次提交

drm/vc4: Add support for branching in shader validation. · 6d45c81d

由 Eric Anholt 提交于 7月 02, 2016

We're already checking that branch instructions are between the start
of the shader and the proper PROG_END sequence.  The other thing we
need to make branching safe is to verify that the shader doesn't read
past the end of the uniforms stream.

To do that, we require that at any basic block reading uniforms have
the following instructions:

load_imm temp, <next offset within uniform stream>
add unif_addr, temp, unif

The instructions are generated by userspace, and the kernel verifies
that the load_imm is of the expected offset, and that the add adds it
to a uniform.  We track which uniform in the stream that is, and at
draw call time fix up the uniform stream to have the address of the
start of the shader's uniforms at that location.
Signed-off-by: NEric Anholt <eric@anholt.net>

6d45c81d

12 7月, 2016 1 次提交

drm/vc4: Implement precise vblank timestamping. · 1bf59f1d

由 Mario Kleiner 提交于 6月 23, 2016

Precise vblank timestamping is implemented via the
usual scanout position based method. On VC4 the
pixelvalves PV do not have a scanout position
register. Only the hardware video scaler HVS has a
similar register which describes which scanline for
the output is currently composited and stored in the
HVS fifo for later consumption by the PV.

This causes a problem in that the HVS runs at a much
faster clock (system clock / audio gate) than the PV
which runs at video mode dot clock, so the unless the
fifo between HVS and PV is full, the HVS will progress
faster in its observable read line position than video
scan rate, so the HVS position reading can't be directly
translated into a scanout position for timestamp correction.

Additionally when the PV is in vblank, it doesn't consume
from the fifo, so the fifo gets full very quickly and then
the HVS stops compositing until the PV enters active scanout
and starts consuming scanlines from the fifo again, making
new space for the HVS to composite.

Therefore a simple translation of HVS read position into
elapsed time since (or to) start of active scanout does
not work, but for the most interesting cases we can still
get useful and sufficiently accurate results:

1. The PV enters active scanout of a new frame with the
   fifo of the HVS completely full, and the HVS can refill
   any fifo line which gets consumed and thereby freed up by
   the PV during active scanout very quickly. Therefore the
   PV and HVS work effectively in lock-step during active
   scanout with the fifo never having more than 1 scanline
   freed up by the PV before it gets refilled. The PV's
   real scanout position is therefore trailing the HVS
   compositing position as scanoutpos = hvspos - fifosize
   and we can get the true scanoutpos as HVS readpos minus
   fifo size, so precise timestamping works while in active
   scanout, except for the last few scanlines of the frame,
   when the HVS reaches end of frame, stops compositing and
   the PV catches up and drains the fifo. This special case
   would only introduce minor errors though.

2. If we are in vblank, then we can only guess something
   reasonable. If called from vblank irq, we assume the irq is
   usually dispatched with minimum delay, so we can take a
   timestamp taken at entry into the vblank irq handler as a
   baseline and then add a full vblank duration until the
   guessed start of active scanout. As irq dispatch is usually
   pretty low latency this works with relatively low jitter and
   good results.

   If we aren't called from vblank then we could be anywhere
   within the vblank interval, so we return a neutral result,
   simply the current system timestamp, and hope for the best.

Measurement shows the generated timestamps to be rather precise,
and at least never off more than 1 vblank duration worst-case.

Limitations: Doesn't work well yet for interlaced video modes,
             therefore disabled in interlaced mode for now.

v2: Use the DISPBASE registers to determine the FIFO size (changes
    by anholt)
Signed-off-by: NMario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: NEric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> (v2)

1bf59f1d

02 6月, 2016 1 次提交

drm/atomic: Add drm_atomic_crtc_state_for_each_plane_state · 2f196b7c

由 Daniel Vetter 提交于 6月 02, 2016

... and use it in msm&vc4. Again just want to encapsulate
drm_atomic_state internals a bit.

The const threading is a bit awkward in vc4 since C sucks, but I still
think it's worth to enforce this. Eventually I want to make all the
obj->state pointers const too, but that's a lot more work ...

v2: Provide safe macro to wrap up the unsafe helper better, suggested
by Maarten.

v3: Fixup subject (Maarten) and spelling fixes (Eric Engestrom).

Cc: Eric Anholt <eric@anholt.net>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464877304-4213-1-git-send-email-daniel.vetter@ffwll.ch

2f196b7c

15 4月, 2016 1 次提交

drm/vc4: Add DPI driver · 08302c35

由 Eric Anholt 提交于 2月 10, 2016

The DPI interface involves taking a ton of our GPIOs to be used as
outputs, and routing display signals over them in parallel.

v2: Use display_info.bus_formats[] to replace our custom DT
    properties.
v3: Rebase on V3D documentation changes.
v4: Fix rebase detritus from V3D documentation changes.
Signed-off-by: NEric Anholt <eric@anholt.net>
Acked-by: NRob Herring <robh@kernel.org>

08302c35

14 3月, 2016 1 次提交

drm/vc4: improve throughput by pipelining binning and rendering jobs · ca26d28b

由 Varad Gautam 提交于 2月 17, 2016

The hardware provides us with separate threads for binning and
rendering, and the existing model waits for them both to complete
before submitting the next job.

Splitting the binning and rendering submissions reduces idle time and
gives us approx 20-30% speedup with some x11perf tests such as -line10
and -tilerect1.  Improves openarena performance by 1.01897% +/-
0.247857% (n=16).

Thanks to anholt for suggesting this.

v2: Rebase on the spurious resets fix (change by anholt).
Signed-off-by: NVarad Gautam <varadgautam@gmail.com>
Reviewed-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NEric Anholt <eric@anholt.net>

ca26d28b

17 2月, 2016 5 次提交

drm/vc4: Use runtime PM to power cycle the device when the GPU hangs. · 36cb6253

由 Eric Anholt 提交于 2月 08, 2016

This gets us functional GPU reset again, like we had until a refactor
at merge time.  Tested with a little patch to stuff in a broken binner
job every 100 frames.
Signed-off-by: NEric Anholt <eric@anholt.net>

36cb6253

drm/vc4: Enable runtime PM. · 001bdb55

由 Eric Anholt 提交于 2月 05, 2016

This may actually get us a feature that the closed driver didn't have:
turning off the GPU in between rendering jobs, while the V3D device is
still opened by the client.

There may be some tuning to be applied here to use autosuspend so that
we don't bounce the device's power so much, but in steady-state
GPU-bound rendering we keep the power on (since we keep multiple jobs
outstanding) and even if we power cycle on every job we can still
manage at least 680 fps.

More importantly, though, runtime PM will allow us to power off the
device to do a GPU reset.

v2: Switch #ifdef to CONFIG_PM not CONFIG_PM_SLEEP (caught by kbuild
    test robot)
Signed-off-by: NEric Anholt <eric@anholt.net>

001bdb55

drm/vc4: Fix spurious GPU resets due to BO reuse. · c4ce60dc

由 Eric Anholt 提交于 2月 08, 2016

We were tracking the "where are the head pointers pointing" globally,
so if another job reused the same BOs and execution was at the same
point as last time we checked, we'd stop and trigger a reset even
though the GPU had made progress.
Signed-off-by: NEric Anholt <eric@anholt.net>

c4ce60dc

drm/vc4: Add support for scaling of display planes. · 21af94cf

由 Eric Anholt 提交于 10月 20, 2015

This implements a simple policy for choosing scaling modes
(trapezoidal for decimation, PPF for magnification), and a single PPF
filter (Mitchell/Netravali's recommendation).
Signed-off-by: NEric Anholt <eric@anholt.net>

21af94cf

drm/vc4: Make the CRTCs cooperate on allocating display lists. · d8dbf44f

由 Eric Anholt 提交于 12月 28, 2015

So far, we've only ever lit up one CRTC, so this has been fine.  To
extend to more displays or more planes, we need to make sure we don't
run our display lists into each other.
Signed-off-by: NEric Anholt <eric@anholt.net>

d8dbf44f

08 2月, 2016 1 次提交

drm/vc4: Nuke preclose hook · 32a3dbeb

由 Daniel Vetter 提交于 1月 25, 2016

Again since the drm core takes care of event unlinking/disarming this
is now just needless code.

v2: Fixup misplaced hunk.

Cc: Eric Anholt <eric@anholt.net>
Acked-by: NDaniel Stone <daniels@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Acked-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453756616-28942-14-git-send-email-daniel.vetter@ffwll.ch

32a3dbeb

08 12月, 2015 7 次提交

drm/vc4: Add an interface for capturing the GPU state after a hang. · 21461365

由 Eric Anholt 提交于 10月 30, 2015

This can be parsed with vc4-gpu-tools tools for trying to figure out
what was going on.

v2: Use __u32-style types.
Signed-off-by: NEric Anholt <eric@anholt.net>

21461365

drm/vc4: Add support for async pageflips. · b501bacc

由 Eric Anholt 提交于 11月 30, 2015

An async pageflip stores the modeset to be done and executes it once
the BOs are ready to be displayed.  This gets us about 3x performance
in full screen rendering with pageflipping.
Signed-off-by: NEric Anholt <eric@anholt.net>

b501bacc

drm/vc4: Add support for drawing 3D frames. · d5b1a78a

由 Eric Anholt 提交于 11月 30, 2015

The user submission is basically a pointer to a command list and a
pointer to uniforms.  We copy those in to the kernel, validate and
relocate them, and store the result in a GPU BO which we queue for
execution.

v2: Drop support for NV shader recs (not necessary for GL), simplify
    vc4_use_bo(), improve bin flush/semaphore checks, use __u32 style
    types.
Signed-off-by: NEric Anholt <eric@anholt.net>

d5b1a78a

drm/vc4: Bind and initialize the V3D engine. · d3f5168a

由 Eric Anholt 提交于 3月 02, 2015

This is the component of the GPU that does 3D rendering.
Signed-off-by: NEric Anholt <eric@anholt.net>

d3f5168a

drm/vc4: Add an API for creating GPU shaders in GEM BOs. · 463873d5

由 Eric Anholt 提交于 11月 30, 2015

Since we have no MMU, the kernel needs to validate that the submitted
shader code won't make any accesses to memory that the user doesn't
control, which involves banning some operations (general purpose DMA
writes), and tracking where we need to write out pointers for other
operations (texture sampling).  Once it's validated, we return a GEM
BO containing the shader, which doesn't allow mapping for write or
exporting to other subsystems.

v2: Use __u32-style types.
Signed-off-by: NEric Anholt <eric@anholt.net>

463873d5

drm/vc4: Add create and map BO ioctls. · d5bc60f6

由 Eric Anholt 提交于 1月 18, 2015

While there exist dumb APIs for creating and mapping BOs, one of the
rules is that drivers doing 3D acceleration have to provide their own
APIs for buffer allocation (besides, the pitch/height parameters of
the dumb alloc don't really make sense for a lot of 3D allocations).

v2: Use __u32-style types, use "drm.h" instead of <drm/drm.h>.
Signed-off-by: NEric Anholt <eric@anholt.net>

d5bc60f6

drm/vc4: Add a BO cache. · c826a6e1

由 Eric Anholt 提交于 10月 09, 2015

We need to allocate new BOs in the kernel as part of each frame, but
the CMA allocator is way too slow for that.  As an optimization, keep
track of recently-freed BOs and reuse them, with a 1 second timeout to
fully free them back to the system.

This improves 3D performance by about 15%.
Signed-off-by: NEric Anholt <eric@anholt.net>

c826a6e1

21 10月, 2015 2 次提交

drm/vc4: Use the fbdev_cma helpers · 48666d56

由 Derek Foreman 提交于 7月 02, 2015

Keep the fbdev_cma pointer around so we can use it on hotplog and close
to ensure the frame buffer console is in a useful state.
Signed-off-by: NDerek Foreman <derekf@osg.samsung.com>
Signed-off-by: NEric Anholt <eric@anholt.net>

48666d56

drm/vc4: Add KMS support for Raspberry Pi. · c8b75bca

由 Eric Anholt 提交于 3月 02, 2015

This is enough for fbcon and bringing up X using
xf86-video-modesetting.  It doesn't support the 3D accelerator or
power management yet.

v2: Drop FB_HELPER select thanks to Archit's patches.  Do manual init
    ordering instead of using the .load hook.  Structure registration
    more like tegra's, but still using the typical "component" code.
    Drop no-op hooks for atomic_begin and mode_fixup() now that
    they're optional.  Drop sentinel in Makefile.  Fix minor style
    nits I noticed on another reread.

v3: Use the new bcm2835 clk driver to manage pixel/HSM clocks instead
    of having a fixed video mode.  Use exynos-style component driver
    matching instead of devicetree nodes to list the component driver
    instances.  Rename compatibility strings to say bcm2835, and
    distinguish pv0/1/2.  Clean up some h/vsync code, and add in
    interlaced mode setup.  Fix up probe/bind error paths.  Use
    bitops.h macros for vc4_regs.h

v4: Include i2c.h, allow building under COMPILE_TEST, drop msleep now
    that other bugs have been fixed, add timeouts to cpu_relax()
    loops, rename hpd-gpio to hpd-gpios.
Signed-off-by: NEric Anholt <eric@anholt.net>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c8b75bca

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功