• J
    libobs: Full-screen triangle format conversions · aa22b61e
    James Park 提交于
    The cache coherency of rasterization for full-screen passes is better
    using an oversized triangle that is clipped rather than two triangles.
    Traversal order of rasterization is GPU-specific, but will almost
    certainly be better using an undivided primitive.
    
    A smaller benefit is that quads along the diagonal are not evaluated
    multiple times, but that's minor in comparison.
    
    Redo format shaders to bypass vertex buffer, and input layout. Add
    global shader bool "obs_glsl_compile" to make API-specific decisions,
    i.e. handle upside-down UVs. gl_ortho is not needed for format
    conversion because the vertex shader does not use ViewProj anymore.
    
    This can be applied to more situations, but start small first.
    
    Testbed full screen passes, Intel HD Graphics 530:
    RGBA -> UYVX: 467 -> 439 us, ~6% savings
    UYVX -> uv: 295 -> 239 us, ~19% savings
    aa22b61e
obs-video.c 24.6 KB