1. 22 5月, 2019 5 次提交
  2. 20 4月, 2019 3 次提交
  3. 17 4月, 2019 1 次提交
  4. 13 4月, 2019 2 次提交
    • R
      drm/panfrost: Add initial panfrost driver · f3ba9122
      Rob Herring 提交于
      This adds the initial driver for panfrost which supports Arm Mali
      Midgard and Bifrost family of GPUs. Currently, only the T860 and
      T760 Midgard GPUs have been tested.
      
      v2:
      - Add GPU reset on job hangs (Tomeu)
      - Add RuntimePM and devfreq support (Tomeu)
      - Fix T760 support (Tomeu)
      - Add a TODO file (Rob, Tomeu)
      - Support multiple in fences (Tomeu)
      - Drop support for shared fences (Tomeu)
      - Fill in MMU de-init (Rob)
      - Move register definitions back to single header (Rob)
      - Clean-up hardcoded job submit todos (Rob)
      - Implement feature setup based on features/issues (Rob)
      - Add remaining Midgard DT compatible strings (Rob)
      
      v3:
      - Add support for reset lines (Neil)
      - Add a MAINTAINERS entry (Rob)
      - Call dma_set_mask_and_coherent (Rob)
      - Do MMU invalidate on map and unmap. Restructure to do a single
        operation per map/unmap call. (Rob)
      - Add a missing explicit padding to struct drm_panfrost_create_bo (Rob)
      - Fix 0-day error: "panfrost_devfreq.c:151:9-16: ERROR: PTR_ERR applied after initialization to constant on line 150"
      - Drop HW_FEATURE_AARCH64_MMU conditional (Rob)
      - s/DRM_PANFROST_PARAM_GPU_ID/DRM_PANFROST_PARAM_GPU_PROD_ID/ (Rob)
      - Check drm_gem_shmem_prime_import_sg_table() error code (Rob)
      - Re-order power on sequence (Rob)
      - Move panfrost_acquire_object_fences() before scheduling job (Rob)
      - Add NULL checks on array pointers in job clean-up (Rob)
      - Rework devfreq (Tomeu)
      - Fix devfreq init with no regulator (Rob)
      - Various WS and comments clean-up (Rob)
      
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Maxime Ripard <maxime.ripard@bootlin.com>
      Cc: Sean Paul <sean@poorly.run>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Lyude Paul <lyude@redhat.com>
      Reviewed-by: NAlyssa Rosenzweig <alyssa@rosenzweig.io>
      Reviewed-by: NEric Anholt <eric@anholt.net>
      Reviewed-by: NSteven Price <steven.price@arm.com>
      Signed-off-by: NMarty E. Plummer <hanetzer@startmail.com>
      Signed-off-by: NTomeu Vizoso <tomeu.vizoso@collabora.com>
      Signed-off-by: NNeil Armstrong <narmstrong@baylibre.com>
      Signed-off-by: NRob Herring <robh@kernel.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190409205427.6943-4-robh@kernel.org
      f3ba9122
    • C
      drm/amdgpu: add timeline support in amdgpu CS v3 · 2624dd15
      Chunming Zhou 提交于
      syncobj wait/signal operation is appending in command submission.
      v2: separate to two kinds in/out_deps functions
      v3: fix checking for timeline syncobj
      Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
      Cc: Tobias Hector <Tobias.Hector@amd.com>
      Cc: Jason Ekstrand <jason@jlekstrand.net>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
      Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      2624dd15
  5. 04 4月, 2019 1 次提交
  6. 02 4月, 2019 1 次提交
    • Q
      drm/lima: driver for ARM Mali4xx GPUs · a1d2a633
      Qiang Yu 提交于
      - Mali 4xx GPUs have two kinds of processors GP and PP. GP is for
        OpenGL vertex shader processing and PP is for fragment shader
        processing. Each processor has its own MMU so prcessors work in
        virtual address space.
      - There's only one GP but multiple PP (max 4 for mali 400 and 8
        for mali 450) in the same mali 4xx GPU. All PPs are grouped
        togather to handle a single fragment shader task divided by
        FB output tiled pixels. Mali 400 user space driver is
        responsible for assign target tiled pixels to each PP, but mali
        450 has a HW module called DLBU to dynamically balance each
        PP's load.
      - User space driver allocate buffer object and map into GPU
        virtual address space, upload command stream and draw data with
        CPU mmap of the buffer object, then submit task to GP/PP with
        a register frame indicating where is the command stream and misc
        settings.
      - There's no command stream validation/relocation due to each user
        process has its own GPU virtual address space. GP/PP's MMU switch
        virtual address space before running two tasks from different
        user process. Error or evil user space code just get MMU fault
        or GP/PP error IRQ, then the HW/SW will be recovered.
      - Use GEM+shmem for MM. Currently just alloc and pin memory when
        gem object creation. GPU vm map of the buffer is also done in
        the alloc stage in kernel space. We may delay the memory
        allocation and real GPU vm map to command submission stage in the
        furture as improvement.
      - Use drm_sched for GPU task schedule. Each OpenGL context should
        have a lima context object in the kernel to distinguish tasks
        from different user. drm_sched gets task from each lima context
        in a fair way.
      
      mesa driver can be found here before upstreamed:
      https://gitlab.freedesktop.org/lima/mesa
      
      v8:
      - add comments for in_sync
      - fix ctx free miss mutex unlock
      
      v7:
      - remove lima_fence_ops with default value
      - move fence slab create to device probe
      - check pad ioctl args to be zero
      - add comments for user/kernel interface
      
      v6:
      - fix comments by checkpatch.pl
      
      v5:
      - export gp/pp version to userspace
      - rebase on drm-misc-next
      
      v4:
      - use get param interface to get info
      - separate context create/free ioctl
      - remove unused max sched task param
      - update copyright time
      - use xarray instead of idr
      - stop using drmP.h
      
      v3:
      - fix comments from kbuild robot
      - restrict supported arch to tested ones
      
      v2:
      - fix syscall argument check
      - fix job finish fence leak since kernel 5.0
      - use drm syncobj to replace native fence
      - move buffer object GPU va map into kernel
      - reserve syscall argument space for future info
      - remove kernel gem modifier
      - switch TTM back to GEM+shmem MM
      - use time based io poll
      - use whole register name
      - adopt gem reservation obj integration
      - use drm_timeout_abs_to_jiffies
      
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Christian König <ckoenig.leichtzumerken@gmail.com>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Alex Deucher <alexdeucher@gmail.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: Dave Airlie <airlied@gmail.com>
      Signed-off-by: NAndreas Baierl <ichgeh@imkreisrum.de>
      Signed-off-by: NErico Nunes <nunes.erico@gmail.com>
      Signed-off-by: NHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: NMarek Vasut <marex@denx.de>
      Signed-off-by: NNeil Armstrong <narmstrong@baylibre.com>
      Signed-off-by: NSimon Shields <simon@lineageos.org>
      Signed-off-by: NVasily Khoruzhick <anarsoul@gmail.com>
      Signed-off-by: NQiang Yu <yuq825@gmail.com>
      Reviewed-by: NEric Anholt <eric@anholt.net>
      Reviewed-by: NRob Herring <robh@kerrnel.org>
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Link: https://patchwork.freedesktop.org/patch/291200/
      a1d2a633
  7. 01 4月, 2019 4 次提交
  8. 27 3月, 2019 2 次提交
  9. 22 3月, 2019 4 次提交
    • C
      drm/i915: Allow contexts to share a single timeline across all engines · ea593dbb
      Chris Wilson 提交于
      Previously, our view has been always to run the engines independently
      within a context. (Multiple engines happened before we had contexts and
      timelines, so they always operated independently and that behaviour
      persisted into contexts.) However, at the user level the context often
      represents a single timeline (e.g. GL contexts) and userspace must
      ensure that the individual engines are serialised to present that
      ordering to the client (or forgot about this detail entirely and hope no
      one notices - a fair ploy if the client can only directly control one
      engine themselves ;)
      
      In the next patch, we will want to construct a set of engines that
      operate as one, that have a single timeline interwoven between them, to
      present a single virtual engine to the user. (They submit to the virtual
      engine, then we decide which engine to execute on based.)
      
      To that end, we want to be able to create contexts which have a single
      timeline (fence context) shared between all engines, rather than multiple
      timelines.
      
      v2: Move the specialised timeline ordering to its own function.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-4-chris@chris-wilson.co.uk
      ea593dbb
    • C
      drm/i915: Extend CONTEXT_CREATE to set parameters upon construction · b9171541
      Chris Wilson 提交于
      It can be useful to have a single ioctl to create a context with all
      the initial parameters instead of a series of create + setparam + setparam
      ioctls. This extension to create context allows any of the parameters
      to be passed in as a linked list to be applied to the newly constructed
      context.
      
      v2: Make a local copy of user setparam (Tvrtko)
      v3: Use flags to detect availability of extension interface
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-3-chris@chris-wilson.co.uk
      b9171541
    • C
      drm/i915: Create/destroy VM (ppGTT) for use with contexts · e0695db7
      Chris Wilson 提交于
      In preparation to making the ppGTT binding for a context explicit (to
      facilitate reusing the same ppGTT between different contexts), allow the
      user to create and destroy named ppGTT.
      
      v2: Replace global barrier for swapping over the ppgtt and tlbs with a
      local context barrier (Tvrtko)
      v3: serialise with struct_mutex; it's lazy but required dammit
      v4: Rewrite igt_ctx_shared_exec to be more different (aimed to be more
      similarly, turned out different!)
      
      v5: Fix up test unwind for aliasing-ppgtt (snb)
      v6: Tighten language for uapi struct drm_i915_gem_vm_control.
      v7: Patch the context image for runtime ppgtt switching!
      
      Testcase: igt/gem_vm_create
      Testcase: igt/gem_ctx_param/vm
      Testcase: igt/gem_ctx_clone/vm
      Testcase: igt/gem_ctx_shared
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-2-chris@chris-wilson.co.uk
      e0695db7
    • C
      drm/i915: Introduce the i915_user_extension_method · 9d1305ef
      Chris Wilson 提交于
      An idea for extending uABI inspired by Vulkan's extension chains.
      Instead of expanding the data struct for each ioctl every time we need
      to add a new feature, define an extension chain instead. As we add
      optional interfaces to control the ioctl, we define a new extension
      struct that can be linked into the ioctl data only when required by the
      user. The key advantage being able to ignore large control structs for
      optional interfaces/extensions, while being able to process them in a
      consistent manner.
      
      In comparison to other extensible ioctls, the key difference is the
      use of a linked chain of extension structs vs an array of tagged
      pointers. For example,
      
      struct drm_amdgpu_cs_chunk {
              __u32           chunk_id;
              __u32           length_dw;
              __u64           chunk_data;
      };
      
      struct drm_amdgpu_cs_in {
              __u32           ctx_id;
              __u32           bo_list_handle;
              __u32           num_chunks;
              __u32           _pad;
              __u64           chunks;
      };
      
      allows userspace to pass in array of pointers to extension structs, but
      must therefore keep constructing that array along side the command stream.
      In dynamic situations like that, a linked list is preferred and does not
      similar from extra cache line misses as the extension structs themselves
      must still be loaded separate to the chunks array.
      
      v2: Apply the tail call optimisation directly to nip the worry of stack
      overflow in the bud.
      v3: Defend against recursion.
      v4: Fixup local types to match new uabi
      
      Opens:
      - do we include the result as an out-field in each chain?
      struct i915_user_extension {
      	__u64 next_extension;
      	__u64 name;
      	__s32 result;
      	__u32 mbz; /* reserved for future use */
      };
      * Undecided, so provision some room for future expansion.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-1-chris@chris-wilson.co.uk
      9d1305ef
  10. 21 3月, 2019 1 次提交
    • M
      drm/fourcc: Fix conflicting Y41x definitions · ff01e697
      Maarten Lankhorst 提交于
      There has unfortunately been a conflict with the following 3 commits:
      
      commit e9961ab9
      Author: Ayan Kumar Halder <ayan.halder@arm.com>
      Date:   Fri Nov 9 17:21:12 2018 +0000
          drm: Added a new format DRM_FORMAT_XVYU2101010
      
      commit 7ba0fee2
      Author: Brian Starkey <brian.starkey@arm.com>
      Date:   Fri Oct 5 10:27:00 2018 +0100
      
          drm/fourcc: Add AFBC yuv fourccs for Mali
      
      and
      
      commit 50bf5d7d
      Author: Swati Sharma <swati2.sharma@intel.com>
      Date:   Mon Mar 4 17:26:33 2019 +0530
      
          drm: Add Y2xx and Y4xx (xx:10/12/16) format definitions and fourcc
      
      Unfortunately gcc didn't warn about the redefinitions, because the
      double defines were the set to same value, and gcc apparently no longer
      warns about that.
      
      Fix this by using new XYVU for i915, without alpha, and making the
      Y41x definitions match msdn, with alpha.
      
      Fortunately we caught it early, and the conflict hasn't even landed in
      drm-next yet.
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Brian Starkey <Brian.Starkey@arm.com>
      Cc: Swati Sharma <swati2.sharma@intel.com>
      Cc: Ayan Kumar Halder <ayan.halder@arm.com>
      Cc: malidp@foss.arm.com
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Maxime Ripard <maxime.ripard@bootlin.com>
      Cc: Sean Paul <sean@poorly.run>
      Cc: Dave Airlie <airlied@linux.ie>
      Cc: Liviu Dudau <Liviu.Dudau@arm.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190319121702.6814-1-maarten.lankhorst@linux.intel.com
      Acked-by: Jani Nikula <jani.nikula@intel.com> #irc
      Acked-by: NSean Paul <sean@poorly.run>
      Reviewed-by: NAyan Kumar halder <ayan.halder@arm.com>
      ff01e697
  11. 20 3月, 2019 3 次提交
  12. 13 3月, 2019 3 次提交
    • K
      drm/fourcc: Add 64 bpp half float formats · 88ab9c76
      Kevin Strasser 提交于
      Add 64 bpp 16:16:16:16 half float pixel formats. Each 16 bit component is
      formatted in IEEE-754 half-precision float (binary16) 1:5:10
      MSb-sign:exponent:fraction form.
      
      This patch attempts to address the feedback provided when 2 of these
      formats were previosly proposed:
        https://patchwork.kernel.org/patch/10072545/
      
      v2:
      - Fixed cpp (Ville)
      - Added detail pixel formatting (Ville)
      - Ordered formats in header (Ville)
      
      v5:
      - .depth should be 0 for new formats (Maarten)
      
      Cc: Tina Zhang <tina.zhang@intel.com>
      Cc: Uma Shankar <uma.shankar@intel.com>
      Cc: Shashank Sharma <shashank.sharma@intel.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: dri-devel@lists.freedesktop.org
      Signed-off-by: NKevin Strasser <kevin.strasser@intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Reviewed-by: NAdam Jackson <ajax@redhat.com>
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1552437513-22648-2-git-send-email-kevin.strasser@intel.com
      88ab9c76
    • A
      drm: Added a new format DRM_FORMAT_XVYU2101010 · e9961ab9
      Ayan Kumar Halder 提交于
      This new format is supported by DP550 and DP650
      
      Changes since v3 (series):
      - Added the ack
      - Rebased on the latest drm-misc-next
      Signed-off-by: NAyan Kumar halder <ayan.halder@arm.com>
      Reviewed-by: NLiviu Dudau <liviu.dudau@arm.com>
      Acked-by: NAlyssa Rosenzweig <alyssa@rosenzweig.io>
      Link: https://patchwork.freedesktop.org/patch/291758/?series=57895&rev=1
      e9961ab9
    • B
      drm/fourcc: Add AFBC yuv fourccs for Mali · 7ba0fee2
      Brian Starkey 提交于
      As we look to enable AFBC using DRM format modifiers, we run into
      problems which we've historically handled via vendor-private details
      (i.e. gralloc, on Android).
      
      AFBC (as an encoding) is fully flexible, and for example YUV data can
      be encoded into 1, 2 or 3 encoded "planes", much like the linear
      equivalents. Component order is also meaningful, as AFBC doesn't
      necessarily care about what each "channel" of the data it encodes
      contains. Therefore ABGR8888 and RGBA8888 can be encoded in AFBC with
      different representations. Similarly, 'X' components may be encoded
      into AFBC streams in cases where a decoder expects to decode a 4th
      component.
      
      In addition, AFBC is a licensable IP, meaning that to support the
      ecosystem we need to ensure that _all_ AFBC users are able to describe
      the encodings that they need. This is much better achieved by
      preserving meaning in the fourcc codes when they are combined with an
      AFBC modifier.
      
      In essence, we want to use the modifier to describe the parameters of
      the AFBC encode/decode, and use the fourcc code to describe the data
      being encoded/decoded.
      
      To do anything different would be to introduce redundancy - we would
      need to duplicate in the modifier information which is _already_
      conveyed clearly and non-ambigiously by a fourcc code.
      
      I hope that for RGB this is non-controversial.
      (BGRA8888 + MODIFIER_AFBC) is a different format from
      (RGBA8888 + MODIFIER_AFBC).
      
      Possibly more controversial is that (XBGR8888 + MODIFIER_AFBC)
      is different from (BGR888 + MODIFIER_AFBC). I understand that in some
      schemes it is not the case - but in AFBC it is so.
      
      Where we run into problems is where there are not already fourcc codes
      which represent the data which the AFBC encoder/decoder is processing.
      To that end, we want to introduce new fourcc codes to describe the
      data being encoded/decoded, in the places where none of the existing
      fourcc codes are applicable.
      
      Where we don't support an equivalent non-compressed layout, or where
      no "obvious" linear layout exists, we are proposing adding fourcc
      codes which have no associated linear layout - because any layout we
      proposed would be completely arbitrary.
      
      Some formats are following the naming conventions from [2].
      
      The summary of the new formats is:
       DRM_FORMAT_VUY888 - Packed 8-bit YUV 444. Y followed by U then V.
       DRM_FORMAT_VUY101010 - Packed 10-bit YUV 444. Y followed by U then
                              V. No defined linear encoding.
       DRM_FORMAT_Y210 - Packed 10-bit YUV 422. Y followed by U (then Y)
                         then V. 10-bit samples in 16-bit words.
       DRM_FORMAT_Y410 - Packed 10-bit YUV 444, with 2-bit alpha.
       DRM_FORMAT_P210 - Semi-planar 10-bit YUV 422. Y plane, followed by
                         interleaved U-then-V plane. 10-bit samples in
                         16-bit words.
       DRM_FORMAT_YUV420_8BIT - Packed 8-bit YUV 420. Y followed by U then
                                V. No defined linear encoding
       DRM_FORMAT_YUV420_10BIT - Packed 10-bit YUV 420. Y followed by U
                                 then V. No defined linear encoding
      
      Please also note that in the absence of AFBC, we would still need to
      add Y410, Y210 and P210.
      
      Full rationale follows:
      
      YUV 444 8-bit, 1-plane
      ----------------------
       The currently defined AYUV format encodes a 4th alpha component,
       which makes it unsuitable for representing a 3-component YUV 444
       AFBC stream.
      
       The proposed[1] XYUV format which is supported by Mali-DP in linear
       layout is also unsuitable, because the component order is the
       opposite of the AFBC version, and it encodes a 4th 'X' component.
      
       DRM_FORMAT_VUY888 is the "obvious" format for a 3-component, packed,
       YUV 444 8-bit format, with the component order which our HW expects to
       encode/decode. It conforms to the same naming convention as the
       existing packed YUV 444 format.
       The naming here is meant to be consistent with DRM_FORMAT_AYUV and
       DRM_FORMAT_XYUV[1]
      
      YUV 444 10-bit, 1-plane
      -----------------------
       There is no currently-defined YUV 444 10-bit format in
       drm_fourcc.h, irrespective of number of planes.
      
       The proposed[1] XVYU2101010 format which is supported by Mali-DP in
       linear layout uses the wrong component order, and also encodes a 4th
       'X' component, which doesn't match the AFBC version of YUV 444
       10-bit which we support.
      
       DRM_FORMAT_Y410 is the same layout as XVYU2101010, but with 2 bits of
       alpha.  This format is supported with linear layout by Mali GPUs. The
       naming follows[2].
      
       There is no "obvious" linear encoding for a 3-component 10:10:10
       packed format, and so DRM_FORMAT_VUY101010 defines a component
       order, but not a bit encoding. Again, the naming is meant to be
       consistent with DRM_FORMAT_AYUV.
      
      YUV 422 8-bit, 1-plane
      ----------------------
       The existing DRM_FORMAT_YUYV (and the other component orders) are
       single-planar YUV 422 8-bit formats. Following the convention of
       the component orders of the RGB formats, YUYV has the correct
       component order for our AFBC encoding (Y followed by U followed by
       V). We can use YUYV for AFBC YUV 422 8-bit.
      
      YUV 422 10-bit, 1-plane
      -----------------------
       There is no currently-defined YUV 422 10-bit format in drm_fourcc.h
      
       DRM_FORMAT_Y210 is analogous to YUYV, but with 10-bits per sample
       packed into the upper 10-bits of 16-bit samples. This format is
       supported in both linear and AFBC by Mali GPUs.
      
      YUV 422 10-bit, 2-plane
      -----------------------
       The recently defined DRM_FORMAT_P010 format is a 10-bit semi-planar
       YUV 420 format, which has the correct component ordering for an AFBC
       2-plane YUV 420 buffer. The linear layout contains meaningless padding
       bits, which will not be encoded in an AFBC stream.
      
      YUV 420 8-bit, 1-plane
      ----------------------
       There is no currently defined single-planar YUV 420, 8-bit format
       in drm_fourcc.h. There's differing opinions on whether using the
       existing fourcc-implied n_planes where possible is a good idea or
       not when using modifiers.
      
       For me, it's much more "obvious" to use NV12 for 2-plane AFBC and
       YUV420 for 3-plane AFBC. This keeps the aforementioned separation
       between the AFBC codec settings (in the modifier) and the pixel data
       format (in the fourcc). With different vendors using AFBC, this helps
       to ensure that there is no confusion in interoperation. It also
       ensures that the AFBC modifiers describe AFBC itself (which is a
       licensable component), and not implementation details which are not
       defined by AFBC.
      
       The proposed[1] X0L0 format which Mali-DP supports with Linear layout
       is unsuitable, as it contains a 4th 'X' component, and our AFBC
       decoder expects only 3 components.
      
       To that end, we propose a new YUV 420 8-bit format. There is no
       "obvious" linear encoding for a 3-component 8:8:8, 420, packed format,
       and so DRM_FORMAT_YUV420_8BIT defines a component order, but not a
       bit encoding. I'm happy to hear different naming suggestions.
      
      YUV 420 8-bit, 2-, 3-plane
      --------------------------
       These already exist, we can use NV12 and YUV420.
      
      YUV 420 10-bit, 1-plane
      -----------------------
       As above, no current definition exists, and X0L2 encodes a 4th 'X'
       channel.
      
       Analogous to DRM_FORMAT_YUV420_8BIT, we define DRM_FORMAT_YUV420_10BIT.
      
      [1] https://lists.freedesktop.org/archives/dri-devel/2018-July/184598.html
      [2] https://docs.microsoft.com/en-us/windows/desktop/medfound/10-bit-and-16-bit-yuv-video-formats
      
      Changes since RFC v1:
       - Fix confusing subsampling vs bit-depth X:X:X notation in
         descriptions (danvet)
       - Rename DRM_FORMAT_AVYU1101010 to DRM_FORMAT_Y410 (Lisa Wu)
       - Add drm_format_info structures for the new formats, using the
         new 'bpp' field for those with non-integer bytes-per-pixel
       - Rebase, including Juha-Pekka Heikkila's format definitions
      
      Changes since RFC v2:
      - Rebase on top of latest changes in drm-misc-next
      - Change the description of DRM_FORMAT_P210 in __drm_format_info and
      drm_fourcc.h so as to make it consistent with other DRM_FORMAT_PXXX
      formats.
      
      Changes since v3:
      - Added the ack
      - Rebased on the latest drm-misc-next
      Signed-off-by: NBrian Starkey <brian.starkey@arm.com>
      Signed-off-by: NAyan Kumar Halder <ayan.halder@arm.com>
      Reviewed-by: NLiviu Dudau <liviu.dudau@arm.com>
      Acked-by: NAlyssa Rosenzweig <alyssa@rosenzweig.io>
      Link: https://patchwork.freedesktop.org/patch/291759/?series=57895&rev=1
      7ba0fee2
  13. 06 3月, 2019 1 次提交
  14. 05 3月, 2019 1 次提交
  15. 02 3月, 2019 2 次提交
    • C
      drm/i915: Fix I915_EXEC_RING_MASK · d90c06d5
      Chris Wilson 提交于
      This was supposed to be a mask of all known rings, but it is being used
      by execbuffer to filter out invalid rings, and so is instead mapping high
      unused values onto valid rings. Instead of a mask of all known rings,
      we need it to be the mask of all possible rings.
      
      Fixes: 549f7365 ("drm/i915: Enable SandyBridge blitter ring")
      Fixes: de1add36 ("drm/i915: Decouple execbuf uAPI from internal implementation")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: <stable@vger.kernel.org> # v4.6+
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301140404.26690-21-chris@chris-wilson.co.uk
      d90c06d5
    • C
      drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ · e8861964
      Chris Wilson 提交于
      Having introduced per-context seqno, we now have a means to identity
      progress across the system without feel of rollback as befell the
      global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in
      advance of submission safe in the knowledge that our target seqno and
      address is stable.
      
      However, since we are telling the GPU to busy-spin on the target address
      until it matches the signaling seqno, we only want to do so when we are
      sure that busy-spin will be completed quickly. To achieve this we only
      submit the request to HW once the signaler is itself executing (modulo
      preemption causing us to wait longer), and we only do so for default and
      above priority requests (so that idle priority tasks never themselves
      hog the GPU waiting for others).
      
      As might be reasonably expected, HW semaphores excel in inter-engine
      synchronisation microbenchmarks (where the 3x reduced latency / increased
      throughput more than offset the power cost of spinning on a second ring)
      and have significant improvement (can be up to ~10%, most see no change)
      for single clients that utilize multiple engines (typically media players
      and transcoders), without regressing multiple clients that can saturate
      the system or changing the power envelope dramatically.
      
      v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway.
      v4: Tell the world and include it as part of scheduler caps.
      
      Testcase: igt/gem_exec_whisper
      Testcase: igt/benchmarks/gem_wsim
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-3-chris@chris-wilson.co.uk
      e8861964
  16. 20 2月, 2019 2 次提交
  17. 19 2月, 2019 1 次提交
  18. 18 2月, 2019 1 次提交
  19. 16 2月, 2019 1 次提交
  20. 09 2月, 2019 1 次提交