1. 01 6月, 2013 1 次提交
  2. 18 1月, 2013 2 次提交
  3. 18 12月, 2012 1 次提交
    • D
      drm/i915: Implement workaround for broken CS tlb on i830/845 · b45305fc
      Daniel Vetter 提交于
      Now that Chris Wilson demonstrated that the key for stability on early
      gen 2 is to simple _never_ exchange the physical backing storage of
      batch buffers I've tried a stab at a kernel solution. Doesn't look too
      nefarious imho, now that I don't try to be too clever for my own good
      any more.
      
      v2: After discussing the various techniques, we've decided to always blit
      batches on the suspect devices, but allow userspace to opt out of the
      kernel workaround assume full responsibility for providing coherent
      batches. The principal reason is that avoiding the blit does improve
      performance in a few key microbenchmarks and also in cairo-trace
      replays.
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet:
      - Drop the hunk which uses HAS_BROKEN_CS_TLB to implement the ring
        wrap w/a. Suggested by Chris Wilson.
      - Also add the ACTHD check from Chris Wilson for the error state
        dumping, so that we still catch batches when userspace opts out of
        the w/a.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b45305fc
  4. 05 10月, 2012 1 次提交
  5. 03 10月, 2012 1 次提交
  6. 26 9月, 2012 1 次提交
  7. 20 9月, 2012 1 次提交
  8. 17 8月, 2012 1 次提交
    • D
      drm/i915: implement dma buf begin_cpu_access (v2) · ec6f1bb9
      Dave Airlie 提交于
      In order for udl vmap to work properly, we need to push the object
      into the CPU domain before we start copying the data to the USB device.
      
      This along with the udl change avoids userspace explicit mapping to
      be used.
      
      v2: add a flag for userspace to query to know if Intel kernel driver can
      deal with the vmap flushing properly. In theory udl would need a flag also,
      but I intend to push the patches very close to each other and other drivers
      should do the right thing from the start.
      
      I've added a test to my intel-gpu-tools prime branch, however testing
      this is a bit messy since the only way to get udl to vmap is to rendering
      something. I've tested this with real code as well to make sure it works.
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      [danvet: resolved conflict, which required reallocating the PARAM
      number to 21.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ec6f1bb9
  9. 08 8月, 2012 1 次提交
    • C
      drm/i915: Add I915_GEM_PARAM_HAS_SEMAPHORES · 2fedbff9
      Chris Wilson 提交于
      Userspace tries to estimate the cost of ring switching based on whether
      the GPU and GEM supports semaphores. (If we have multiple rings and no
      semaphores, userspace assumes that the cost of switching rings between
      batches is exorbitant and will endeavour to keep the next batch on the
      active ring - as a coarse approximation to tracking both destination and
      source surfaces.) Currently userspace has to guess whether semaphores
      exist based on the chipset generation and the module parameter,
      i915.semaphores. This is a crude and inaccurate guess as the defaults
      internally depend upon other chipset features being enabled or disabled,
      nor does it extend well into the future. By exporting a HAS_SEMAPHORES
      parameter, we can easily query the driver and obtain an accurate answer.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2fedbff9
  10. 26 7月, 2012 4 次提交
    • C
      drm/i915: Export ability of changing cache levels to userspace · e6994aee
      Chris Wilson 提交于
      By selecting the cache level (essentially whether or not the CPU snoops
      any updates to the bo, and on more recent machines whether it resides
      inside the CPU's last-level-cache) a userspace driver is able to then
      manage all of its memory within buffer objects, if it so desires. This
      enables the userspace driver to accelerate uploads and more importantly
      downloads from the GPU and to able to mix CPU and GPU rendering/activity
      efficiently.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Added code comment about where we plan to stuff platform
      specific cacheing control bits in the ioctl struct.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e6994aee
    • C
      drm/i915: Return a mask of the active rings in the high word of busy_ioctl · e9808edd
      Chris Wilson 提交于
      The intention is to help select which engine to use for copies with
      interoperating clients - such as a GL client making a request to the X
      server to perform a SwapBuffers, which may require copying from the
      active GL back buffer to the X front buffer.
      
      We choose to report a mask of the active rings to future proof the
      interface against any changes which may allow for the object to reside
      upon multiple rings.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: bikeshed away the write ring mask and add the explanation
      Chris sent in a follow-up mail why we decided to use masks.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e9808edd
    • B
      drm/i915: add register read IOCTL · c0c7babc
      Ben Widawsky 提交于
      The interface's immediate purpose is to do synchronous timestamp queries
      as required by GL_TIMESTAMP. The GPU has a register for reading the
      timestamp but because that would normally require root access through
      libpciaccess, the IOCTL can provide this service instead.
      
      Currently the implementation whitelists only the render ring timestamp
      register, because that is the only thing we need to expose at this time.
      
      v2: make size implicit based on the register offset
      Add a generation check
      Reviewed-by: NEric Anholt <eric@anholt.net>
      Cc: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: fixup the ioctl numerb:]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c0c7babc
    • D
      drm/i915: Reserve ioctl numbers for set/get_caching · 2b860db6
      Daniel Vetter 提交于
      I'm planing to merge this next week for 3.7, but I'd like to avoid
      stupid conflicts with the exsting userspace when merging the new
      reg_read ioctl (which doesn't have userspace yet, but this caching
      interface has).
      
      Header extracted from Chris Wilson's patch, but fix up the copy&pasted
      comment in the interface struct.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2b860db6
  11. 14 6月, 2012 2 次提交
    • B
      drm/i915/context: switch contexts with execbuf2 · 6e0a69db
      Ben Widawsky 提交于
      Use the rsvd1 field in execbuf2 to specify the context ID associated
      with the workload. This will allow the driver to do the proper context
      switch when/if needed.
      
      v2: Add checks for context switches on rings not supporting contexts.
      Before the code would silently ignore such requests.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      6e0a69db
    • B
      drm/i915/context: create & destroy ioctls · 84624813
      Ben Widawsky 提交于
      Add the interfaces to allow user space to create and destroy contexts.
      Contexts are destroyed automatically if the file descriptor for the dri
      device is closed.
      
      Following convention as usual here causes checkpatch warnings.
      
      v2: with is_initialized, no longer need to init at create
      drop the context switch on create (daniel)
      
      v3: Use interruptible lock (Chris)
      return -ENODEV in !GEM case (Chris)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      84624813
  12. 06 6月, 2012 2 次提交
  13. 25 5月, 2012 1 次提交
    • B
      drm/i915: wait render timeout ioctl · 23ba4fd0
      Ben Widawsky 提交于
      This helps implement GL_ARB_sync but stops short of allowing full blown
      sync objects. Finally we can use the new timed seqno waiting function
      to allow userspace to wait on a buffer object with a timeout. This
      implements that interface.
      
      The IOCTL will take as input a buffer object handle, and a timeout in
      nanoseconds (flags is currently optional but will likely be used for
      permutations of flush operations). Users may specify 0 nanoseconds to
      instantly check.
      
      The wait ioctl with a timeout of 0 reimplements the busy ioctl. With any
      non-zero timeout parameter the wait ioctl will wait for the given number
      of nanoseconds on an object becoming unbusy. Since the wait itself does
      so holding struct_mutex the object may become re-busied before this
      completes. A similar but shorter race condition exists in the busy
      ioctl.
      
      v2: ETIME/ERESTARTSYS instead of changing to EBUSY, and EGAIN (Chris)
      Flush the object from the gpu write domain (Chris + Daniel)
      Fix leaked refcount in good case (Chris)
      Naturally align ioctl struct (Chris)
      
      v3: Drop lock after getting seqno to avoid ugly dance (Chris)
      
      v4: check for 0 timeout after olr check to allow polling (Chris)
      
      v5: Updated the comment. (Chris)
      
      v6: Return -ETIME instead of -EBUSY when timeout_ns is 0 (Daniel)
      Fix the commit message comment to be less ugly (Ben)
      Add a warning to check the return timespec (Ben)
      
      v7: Use DRM_AUTH for the ioctl. (Eugeni)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      23ba4fd0
  14. 21 3月, 2012 1 次提交
  15. 18 1月, 2012 1 次提交
  16. 04 1月, 2012 2 次提交
    • E
      drm/i915: Add support for resetting the SO write pointers on gen7. · ae662d31
      Eric Anholt 提交于
      These registers are automatically incremented by the hardware during
      transform feedback to track where the next streamed vertex output
      should go.  Unlike the previous generation, which had a packet for
      setting the corresponding registers to a defined value, gen7 only has
      MI_LOAD_REGISTER_IMM to do so.  That's a secure packet (since it loads
      an arbitrary register), so we need to do it from the kernel, and it
      needs to be settable atomically with the batchbuffer execution so that
      two clients doing transform feedback don't stomp on each others'
      state.
      
      Instead of building a more complicated interface involcing setting the
      registers to a specific value, just set them to 0 when asked and
      userland can tweak its pointers accordingly.
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com>
      Reviewed-by: NKenneth Graunke <kenneth@whitecape.org>
      Signed-off-by: NKeith Packard <keithp@keithp.com>
      ae662d31
    • J
      drm/i915: add color key support v4 · 8ea30864
      Jesse Barnes 提交于
      Add new ioctls for getting and setting the current destination color
      key.  This allows for simple overlay display control by matching a color
      key value in the primary plane before blending the overlay on top.
      
      v2: remove unnecessary mutex acquire/release around reg accesses
      v3: add support for full color key management
      v4: fix copy & paste bug in snb_get_colorkey
          don't bother checking min/max values against docs as the docs are likely
          wrong (how could we handle 10bpc surface formats?)
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      8ea30864
  17. 23 7月, 2011 1 次提交
  18. 02 3月, 2011 1 次提交
  19. 20 12月, 2010 1 次提交
  20. 05 12月, 2010 1 次提交
  21. 29 10月, 2010 1 次提交
    • C
      drm/i915: Only enforce fence limits inside the GTT. · a00b10c3
      Chris Wilson 提交于
      So long as we adhere to the fence registers rules for alignment and no
      overlaps (including with unfenced accesses to linear memory) and account
      for the tiled access in our size allocation, we do not have to allocate
      the full fenced region for the object. This allows us to fight the bloat
      tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside
      the GTT we still suffer the additional alignment constraints, so it doesn't
      magic allow us to render larger scenes without stalls -- we need the
      expanded GTT and fence pipelining to overcome those...]
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      a00b10c3
  22. 22 10月, 2010 1 次提交
  23. 17 8月, 2010 1 次提交
    • D
      drm: block userspace under allocating buffer and having drivers overwrite it (v2) · 1b2f1489
      Dave Airlie 提交于
      With the current screwed but its ABI, ioctls for the drm, Linus pointed out that we could allow userspace to specify the allocation size, but we pass it to the driver which then uses it blindly to store a struct. Now if userspace specifies the allocation size as smaller than the driver needs, the driver can possibly overwrite memory.
      
      This patch restructures the driver ioctls so we store the structure size we are expecting, and make sure we allocate at least that size. The copy from/to userspace are still restricted to the size the user specifies, this allows ioctl structs to grow on both sides of the equation.
      
      Up until now we didn't really use the DRM_IOCTL defines in the kernel, so this cleans them up and adds them for nouveau.
      
      v2:
      fix nouveau pushbuf arg (thanks to Ben for pointing it out)
      Reported-by: NLinus Torvalds <torvalds@linuxfoundation.org>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      1b2f1489
  24. 03 8月, 2010 1 次提交
    • J
      x86 platform driver: intelligent power sharing driver · aa7ffc01
      Jesse Barnes 提交于
      Intel Core i3/5 platforms with integrated graphics support both CPU and
      GPU turbo mode.  CPU turbo mode is opportunistic: the CPU will use any
      available power to increase core frequencies if thermal headroom is
      available.  The GPU side is more manual however; the graphics driver
      must monitor GPU power and temperature and coordinate with a core
      thermal driver to take advantage of available thermal and power headroom
      in the package.
      
      The intelligent power sharing (IPS) driver is intended to coordinate
      this activity by monitoring MCP (multi-chip package) temperature and
      power, allowing the CPU and/or GPU to increase their power consumption,
      and thus performance, when possible.  The goal is to maximize
      performance within a given platform's TDP (thermal design point).
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NMatthew Garrett <mjg@redhat.com>
      aa7ffc01
  25. 02 6月, 2010 1 次提交
  26. 27 5月, 2010 1 次提交
  27. 07 1月, 2010 1 次提交
  28. 04 12月, 2009 1 次提交
  29. 02 12月, 2009 2 次提交
  30. 06 11月, 2009 1 次提交
    • D
      drm/i915: implement drmmode overlay support v4 · 02e792fb
      Daniel Vetter 提交于
      This implements intel overlay support for kms via a device-specific
      ioctl. Thomas Hellstrom brought up the idea of a general ioctl (on
      dri-devel). We've reached the conclusion that such an infrastructure
      only makes sense when multiple kms overlay implementations exists,
      which atm don't (and it doesn't look like this is gonna change).
      
      Open issues:
      - Runs in sync with the gpu, i.e. unnecessary waiting. I've decided
        to wait on this because the hw tends to hang when changing something
        in this area. I left some dummy functions as infrastructure.
      - polyphase filtering uses a static table.
      - uses uninterruptible sleeps. Unfortunately the alternatives may
        unnecessarily wedged the hw if/when we timeout too early (and
        userspace only overloaded the batch buffers with stuff worth a few
        secs of gpu time).
      
      Changes since v1:
      - fix off-by-one misconception on my side. This fixes fullscreen
        playback.
      Changes since v2:
      - add underrun detection as spec'ed for i965.
      - flush caches properly, fixing visual corruptions.
      Changes since v4:
      - fix up cache flushing of overlay memory regs.
      - killed require_pipe_a logic - it hangs the chip.
      
      Tested-By: diego.abelenda@gmail.com (on a 865G)
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      [anholt: Resolved against the MADVISE ioctl going in before this one]
      Signed-off-by: NEric Anholt <eric@anholt.net>
      02e792fb
  31. 23 9月, 2009 1 次提交
  32. 18 9月, 2009 1 次提交