1. 07 8月, 2014 1 次提交
  2. 08 7月, 2014 3 次提交
  3. 11 6月, 2014 1 次提交
  4. 23 5月, 2014 5 次提交
    • O
      drm/i915: s/i915_hw_context/intel_context · 273497e5
      Oscar Mateo 提交于
      Up until now, contexts had one (and only one) backing object that was
      used by the hardware to save/restore render ring contexts (via the
      MI_SET_CONTEXT command). Other rings did not have or need this, so
      our i915_hw_context struct had a 1:1 relationship with a a real HW
      context.
      
      With Logical Ring Contexts and Execlists, this is not possible anymore:
      all rings need a backing object, and it cannot be reused. To prepare
      for that, rename our contexts to the more generic term intel_context.
      
      No functional changes.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      273497e5
    • O
      drm/i915: Split the ringbuffers from the rings (3/3) · 93b0a4e0
      Oscar Mateo 提交于
      Manual cleanup after the previous Coccinelle script.
      
      Yes, I could write another Coccinelle script to do this but I
      don't want labor-replacing robots making an honest programmer's
      work obsolete (also, I'm lazy).
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      93b0a4e0
    • O
      drm/i915: Split the ringbuffers from the rings (2/3) · ee1b1e5e
      Oscar Mateo 提交于
      This refactoring has been performed using the following Coccinelle
      semantic script:
      
          @@
          struct intel_engine_cs r;
          @@
          (
          - (r).obj
          + r.buffer->obj
          |
          - (r).virtual_start
          + r.buffer->virtual_start
          |
          - (r).head
          + r.buffer->head
          |
          - (r).tail
          + r.buffer->tail
          |
          - (r).space
          + r.buffer->space
          |
          - (r).size
          + r.buffer->size
          |
          - (r).effective_size
          + r.buffer->effective_size
          |
          - (r).last_retired_head
          + r.buffer->last_retired_head
          )
      
          @@
          struct intel_engine_cs *r;
          @@
          (
          - (r)->obj
          + r->buffer->obj
          |
          - (r)->virtual_start
          + r->buffer->virtual_start
          |
          - (r)->head
          + r->buffer->head
          |
          - (r)->tail
          + r->buffer->tail
          |
          - (r)->space
          + r->buffer->space
          |
          - (r)->size
          + r->buffer->size
          |
          - (r)->effective_size
          + r->buffer->effective_size
          |
          - (r)->last_retired_head
          + r->buffer->last_retired_head
          )
      
          @@
          expression E;
          @@
          (
          - LP_RING(E)->obj
          + LP_RING(E)->buffer->obj
          |
          - LP_RING(E)->virtual_start
          + LP_RING(E)->buffer->virtual_start
          |
          - LP_RING(E)->head
          + LP_RING(E)->buffer->head
          |
          - LP_RING(E)->tail
          + LP_RING(E)->buffer->tail
          |
          - LP_RING(E)->space
          + LP_RING(E)->buffer->space
          |
          - LP_RING(E)->size
          + LP_RING(E)->buffer->size
          |
          - LP_RING(E)->effective_size
          + LP_RING(E)->buffer->effective_size
          |
          - LP_RING(E)->last_retired_head
          + LP_RING(E)->buffer->last_retired_head
          )
      
      Note: On top of this this patch also removes the now unused ringbuffer
      fields in intel_engine_cs.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      [danvet: Add note about fixup patch included here.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ee1b1e5e
    • O
      drm/i915: Split the ringbuffers from the rings (1/3) · 8ee14975
      Oscar Mateo 提交于
      As advanced by the previous patch, the ringbuffers and the engine
      command streamers belong in different structs. This is so because,
      while they used to be tightly coupled together, the new Logical
      Ring Contexts (LRC for short) have a ringbuffer each.
      
      In legacy code, we will use the buffer* pointer inside each ring
      to get to the pertaining ringbuffer (the actual switch will be
      done in the next patch). In the new Execlists code, this pointer
      will be NULL and we will use instead the one inside the context
      instead.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8ee14975
    • O
      drm/i915: s/intel_ring_buffer/intel_engine_cs · a4872ba6
      Oscar Mateo 提交于
      In the upcoming patches we plan to break the correlation between
      engine command streamers (a.k.a. rings) and ringbuffers, so it
      makes sense to refactor the code and make the change obvious.
      
      No functional changes.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a4872ba6
  5. 13 5月, 2014 1 次提交
    • B
      drm/i915: Use hash tables for the command parser · 44e895a8
      Brad Volkin 提交于
      For clients that submit large batch buffers the command parser has
      a substantial impact on performance. On my HSW ULT system performance
      drops as much as ~20% on some tests. Most of the time is spent in the
      command lookup code. Converting that from the current naive search to
      a hash table lookup reduces the performance drop to ~10%.
      
      The choice of value for I915_CMD_HASH_ORDER allows all commands
      currently used in the parser tables to hash to their own bucket (except
      for one collision on the render ring). The tradeoff is that it wastes
      memory. Because the opcodes for the commands in the tables are not
      particularly well distributed, reducing the order still leaves many
      buckets empty. The increased collisions don't seem to have a huge
      impact on the performance gain, but for now anyhow, the parser trades
      memory for performance.
      
      NB: Ville noticed that the error paths through the ring init code
      will leak memory. I've not addressed that here. We can do a follow
      up pass to handle all of the leaks.
      
      v2: improved comment describing selection of hash key mask (Damien)
      replace a BUG_ON() with an error return (Tvrtko, Ville)
      commit message improvements
      Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      44e895a8
  6. 05 5月, 2014 7 次提交
  7. 25 4月, 2014 1 次提交
  8. 03 4月, 2014 2 次提交
    • C
      drm/i915: Move all ring resets before setting the HWS page · 9991ae78
      Chris Wilson 提交于
      In commit a51435a3
      Author: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com>
      Date:   Wed Mar 12 16:39:40 2014 +0530
      
          drm/i915: disable rings before HW status page setup
      
      we reordered stopping the rings to do so before we set the HWS register.
      However, there is an extra workaround for g45 to reset the rings twice,
      and for consistency we should apply that workaround before setting the
      HWS to be sure that the rings are truly stopped.
      
      Cc: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9991ae78
    • B
      drm/i915: Invariably invalidate before ctx switch · 057f6a8a
      Ben Widawsky 提交于
      We have been setting the bit which was originally BIOS dependent since:
      commit f05bb0c7
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Sun Jan 20 16:33:32 2013 +0000
      
          drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits
      
      Therefore, we do not need to try to figure it out dynamically and we can
      just always invalidate the TLBs.
      
      It's a partial revert of:
      commit 12b0286f
      Author: Ben Widawsky <ben@bwidawsk.net>
      Date:   Mon Jun 4 14:42:50 2012 -0700
      
          drm/i915: possibly invalidate TLB before context switch
      
      The original commit attempted to only invalidate when necessary
      (very much a relic from the old days). Now, we can just always invalidate.
      
      I guess the old TODO still exists. Since we seem to have abandoned ILK
      contexts however, there isn't much point in even remembering.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      057f6a8a
  9. 29 3月, 2014 1 次提交
    • C
      drm/i915: Broadwell expands ACTHD to 64bit · 50877445
      Chris Wilson 提交于
      As Broadwell has an increased virtual address size, it requires more
      than 32 bits to store offsets into its address space. This includes the
      debug registers to track the current HEAD of the individual rings, which
      may be anywhere within the per-process address spaces. In order to find
      the full location, we need to read the high bits from a second register.
      We then also need to expand our storage to keep track of the larger
      address.
      
      v2: Carefully read the two registers to catch wraparound between
          the reads.
      v3: Use a WARN_ON rather than loop indefinitely on an unstable
          register read.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Cc: Timo Aaltonen <tjaalton@ubuntu.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Drop spurious hunk which conflicted.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      50877445
  10. 12 3月, 2014 1 次提交
  11. 08 3月, 2014 1 次提交
    • B
      drm/i915: Implement command buffer parsing logic · 351e3db2
      Brad Volkin 提交于
      The command parser scans batch buffers submitted via execbuffer ioctls before
      the driver submits them to hardware. At a high level, it looks for several
      things:
      
      1) Commands which are explicitly defined as privileged or which should only be
         used by the kernel driver. The parser generally rejects such commands, with
         the provision that it may allow some from the drm master process.
      2) Commands which access registers. To support correct/enhanced userspace
         functionality, particularly certain OpenGL extensions, the parser provides a
         whitelist of registers which userspace may safely access (for both normal and
         drm master processes).
      3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The
         parser always rejects such commands.
      
      See the overview comment in the source for more details.
      
      This patch only implements the logic. Subsequent patches will build the tables
      that drive the parser.
      
      v2: Don't set the secure bit if the parser succeeds
      Fail harder during init
      Makefile cleanup
      Kerneldoc cleanup
      Clarify module param description
      Convert ints to bools in a few places
      Move client/subclient defs to i915_reg.h
      Remove the bits_count field
      
      OTC-Tracker: AXIA-4631
      Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30
      Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
      Reviewed-by: NJani Nikula <jani.nikula@intel.com>
      [danvet: Appease checkpatch.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      351e3db2
  12. 12 2月, 2014 1 次提交
  13. 04 2月, 2014 1 次提交
  14. 10 9月, 2013 1 次提交
    • C
      drm/i915: Write RING_TAIL once per-request · 09246732
      Chris Wilson 提交于
      Ignoring the legacy DRI1 code, and a couple of special cases (to be
      discussed later), all access to the ring is mediated through requests.
      The first write to a ring will grab a seqno and mark the ring as having
      an outstanding_lazy_request. Either through explicitly adding a request
      after an execbuffer or through an implicit wait (either by the CPU or by
      a semaphore), that sequence of writes will be terminated with a request.
      So we can ellide all the intervening writes to the tail register and
      send the entire command stream to the GPU at once. This will reduce the
      number of *serialising* writes to the tail register by a factor or 3-5
      times (depending upon architecture and number of workarounds, context
      switches, etc involved). This becomes even more noticeable when the
      register write is overloaded with a number of debugging tools. The
      astute reader will wonder if it is then possible to overflow the ring
      with a single command. It is not. When we start a command sequence to
      the ring, we check for available space and issue a wait in case we have
      not. The ring wait will in this case be forced to flush the outstanding
      register write and then poll the ACTHD for sufficient space to continue.
      
      The exception to the rule where everything is inside a request are a few
      initialisation cases where we may want to write GPU commands via the CS
      before userspace wakes up and page flips.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      09246732
  15. 06 9月, 2013 1 次提交
  16. 05 9月, 2013 2 次提交
  17. 04 9月, 2013 1 次提交
  18. 23 8月, 2013 1 次提交
  19. 22 8月, 2013 1 次提交
  20. 11 7月, 2013 2 次提交
    • D
      drm/i915: unify ring irq refcounts (again) · c7113cc3
      Daniel Vetter 提交于
      With the simplified locking there's no reason any more to keep the
      refcounts seperate.
      
      v2: Readd the lost comment that ring->irq_refcount is protected by
      dev_priv->irq_lock.
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c7113cc3
    • D
      drm/i915: kill dev_priv->rps.lock · 59cdb63d
      Daniel Vetter 提交于
      Now that the rps interrupt locking isn't clearly separated (at elast
      conceptually) from all the other interrupt locking having a different
      lock stopped making sense: It protects much more than just the rps
      workqueue it started out with. But with the addition of VECS the
      separation started to blurr and resulted in some more complex locking
      for the ring interrupt refcount.
      
      With this we can (again) unifiy the ringbuffer irq refcounts without
      causing a massive confusion, but that's for the next patch.
      
      v2: Explain better why the rps.lock once made sense and why no longer,
      requested by Ben.
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      59cdb63d
  21. 13 6月, 2013 1 次提交
  22. 11 6月, 2013 1 次提交
    • C
      drm/i915: Don't count semaphore waits towards a stuck ring · 6274f212
      Chris Wilson 提交于
      If we detect a ring is in a valid wait for another, just let it be.
      Eventually it will either begin to progress again, or the entire system
      will come grinding to a halt and then hangcheck will fire as soon as the
      deadlock is detected.
      
      This error was foretold by Ben in
      commit 05407ff8
      Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Date:   Thu May 30 09:04:29 2013 +0300
      
          drm/i915: detect hang using per ring hangcheck_score
      
      "If ring B is waiting on ring A via semaphore, and ring A is making
      progress, albeit slowly - the hangcheck will fire. The check will
      determine that A is moving, however ring B will appear hung because
      the ACTHD doesn't move. I honestly can't say if that's actually a
      realistic problem to hit it probably implies the timeout value is too
      low."
      
      v2: Make sure we don't even incur the KICK cost whilst waiting.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65394Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6274f212
  23. 07 6月, 2013 1 次提交
  24. 03 6月, 2013 1 次提交
    • M
      drm/i915: detect hang using per ring hangcheck_score · 05407ff8
      Mika Kuoppala 提交于
      Keep track of ring seqno progress and if there are no
      progress detected, declare hang. Use actual head (acthd)
      to distinguish between ring stuck and batchbuffer looping
      situation. Stuck ring will be kicked to trigger progress.
      
      This commit adds a hard limit for batchbuffer completion time.
      If batchbuffer completion time is more than 4.5 seconds,
      the gpu will be declared hung.
      
      Review comment from Ben which nicely clarifies the semantic change:
      
      "Maybe I'm just stating the functional changes of the patch, but in case
      they were unintended here is what I see as potential issues:
      
      1. "If ring B is waiting on ring A via semaphore, and ring A is making
         progress, albeit slowly - the hangcheck will fire. The check will
         determine that A is moving, however ring B will appear hung because
         the ACTHD doesn't move. I honestly can't say if that's actually a
         realistic problem to hit it probably implies the timeout value is too
         low.
      
      2. "There's also another corner case on the kick. If the seqno = 2
         (though not stuck), and on the 3rd hangcheck, the ring is stuck, and
         we try to kick it... we don't actually try to find out if the kick
         helped"
      
      v2: use atchd to detect stuck ring from loop (Ben Widawsky)
      
      v3: Use acthd to check when ring needs kicking.
      Declare hang on third time in order to give time for
      kick_ring to take effect.
      
      v4: Update commit msg
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Paste in Ben's review comment.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      05407ff8
  25. 01 6月, 2013 1 次提交