1. 14 11月, 2014 1 次提交
    • M
      drm/i915: Initialize workarounds in logical ring mode too · 771b9a53
      Michel Thierry 提交于
      Following the legacy ring submission example, update the
      ring->init_context() hook to support the execlist submission mode.
      
      v2: update to use the new workaround macros and cleanup unused code.
      This takes care of both bdw and chv workarounds.
      
      v2.1: Add missing call to init_context() during deferred context creation.
      
      v3: Split init_context (emit) in legacy/lrc modes. For lrc, get the ringbuf
      from the context (Mika/Daniel).
      
      v4: Merge init_context interfaces back, the legacy mode only needs the ring,
      but the lrc mode needs the ring and context (Mika).
      
      Issue: VIZ-4092
      Issue: GMIN-3475
      Change-Id: Ie3d093b2542ab0e2a44b90460533e2f979788d6c
      Cc: Deepak S <deepak.s@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
      Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      [danvet: Align function paramater lists properly.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      771b9a53
  2. 03 9月, 2014 1 次提交
    • A
      drm/i915/bdw: Apply workarounds in render ring init function · 86d7f238
      Arun Siluvery 提交于
      For BDW workarounds are currently initialized in init_clock_gating() but
      they are lost during reset, suspend/resume etc; this patch moves the WAs
      that are part of register state context to render ring init fn otherwise
      default context ends up with incorrect values as they don't get initialized
      until init_clock_gating fn.
      
      v2: Add workarounds to golden render state
      This method has its own issues, first of all this is different for
      each gen and it is generated using a tool so adding new workaround
      and mainitaining them across gens is not a straightforward process.
      
      v3: Use LRIs to emit these workarounds (Ville)
      Instead of modifying the golden render state the same LRIs are
      emitted from within the driver.
      
      v4: Use abstract name when exporting gen specific routines (Chris)
      
      For: VIZ-4092
      Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      86d7f238
  3. 15 8月, 2014 3 次提交
    • T
      drm/i915/bdw: Handle context switch events · e981e7b1
      Thomas Daniel 提交于
      Handle all context status events in the context status buffer on every
      context switch interrupt. We only remove work from the execlist queue
      after a context status buffer reports that it has completed and we only
      attempt to schedule new contexts on interrupt when a previously submitted
      context completes (unless no contexts are queued, which means the GPU is
      free).
      
      We canot call intel_runtime_pm_get() in an interrupt (or with a spinlock
      grabbed, FWIW), because it might sleep, which is not a nice thing to do.
      Instead, do the runtime_pm get/put together with the create/destroy request,
      and handle the forcewake get/put directly.
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      
      v2: Unreferencing the context when we are freeing the request might free
      the backing bo, which requires the struct_mutex to be grabbed, so defer
      unreferencing and freeing to a bottom half.
      
      v3:
      - Ack the interrupt inmediately, before trying to handle it (fix for
      missing interrupts by Bob Beckett <robert.beckett@intel.com>).
      - Update the Context Status Buffer Read Pointer, just in case (spotted
      by Damien Lespiau).
      
      v4: New namespace and multiple rebase changes.
      
      v5: Squash with "drm/i915/bdw: Do not call intel_runtime_pm_get() in an
      interrupt", as suggested by Daniel.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Checkpatch ...]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e981e7b1
    • M
      drm/i915/bdw: Two-stage execlist submit process · acdd884a
      Michel Thierry 提交于
      Context switch (and execlist submission) should happen only when
      other contexts are not active, otherwise pre-emption occurs.
      
      To assure this, we place context switch requests in a queue and those
      request are later consumed when the right context switch interrupt is
      received (still TODO).
      
      v2: Use a spinlock, do not remove the requests on unqueue (wait for
      context switch completion).
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      
      v3: Several rebases and code changes. Use unique ID.
      
      v4:
      - Move the queue/lock init to the late ring initialization.
      - Damien's kmalloc review comments: check return, use sizeof(*req),
      do not cast.
      
      v5:
      - Do not reuse drm_i915_gem_request. Instead, create our own.
      - New namespace.
      
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v1)
      Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v2-v5)
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [davnet: Checkpatch + wash-up s/BUG_ON/WARN_ON/.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      acdd884a
    • O
      drm/i915: Add temporary ring->ctx backpointer · 582d67f0
      Oscar Mateo 提交于
      The execlist patches have a bit a convoluted and long history and due
      to that have the actual submission still misplaced deeply burried in
      the low-level ringbuffer handling code. This design goes back to the
      legacy ringbuffer code with its tricky lazy request and simple work
      submissiion using ring tail writes. For that reason they need a
      ring->ctx backpointer.
      
      The goal is to unburry that code and move it up into a level where the
      full execlist context is available so that we can ditch this
      backpointer. Until that's done make it really obvious that there's
      work still to be done.
      
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Thomas Daniel <thomas.daniel@intel.com>
      Acked-by: NThomas Daniel <thomas.daniel@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      582d67f0
  4. 12 8月, 2014 5 次提交
  5. 11 8月, 2014 4 次提交
  6. 07 8月, 2014 1 次提交
  7. 08 7月, 2014 3 次提交
  8. 11 6月, 2014 1 次提交
  9. 23 5月, 2014 5 次提交
    • O
      drm/i915: s/i915_hw_context/intel_context · 273497e5
      Oscar Mateo 提交于
      Up until now, contexts had one (and only one) backing object that was
      used by the hardware to save/restore render ring contexts (via the
      MI_SET_CONTEXT command). Other rings did not have or need this, so
      our i915_hw_context struct had a 1:1 relationship with a a real HW
      context.
      
      With Logical Ring Contexts and Execlists, this is not possible anymore:
      all rings need a backing object, and it cannot be reused. To prepare
      for that, rename our contexts to the more generic term intel_context.
      
      No functional changes.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      273497e5
    • O
      drm/i915: Split the ringbuffers from the rings (3/3) · 93b0a4e0
      Oscar Mateo 提交于
      Manual cleanup after the previous Coccinelle script.
      
      Yes, I could write another Coccinelle script to do this but I
      don't want labor-replacing robots making an honest programmer's
      work obsolete (also, I'm lazy).
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      93b0a4e0
    • O
      drm/i915: Split the ringbuffers from the rings (2/3) · ee1b1e5e
      Oscar Mateo 提交于
      This refactoring has been performed using the following Coccinelle
      semantic script:
      
          @@
          struct intel_engine_cs r;
          @@
          (
          - (r).obj
          + r.buffer->obj
          |
          - (r).virtual_start
          + r.buffer->virtual_start
          |
          - (r).head
          + r.buffer->head
          |
          - (r).tail
          + r.buffer->tail
          |
          - (r).space
          + r.buffer->space
          |
          - (r).size
          + r.buffer->size
          |
          - (r).effective_size
          + r.buffer->effective_size
          |
          - (r).last_retired_head
          + r.buffer->last_retired_head
          )
      
          @@
          struct intel_engine_cs *r;
          @@
          (
          - (r)->obj
          + r->buffer->obj
          |
          - (r)->virtual_start
          + r->buffer->virtual_start
          |
          - (r)->head
          + r->buffer->head
          |
          - (r)->tail
          + r->buffer->tail
          |
          - (r)->space
          + r->buffer->space
          |
          - (r)->size
          + r->buffer->size
          |
          - (r)->effective_size
          + r->buffer->effective_size
          |
          - (r)->last_retired_head
          + r->buffer->last_retired_head
          )
      
          @@
          expression E;
          @@
          (
          - LP_RING(E)->obj
          + LP_RING(E)->buffer->obj
          |
          - LP_RING(E)->virtual_start
          + LP_RING(E)->buffer->virtual_start
          |
          - LP_RING(E)->head
          + LP_RING(E)->buffer->head
          |
          - LP_RING(E)->tail
          + LP_RING(E)->buffer->tail
          |
          - LP_RING(E)->space
          + LP_RING(E)->buffer->space
          |
          - LP_RING(E)->size
          + LP_RING(E)->buffer->size
          |
          - LP_RING(E)->effective_size
          + LP_RING(E)->buffer->effective_size
          |
          - LP_RING(E)->last_retired_head
          + LP_RING(E)->buffer->last_retired_head
          )
      
      Note: On top of this this patch also removes the now unused ringbuffer
      fields in intel_engine_cs.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      [danvet: Add note about fixup patch included here.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ee1b1e5e
    • O
      drm/i915: Split the ringbuffers from the rings (1/3) · 8ee14975
      Oscar Mateo 提交于
      As advanced by the previous patch, the ringbuffers and the engine
      command streamers belong in different structs. This is so because,
      while they used to be tightly coupled together, the new Logical
      Ring Contexts (LRC for short) have a ringbuffer each.
      
      In legacy code, we will use the buffer* pointer inside each ring
      to get to the pertaining ringbuffer (the actual switch will be
      done in the next patch). In the new Execlists code, this pointer
      will be NULL and we will use instead the one inside the context
      instead.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8ee14975
    • O
      drm/i915: s/intel_ring_buffer/intel_engine_cs · a4872ba6
      Oscar Mateo 提交于
      In the upcoming patches we plan to break the correlation between
      engine command streamers (a.k.a. rings) and ringbuffers, so it
      makes sense to refactor the code and make the change obvious.
      
      No functional changes.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a4872ba6
  10. 13 5月, 2014 1 次提交
    • B
      drm/i915: Use hash tables for the command parser · 44e895a8
      Brad Volkin 提交于
      For clients that submit large batch buffers the command parser has
      a substantial impact on performance. On my HSW ULT system performance
      drops as much as ~20% on some tests. Most of the time is spent in the
      command lookup code. Converting that from the current naive search to
      a hash table lookup reduces the performance drop to ~10%.
      
      The choice of value for I915_CMD_HASH_ORDER allows all commands
      currently used in the parser tables to hash to their own bucket (except
      for one collision on the render ring). The tradeoff is that it wastes
      memory. Because the opcodes for the commands in the tables are not
      particularly well distributed, reducing the order still leaves many
      buckets empty. The increased collisions don't seem to have a huge
      impact on the performance gain, but for now anyhow, the parser trades
      memory for performance.
      
      NB: Ville noticed that the error paths through the ring init code
      will leak memory. I've not addressed that here. We can do a follow
      up pass to handle all of the leaks.
      
      v2: improved comment describing selection of hash key mask (Damien)
      replace a BUG_ON() with an error return (Tvrtko, Ville)
      commit message improvements
      Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      44e895a8
  11. 05 5月, 2014 7 次提交
  12. 25 4月, 2014 1 次提交
  13. 03 4月, 2014 2 次提交
    • C
      drm/i915: Move all ring resets before setting the HWS page · 9991ae78
      Chris Wilson 提交于
      In commit a51435a3
      Author: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com>
      Date:   Wed Mar 12 16:39:40 2014 +0530
      
          drm/i915: disable rings before HW status page setup
      
      we reordered stopping the rings to do so before we set the HWS register.
      However, there is an extra workaround for g45 to reset the rings twice,
      and for consistency we should apply that workaround before setting the
      HWS to be sure that the rings are truly stopped.
      
      Cc: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9991ae78
    • B
      drm/i915: Invariably invalidate before ctx switch · 057f6a8a
      Ben Widawsky 提交于
      We have been setting the bit which was originally BIOS dependent since:
      commit f05bb0c7
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Sun Jan 20 16:33:32 2013 +0000
      
          drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits
      
      Therefore, we do not need to try to figure it out dynamically and we can
      just always invalidate the TLBs.
      
      It's a partial revert of:
      commit 12b0286f
      Author: Ben Widawsky <ben@bwidawsk.net>
      Date:   Mon Jun 4 14:42:50 2012 -0700
      
          drm/i915: possibly invalidate TLB before context switch
      
      The original commit attempted to only invalidate when necessary
      (very much a relic from the old days). Now, we can just always invalidate.
      
      I guess the old TODO still exists. Since we seem to have abandoned ILK
      contexts however, there isn't much point in even remembering.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      057f6a8a
  14. 29 3月, 2014 1 次提交
    • C
      drm/i915: Broadwell expands ACTHD to 64bit · 50877445
      Chris Wilson 提交于
      As Broadwell has an increased virtual address size, it requires more
      than 32 bits to store offsets into its address space. This includes the
      debug registers to track the current HEAD of the individual rings, which
      may be anywhere within the per-process address spaces. In order to find
      the full location, we need to read the high bits from a second register.
      We then also need to expand our storage to keep track of the larger
      address.
      
      v2: Carefully read the two registers to catch wraparound between
          the reads.
      v3: Use a WARN_ON rather than loop indefinitely on an unstable
          register read.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Cc: Timo Aaltonen <tjaalton@ubuntu.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Drop spurious hunk which conflicted.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      50877445
  15. 12 3月, 2014 1 次提交
  16. 08 3月, 2014 1 次提交
    • B
      drm/i915: Implement command buffer parsing logic · 351e3db2
      Brad Volkin 提交于
      The command parser scans batch buffers submitted via execbuffer ioctls before
      the driver submits them to hardware. At a high level, it looks for several
      things:
      
      1) Commands which are explicitly defined as privileged or which should only be
         used by the kernel driver. The parser generally rejects such commands, with
         the provision that it may allow some from the drm master process.
      2) Commands which access registers. To support correct/enhanced userspace
         functionality, particularly certain OpenGL extensions, the parser provides a
         whitelist of registers which userspace may safely access (for both normal and
         drm master processes).
      3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The
         parser always rejects such commands.
      
      See the overview comment in the source for more details.
      
      This patch only implements the logic. Subsequent patches will build the tables
      that drive the parser.
      
      v2: Don't set the secure bit if the parser succeeds
      Fail harder during init
      Makefile cleanup
      Kerneldoc cleanup
      Clarify module param description
      Convert ints to bools in a few places
      Move client/subclient defs to i915_reg.h
      Remove the bits_count field
      
      OTC-Tracker: AXIA-4631
      Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30
      Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
      Reviewed-by: NJani Nikula <jani.nikula@intel.com>
      [danvet: Appease checkpatch.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      351e3db2
  17. 12 2月, 2014 1 次提交
  18. 04 2月, 2014 1 次提交