1. 07 1月, 2016 1 次提交
  2. 05 1月, 2016 1 次提交
  3. 05 12月, 2015 1 次提交
  4. 03 12月, 2015 1 次提交
    • N
      drm/i915: Extend LRC pinning to cover GPU context writeback · 6d65ba94
      Nick Hoath 提交于
      Use the first retired request on a new context to unpin
      the old context. This ensures that the hw context remains
      bound until it has been written back to by the GPU.
      Now that the context is pinned until later in the request/context
      lifecycle, it no longer needs to be pinned from context_queue to
      retire_requests.
      This fixes an issue with GuC submission where the GPU might not
      have finished writing back the context before it is unpinned. This
      results in a GPU hang.
      
      v2: Moved the new pin to cover GuC submission (Alex Dai)
          Moved the new unpin to request_retire to fix coverage leak
      v3: Added switch to default context if freeing a still pinned
          context just in case the hw was actually still using it
      v4: Unwrapped context unpin to allow calling without a request
      v5: Only create a switch to idle context if the ring doesn't
          already have a request pending on it (Alex Dai)
          Rename unsaved to dirty to avoid double negatives (Dave Gordon)
          Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
          Split out per engine cleanup from context_free as it
          was getting unwieldy
          Corrected locking (Dave Gordon)
      v6: Removed some bikeshedding (Mika Kuoppala)
          Added explanation of the GuC hang that this fixes (Daniel Vetter)
      v7: Removed extra per request pinning from ring reset code (Alex Dai)
          Added forced ring unpin/clean in error case in context free (Alex Dai)
      Signed-off-by: NNick Hoath <nicholas.hoath@intel.com>
      Issue: VIZ-4277
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Gordon <david.s.gordon@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Alex Dai <yu.dai@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NAlex Dai <yu.dai@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6d65ba94
  5. 18 11月, 2015 2 次提交
  6. 28 9月, 2015 1 次提交
    • M
      drm/i915: Consider HW CSB write pointer before resetting the sw read pointer · dfc53c5e
      Michel Thierry 提交于
      A previous commit resets the Context Status Buffer (CSB) read pointer in
      ring init
          commit c0a03a2e ("drm/i915: Reset CSB read pointer in ring init")
      
      This is generally correct, but this pointer is not reset after
      suspend/resume in some platforms (cht). In this case, the driver should
      read the register value instead of resetting the sw read counter to 0.
      Otherwise we process old events, leading to unwanted pre-emptions or
      something worse.
      
      But in other platforms (bdw) and also during GPU reset or power up, the
      CSBWP is reset to 0x7 (an invalid number), and in this case the read
      pointer should be set to 5 (the interrupt code will increment this
      counter one more time, and will start reading from CSB[0]).
      
      v2: When the CSB registers are reset, the read pointer needs to be set
      to 5, otherwise the first write (CSB[0]) won't be read (Mika).
      Replace magic numbers with GEN8_CSB_ENTRIES (6) and GEN8_CSB_PTR_MASK
      (0x07).
      
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: stable@vger.kernel.org # v4.0+
      Signed-off-by: NLei Shen <lei.shen@intel.com>
      Signed-off-by: NDeepak S <deepak.s@intel.com>
      Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      dfc53c5e
  7. 23 9月, 2015 1 次提交
  8. 14 9月, 2015 1 次提交
    • N
      drm/i915: Split alloc from init for lrc · e84fe803
      Nick Hoath 提交于
      Extend init/init_hw split to context init.
         - Move context initialisation in to i915_gem_init_hw
         - Move one off initialisation for render ring to
              i915_gem_validate_context
         - Move default context initialisation to logical_ring_init
      
      Rename intel_lr_context_deferred_create to
      intel_lr_context_deferred_alloc, to reflect reduced functionality &
      alloc/init split.
      
      This patch is intended to split out the allocation of resources &
      initialisation to allow easier reuse of code for resume/gpu reset.
      
      v2: Removed function ptr wrapping of do_switch_context (Daniel Vetter)
          Left ->init_context int intel_lr_context_deferred_alloc
          (Daniel Vetter)
          Remove unnecessary init flag & ring type test. (Daniel Vetter)
          Improve commit message (Daniel Vetter)
      v3: On init/reinit, set the hw next sequence number to the sw next
          sequence number. This is set to 1 at driver load time. This prevents
          the seqno being reset on reinit (Chris Wilson)
      v4: Set seqno back to ~0 - 0x1000 at start-of-day, and increment by 0x100
          on reset.
          This makes it obvious which bbs are which after a reset. (David Gordon
          & John Harrison)
          Rebase.
      v5: Rebase. Fixed rebase breakage. Put context pinning in separate
          function. Removed code churn. (Thomas Daniel)
      v6: Cleanup up issues introduced in v2 & v5 (Thomas Daniel)
      
      Issue: VIZ-4798
      Signed-off-by: NNick Hoath <nicholas.hoath@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: John Harrison <john.c.harrison@intel.com>
      Cc: David Gordon <david.s.gordon@intel.com>
      Cc: Thomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NThomas Daniel <thomas.daniel@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e84fe803
  9. 15 8月, 2015 2 次提交
    • A
      drm/i915: Integrate GuC-based command submission · d1675198
      Alex Dai 提交于
      GuC-based submission is mostly the same as execlist mode, up to
      intel_logical_ring_advance_and_submit(), where the context being
      dispatched would be added to the execlist queue; at this point
      we submit the context to the GuC backend instead.
      
      There are, however, a few other changes also required, notably:
      1.  Contexts must be pinned at GGTT addresses accessible by the GuC
          i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the
          PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls.
      
      2.  The GuC's TLB must be invalidated after a context is pinned at
          a new GGTT address.
      
      3.  GuC firmware uses the one page before Ring Context as shared data.
          Therefore, whenever driver wants to get base address of LRC, we
          will offset one page for it. LRC_PPHWSP_PN is defined as the page
          number of LRCA.
      
      4.  In the work queue used to pass requests to the GuC, the GuC
          firmware requires the ring-tail-offset to be represented as an
          11-bit value, expressed in QWords. Therefore, the ringbuffer
          size must be reduced to the representable range (4 pages).
      
      v2:
          Defer adding #defines until needed [Chris Wilson]
          Rationalise type declarations [Chris Wilson]
      
      v4:
          Squashed kerneldoc patch into here [Daniel Vetter]
      
      v5:
          Update request->tail in code common to both GuC and execlist modes.
          Add a private version of lr_context_update(), as sharing the
              execlist version leads to race conditions when the CPU and
              the GuC both update TAIL in the context image.
          Conversion of error-captured HWS page to string must account
              for offset from start of object to actual HWS (LRC_PPHWSP_PN).
      
      Issue: VIZ-4884
      Signed-off-by: NAlex Dai <yu.dai@intel.com>
      Signed-off-by: NDave Gordon <david.s.gordon@intel.com>
      Reviewed-by: NTom O'Rourke <Tom.O'Rourke@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d1675198
    • D
      drm/i915: Expose one LRC function for GuC submission mode · 919f1f55
      Dave Gordon 提交于
      GuC submission is basically execlist submission, but with the GuC
      handling the actual writes to the ELSP and the resulting context
      switch interrupts.  So to describe a context for submission via
      the GuC, we need one of the same functions used in execlist mode.
      This commit exposes one such function, changing its name to better
      describe what it does (it's related to logical ring contexts rather
      than to execlists per se).
      
      v2:
          Replaces previous "drm/i915: Move execlists defines from .c to .h"
      
      v3:
          Incorporates a change to one of the functions exposed here that was
              previously part of an internal patch, but which was omitted from
              the version recently committed to drm-intel-nightly:
      	    7a01a0a2 drm/i915/lrc: Update PDPx registers with lri commands
              So we reinstate this change here.
      
      v4:
          Drop v3 change, update function parameters due to collision with
              8ee36152 drm/i915: Convert execlists_ctx_descriptor() for requests
      
      v5:
          Don't expose execlists_update_context() after all. The current
              version is no longer compatible with GuC submission; trying to
              share the execlist version of this function results in both GuC
              and CPU updating TAIL in the context image, with bad results when
              they get out of step. The GuC submission path now has its own
              private version that just updates the ringbuffer start address,
              and not TAIL or PDPx.
      
      v6:
          Rebased
      
      Issue: VIZ-4884
      Signed-off-by: NDave Gordon <david.s.gordon@intel.com>
      Reviewed-by: NTom O'Rourke <Tom.O'Rourke@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      919f1f55
  10. 14 7月, 2015 1 次提交
    • P
      drm/i915: Added Programming of the MOCS · 3bbaba0c
      Peter Antoine 提交于
      This change adds the programming of the MOCS registers to the gen 9+
      platforms. The set of MOCS configuration entries introduced by this
      patch is intended to be minimal but sufficient to cover the needs of
      current userspace - i.e. a good set of defaults. It is expected to be
      extended in the future to provide further default values or to allow
      userspace to redefine its private MOCS tables based on its demand for
      additional caching configurations. In this setup, userspace should
      only utilize the first N entries, higher entries are reserved for
      future use.
      
      It creates a fixed register set that is programmed across the different
      engines so that all engines have the same table. This is done as the
      main RCS context only holds the registers for itself and the shared
      L3 values. By trying to keep the registers consistent across the
      different engines it should make the programming for the registers
      consistent.
      
      v2:
      -'static const' for private data structures and style changes.(Matt Turner)
      v3:
      - Make the tables "slightly" more readable. (Damien Lespiau)
      - Updated tables fix performance regression.
      v4:
      - Code formatting. (Chris Wilson)
      - re-privatised mocs code. (Daniel Vetter)
      v5:
      - Changed the name of a function. (Chris Wilson)
      v6:
      - re-based
      - Added Mesa table entry (skylake & broxton) (Francisco Jerez)
      - Tidied up the readability defines (Francisco Jerez)
      - NUMBER of entries defines wrong. (Jim Bish)
      - Added comments to clear up the meaning of the tables (Jim Bish)
      Signed-off-by: NPeter Antoine <peter.antoine@intel.com>
      
      v7 (Francisco Jerez):
      - Don't write L3-specific MOCS_ESC/SCC values into the e/LLC control
        tables.  Prefix L3-specific defines consistently with L3_ and
        e/LLC-specific defines with LE_ to avoid this kind of confusion in
        the future.
      - Change L3CC WT define back to RESERVED (matches my hardware
        documentation and the original patch, probably a misunderstanding
        of my own previous comment).
      - Drop Android tables, define new minimal tables more suitable for the
        open source stack.
      - Add comment that the MOCS tables are part of the kernel ABI.
      - Move intel_logical_ring_begin() and _advance() calls one level down
        (Chris Wilson).
      - Minor formatting and style fixes.
      v8 (Francisco Jerez):
      - Add table size sanity check to emit_mocs_control/l3cc_table() (Chris
        Wilson).
      - Add comment about undefined entries being implicitly set to uncached
        for forwards compatibility.
      v9 (Francisco Jerez):
      - Minor style fixes.
      Signed-off-by: NFrancisco Jerez <currojerez@riseup.net>
      Acked-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3bbaba0c
  11. 06 7月, 2015 2 次提交
  12. 23 6月, 2015 4 次提交
    • J
      drm/i915: Add *_ring_begin() to request allocation · ccd98fe4
      John Harrison 提交于
      Now that the *_ring_begin() functions no longer call the request allocation
      code, it is finally safe for the request allocation code to call *_ring_begin().
      This is important to guarantee that the space reserved for the subsequent
      i915_add_request() call does actually get reserved.
      
      v2: Renamed functions according to review feedback (Tomas Elf).
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ccd98fe4
    • J
      drm/i915: Update flush_all_caches() to take request structures · 4866d729
      John Harrison 提交于
      Updated the *_ring_flush_all_caches() functions to take requests instead of
      rings or ringbuf/context pairs.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      4866d729
    • J
      drm/i915: Merged the many do_execbuf() parameters into a structure · 5f19e2bf
      John Harrison 提交于
      The do_execbuf() function takes quite a few parameters. The actual set of
      parameters is going to change with the conversion to passing requests around.
      Further, it is due to grow massively with the arrival of the GPU scheduler.
      
      This patch simplifies the prototype by passing a parameter structure instead.
      Changing the parameter set in the future is then simply a matter of
      adding/removing items to the structure.
      
      Note that the structure does not contain absolutely everything that is passed
      in. This is because the intention is to use this structure more extensively
      later in this patch series and more especially in the GPU scheduler that is
      coming soon. The latter requires hanging on to the structure as the final
      hardware submission can be delayed until long after the execbuf IOCTL has
      returned to user land. Thus it is unsafe to put anything in the structure that
      is local to the IOCTL call itself - such as the 'args' parameter. All entries
      must be copies of data or pointers to structures that are reference counted in
      some way and guaranteed to exist for the duration of the batch buffer's life.
      
      v2: Rebased to newer tree and updated for changes to the command parser.
      Specifically, a code shuffle has required saving the batch start address in the
      params structure.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      5f19e2bf
    • J
      drm/i915: Set context in request from creation even in legacy mode · 40e895ce
      John Harrison 提交于
      In execlist mode, the context object pointer is written in to the request
      structure (and reference counted) at the point of request creation. In legacy
      mode, this only happens inside i915_add_request().
      
      This patch updates the legacy code path to match the execlist version. This
      allows all the intermediate code between request creation and request submission
      to get at the context object given only a request structure. Thus negating the
      need to pass context pointers here, there and everywhere.
      
      v2: Moved the context reference so it does not need to be undone if the
      get_seqno() fails.
      
      v3: Fixed execlist mode always hitting a warning about invalid last_contexts
      (which don't exist in execlist mode).
      
      v4: Updated for new i915_gem_request_alloc() scheme.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      40e895ce
  13. 01 4月, 2015 2 次提交
  14. 26 2月, 2015 1 次提交
    • J
      drm/i915: Rename 'flags' to 'dispatch_flags' for better code reading · 8e004efc
      John Harrison 提交于
      There is a flags word that is passed through the execbuffer code path all the
      way from initial decoding of the user parameters down to the very final dispatch
      buffer call. It is simply called 'flags'. Unfortuantely, there are many other
      flags words floating around in the same blocks of code. Even more once the GPU
      scheduler arrives.
      
      This patch makes it more obvious exactly which flags word is which by renaming
      'flags' to 'dispatch_flags'. Note that the bit definitions for this flags word
      already have an 'I915_DISPATCH_' prefix on them and so are not quite so
      ambiguous.
      
      OTC-Jira: VIZ-1587
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      [danvet: Resolve conflict with Chris' rework of the bb parsing.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8e004efc
  15. 24 2月, 2015 1 次提交
  16. 14 2月, 2015 3 次提交
  17. 27 1月, 2015 4 次提交
  18. 15 12月, 2014 1 次提交
  19. 20 11月, 2014 2 次提交
    • O
      drm/i915/bdw: Pin the context backing objects to GGTT on-demand · dcb4c12a
      Oscar Mateo 提交于
      Up until now, we have pinned every logical ring context backing object
      during creation, and left it pinned until destruction. This made my life
      easier, but it's a harmful thing to do, because we cause fragmentation
      of the GGTT (and, eventually, we would run out of space).
      
      This patch makes the pinning on-demand: the backing objects of the two
      contexts that are written to the ELSP are pinned right before submission
      and unpinned once the hardware is done with them. The only context that
      is still pinned regardless is the global default one, so that the HWS can
      still be accessed in the same way (ring->status_page).
      
      v2: In the early version of this patch, we were pinning the context as
      we put it into the ELSP: on the one hand, this is very efficient because
      only a maximum two contexts are pinned at any given time, but on the other
      hand, we cannot really pin in interrupt time :(
      
      v3: Use a mutex rather than atomic_t to protect pin count to avoid races.
      Do not unpin default context in free_request.
      
      v4: Break out pin and unpin into functions.  Fix style problems reported
      by checkpatch
      
      v5: Remove unpin_lock as all pinning and unpinning is done with the struct
      mutex already locked.  Add WARN_ONs to make sure this is the case in future.
      
      Issue: VIZ-4277
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NAkash Goel <akash.goels@gmail.com>
      Reviewed-by: Deepak S<deepak.s@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      dcb4c12a
    • T
      drm/i915/bdw: Clean up execlist queue items in retire_work · c86ee3a9
      Thomas Daniel 提交于
      No longer create a work item to clean each execlist queue item.
      Instead, move retired execlist requests to a queue and clean up the
      items during retire_requests.
      
      v2: Fix legacy ring path broken during overzealous cleanup
      
      v3: Update idle detection to take execlists queue into account
      
      v4: Grab execlist lock when checking queue state
      
      v5: Fix leaking requests by freeing in execlists_retire_requests.
      
      Issue: VIZ-4274
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NDeepak S <deepak.s@linux.intel.com>
      Reviewed-by: NAkash Goel <akash.goels@gmail.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c86ee3a9
  20. 03 9月, 2014 1 次提交
    • O
      drm/i915/bdw: Render state init for Execlists · 564ddb2f
      Oscar Mateo 提交于
      The batchbuffer that sets the render context state is submitted
      in a different way, and from different places.
      
      We needed to make both the render state preparation and free functions
      outside accesible, and namespace accordingly. This mess is so that all
      LR, LRC and Execlists functionality can go together in intel_lrc.c: we
      can fix all of this later on, once the interfaces are clear.
      
      v2: Create a separate ctx->rcs_initialized for the Execlists case, as
      suggested by Chris Wilson.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      
      v3: Setup ring status page in lr_context_deferred_create when the
      default context is being created. This means that the render state
      init for the default context is no longer a special case.  Execute
      deferred creation of the default context at the end of
      logical_ring_init to allow the render state commands to be submitted.
      Fix style errors reported by checkpatch. Rebased.
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      564ddb2f
  21. 20 8月, 2014 2 次提交
  22. 15 8月, 2014 5 次提交
    • O
      drm/i915/bdw: Avoid non-lite-restore preemptions · e1fee72c
      Oscar Mateo 提交于
      In the current Execlists feeding mechanism, full preemption is not
      supported yet: only lite-restores are allowed (this is: the GPU
      simply samples a new tail pointer for the context currently in
      execution).
      
      But we have identified an scenario in which a full preemption occurs:
      1) We submit two contexts for execution (A & B).
      2) The GPU finishes with the first one (A), switches to the second one
      (B) and informs us.
      3) We submit B again (hoping to cause a lite restore) together with C,
      but in the time we spend writing to the ELSP, the GPU finishes B.
      4) The GPU start executing B again (since we told it so).
      5) We receive a B finished interrupt and, mistakenly, we submit C (again)
      and D, causing a full preemption of B.
      
      The race is avoided by keeping track of how many times a context has been
      submitted to the hardware and by better discriminating the received context
      switch interrupts: in the example, when we have submitted B twice, we won´t
      submit C and D as soon as we receive the notification that B is completed
      because we were expecting to get a LITE_RESTORE and we didn´t, so we know a
      second completion will be received shortly.
      
      Without this explicit checking, somehow, the batch buffer execution order
      gets messed with. This can be verified with the IGT test I sent together with
      the series. I don´t know the exact mechanism by which the pre-emption messes
      with the execution order but, since other people is working on the Scheduler
      + Preemption on Execlists, I didn´t try to fix it. In these series, only Lite
      Restores are supported (other kind of preemptions WARN).
      
      v2: elsp_submitted belongs in the new intel_ctx_submit_request. Several
      rebase changes.
      
      v3: Clarify how the race is avoided, as requested by Daniel.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Align function parameters ...]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e1fee72c
    • T
      drm/i915/bdw: Handle context switch events · e981e7b1
      Thomas Daniel 提交于
      Handle all context status events in the context status buffer on every
      context switch interrupt. We only remove work from the execlist queue
      after a context status buffer reports that it has completed and we only
      attempt to schedule new contexts on interrupt when a previously submitted
      context completes (unless no contexts are queued, which means the GPU is
      free).
      
      We canot call intel_runtime_pm_get() in an interrupt (or with a spinlock
      grabbed, FWIW), because it might sleep, which is not a nice thing to do.
      Instead, do the runtime_pm get/put together with the create/destroy request,
      and handle the forcewake get/put directly.
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      
      v2: Unreferencing the context when we are freeing the request might free
      the backing bo, which requires the struct_mutex to be grabbed, so defer
      unreferencing and freeing to a bottom half.
      
      v3:
      - Ack the interrupt inmediately, before trying to handle it (fix for
      missing interrupts by Bob Beckett <robert.beckett@intel.com>).
      - Update the Context Status Buffer Read Pointer, just in case (spotted
      by Damien Lespiau).
      
      v4: New namespace and multiple rebase changes.
      
      v5: Squash with "drm/i915/bdw: Do not call intel_runtime_pm_get() in an
      interrupt", as suggested by Daniel.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Checkpatch ...]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e981e7b1
    • M
      drm/i915/bdw: Two-stage execlist submit process · acdd884a
      Michel Thierry 提交于
      Context switch (and execlist submission) should happen only when
      other contexts are not active, otherwise pre-emption occurs.
      
      To assure this, we place context switch requests in a queue and those
      request are later consumed when the right context switch interrupt is
      received (still TODO).
      
      v2: Use a spinlock, do not remove the requests on unqueue (wait for
      context switch completion).
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      
      v3: Several rebases and code changes. Use unique ID.
      
      v4:
      - Move the queue/lock init to the late ring initialization.
      - Damien's kmalloc review comments: check return, use sizeof(*req),
      do not cast.
      
      v5:
      - Do not reuse drm_i915_gem_request. Instead, create our own.
      - New namespace.
      
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v1)
      Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v2-v5)
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [davnet: Checkpatch + wash-up s/BUG_ON/WARN_ON/.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      acdd884a
    • B
      drm/i915/bdw: Implement context switching (somewhat) · 84b790f8
      Ben Widawsky 提交于
      A context switch occurs by submitting a context descriptor to the
      ExecList Submission Port. Given that we can now initialize a context,
      it's possible to begin implementing the context switch by creating the
      descriptor and submitting it to ELSP (actually two, since the ELSP
      has two ports).
      
      The context object must be mapped in the GGTT, which means it must exist
      in the 0-4GB graphics VA range.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      
      v2: This code has changed quite a lot in various rebases. Of particular
      importance is that now we use the globally unique Submission ID to send
      to the hardware. Also, context pages are now pinned unconditionally to
      GGTT, so there is no need to bind them.
      
      v3: Use LRCA[31:12] as hwCtxId[19:0]. This guarantees that the HW context
      ID we submit to the ELSP is globally unique and != 0 (Bspec requirements
      of the software use-only bits of the Context ID in the Context Descriptor
      Format) without the hassle of the previous submission Id construction.
      Also, re-add the ELSP porting read (it was dropped somewhere during the
      rebases).
      
      v4:
      - Squash with "drm/i915/bdw: Add forcewake lock around ELSP writes" (BSPEC
        says: "SW must set Force Wakeup bit to prevent GT from entering C6 while
        ELSP writes are in progress") as noted by Thomas Daniel
        (thomas.daniel@intel.com).
      - Rename functions and use an execlists/intel_execlists_ namespace.
      - The BUG_ON only checked that the LRCA was <32 bits, but it didn't make
        sure that it was properly aligned. Spotted by Alistair Mcaulay
        <alistair.mcaulay@intel.com>.
      
      v5:
      - Improved source code comments as suggested by Chris Wilson.
      - No need to abstract submit_ctx away, as pointed by Brad Volkin.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Checkpatch. Sigh.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      84b790f8
    • O
      drm/i915/bdw: Emission of requests with logical rings · 48e29f55
      Oscar Mateo 提交于
      On a previous iteration of this patch, I created an Execlists
      version of __i915_add_request and asbtracted it away as a
      vfunc. Daniel Vetter wondered then why that was needed:
      
      "with the clean split in command submission I expect every
      function to know wether it'll submit to an lrc (everything in
      intel_lrc.c) or wether it'll submit to a legacy ring (existing
      code), so I don't see a need for an add_request vfunc."
      
      The honest, hairy truth is that this patch is the glue keeping
      the whole logical ring puzzle together:
      
      - i915_add_request is used by intel_ring_idle, which in turn is
        used by i915_gpu_idle, which in turn is used in several places
        inside the eviction and gtt codes.
      - Also, it is used by i915_gem_check_olr, which is littered all
        over i915_gem.c
      - ...
      
      If I were to duplicate all the code that directly or indirectly
      uses __i915_add_request, I'll end up creating a separate driver.
      
      To show the differences between the existing legacy version and
      the new Execlists one, this time I have special-cased
      __i915_add_request instead of adding an add_request vfunc. I
      hope this helps to untangle this Gordian knot.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Adjust to ringbuf->FIXME_lrc_ctx per the discussion with
      Thomas Daniel.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      48e29f55