1. 20 8月, 2014 2 次提交
  2. 15 8月, 2014 5 次提交
    • O
      drm/i915/bdw: Avoid non-lite-restore preemptions · e1fee72c
      Oscar Mateo 提交于
      In the current Execlists feeding mechanism, full preemption is not
      supported yet: only lite-restores are allowed (this is: the GPU
      simply samples a new tail pointer for the context currently in
      execution).
      
      But we have identified an scenario in which a full preemption occurs:
      1) We submit two contexts for execution (A & B).
      2) The GPU finishes with the first one (A), switches to the second one
      (B) and informs us.
      3) We submit B again (hoping to cause a lite restore) together with C,
      but in the time we spend writing to the ELSP, the GPU finishes B.
      4) The GPU start executing B again (since we told it so).
      5) We receive a B finished interrupt and, mistakenly, we submit C (again)
      and D, causing a full preemption of B.
      
      The race is avoided by keeping track of how many times a context has been
      submitted to the hardware and by better discriminating the received context
      switch interrupts: in the example, when we have submitted B twice, we won´t
      submit C and D as soon as we receive the notification that B is completed
      because we were expecting to get a LITE_RESTORE and we didn´t, so we know a
      second completion will be received shortly.
      
      Without this explicit checking, somehow, the batch buffer execution order
      gets messed with. This can be verified with the IGT test I sent together with
      the series. I don´t know the exact mechanism by which the pre-emption messes
      with the execution order but, since other people is working on the Scheduler
      + Preemption on Execlists, I didn´t try to fix it. In these series, only Lite
      Restores are supported (other kind of preemptions WARN).
      
      v2: elsp_submitted belongs in the new intel_ctx_submit_request. Several
      rebase changes.
      
      v3: Clarify how the race is avoided, as requested by Daniel.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Align function parameters ...]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e1fee72c
    • T
      drm/i915/bdw: Handle context switch events · e981e7b1
      Thomas Daniel 提交于
      Handle all context status events in the context status buffer on every
      context switch interrupt. We only remove work from the execlist queue
      after a context status buffer reports that it has completed and we only
      attempt to schedule new contexts on interrupt when a previously submitted
      context completes (unless no contexts are queued, which means the GPU is
      free).
      
      We canot call intel_runtime_pm_get() in an interrupt (or with a spinlock
      grabbed, FWIW), because it might sleep, which is not a nice thing to do.
      Instead, do the runtime_pm get/put together with the create/destroy request,
      and handle the forcewake get/put directly.
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      
      v2: Unreferencing the context when we are freeing the request might free
      the backing bo, which requires the struct_mutex to be grabbed, so defer
      unreferencing and freeing to a bottom half.
      
      v3:
      - Ack the interrupt inmediately, before trying to handle it (fix for
      missing interrupts by Bob Beckett <robert.beckett@intel.com>).
      - Update the Context Status Buffer Read Pointer, just in case (spotted
      by Damien Lespiau).
      
      v4: New namespace and multiple rebase changes.
      
      v5: Squash with "drm/i915/bdw: Do not call intel_runtime_pm_get() in an
      interrupt", as suggested by Daniel.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Checkpatch ...]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e981e7b1
    • M
      drm/i915/bdw: Two-stage execlist submit process · acdd884a
      Michel Thierry 提交于
      Context switch (and execlist submission) should happen only when
      other contexts are not active, otherwise pre-emption occurs.
      
      To assure this, we place context switch requests in a queue and those
      request are later consumed when the right context switch interrupt is
      received (still TODO).
      
      v2: Use a spinlock, do not remove the requests on unqueue (wait for
      context switch completion).
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      
      v3: Several rebases and code changes. Use unique ID.
      
      v4:
      - Move the queue/lock init to the late ring initialization.
      - Damien's kmalloc review comments: check return, use sizeof(*req),
      do not cast.
      
      v5:
      - Do not reuse drm_i915_gem_request. Instead, create our own.
      - New namespace.
      
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v1)
      Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v2-v5)
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [davnet: Checkpatch + wash-up s/BUG_ON/WARN_ON/.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      acdd884a
    • B
      drm/i915/bdw: Implement context switching (somewhat) · 84b790f8
      Ben Widawsky 提交于
      A context switch occurs by submitting a context descriptor to the
      ExecList Submission Port. Given that we can now initialize a context,
      it's possible to begin implementing the context switch by creating the
      descriptor and submitting it to ELSP (actually two, since the ELSP
      has two ports).
      
      The context object must be mapped in the GGTT, which means it must exist
      in the 0-4GB graphics VA range.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      
      v2: This code has changed quite a lot in various rebases. Of particular
      importance is that now we use the globally unique Submission ID to send
      to the hardware. Also, context pages are now pinned unconditionally to
      GGTT, so there is no need to bind them.
      
      v3: Use LRCA[31:12] as hwCtxId[19:0]. This guarantees that the HW context
      ID we submit to the ELSP is globally unique and != 0 (Bspec requirements
      of the software use-only bits of the Context ID in the Context Descriptor
      Format) without the hassle of the previous submission Id construction.
      Also, re-add the ELSP porting read (it was dropped somewhere during the
      rebases).
      
      v4:
      - Squash with "drm/i915/bdw: Add forcewake lock around ELSP writes" (BSPEC
        says: "SW must set Force Wakeup bit to prevent GT from entering C6 while
        ELSP writes are in progress") as noted by Thomas Daniel
        (thomas.daniel@intel.com).
      - Rename functions and use an execlists/intel_execlists_ namespace.
      - The BUG_ON only checked that the LRCA was <32 bits, but it didn't make
        sure that it was properly aligned. Spotted by Alistair Mcaulay
        <alistair.mcaulay@intel.com>.
      
      v5:
      - Improved source code comments as suggested by Chris Wilson.
      - No need to abstract submit_ctx away, as pointed by Brad Volkin.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Checkpatch. Sigh.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      84b790f8
    • O
      drm/i915/bdw: Emission of requests with logical rings · 48e29f55
      Oscar Mateo 提交于
      On a previous iteration of this patch, I created an Execlists
      version of __i915_add_request and asbtracted it away as a
      vfunc. Daniel Vetter wondered then why that was needed:
      
      "with the clean split in command submission I expect every
      function to know wether it'll submit to an lrc (everything in
      intel_lrc.c) or wether it'll submit to a legacy ring (existing
      code), so I don't see a need for an add_request vfunc."
      
      The honest, hairy truth is that this patch is the glue keeping
      the whole logical ring puzzle together:
      
      - i915_add_request is used by intel_ring_idle, which in turn is
        used by i915_gpu_idle, which in turn is used in several places
        inside the eviction and gtt codes.
      - Also, it is used by i915_gem_check_olr, which is littered all
        over i915_gem.c
      - ...
      
      If I were to duplicate all the code that directly or indirectly
      uses __i915_add_request, I'll end up creating a separate driver.
      
      To show the differences between the existing legacy version and
      the new Execlists one, this time I have special-cased
      __i915_add_request instead of adding an add_request vfunc. I
      hope this helps to untangle this Gordian knot.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Adjust to ringbuf->FIXME_lrc_ctx per the discussion with
      Thomas Daniel.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      48e29f55
  3. 12 8月, 2014 1 次提交
    • O
      drm/i915/bdw: New logical ring submission mechanism · 82e104cc
      Oscar Mateo 提交于
      Well, new-ish: if all this code looks familiar, that's because it's
      a clone of the existing submission mechanism (with some modifications
      here and there to adapt it to LRCs and Execlists).
      
      And why did we do this instead of reusing code, one might wonder?
      Well, there are some fears that the differences are big enough that
      they will end up breaking all platforms.
      
      Also, Execlists offer several advantages, like control over when the
      GPU is done with a given workload, that can help simplify the
      submission mechanism, no doubt. I am interested in getting Execlists
      to work first and foremost, but in the future this parallel submission
      mechanism will help us to fine tune the mechanism without affecting
      old gens.
      
      v2: Pass the ringbuffer only (whenever possible).
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Appease checkpatch. Again. And drop the legacy sarea gunk
      that somehow crept in.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      82e104cc
  4. 11 8月, 2014 4 次提交
    • O
      drm/i915/bdw: Skeleton for the new logical rings submission path · 454afebd
      Oscar Mateo 提交于
      Execlists are indeed a brave new world with respect to workload
      submission to the GPU.
      
      In previous version of these series, I have tried to impact the
      legacy ringbuffer submission path as little as possible (mostly,
      passing the context around and using the correct ringbuffer when I
      needed one) but Daniel is afraid (probably with a reason) that
      these changes and, especially, future ones, will end up breaking
      older gens.
      
      This commit and some others coming next will try to limit the
      damage by creating an alternative path for workload submission.
      The first step is here: laying out a new ring init/fini.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      454afebd
    • O
      drm/i915/bdw: Initialization for Logical Ring Contexts · ede7d42b
      Oscar Mateo 提交于
      For the moment this is just a placeholder, but it shows one of the
      main differences between the good ol' HW contexts and the shiny
      new Logical Ring Contexts: LR contexts allocate  and free their
      own backing objects. Another difference is that the allocation is
      deferred (as the create function name suggests), but that does not
      happen in this patch yet, because for the moment we are only dealing
      with the default context.
      
      Early in the series we had our own gen8_gem_context_init/fini
      functions, but the truth is they now look almost the same as the
      legacy hw context init/fini functions. We can always split them
      later if this ceases to be the case.
      
      Also, we do not fall back to legacy ringbuffers when logical ring
      context initialization fails (not very likely to happen and, even
      if it does, hw contexts would probably fail as well).
      
      v2: Daniel says "explain, do not showcase".
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: s/BUG_ON/WARN_ON/.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ede7d42b
    • O
      drm/i915/bdw: Macro for LRCs and module option for Execlists · 127f1003
      Oscar Mateo 提交于
      GEN8 brings an expansion of the HW contexts: "Logical Ring Contexts".
      These expanded contexts enable a number of new abilities, especially
      "Execlists".
      
      The macro is defined to off until we have things in place to hope to
      work.
      
      v2: Rename "advanced contexts" to the more correct "logical ring
      contexts".
      
      v3: Add a module parameter to enable execlists. Execlist are relatively
      new, and so it'd be wise to be able to switch back to ring submission
      to debug subtle problems that will inevitably arise.
      
      v4: Add an intel_enable_execlists function.
      
      v5: Sanitize early, as suggested by Daniel. Remove lrc_enabled.
      
      Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
      Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> (v3)
      Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v2, v4 & v5)
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      127f1003
    • O
      drm/i915/bdw: New source and header file for LRs, LRCs and Execlists · b20385f1
      Oscar Mateo 提交于
      Some legacy HW context code assumptions don't make sense for this new
      submission method, so we will place this stuff in a separate file.
      
      Note for reviewers: I've carefully considered the best name for this file
      and this was my best option (other possibilities were intel_lr_context.c
      or intel_execlist.c). I am open to a certain bikeshedding on this matter,
      anyway.
      
      And some point in time, it would be a good idea to split intel_lrc.c/.h
      even further, but for the moment just shove everything together.
      
      v2: Change to intel_lrc.c
      
      v3: Squash together with the header file addition
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b20385f1
  5. 21 3月, 2012 1 次提交
  6. 26 5月, 2011 1 次提交
  7. 03 3月, 2011 1 次提交
  8. 06 10月, 2010 1 次提交
  9. 13 9月, 2010 1 次提交
  10. 08 9月, 2009 1 次提交
    • J
      drm/radeon/kms: add r600 KMS support · 3ce0a23d
      Jerome Glisse 提交于
      This adds the r600 KMS + CS support to the Linux kernel.
      
      The r600 TTM support is quite basic and still needs more
      work esp around using interrupts, but the polled fencing
      should work okay for now.
      
      Also currently TTM is using memcpy to do VRAM moves,
      the code is here to use a 3D blit to do this, but
      isn't fully debugged yet.
      
      Authors:
      Alex Deucher <alexdeucher@gmail.com>
      Dave Airlie <airlied@redhat.com>
      Jerome Glisse <jglisse@redhat.com>
      Signed-off-by: NJerome Glisse <jglisse@redhat.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      3ce0a23d