1. 31 3月, 2014 1 次提交
  2. 08 3月, 2014 1 次提交
    • B
      drm/i915: Implement command buffer parsing logic · 351e3db2
      Brad Volkin 提交于
      The command parser scans batch buffers submitted via execbuffer ioctls before
      the driver submits them to hardware. At a high level, it looks for several
      things:
      
      1) Commands which are explicitly defined as privileged or which should only be
         used by the kernel driver. The parser generally rejects such commands, with
         the provision that it may allow some from the drm master process.
      2) Commands which access registers. To support correct/enhanced userspace
         functionality, particularly certain OpenGL extensions, the parser provides a
         whitelist of registers which userspace may safely access (for both normal and
         drm master processes).
      3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The
         parser always rejects such commands.
      
      See the overview comment in the source for more details.
      
      This patch only implements the logic. Subsequent patches will build the tables
      that drive the parser.
      
      v2: Don't set the secure bit if the parser succeeds
      Fail harder during init
      Makefile cleanup
      Kerneldoc cleanup
      Clarify module param description
      Convert ints to bools in a few places
      Move client/subclient defs to i915_reg.h
      Remove the bits_count field
      
      OTC-Tracker: AXIA-4631
      Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30
      Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
      Reviewed-by: NJani Nikula <jani.nikula@intel.com>
      [danvet: Appease checkpatch.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      351e3db2
  3. 14 2月, 2014 3 次提交
    • D
      drm/i915: Only bind each object rather than for every execbuffer · 8ea99c92
      Daniel Vetter 提交于
      One side-effect of the introduction of ppgtt was that we needed to
      rebind the object into the appropriate vm (and global gtt in some
      peculiar cases). For simplicity this was done twice for every object on
      every call to execbuffer. However, that adds a tremendous amount of CPU
      overhead (rewriting all the PTE for all objects into WC memory) per
      draw. The fix is to push all the decision about which vm to bind into
      and when down into the low-level bind routines through hints rather than
      performing the bind unconditionally in the execbuffer routine.
      
      Note that this is a regression introduced in the full ppgtt feature
      branch, before this we've only done re-bound objects when the relevant
      has_(aliasing_ppgtt|global_gtt)_mapping flag was clear. But since
      that's per-object and not per-vma that optimization broke.
      
      v2: Split out prep work and unrelated changes.
      
      v3: Bring back functional change around PIN_GLOBAL that I've
      accidentally split out.
      
      v4: Remove the temporary hack for the old binding logic to avoid
      bisection issues.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72906
      Tested-by: jianx.zhou@intel.com
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Acked-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8ea99c92
    • D
      drm/i915: split PIN_GLOBAL out from PIN_MAPPABLE · bf3d149b
      Daniel Vetter 提交于
      With abitrary pin flags it makes sense to split out a "please bind
      this into global gtt" from the "please allocate in the mappable
      range".
      
      Use this unconditionally in our global gtt pin helper since this is
      what its callers want. Later patches will drop PIN_MAPPABLE where it's
      not strictly needed.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      bf3d149b
    • D
      drm/i915: Consolidate binding parameters into flags · 1ec9e26d
      Daniel Vetter 提交于
      Anything more than just one bool parameter is just a pain to read,
      symbolic constants are much better.
      
      Split out from Chris' vma-binding rework patch.
      
      v2: Undo the behaviour change in object_pin that Chris spotted.
      
      v3: Split out misplaced hunk to handle set_cache_level errors,
      spotted by Jani.
      
      v4: Keep the current over-zealous binding logic in the execbuffer code
      working with a quick hack while the overall binding code gets shuffled
      around.
      
      v5: Reorder the PIN_ flags for more natural patch splitup.
      
      v6: Pull out the PIN_GLOBAL split-up again.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1ec9e26d
  4. 28 1月, 2014 1 次提交
    • J
      drm/i915: move module parameters into a struct, in a new file · d330a953
      Jani Nikula 提交于
      With 20+ module parameters, I think referring to them via a struct
      improves clarity over just having a bunch of globals. While at it, move
      the parameter initialization and definitions into a new file
      i915_params.c to reduce clutter in i915_drv.c.
      
      Apart from the ill-named i915_enable_rc6, i915_enable_fbc and
      i915_enable_ppgtt parameters, for which we lose the "i915_" prefix
      internally, the module parameters now look the same both on the kernel
      command line and in code. For example, "i915.modeset".
      
      The downsides of the change are losing static on a couple of variables
      and not having the initialization and module_param_named() right next to
      each other. On the other hand, all module parameters are now defined in
      one place at i915_params.c. Plus you can do this to find all module
      parameter references:
      
      $ git grep "i915\." -- drivers/gpu/drm/i915
      
      v2:
      - move the definitions into a new file
      - s/i915_params/i915/
      - make i915_try_reset i915.reset, for consistency
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d330a953
  5. 27 1月, 2014 1 次提交
  6. 22 1月, 2014 1 次提交
  7. 07 1月, 2014 1 次提交
  8. 19 12月, 2013 3 次提交
  9. 18 12月, 2013 7 次提交
    • B
      drm/i915: Use multiple VMs -- the point of no return · 7e0d96bc
      Ben Widawsky 提交于
      As with processes which run on the CPU, the goal of multiple VMs is to
      provide process isolation. Specific to GEN, there is also the ability to
      map more objects per process (2GB each instead of 2Gb-2k total).
      
      For the most part, all the pipes have been laid, and all we need to do
      is remove asserts and actually start changing address spaces with the
      context switch. Since prior to this we've converted the setting of the
      page tables to a streamed version, this is quite easy.
      
      One important thing to point out (since it'd been hotly contested) is
      that with this patch, every context created will have it's own address
      space (provided the HW can do it).
      
      v2: Disable BDW on rebase
      
      NOTE: I tried to make this commit as small as possible. I needed one
      place where I could "turn everything on" and that is here. It could be
      split into finer commits, but I didn't really see much point.
      
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7e0d96bc
    • B
      drm/i915: Get context early in execbuf · 41bde553
      Ben Widawsky 提交于
      We need to have the address space when reserving space for the objects.
      Since the address space and context are tied together, and reserve
      occurs before context switch (for good reason), we must lookup our
      context earlier in the process.
      
      This leaves some room for optimizations where we no longer need to use
      ctx_id in certain places. This will be addressed in a subsequent patch.
      
      Important tricky bit:
      Because slow relocations during execbuffer drop struct_mutex
      
      Perhaps it would be best to acquire the reference when we get the
      context, but I'll save that for another day (note I have written the
      patch before, and I found the changes required to be uglier than this).
      
      Note that since we currently access everything via context id, and not
      the data structure this is fine, though not desirable. The next change
      attempts to get the context only once via the context ID idr lookup, and
      as such, the following can happen:
      
      CTX-A is created, refcount = 1
      CTX-A execbuf, mutex dropped
      close IOCTL called on CTX-A, refcount = 0
      CTX-A resumes in execbuf.
      
      v2: Rebased on top of
      commit b6359918
      Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Date:   Wed Oct 30 15:44:16 2013 +0200
      
          drm/i915: add i915_get_reset_stats_ioctl
      
      v3: Rebased on top of
      commit 25b3dfc8
      Author: Mika Westerberg <mika.westerberg@linux.intel.com>
      Date:   Tue Nov 12 11:57:30 2013 +0200
      
      Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Date:   Tue Nov 26 16:14:33 2013 +0200
      
          drm/i915: check context reset stats before relocations
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      41bde553
    • B
      drm/i915: Permit contexts on all rings · 67e3d297
      Ben Widawsky 提交于
      If we want to use contexts in more abstract terms (specifically with
      PPGTT in mind), we need to allow them to be specified for any ring.
      
      Since the upcoming patches will bring about the use of multiple address
      spaces, and each ring needs to have an address space programmed (which
      we intend to do at context switch time), we can no longer only use RCS.
      
      With multiple rings having a last context, we must now unreference these
      contexts.
      
      NOTE: This commit requires an update to intel-gpu-tools to make it not
      fail.
      
      v2: Rebased with some logical conflicts.
      Squashed in the context fini refcount patch
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      67e3d297
    • B
      drm/i915: Simplify ring handling in execbuf · ca01b12b
      Ben Widawsky 提交于
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ca01b12b
    • B
      drm/i915: Remove vm arg from relocate entry · 3e7a0322
      Ben Widawsky 提交于
      The only place we were using it was for GEN6, which won't have PPGTT
      support anyway (ie. the VM is always the same). To clear things up,
      (it only added confusion for me since it doesn't allow us to assert
      vma->vm is what we always want, when just looking at the code).
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3e7a0322
    • B
      drm/i915: Create bind/unbind abstraction for VMAs · 6f65e29a
      Ben Widawsky 提交于
      To sum up what goes on here, we abstract the vma binding, similarly to
      the previous object binding. This helps for distinguishing legacy
      binding, versus modern binding. To keep the code churn as minimal as
      possible, I am leaving in insert_entries(). It serves as the per
      platform pte writing basically. bind_vma and insert_entries do share a
      lot of similarities, and I did have designs to combine the two, but as
      mentioned already... too much churn in an already massive patchset.
      
      What follows are the 3 commits which existed discretely in the original
      submissions. Upon rebasing on Broadwell support, it became clear that
      separation was not good, and only made for more error prone code. Below
      are the 3 commit messages with all their history.
      
      drm/i915: Add bind/unbind object functions to VMA
      drm/i915: Use the new vm [un]bind functions
      drm/i915: reduce vm->insert_entries() usage
      
      drm/i915: Add bind/unbind object functions to VMA
      
      As we plumb the code with more VM information, it has become more
      obvious that the easiest way to deal with bind and unbind is to simply
      put the function pointers in the vm, and let those choose the correct
      way to handle the page table updates. This change allows many places in
      the code to simply be vm->bind, and not have to worry about
      distinguishing PPGTT vs GGTT.
      
      Notice that this patch has no impact on functionality. I've decided to
      save the actual change until the next patch because I think it's easier
      to review that way. I'm happy to squash the two, or let Daniel do it on
      merge.
      
      v2:
      Make ggtt handle the quirky aliasing ppgtt
      Add flags to bind object to support above
      Don't ever call bind/unbind directly for PPGTT until we have real, full
      PPGTT (use NULLs to assert this)
      Make sure we rebind the ggtt if there already is a ggtt binding.  This
      happens on set cache levels.
      Use VMA for bind/unbind (Daniel, Ben)
      
      v3: Reorganize ggtt_vma_bind to be more concise and easier to read
      (Ville). Change logic in unbind to only unbind ggtt when there is a
      global mapping, and to remove a redundant check if the aliasing ppgtt
      exists.
      
      v4: Make the bind function a bit smarter about the cache levels to avoid
      unnecessary multiple remaps. "I accept it is a wart, I think unifying
      the pin_vma / bind_vma could be unified later" (Chris)
      Removed the git notes, and put version info here. (Daniel)
      
      v5: Update the comment to not suck (Chris)
      
      v6:
      Move bind/unbind to the VMA. It makes more sense in the VMA structure
      (always has, but I was previously lazy). With this change, it will allow
      us to keep a distinct insert_entries.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      
      drm/i915: Use the new vm [un]bind functions
      
      Building on the last patch which created the new function pointers in
      the VM for bind/unbind, here we actually put those new function pointers
      to use.
      
      Split out as a separate patch to aid in review. I'm fine with squashing
      into the previous patch if people request it.
      
      v2: Updated to address the smart ggtt which can do aliasing as needed
      Make sure we bind to global gtt when mappable and fenceable. I thought
      we could get away without this initialy, but we cannot.
      
      v3: Make the global GTT binding explicitly use the ggtt VM for
      bind_vma(). While at it, use the new ggtt_vma helper (Chris)
      
      At this point the original mailing list thread diverges. ie.
      
      v4^:
      use target_obj instead of obj for gen6 relocate_entry
      vma->bind_vma() can be called safely during pin. So simply do that
      instead of the complicated conditionals.
      Don't restore PPGTT bound objects on resume path
      Bug fix in resume path for globally bound Bos
      Properly handle secure dispatch
      Rebased on vma bind/unbind conversion
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      
      drm/i915: reduce vm->insert_entries() usage
      
      FKA: drm/i915: eliminate vm->insert_entries()
      
      With bind/unbind function pointers in place, we no longer need
      insert_entries. We could, and want, to remove clear_range, however it's
      not totally easy at this point. Since it's used in a couple of place
      still that don't only deal in objects: setup, ppgtt init, and restore
      gtt mappings.
      
      v2: Don't actually remove insert_entries, just limit its usage. It will
      be useful when we introduce gen8. It will always be called from the vma
      bind/unbind.
      
      Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6f65e29a
    • B
      drm/i915: Make pin count per VMA · d7f46fc4
      Ben Widawsky 提交于
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d7f46fc4
  10. 12 12月, 2013 1 次提交
    • C
      drm/i915: Prevent double unref following alloc failure during execbuffer · 9ae9ab52
      Chris Wilson 提交于
      Whilst looking up the objects required for an execbuffer, an untimely
      allocation failure in creating the vma results in the object being
      unreferenced from two lists. The ownership during the lookup is meant to
      be moved from the list of objects being looked to the vma, and this
      double unreference upon error results in a use-after-free.
      
      Fixes regression from
      commit 27173f1f
      Author: Ben Widawsky <ben@bwidawsk.net>
      Date:   Wed Aug 14 11:38:36 2013 +0200
      
          drm/i915: Convert execbuf code to use vmas
      
      Based on the fix by Ben Widawsky.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: stable@vger.kernel.org
      [danvet: Bikeshed the crucial comment above the ownership transfer as
      discussed on irc.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9ae9ab52
  11. 11 12月, 2013 1 次提交
  12. 04 12月, 2013 1 次提交
  13. 27 11月, 2013 1 次提交
  14. 26 11月, 2013 2 次提交
  15. 09 11月, 2013 2 次提交
    • B
      drm/i915/bdw: unleash PPGTT · 28cf5415
      Ben Widawsky 提交于
      v2: Squash in fix from Ben: Set PPGTT batches as necessary
      
      This fixes the regression in the last couple of days when we enabled
      PPGTT.
      
      v3: Squash in fixup to still use GTT for secure batches from Ville:
      
      BDW doesn't have a separate secure vs. non-secure bit in
      MI_BATCH_BUFFER_START. So for secure batches we have to simply
      leave the PPGTT bit unset. Fortunately older generations (except
      HSW) had similar limitations so execbuffer already creates a GTT
      mapping for all secure batches.
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NImre Deak <imre.deak@intel.com>
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      28cf5415
    • B
      drm/i915/bdw: Support 64b relocations · 3c94ceee
      Ben Widawsky 提交于
      We don't actually return any to userspace yet, however we can pretend
      like we do now so userspace will support it when it happens.
      
      This is just to please Chris as the code itself isn't ready for > 64b
      relocations.
      
      v2: Rebase on top of the refactored relocate_entry_gtt|cpu functions.
      
      v3: Squash in fixup from Rafal Barbalho for 64 byte relocs using cpu
      relocs and those crossing a page boundary.
      
      v4: Squash in a fixup for the fixup from Rafael.
      
      Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
      Signed-off-by: NBarbalho, Rafael <rafael.barbalho@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3c94ceee
  16. 01 10月, 2013 3 次提交
    • B
      drm/i915: Convert active API to VMA · e2d05a8b
      Ben Widawsky 提交于
      Even though we track object activity and not VMA, because we have the
      active_list be based on the VM, it makes the most sense to use VMAs in
      the APIs.
      
      NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for
      now, leave them be.
      
      v2: Remove leftover hunk from the previous patch which didn't keep
      i915_gem_object_move_to_active. That patch had to rely on the ring to
      get the dev instead of the obj. (Chris)
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e2d05a8b
    • D
      drm/i915: Use unsigned for overflow checks in execbuf · b205ca57
      Daniel Vetter 提交于
      There's actually no real risk since we already check for stricter
      constraints earlier (using UINT_MAX / sizeof (struct
      drm_i915_gem_exec_object2) as the limit). But in eb_create we use
      signed integers, which steals a factor of 2. Luckily struct
      drm_i915_gem_exec_object2 for this to not matter.
      
      Still, be consistent and use unsigned integers.
      
      Similar use unsinged integers when checking for overflows in the
      relocation entry processing.
      
      I've also added a new subtests to igt/gem_reloc_overflow to also
      test for overflowing args->buffer_count values.
      
      v2: Give the variables again tighter scope to make it clear that the
      computation is purely local and doesn't leak out to the 2nd block.
      Requested by Chris Wilson.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b205ca57
    • D
      drm/i915: Use kcalloc more · a1e22653
      Daniel Vetter 提交于
      No buffer overflows here, but better safe than sorry.
      
      v2:
      - Fixup the sizeof conversion, I've missed the pointer deref (Jani).
      - Drop the redundant GFP_ZERO, kcalloc alreads memsets (Jani).
      - Use kmalloc_array for the execbuf fastpath to avoid the memset
        (Chris). I've opted to leave all other conversions as-is since they
        aren't in a fastpath and dealing with cleared memory instead of
        random garbage is just generally nicer.
      
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJani Nikula <jani.nikula@intel.com>
      [danvet: Drop the contentious kmalloc_array hunk in execbuf.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a1e22653
  17. 13 9月, 2013 1 次提交
  18. 06 9月, 2013 1 次提交
    • M
      drm/i915: ban badly behaving contexts · be62acb4
      Mika Kuoppala 提交于
      Now when we have mechanism in place to track which context
      was guilty of hanging the gpu, it is possible to punish
      for bad behaviour.
      
      If context has recently submitted a faulty batchbuffers guilty of
      gpu hang and submits another batch which hangs gpu in quick
      succession, ban it permanently. If ctx is banned, no more
      batchbuffers will be queued for execution.
      
      There is no need for global wedge machinery anymore and
      it would be unwise to wedge the whole gpu if we have multiple
      hanging batches queued for execution. Instead just ban
      the guilty ones and carry on.
      
      v2: Store guilty ban status bool in gpu_error instead of pointers
          that might become danling before hang is declared.
      
      v3: Use return value for banned status instead of stashing state
          into gpu_error (Chris Wilson)
      
      v4: - rebase on top of fixed hang stats api
          - add define for ban period
          - rename commit and improve commit msg
      
      v5: - rely context banning instead of wedging the gpu
          - beautification and fix for ban calculation (Chris)
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      be62acb4
  19. 04 9月, 2013 4 次提交
    • C
      drm/i915: Always prefer CPU relocations with LLC · 2cc86b82
      Chris Wilson 提交于
      A follow-on to the update of the LLC coherency logic is that we can rely
      on the LLC being coherent with the CS for rewriting batchbuffers
      irrespective of their cache domain. (This should have no effect
      currently as all the batch buffers are expected to be I915_CACHE_LLC and
      so using the cpu relocation path anyway.)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2cc86b82
    • D
      drm/i915: inline vma_create into lookup_or_create_vma · e656a6cb
      Daniel Vetter 提交于
      In the execbuf code we don't clean up any vmas which ended up not
      getting bound for code simplicity. To make sure that we don't end up
      creating multiple vma for the same vm kill the somewhat dangerous
      vma_create function and inline it into lookup_or_create.
      
      This is just a safety measure to prevent surprises in the future.
      
      Also update the somewhat confused comment in the execbuf code and
      clarify what kind of magic is going on with a new one.
      
      v2: Keep the function separate as requested by Chris. But give it a __
      prefix for paranoia and move it tighter together with the other vma
      stuff.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e656a6cb
    • B
      drm/i915: Convert execbuf code to use vmas · 27173f1f
      Ben Widawsky 提交于
      In order to transition more of our code over to using a VMA instead of
      an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up
      until now, we've only had a VMA when actually binding an object.
      
      The previous patch helped handle the distinction on bound vs. unbound.
      This patch will help us catch leaks, and other issues before we actually
      shuffle a bunch of stuff around.
      
      This attempts to convert all the execbuf code to speak in vmas. Since
      the execbuf code is very self contained it was a nice isolated
      conversion.
      
      The meat of the code is about turning eb_objects into eb_vma, and then
      wiring up the rest of the code to use vmas instead of obj, vm pairs.
      
      Unfortunately, to do this, we must move the exec_list link from the obj
      structure. This list is reused in the eviction code, so we must also
      modify the eviction code to make this work.
      
      WARNING: This patch makes an already hotly profiled path slower. The cost is
      unavoidable. In reply to this mail, I will attach the extra data.
      
      v2: Release table lock early, and two a 2 phase vma lookup to avoid
      having to use a GFP_ATOMIC. (Chris)
      
      v3: s/obj_exec_list/obj_exec_link/
      Updates to address
      commit 6d2b8885
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Wed Aug 7 18:30:54 2013 +0100
      
          drm/i915: List objects allocated from stolen memory in debugfs
      
      v4: Use obj = vma->obj for neatness in some places (Chris)
      need_reloc_mappable() should return false if ppgtt (Chris)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Split out prep patches. Also remove a FIXME comment which is
      now taken care of.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      27173f1f
    • D
      drm/i915: fix up the relocate_entry refactoring · d4d36014
      Daniel Vetter 提交于
      Somehow we've lost the error handling in the patch split-up between
      the internal and external patch. This regression has been introduced
      in
      
      commit 5032d871
      Author: Rafael Barbalho <rafael.barbalho@intel.com>
      Date:   Wed Aug 21 17:10:51 2013 +0100
      
          drm/i915: Cleaning up the relocate entry function
      
      This bug is exercised by igt/gem_reloc_vs_gpu/interruptible.
      
      Cc: Rafael Barbalho <rafael.barbalho@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d4d36014
  20. 23 8月, 2013 1 次提交
  21. 22 8月, 2013 1 次提交
  22. 10 8月, 2013 1 次提交
    • C
      drm/i915: Update rules for writing through the LLC with the cpu · 2c22569b
      Chris Wilson 提交于
      As mentioned in the previous commit, reads and writes from both the CPU
      and GPU go through the LLC. This gives us coherency between the CPU and
      GPU irrespective of the attribute settings either device sets. We can
      use to avoid having to clflush even uncached memory.
      
      Except for the scanout.
      
      The scanout resides within another functional block that does not use
      the LLC but reads directly from main memory. So in order to maintain
      coherency with the scanout, writes to uncached memory must be flushed.
      In order to optimize writes elsewhere, we start tracking whether an
      framebuffer is attached to an object.
      
      v2: Use pin_display tracking rather than fb_count (to ensure we flush
      cursors as well etc) and only force the clflush along explicit writes to
      the scanout paths (i.e. pin_to_display_plane and pwrite into scanout).
      
      v3: Force the flush after hitting the slowpath in pwrite, as after
      dropping the lock the object's cache domain may be invalidated. (Ville)
      
      Based on a patch by Ville Syrjälä.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2c22569b
  23. 08 8月, 2013 1 次提交
    • B
      drm/i915: mm_list is per VMA · ca191b13
      Ben Widawsky 提交于
      formerly: "drm/i915: Create VMAs (part 5) - move mm_list"
      
      The mm_list is used for the active/inactive LRUs. Since those LRUs are
      per address space, the link should be per VMx .
      
      Because we'll only ever have 1 VMA before this point, it's not incorrect
      to defer this change until this point in the patch series, and doing it
      here makes the change much easier to understand.
      
      Shamelessly manipulated out of Daniel:
      "active/inactive stuff is used by eviction when we run out of address
      space, so needs to be per-vma and per-address space. Bound/unbound otoh
      is used by the shrinker which only cares about the amount of memory used
      and not one bit about in which address space this memory is all used in.
      Of course to actual kick out an object we need to unbind it from every
      address space, but for that we have the per-object list of vmas."
      
      v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)
      
      v3: Moved earlier in the series
      
      v4: Add dropped message from v3
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Frob patch to apply and use vma->node.size directly as
      discused with Ben. Also drop a needles BUG_ON before move_to_inactive,
      the function itself has the same check.]
      [danvet 2nd: Rebase on top of the lost "drm/i915: Cleanup more of VMA
      in destroy", specifically unlink the vma from the mm_list in
      vma_unbind (to keep it symmetric with bind_to_vm) instead of
      vma_destroy.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ca191b13