1. 06 8月, 2013 5 次提交
  2. 25 7月, 2013 2 次提交
  3. 24 7月, 2013 1 次提交
  4. 19 7月, 2013 4 次提交
  5. 18 7月, 2013 4 次提交
    • B
      drm/i915: Create VMAs · 2f633156
      Ben Widawsky 提交于
      Formerly: "drm/i915: Create VMAs (part 1)"
      
      In a previous patch, the notion of a VM was introduced. A VMA describes
      an area of part of the VM address space. A VMA is similar to the concept
      in the linux mm. However, instead of representing regular memory, a VMA
      is backed by a GEM BO. There may be many VMAs for a given object, one
      for each VM the object is to be used in. This may occur through flink,
      dma-buf, or a number of other transient states.
      
      Currently the code depends on only 1 VMA per object, for the global GTT
      (and aliasing PPGTT). The following patches will address this and make
      the rest of the infrastructure more suited
      
      v2: s/i915_obj/i915_gem_obj (Chris)
      
      v3: Only move an object to the now global unbound list if there are no
      more VMAs for the object which are bound into a VM (ie. the list is
      empty).
      
      v4: killed obj->gtt_space
      some reworks due to rebase
      
      v5: Free vma on error path (Imre)
      
      v6: Another missed vma free in i915_gem_object_bind_to_gtt error path
      (Imre)
      Fixed vma freeing in stolen preallocation (Imre)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NImre Deak <imre.deak@intel.com>
      [danvet: Squash in fixup from Ben to not deref a non-existing vma in
      set_cache_level, reported by Chris.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2f633156
    • B
      drm/i915: Move active/inactive lists to new mm · 5cef07e1
      Ben Widawsky 提交于
      Shamelessly manipulated out of Daniel :-)
      "When moving the lists around explain that the active/inactive stuff is
      used by eviction when we run out of address space, so needs to be
      per-vma and per-address space. Bound/unbound otoh is used by the
      shrinker which only cares about the amount of memory used and not one
      bit about in which address space this memory is all used in. Of course
      to actual kick out an object we need to unbind it from every address
      space, but for that we have the per-object list of vmas."
      
      v2: Leave the bound list as a global one. (Chris, indirectly)
      
      v3: Rebased with no i915_gtt_vm. In most places I added a new *vm local,
      since it will eventually be replaces by a vm argument.
      Put comment back inline, since it no longer makes sense to do otherwise.
      
      v4: Rebased on hangcheck/error state movement
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NImre Deak <imre.deak@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      5cef07e1
    • B
      drm/i915: Put the mm in the parent address space · 93bd8649
      Ben Widawsky 提交于
      Every address space should support object allocation. It therefore makes
      sense to have the allocator be part of the "superclass" which GGTT and
      PPGTT will derive.
      
      Since our maximum address space size is only 2GB we're not yet able to
      avoid doing allocation/eviction; but we'd hope one day this becomes
      almost irrelvant.
      
      v2: Rebased
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NImre Deak <imre.deak@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      93bd8649
    • B
      drm/i915: Move gtt and ppgtt under address space umbrella · 853ba5d2
      Ben Widawsky 提交于
      The GTT and PPGTT can be thought of more generally as GPU address
      spaces. Many of their actions (insert entries), state (LRU lists), and
      many of their characteristics (size) can be shared. Do that.
      
      The change itself doesn't actually impact most of the VMA/VM rework
      coming up, it just fits in with the grand scheme of abstracting the GPU
      VM operations. GGTT will usually be a special case where we either know
      an object must be in the GGTT (dislay engine, workarounds, etc.).
      
      The scratch page is left as part of the VM (even though it's currently
      shared with the ppgtt code) because in the future when we have Full
      PPGTT, I intend to create a separate scratch page for each.
      
      v2: Drop usage of i915_gtt_vm (Daniel)
      Make cleanup also part of the parent class (Ben)
      Modified commit msg
      Rebased
      
      v3: Properly share scratch page (Imre)
      Finish commit message (Daniel, Imre)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NImre Deak <imre.deak@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      853ba5d2
  6. 16 7月, 2013 3 次提交
  7. 10 7月, 2013 4 次提交
    • C
      Revert "drm/i915: Workaround incoherence between fences and LLC across multiple CPUs" · 46a0b638
      Chris Wilson 提交于
      This reverts commit 25ff1195 and the follow on for Valleyview commit 2dc8aae0.
      
      commit 25ff1195
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Thu Apr 4 21:31:03 2013 +0100
      
          drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
      
      commit 2dc8aae0
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Wed May 22 17:08:06 2013 +0100
      
          drm/i915: Workaround incoherence with fence updates on Valleyview
      
      Jon Bloomfield came up with a plausible explanation and cheap fix
      (drm/i915: Fix incoherence with fence updates on Sandybridge+) for the
      race condition, so lets run with it.
      
      This is a candidate for stable as the old workaround incurs a
      significant cost (calling wbinvd on all CPUs before performing the
      register write) for some workloads as noted by Carsten Emde.
      
      Link: http://lists.freedesktop.org/archives/intel-gfx/2013-June/028819.html
      References: https://www.osadl.org/?id=1543#c7602
      References: https://bugs.freedesktop.org/show_bug.cgi?id=63825Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Jon Bloomfield <jon.bloomfield@intel.com>
      Cc: Carsten Emde <C.Emde@osadl.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      46a0b638
    • C
      drm/i915: Fix incoherence with fence updates on Sandybridge+ · d18b9619
      Chris Wilson 提交于
      This hopefully fixes the root cause behind the workaround added in
      
      commit 25ff1195
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Thu Apr 4 21:31:03 2013 +0100
      
          drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
      
      Thanks to further investigation by Jon Bloomfield, he realised that
      the 64-bit register might be broken up by the hardware into two 32-bit
      writes (a problem we have encountered elsewhere). This non-atomicity
      would then cause an issue where a second thread would see an
      intermediate register state (new high dword, old low dword), and this
      register would randomly be used in preference to its own thread register.
      This would cause the second thread to read from and write into a fairly
      random tiled location.  Breaking the operation into 3 explicit 32-bit
      updates (first disable the fence, poke the upper bits, then poke the lower
      bits and enable) ensures that, given proper serialisation between the
      32-bit register write and the memory transfer, that the fence value is
      always consistent.
      
      Armed with this knowledge, we can explain how the previous workaround
      work. The key to the corruption is that a second thread sees an
      erroneous fence register that conflicts and overrides its own. By
      serialising the fence update across all CPUs, we have a small window
      where no GTT access is occurring and so hide the potential corruption.
      This also leads to the conclusion that the earlier workaround was
      incomplete.
      
      v2: Be overly paranoid about the order in which fence updates become
      visible to the GPU to make really sure that we turn the fence off before
      doing the update, and then only switch the fence on afterwards.
      Signed-off-by: NJon Bloomfield <jon.bloomfield@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Carsten Emde <C.Emde@osadl.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d18b9619
    • D
      drm/i915: don't frob mm.suspended when not using ums · db1b76ca
      Daniel Vetter 提交于
      In kernel modeset driver mode we're in full control of the chip,
      always. So there's no need at all to set mm.suspended in
      i915_gem_idle. Hence move that out into the leavevt ioctl. Since
      i915_gem_idle doesn't suspend gem any more we can also drop the
      re-enabling for KMS in the thaw function.
      
      Also clean up the handling of mm.suspend at driver load by coalescing
      all the assignments.
      
      Stumbled over while reading through our resume code for unrelated
      reasons.
      
      v2: Shovel mm.suspended into the (newly created) ums dungeon as
      suggested by Chris Wilson. The plan is that once we've completely
      stopped relying on the register save/restore code we could shovel even
      that in there.
      
      v3: Improve the locking for the entervt/leavevt ioctls a bit by moving
      the dev->struct_mutex locking outside of i915_gem_idle. Also don't
      clear dev_priv->ums.mm_suspended for the kms case, we allocate it with
      kzalloc. Both suggested by Chris Wilson.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      db1b76ca
    • C
      drm/i915: Fix write-read race with multiple rings · 02978ff5
      Chris Wilson 提交于
      Daniel noticed a problem where is we wrote to an object with ring A in
      the middle of a very long running batch, then executed a quick batch on
      ring B before a batch that reads from the same object, its obj->ring would
      now point to ring B, but its last_write_seqno would be still relative to
      ring A. This would allow for the user to read from the object before the
      GPU had completed the write, as set_domain would only check that ring B
      had passed the last_write_seqno.
      
      To fix this simply (and inelegantly), we bump the last_write_seqno when
      switching rings so that the last_write_seqno is always relative to the
      current obj->ring.
      
      This fixes igt/tests/gem_write_read_ring_switch.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: stable@vger.kernel.org
      [danvet: Add note about the newly created igt which exercises this
      bug.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      02978ff5
  8. 09 7月, 2013 4 次提交
    • X
      drm/i915: Correct obj->mm_list link to dev_priv->dev_priv->mm.inactive_list · 06755608
      Xiong Zhang 提交于
      obj->mm_list link to dev_priv->mm.inactive_list/active_list
      obj->global_list link to dev_priv->mm.unbound_list/bound_list
      
      This regression has been introduced in
      
      commit 93927ca5
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Thu Jan 10 18:03:00 2013 +0100
      
          drm/i915: Revert shrinker changes from "Track unbound pages"
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NXiong Zhang <xiong.y.zhang@intel.com>
      [danvet: Add regression notice.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      06755608
    • B
      drm/i915: Embed drm_mm_node in i915 gem obj · c6cfb325
      Ben Widawsky 提交于
      Embedding the node in the obj is more natural in the transition to VMAs
      which will also have embedded nodes. This change also helps transition
      away from put_block to remove node.
      
      Though it's quite an uncommon occurrence, it's somewhat convenient to not
      fail at bind time because we cannot allocate the node. Though in
      practice there are other allocations (like the request structure) which
      would probably make this point not terribly useful.
      
      Quoting Daniel:
      Note that the only difference between put_block and remove_node is
      that the former fills up the preallocation cache. Which we don't need
      anyway and hence is just wasted space.
      
      v2: Clean up the stolen preallocation code.
      Rebased on the reserve_node patches
      renames ggtt_ stuff to gtt_ stuff
      WARN_ON if the object is already bound (which doesn't mean it's in the
      bound list, tricky)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c6cfb325
    • B
      drm/i915: Kill obj->gtt_offset · edd41a87
      Ben Widawsky 提交于
      With the getters in place from the previous patch this members serves no
      purpose other than saving one spare pointer chase, which will be killed
      in the next patch anyway.
      
      Moving to VMAs, this members adds unnecessary confusion since an object
      may exist at different offsets in different VMs.
      
      v2: Properly preserve the stolen offset. This code is a bit hacky but it
      all goes away when we embed the drm_mm_node and removes the need for the
      incorrect patch I submitted previously: "Use gtt_space->start for stolen
      reservation"
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      edd41a87
    • B
      drm/i915: Getter/setter for object attributes · f343c5f6
      Ben Widawsky 提交于
      Soon we want to gut a lot of our existing assumptions how many address
      spaces an object can live in, and in doing so, embed the drm_mm_node in
      the object (and later the VMA).
      
      It's possible in the future we'll want to add more getter/setter
      methods, but for now this is enough to enable the VMAs.
      
      v2: Reworked commit message (Ben)
      Added comments to the main functions (Ben)
      sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
      sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
      sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
      sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
      (Daniel)
      
      v3: Rebased on new reserve_node patch
      Changed DRM_DEBUG_KMS to actually work (will need fixing later)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      f343c5f6
  9. 01 7月, 2013 4 次提交
    • C
      drm/i915: Refactor the wait_rendering completion into a common routine · d26e3af8
      Chris Wilson 提交于
      Harmonise the completion logic between the non-blocking and normal
      wait_rendering paths, and move that logic into a common function.
      
      In the process, we note that the last_write_seqno is by definition the
      earlier of the two read/write seqnos and so all successful waits will
      have passed the last_write_seqno. Therefore we can unconditionally clear
      the write seqno and its domains in the completion logic.
      
      v2: Add the missing ring parameter, because sometimes it is good to have
      things compile.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d26e3af8
    • C
      drm/i915: Only clear write-domains after a successful wait-seqno · daa13e1c
      Chris Wilson 提交于
      In the introduction of the non-blocking wait, I cut'n'pasted the wait
      completion code from normal locked path. Unfortunately, this neglected
      that the normal path returned early if the wait returned early. The
      result is that read-only waits may return whilst the GPU is still
      writing to the bo.
      
      Fixes regression from
      commit 3236f57a [v3.7]
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Fri Aug 24 09:35:09 2012 +0100
      
          drm/i915: Use a non-blocking wait for set-to-domain ioctl
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66163
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      daa13e1c
    • J
      drm/i915: fix build warning on format specifier mismatch · 3765f304
      Jani Nikula 提交于
      drivers/gpu/drm/i915/i915_gem.c: In function ‘i915_gem_object_bind_to_gtt’:
      drivers/gpu/drm/i915/i915_gem.c:3002:3: warning: format ‘%ld’ expects
      argument of type ‘long int’, but argument 5 has type ‘size_t’ [-Wformat]
      
      v2: Use %zu instead of %d. Two char patch, and 100% wrong. (Ville)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3765f304
    • K
      drm/i915: make compact dma scatter lists creation work with SWIOTLB backend. · 1625e7e5
      Konrad Rzeszutek Wilk 提交于
      Git commit 90797e6d
      ("drm/i915: create compact dma scatter lists for gem objects") makes
      certain assumptions about the under laying DMA API that are not always
      correct.
      
      On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup
      I see:
      
      [drm:intel_pipe_set_base] *ERROR* pin & fence failed
      [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28
      
      Bit of debugging traced it down to dma_map_sg failing (in
      i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB).
      
      That unfortunately are sizes that the SWIOTLB is incapable of handling -
      the maximum it can handle is a an entry of 512KB of virtual contiguous
      memory for its bounce buffer. (See IO_TLB_SEGSIZE).
      
      Previous to the above mention git commit the SG entries were of 4KB, and
      the code introduced by above git commit squashed the CPU contiguous PFNs
      in one big virtual address provided to DMA API.
      
      This patch is a simple semi-revert - were we emulate the old behavior
      if we detect that SWIOTLB is online. If it is not online then we continue
      on with the new compact scatter gather mechanism.
      
      An alternative solution would be for the the '.get_pages' and the
      i915_gem_gtt_prepare_object to retry with smaller max gap of the
      amount of PFNs that can be combined together - but with this issue
      discovered during rc7 that might be too risky.
      Reported-and-Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      CC: Chris Wilson <chris@chris-wilson.co.uk>
      CC: Imre Deak <imre.deak@intel.com>
      CC: Daniel Vetter <daniel.vetter@ffwll.ch>
      CC: David Airlie <airlied@linux.ie>
      CC: <dri-devel@lists.freedesktop.org>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1625e7e5
  10. 25 6月, 2013 1 次提交
    • K
      drm/i915: make compact dma scatter lists creation work with SWIOTLB backend. · 426729dc
      Konrad Rzeszutek Wilk 提交于
      Git commit 90797e6d
      ("drm/i915: create compact dma scatter lists for gem objects") makes
      certain assumptions about the under laying DMA API that are not always
      correct.
      
      On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup
      I see:
      
      [drm:intel_pipe_set_base] *ERROR* pin & fence failed
      [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28
      
      Bit of debugging traced it down to dma_map_sg failing (in
      i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB).
      
      That unfortunately are sizes that the SWIOTLB is incapable of handling -
      the maximum it can handle is a an entry of 512KB of virtual contiguous
      memory for its bounce buffer. (See IO_TLB_SEGSIZE).
      
      Previous to the above mention git commit the SG entries were of 4KB, and
      the code introduced by above git commit squashed the CPU contiguous PFNs
      in one big virtual address provided to DMA API.
      
      This patch is a simple semi-revert - were we emulate the old behavior
      if we detect that SWIOTLB is online. If it is not online then we continue
      on with the new compact scatter gather mechanism.
      
      An alternative solution would be for the the '.get_pages' and the
      i915_gem_gtt_prepare_object to retry with smaller max gap of the
      amount of PFNs that can be combined together - but with this issue
      discovered during rc7 that might be too risky.
      Reported-and-Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      CC: Chris Wilson <chris@chris-wilson.co.uk>
      CC: Imre Deak <imre.deak@intel.com>
      CC: Daniel Vetter <daniel.vetter@ffwll.ch>
      CC: David Airlie <airlied@linux.ie>
      CC: <dri-devel@lists.freedesktop.org>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      426729dc
  11. 16 6月, 2013 1 次提交
    • C
      drm/i915: Restore fences after resume and GPU resets · 19b2dbde
      Chris Wilson 提交于
      Stéphane Marchesin found that fences for pinned objects (i.e. the
      scanout) were not being restored upon resume, leading to corruption on
      the display and reference counting issues. This is due to a bug in
      
      commit 312817a3 [2.6.38]
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Mon Nov 22 11:50:11 2010 +0000
      
          drm/i915: Only save and restore fences for UMS
      
      that zapped the pinned fences even though they were in use.
      Fortuitously, whilst we forced a VT switch during suspend and resume,
      no fences were ever pinned at the time. However, we now can do
      switchless S3 transitions and so the old bug finally surfaces.
      Reported-by: NStéphane Marchesin <marcheu@chromium.org>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Stéphane Marchesin <marcheu@chromium.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      19b2dbde
  12. 13 6月, 2013 3 次提交
  13. 03 6月, 2013 4 次提交