1. 20 3月, 2015 3 次提交
  2. 18 3月, 2015 2 次提交
  3. 10 3月, 2015 2 次提交
  4. 28 2月, 2015 1 次提交
  5. 27 2月, 2015 1 次提交
  6. 26 2月, 2015 1 次提交
    • J
      drm/i915: Cache ringbuf pointer in request structure · 98e1bd4a
      John Harrison 提交于
      In execlist mode, the ringbuf is a function of the ring and context whereas in
      legacy mode, it is derived from the ring alone. Thus the calculation required to
      determine the ringbuf pointer from the ring (and context) also needs to test
      execlist mode or not. This is messy.
      
      Further, the request structure holds a pointer to both the ring and the context
      for which it was created. Thus, given a request, it is possible to derive the
      ringbuf in either legacy or execlist mode. Hence it is necessary to pass just
      the request in to all the low level functions rather than some combination of
      request, ring, context and ringbuf. However, rather than recalculating it each
      time, it is much simpler to just cache the ringbuf pointer in the request
      structure itself.
      
      Caching the pointer means the calculation is done once at request creation time
      and all further code and simply read it directly from the request structure.
      
      OTC-Jira: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      [danvet: Drop contentless comment in lrc alloc request entirely. And
      spelling fix in the commit message.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      98e1bd4a
  7. 24 2月, 2015 1 次提交
    • N
      drm/i915: Fix a use after free, and unbalanced refcounting · b3a38998
      Nick Hoath 提交于
      When converting from implicitly tracked execlist queue items to ref counted
      requests, not all frees of requests were replaced with unrefs, and extraneous
      refs/unrefs of contexts were added.
      Correct the unbalanced refcount & replace the frees.
      Remove a noisy warning when hitting the request creation path.
      
      drm_i915_gem_request and intel_context are both kref reference counted
      structures. Upon allocation, drm_i915_gem_request's ref count should be
      bumped using kref_init. When a context is assigned to the request,
      the context's reference count should be bumped using i915_gem_context_reference.
      i915_gem_request_reference will reduce the context reference count when
      the request is freed.
      
      Problem introduced in
      commit 6d3d8274
      Author:     Nick Hoath <nicholas.hoath@intel.com>
      AuthorDate: Thu Jan 15 13:10:39 2015 +0000
      
           drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request
      
      v2: Added comments explaining how the ctx pointer and the request object should
      be ref-counted. Removed noisy warning.
      
      v3: Cleaned up the language used in the commit & the header
      description (Thanks David Gordon)
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652Signed-off-by: NNick Hoath <nicholas.hoath@intel.com>
      Reviewed-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NDaniel Vetter <daniel@ffwll.ch>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      b3a38998
  8. 14 2月, 2015 2 次提交
  9. 29 1月, 2015 2 次提交
  10. 27 1月, 2015 6 次提交
  11. 26 1月, 2015 2 次提交
    • B
      drm/i915: Only fence tiled region of object. · af1a7301
      Bob Paauwe 提交于
      When creating a fence for a tiled object, only fence the area that
      makes up the actual tiles.  The object may be larger than the tiled
      area and if we allow those extra addresses to be fenced, they'll
      get converted to addresses beyond where the object is mapped. This
      opens up the possiblity of writes beyond the end of object.
      
      To prevent this, we adjust the size of the fence to only encompass
      the area that makes up the actual tiles.  The extra space is considered
      un-tiled and now behaves as if it was a linear object.
      
      Testcase: igt/gem_tiled_fence_overflow
      Reported-by: NDan Hettena <danh@ghs.com>
      Signed-off-by: NBob Paauwe <bob.j.paauwe@intel.com>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      af1a7301
    • D
      drm/i915: Init PPGTT before context enable · f48a0165
      David Woodhouse 提交于
      Commit 82460d97 ("drm/i915: Rework ppgtt init to no require an aliasing
      ppgtt") introduced a regression on Broadwell, triggering the following
      IOMMU fault at startup:
      
        vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
        dmar: DRHD: handling fault status reg 2
        dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 880000
        DMAR:[fault reason 23] Unknown
        fbcon: inteldrmfb (fb0) is primary device
      
      Further commentary from Daniel:
      
      I sugggested this change to David after staring at the offending patch
      for a while. I have no idea and theory whatsoever why this would upset
      the gpu less than the other way round. But it seems to work. David
      promised to chase hw people a bit more to get a more meaningful answer.
      
      Wrt the comment that this deletes: I've done some digging and afaict
      loading context before ppgtt enable was once required before our recent
      restructuring of the context/ppgtt init code: Before that context sw
      setup (i.e. allocating the default context) and hw setup was smashed
      together.  Also the setup of the default context was the bit that
      actually allocated the aliasing ppgtt structures. Which is the reason
      for the context before ppgtt depency.
      
      Or was, since with all the untangling there's no no real depency any
      more (functional, who knows what the hw is doing), so the comment is
      just stale.
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      f48a0165
  12. 12 1月, 2015 1 次提交
    • C
      drm/i915: Fix mutex->owner inspection race under DEBUG_MUTEXES · 226e5ae9
      Chris Wilson 提交于
      If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
      if the mutex debugging is enabled which introduces a race in our
      mutex_is_locked_by() - i.e. we may inspect the old owner value before it
      is acquired by the new task.
      
      This is the root cause of this error:
      
       diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
       index 5cf6731..3ef3736 100644
       --- a/kernel/locking/mutex-debug.c
       +++ b/kernel/locking/mutex-debug.c
       @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
       			DEBUG_LOCKS_WARN_ON(lock->owner != current);
      
       		DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
       -		mutex_clear_owner(lock);
       	}
      
       	/*
       	 * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
       	 * mutexes so that we can do it here after we've verified state.
       	 */
       +	mutex_clear_owner(lock);
       	atomic_set(&lock->count, 1);
        }
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: stable@vger.kernel.org
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      226e5ae9
  13. 07 1月, 2015 1 次提交
  14. 06 1月, 2015 2 次提交
    • A
      drm/i915: Support creation of unbound wc user mappings for objects · 1816f923
      Akash Goel 提交于
      This patch provides support to create write-combining virtual mappings of
      GEM object. It intends to provide the same funtionality of 'mmap_gtt'
      interface without the constraints and contention of a limited aperture
      space, but requires clients handles the linear to tile conversion on their
      own. This is for improving the CPU write operation performance, as with such
      mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar
      to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache
      flush after update from CPU side, when object is passed onto GPU.  This
      type of mapping is specially useful in case of sub-region update,
      i.e. when only a portion of the object is to be updated. Using a CPU mmap
      in such cases would normally incur a clflush of the whole object, and
      using a GTT mmapping would likely require eviction of an active object or
      fence and thus stall. The write-combining CPU mmap avoids both.
      
      To ensure the cache coherency, before using this mapping, the GTT domain
      has been reused here. This provides the required cache flush if the object
      is in CPU domain or synchronization against the concurrent rendering.
      Although the access through an uncached mmap should automatically
      invalidate the cache lines, this may not be true for non-temporal write
      instructions and also not all pages of the object may be updated at any
      given point of time through this mapping.  Having a call to get_pages in
      set_to_gtt_domain function, as added in the earlier patch 'drm/i915:
      Broaden application of set-domain(GTT)', would guarantee the clflush and
      so there will be no cachelines holding the data for the object before it
      is accessed through this map.
      
      The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been
      extended with a new flags field (defaulting to 0 for existent users). In
      order for userspace to detect the extended ioctl, a new parameter
      I915_PARAM_MMAP_VERSION has been added for versioning the ioctl interface.
      
      v2: Fix error handling, invalid flag detection, renaming (ickle)
      
      v3: Rebase to latest drm-intel-nightly codebase
      
      The new mmapping is exercised by igt/gem_mmap_wc,
      igt/gem_concurrent_blit and igt/gem_gtt_speed.
      
      Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a
      Signed-off-by: NAkash Goel <akash.goel@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1816f923
    • C
      drm/i915: Broaden application of set-domain(GTT) · 43566ded
      Chris Wilson 提交于
      Previously, this was restricted to only operate on bound objects - to
      make pointer access through the GTT to the object coherent with writes
      to and from the GPU. A second usecase is drm_intel_bo_wait_rendering()
      which at present does not function unless the object also happens to
      be bound into the GGTT (on current systems that is becoming increasingly
      rare, especially for the typical requests from mesa). A third usecase is
      a future patch wishing to extend the coverage of the GTT domain to
      include objects not bound into the GGTT but still in its coherent cache
      domain. For the latter pair of requests, we need to operate on the
      object regardless of its bind state.
      
      v2: After discussion with Akash, we came to the conclusion that the
      get-pages was required in order for accurate domain tracking in the
      corner cases (like the shrinker) and also useful for ensuring memory
      coherency with earlier cached CPU mmaps in case userspace uses exotic
      cache bypass (non-temporal) instructions.
      
      v3: Fix the inactive object check.
      
      v4: Rebase to latest drm-intel-nightly codebase
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NAkash Goel <akash.goel@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      43566ded
  15. 24 12月, 2014 1 次提交
  16. 18 12月, 2014 2 次提交
  17. 16 12月, 2014 2 次提交
    • B
      drm/i915: Use batch pools with the command parser · 78a42377
      Brad Volkin 提交于
      This patch sets up all of the tracking and copying necessary to
      use batch pools with the command parser and dispatches the copied
      (shadow) batch to the hardware.
      
      After this patch, the parser is in 'enabling' mode.
      
      Note that performance takes a hit from the copy in some cases
      and will likely need some work. At a rough pass, the memcpy
      appears to be the bottleneck. Without having done a deeper
      analysis, two ideas that come to mind are:
      1) Copy sections of the batch at a time, as they are reached
         by parsing. Might improve cache locality.
      2) Copy only up to the userspace-supplied batch length and
         memset the rest of the buffer. Reduces the number of reads.
      
      v2:
      - Remove setting the capacity of the pool
      - One global pool instead of per-ring pools
      - Replace batch_obj with shadow_batch_obj and hook into eb->vmas
      - Memset any space in the shadow batch beyond what gets copied
      - Rebased on execlist prep refactoring
      
      v3:
      - Rebase on chained batch handling
      - Squash in setting the secure dispatch flag
      - Add a note about the interaction w/secure dispatch pinning
      - Check for request->batch_obj == NULL in i915_gem_free_request
      
      v4:
      - Fix read domains for shadow_batch_obj
      - Remove the set_to_gtt_domain call from i915_parse_cmds
      - ggtt_pin/unpin in the parser block to simplify error handling
      - Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag
      - Remove i915_gem_batch_pool_put calls
      
      v5:
      - Move 'pending_read_domains |= I915_GEM_DOMAIN_COMMAND' after
        the parser (danvet, from v4 0/7 feedback)
      
      Issue: VIZ-4719
      Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
      Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      78a42377
    • B
      drm/i915: Implement a framework for batch buffer pools · 493018dc
      Brad Volkin 提交于
      This adds a small module for managing a pool of batch buffers.
      The only current use case is for the command parser, as described
      in the kerneldoc in the patch. The code is simple, but separating
      it out makes it easier to change the underlying algorithms and to
      extend to future use cases should they arise.
      
      The interface is simple: init to create an empty pool, fini to
      clean it up, get to obtain a new buffer. Note that all buffers are
      expected to be inactive before cleaning up the pool.
      
      Locking is currently based on the caller holding the struct_mutex.
      We already do that in the places where we will use the batch pool
      for the command parser.
      
      v2:
      - s/BUG_ON/WARN_ON/ for locking assertions
      - Remove the cap on pool size
      - Switch from alloc/free to init/fini
      
      v3:
      - Idiomatic looping structure in _fini
      - Correct handling of purged objects
      - Don't return a buffer that's too much larger than needed
      
      v4:
      - Rebased to latest -nightly
      
      v5:
      - Remove _put() function and clean up comments to match
      
      v6:
      - Move purged check inside the loop (danvet, from v4 1/7 feedback)
      
      v7:
      - Use single list instead of two. (Chris W)
      - s/active_list/cache_list
      - Squashed in debug patches (Chris W)
        drm/i915: Add a batch pool debugfs file
      
        It provides some useful information about the buffers in
        the global command parser batch pool.
      
        v2: rebase on global pool instead of per-ring pools
        v3: rebase
      
        drm/i915: Add batch pool details to i915_gem_objects debugfs
      
        To better account for the potentially large memory consumption
        of the batch pool.
      
      v8:
      - Keep cache in LRU order (danvet, from v6 1/5 feedback)
      
      Issue: VIZ-4719
      Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
      Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      493018dc
  18. 15 12月, 2014 1 次提交
    • T
      drm/i915: Infrastructure for supporting different GGTT views per object · fe14d5f4
      Tvrtko Ursulin 提交于
      Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
      to map objects into the same address space multiple times.
      
      Added a GGTT view concept and linked it with the VMA to distinguish between
      multiple instances per address space.
      
      New objects and GEM functions which do not take this new view as a parameter
      assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
      previous behaviour.
      
      This now means that objects can have multiple VMA entries so the code which
      assumed there will only be one also had to be modified.
      
      Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
      which is DMA mapped on first VMA instantiation and unmapped on the last one
      going away.
      
      v2:
          * Removed per view special casing in i915_gem_ggtt_prepare /
            finish_object in favour of creating and destroying DMA mappings
            on first VMA instantiation and last VMA destruction. (Daniel Vetter)
          * Simplified i915_vma_unbind which does not need to count the GGTT views.
            (Daniel Vetter)
          * Also moved obj->map_and_fenceable reset under the same check.
          * Checkpatch cleanups.
      
      v3:
          * Only retire objects once the last VMA is unbound.
      
      v4:
          * Keep scatter-gather table for alternative views persistent for the
            lifetime of the VMA.
          * Propagate binding errors to callers and handle appropriately.
      
      v5:
          * Explicitly look for normal GGTT view in i915_gem_obj_bound to align
            usage in i915_gem_object_ggtt_unpin. (Michel Thierry)
          * Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry)
          * Removed stray semi-colon in i915_gem_object_set_cache_level.
      
      For: VIZ-4544
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      [danvet: Drop hunk from i915_gem_shrink since it's just prettification
      but upsets a __must_check warning.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      fe14d5f4
  19. 05 12月, 2014 3 次提交
  20. 04 12月, 2014 1 次提交
  21. 03 12月, 2014 3 次提交