1. 12 1月, 2015 1 次提交
    • C
      drm/i915: Fix mutex->owner inspection race under DEBUG_MUTEXES · 226e5ae9
      Chris Wilson 提交于
      If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
      if the mutex debugging is enabled which introduces a race in our
      mutex_is_locked_by() - i.e. we may inspect the old owner value before it
      is acquired by the new task.
      
      This is the root cause of this error:
      
       diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
       index 5cf6731..3ef3736 100644
       --- a/kernel/locking/mutex-debug.c
       +++ b/kernel/locking/mutex-debug.c
       @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
       			DEBUG_LOCKS_WARN_ON(lock->owner != current);
      
       		DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
       -		mutex_clear_owner(lock);
       	}
      
       	/*
       	 * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
       	 * mutexes so that we can do it here after we've verified state.
       	 */
       +	mutex_clear_owner(lock);
       	atomic_set(&lock->count, 1);
        }
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: stable@vger.kernel.org
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      226e5ae9
  2. 07 1月, 2015 1 次提交
  3. 06 1月, 2015 2 次提交
    • A
      drm/i915: Support creation of unbound wc user mappings for objects · 1816f923
      Akash Goel 提交于
      This patch provides support to create write-combining virtual mappings of
      GEM object. It intends to provide the same funtionality of 'mmap_gtt'
      interface without the constraints and contention of a limited aperture
      space, but requires clients handles the linear to tile conversion on their
      own. This is for improving the CPU write operation performance, as with such
      mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar
      to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache
      flush after update from CPU side, when object is passed onto GPU.  This
      type of mapping is specially useful in case of sub-region update,
      i.e. when only a portion of the object is to be updated. Using a CPU mmap
      in such cases would normally incur a clflush of the whole object, and
      using a GTT mmapping would likely require eviction of an active object or
      fence and thus stall. The write-combining CPU mmap avoids both.
      
      To ensure the cache coherency, before using this mapping, the GTT domain
      has been reused here. This provides the required cache flush if the object
      is in CPU domain or synchronization against the concurrent rendering.
      Although the access through an uncached mmap should automatically
      invalidate the cache lines, this may not be true for non-temporal write
      instructions and also not all pages of the object may be updated at any
      given point of time through this mapping.  Having a call to get_pages in
      set_to_gtt_domain function, as added in the earlier patch 'drm/i915:
      Broaden application of set-domain(GTT)', would guarantee the clflush and
      so there will be no cachelines holding the data for the object before it
      is accessed through this map.
      
      The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been
      extended with a new flags field (defaulting to 0 for existent users). In
      order for userspace to detect the extended ioctl, a new parameter
      I915_PARAM_MMAP_VERSION has been added for versioning the ioctl interface.
      
      v2: Fix error handling, invalid flag detection, renaming (ickle)
      
      v3: Rebase to latest drm-intel-nightly codebase
      
      The new mmapping is exercised by igt/gem_mmap_wc,
      igt/gem_concurrent_blit and igt/gem_gtt_speed.
      
      Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a
      Signed-off-by: NAkash Goel <akash.goel@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1816f923
    • C
      drm/i915: Broaden application of set-domain(GTT) · 43566ded
      Chris Wilson 提交于
      Previously, this was restricted to only operate on bound objects - to
      make pointer access through the GTT to the object coherent with writes
      to and from the GPU. A second usecase is drm_intel_bo_wait_rendering()
      which at present does not function unless the object also happens to
      be bound into the GGTT (on current systems that is becoming increasingly
      rare, especially for the typical requests from mesa). A third usecase is
      a future patch wishing to extend the coverage of the GTT domain to
      include objects not bound into the GGTT but still in its coherent cache
      domain. For the latter pair of requests, we need to operate on the
      object regardless of its bind state.
      
      v2: After discussion with Akash, we came to the conclusion that the
      get-pages was required in order for accurate domain tracking in the
      corner cases (like the shrinker) and also useful for ensuring memory
      coherency with earlier cached CPU mmaps in case userspace uses exotic
      cache bypass (non-temporal) instructions.
      
      v3: Fix the inactive object check.
      
      v4: Rebase to latest drm-intel-nightly codebase
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NAkash Goel <akash.goel@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      43566ded
  4. 24 12月, 2014 1 次提交
  5. 18 12月, 2014 2 次提交
  6. 16 12月, 2014 2 次提交
    • B
      drm/i915: Use batch pools with the command parser · 78a42377
      Brad Volkin 提交于
      This patch sets up all of the tracking and copying necessary to
      use batch pools with the command parser and dispatches the copied
      (shadow) batch to the hardware.
      
      After this patch, the parser is in 'enabling' mode.
      
      Note that performance takes a hit from the copy in some cases
      and will likely need some work. At a rough pass, the memcpy
      appears to be the bottleneck. Without having done a deeper
      analysis, two ideas that come to mind are:
      1) Copy sections of the batch at a time, as they are reached
         by parsing. Might improve cache locality.
      2) Copy only up to the userspace-supplied batch length and
         memset the rest of the buffer. Reduces the number of reads.
      
      v2:
      - Remove setting the capacity of the pool
      - One global pool instead of per-ring pools
      - Replace batch_obj with shadow_batch_obj and hook into eb->vmas
      - Memset any space in the shadow batch beyond what gets copied
      - Rebased on execlist prep refactoring
      
      v3:
      - Rebase on chained batch handling
      - Squash in setting the secure dispatch flag
      - Add a note about the interaction w/secure dispatch pinning
      - Check for request->batch_obj == NULL in i915_gem_free_request
      
      v4:
      - Fix read domains for shadow_batch_obj
      - Remove the set_to_gtt_domain call from i915_parse_cmds
      - ggtt_pin/unpin in the parser block to simplify error handling
      - Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag
      - Remove i915_gem_batch_pool_put calls
      
      v5:
      - Move 'pending_read_domains |= I915_GEM_DOMAIN_COMMAND' after
        the parser (danvet, from v4 0/7 feedback)
      
      Issue: VIZ-4719
      Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
      Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      78a42377
    • B
      drm/i915: Implement a framework for batch buffer pools · 493018dc
      Brad Volkin 提交于
      This adds a small module for managing a pool of batch buffers.
      The only current use case is for the command parser, as described
      in the kerneldoc in the patch. The code is simple, but separating
      it out makes it easier to change the underlying algorithms and to
      extend to future use cases should they arise.
      
      The interface is simple: init to create an empty pool, fini to
      clean it up, get to obtain a new buffer. Note that all buffers are
      expected to be inactive before cleaning up the pool.
      
      Locking is currently based on the caller holding the struct_mutex.
      We already do that in the places where we will use the batch pool
      for the command parser.
      
      v2:
      - s/BUG_ON/WARN_ON/ for locking assertions
      - Remove the cap on pool size
      - Switch from alloc/free to init/fini
      
      v3:
      - Idiomatic looping structure in _fini
      - Correct handling of purged objects
      - Don't return a buffer that's too much larger than needed
      
      v4:
      - Rebased to latest -nightly
      
      v5:
      - Remove _put() function and clean up comments to match
      
      v6:
      - Move purged check inside the loop (danvet, from v4 1/7 feedback)
      
      v7:
      - Use single list instead of two. (Chris W)
      - s/active_list/cache_list
      - Squashed in debug patches (Chris W)
        drm/i915: Add a batch pool debugfs file
      
        It provides some useful information about the buffers in
        the global command parser batch pool.
      
        v2: rebase on global pool instead of per-ring pools
        v3: rebase
      
        drm/i915: Add batch pool details to i915_gem_objects debugfs
      
        To better account for the potentially large memory consumption
        of the batch pool.
      
      v8:
      - Keep cache in LRU order (danvet, from v6 1/5 feedback)
      
      Issue: VIZ-4719
      Signed-off-by: NBrad Volkin <bradley.d.volkin@intel.com>
      Reviewed-By: NJon Bloomfield <jon.bloomfield@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      493018dc
  7. 15 12月, 2014 1 次提交
    • T
      drm/i915: Infrastructure for supporting different GGTT views per object · fe14d5f4
      Tvrtko Ursulin 提交于
      Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
      to map objects into the same address space multiple times.
      
      Added a GGTT view concept and linked it with the VMA to distinguish between
      multiple instances per address space.
      
      New objects and GEM functions which do not take this new view as a parameter
      assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
      previous behaviour.
      
      This now means that objects can have multiple VMA entries so the code which
      assumed there will only be one also had to be modified.
      
      Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
      which is DMA mapped on first VMA instantiation and unmapped on the last one
      going away.
      
      v2:
          * Removed per view special casing in i915_gem_ggtt_prepare /
            finish_object in favour of creating and destroying DMA mappings
            on first VMA instantiation and last VMA destruction. (Daniel Vetter)
          * Simplified i915_vma_unbind which does not need to count the GGTT views.
            (Daniel Vetter)
          * Also moved obj->map_and_fenceable reset under the same check.
          * Checkpatch cleanups.
      
      v3:
          * Only retire objects once the last VMA is unbound.
      
      v4:
          * Keep scatter-gather table for alternative views persistent for the
            lifetime of the VMA.
          * Propagate binding errors to callers and handle appropriately.
      
      v5:
          * Explicitly look for normal GGTT view in i915_gem_obj_bound to align
            usage in i915_gem_object_ggtt_unpin. (Michel Thierry)
          * Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry)
          * Removed stray semi-colon in i915_gem_object_set_cache_level.
      
      For: VIZ-4544
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      [danvet: Drop hunk from i915_gem_shrink since it's just prettification
      but upsets a __must_check warning.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      fe14d5f4
  8. 05 12月, 2014 3 次提交
  9. 04 12月, 2014 1 次提交
  10. 03 12月, 2014 20 次提交
  11. 21 11月, 2014 1 次提交
  12. 20 11月, 2014 5 次提交