1. 20 5月, 2014 4 次提交
  2. 17 5月, 2014 1 次提交
    • C
      drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl · 5cc9ed4b
      Chris Wilson 提交于
      By exporting the ability to map user address and inserting PTEs
      representing their backing pages into the GTT, we can exploit UMA in order
      to utilize normal application data as a texture source or even as a
      render target (depending upon the capabilities of the chipset). This has
      a number of uses, with zero-copy downloads to the GPU and efficient
      readback making the intermixed streaming of CPU and GPU operations
      fairly efficient. This ability has many widespread implications from
      faster rendering of client-side software rasterisers (chromium),
      mitigation of stalls due to read back (firefox) and to faster pipelining
      of texture data (such as pixel buffer objects in GL or data blobs in CL).
      
      v2: Compile with CONFIG_MMU_NOTIFIER
      v3: We can sleep while performing invalidate-range, which we can utilise
      to drop our page references prior to the kernel manipulating the vma
      (for either discard or cloning) and so protect normal users.
      v4: Only run the invalidate notifier if the range intercepts the bo.
      v5: Prevent userspace from attempting to GTT mmap non-page aligned buffers
      v6: Recheck after reacquire mutex for lost mmu.
      v7: Fix implicit padding of ioctl struct by rounding to next 64bit boundary.
      v8: Fix rebasing error after forwarding porting the back port.
      v9: Limit the userptr to page aligned entries. We now expect userspace
          to handle all the offset-in-page adjustments itself.
      v10: Prevent vma from being copied across fork to avoid issues with cow.
      v11: Drop vma behaviour changes -- locking is nigh on impossible.
           Use a worker to load user pages to avoid lock inversions.
      v12: Use get_task_mm()/mmput() for correct refcounting of mm.
      v13: Use a worker to release the mmu_notifier to avoid lock inversion
      v14: Decouple mmu_notifier from struct_mutex using a custom mmu_notifer
           with its own locking and tree of objects for each mm/mmu_notifier.
      v15: Prevent overlapping userptr objects, and invalidate all objects
           within the mmu_notifier range
      v16: Fix a typo for iterating over multiple objects in the range and
           rearrange error path to destroy the mmu_notifier locklessly.
           Also close a race between invalidate_range and the get_pages_worker.
      v17: Close a race between get_pages_worker/invalidate_range and fresh
           allocations of the same userptr range - and notice that
           struct_mutex was presumed to be held when during creation it wasn't.
      v18: Sigh. Fix the refactor of st_set_pages() to allocate enough memory
           for the struct sg_table and to clear it before reporting an error.
      v19: Always error out on read-only userptr requests as we don't have the
           hardware infrastructure to support them at the moment.
      v20: Refuse to implement read-only support until we have the required
           infrastructure - but reserve the bit in flags for future use.
      v21: use_mm() is not required for get_user_pages(). It is only meant to
           be used to fix up the kernel thread's current->mm for use with
           copy_user().
      v22: Use sg_alloc_table_from_pages for that chunky feeling
      v23: Export a function for sanity checking dma-buf rather than encode
           userptr details elsewhere, and clean up comments based on
           suggestions by Bradley.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
      Cc: Akash Goel <akash.goel@intel.com>
      Cc: "Volkin, Bradley D" <bradley.d.volkin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Reviewed-by: NBrad Volkin <bradley.d.volkin@intel.com>
      [danvet: Frob ioctl allocation to pick the next one - will cause a bit
      of fuss with create2 apparently, but such are the rules.]
      [danvet2: oops, forgot to git add after manual patch application]
      [danvet3: Appease sparse.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      5cc9ed4b
  3. 16 5月, 2014 1 次提交
    • O
      drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display · 19656430
      Oscar Mateo 提交于
      Otherwise, we do a NULL pointer dereference.
      
      I've seen this happen while handling an error in
      i915_gem_object_pin_to_display_plane():
      
      If i915_gem_object_set_cache_level() fails, we call is_pin_display()
      to handle the error. At this point, the object is still not pinned
      to GGTT and maybe not even bound, so we have to check before we
      dereference its GGTT vma.
      
      The IGT kms_flip/bo-too-big tests for this bug.
      
      v2: Chris Wilson says restoring the old value is easier, but that
      is_pin_display is useful as a theory of operation. Take the solomonic
      decision: at least this way is_pin_display is a little more robust
      (until Chris can kill it off).
      
      v3: Chris suggests the WARN in i915_gem_obj_to_ggtt has outlived its
      usefulness: add a reminder to remove it.
      
      Issue: VIZ-3772
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Testcase: igt/kms_flip/bo-too-big
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      19656430
  4. 15 5月, 2014 2 次提交
    • D
      drm/i915: Only do gtt cleanup in vma_unbind for the global vma · 8b1bc9b4
      Daniel Vetter 提交于
      Otherwise we end up tearing down fences when e.g. the client quits
      way too early. Might or might not fix a fence pin_count BUG Ville has
      reported.
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8b1bc9b4
    • D
      drm/i915: Don't drop pinned fences · aff10b30
      Daniel Vetter 提交于
      Userspace can currently provoke this when e.g. trying to use a pinned
      scanout as a cursor or overlay target. Later on that might lead to
      some fun fence pin count mayhem.
      
      Spurred by Ville's report that something goes wrong here and
      originally I've thought that this might slip through the pwrite gtt
      fastpath. But that one checks of obj tiling, so should be ok.
      
      But one thing that _does_ blow up is the vma unbinding with more than
      one address space. The next patch will fix this.
      
      v2: Use a WARN_ON - Chris pointed out that we already catch all cases
      so userspace can't provoke this like I've originally feared.
      
      While reviewing relevant code I've noticed a pile of DRM_ERROR in the
      overlay&cursor code which are all triggerable by userspace. Tune them
      down while at it.
      
      v3: Split out the DRM_ERROR->DRM_DEBUG_KMS change into a separate patch,
      as requested by Chris.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      aff10b30
  5. 13 5月, 2014 1 次提交
    • D
      drm/i915: WARN_ON fence pin leaks · d8ffa60b
      Daniel Vetter 提交于
      The fence pin count should always be <= the bo pin count. If that's
      not the case then we have a funny problem and are leaking references
      somewhere.
      
      Which means we can catch fence pin leaks by checking for the same
      upper limit as we do for the bo pin count. Inspired by a discussion
      with Ville about a fence leak igt testcase.
      
      v2: Also check for fence->pin_count <= ggtt_vma->pin_count, since that
      might catch a leak even quicker. Also de-inline them, they're getting
      too big.
      
      v3: Don't separately check for MAX_PIN_COUNT since the > vma->pin_count
      check will catch that already (Chris).
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d8ffa60b
  6. 08 5月, 2014 1 次提交
  7. 07 5月, 2014 1 次提交
    • B
      drm/i915: Make aliasing a 2nd class VM · 6e7186af
      Ben Widawsky 提交于
      There is a good debate to be had about how best to fit the aliasing
      PPGTT into the code. However, as it stands right now, getting aliasing
      PPGTT bindings is a hack, and done through implicit arguments. To make
      this absolutely clear, WARN and return an error if a driver writer tries
      to do something they shouldn't.
      
      I have no issue with an eventual revert of this patch. It makes sense
      for what we have today.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6e7186af
  8. 05 5月, 2014 6 次提交
  9. 23 4月, 2014 1 次提交
  10. 22 4月, 2014 1 次提交
    • D
      drm/irq: remove cargo-culted locking from irq_install/uninstall · e090c53b
      Daniel Vetter 提交于
      The dev->struct_mutex locking in drm_irq.c only protects
      dev->irq_enabled. Which isn't really much at all and only prevents
      especially nasty ums userspace from concurrently installing the
      interrupt handling a few times. Or at least trying.
      
      There are tons of unlocked readers of dev->irqs_enabled in the vblank
      wait code (and by extension also in the pageflip code since that uses
      the same vblank timestamp engine).
      
      Real modesetting drivers should ensure that nothing can go haywire
      with a sane setup teardown sequence. So we only really need this for
      the drm_control ioctl, everywhere else this will just paper over
      nastiness.
      
      Note that drm/i915 is a bit specially due to the gem+ums combination.
      So there we also need to properly protect the entervt and leavevt
      ioctls. But it's definitely saner to do everything in one go than to
      drop the lock in-between.
      
      Finally there's the gpu reset code in drm/i915. That one's just race
      (concurrent userspace calls to for vblank waits of pageflips could
      spuriously fail). So wrap it up in with a nice comment since fixing
      this is more involved.
      
      v2: Rebase and fix commit message (Thierry)
      Reviewed-by: NThierry Reding <treding@nvidia.com>
      Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e090c53b
  11. 11 4月, 2014 1 次提交
  12. 09 4月, 2014 1 次提交
  13. 04 4月, 2014 1 次提交
    • L
      drm: Add support for two-ended allocation, v3 · 62347f9e
      Lauri Kasanen 提交于
      Clients like i915 need to segregate cache domains within the GTT which
      can lead to small amounts of fragmentation. By allocating the uncached
      buffers from the bottom and the cacheable buffers from the top, we can
      reduce the amount of wasted space and also optimize allocation of the
      mappable portion of the GTT to only those buffers that require CPU
      access through the GTT.
      
      For other drivers, allocating small bos from one end and large ones
      from the other helps improve the quality of fragmentation.
      
      Based on drm_mm work by Chris Wilson.
      
      v3: Changed to use a TTM placement flag
      v2: Updated kerneldoc
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: Christian König <deathsimple@vodafone.de>
      Signed-off-by: NLauri Kasanen <cand@gmx.com>
      Signed-off-by: NDavid Airlie <airlied@redhat.com>
      62347f9e
  14. 31 3月, 2014 1 次提交
  15. 21 3月, 2014 1 次提交
  16. 19 3月, 2014 1 次提交
  17. 16 3月, 2014 1 次提交
    • D
      drm: use anon-inode instead of relying on cdevs · 6796cb16
      David Herrmann 提交于
      DRM drivers share a common address_space across all character-devices of a
      single DRM device. This allows simple buffer eviction and mapping-control.
      However, DRM core currently waits for the first ->open() on any char-dev
      to mark the underlying inode as backing inode of the device. This delayed
      initialization causes ugly conditions all over the place:
        if (dev->dev_mapping)
          do_sth();
      
      To avoid delayed initialization and to stop reusing the inode of the
      char-dev, we allocate an anonymous inode for each DRM device and reset
      filp->f_mapping to it on ->open().
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      6796cb16
  18. 13 3月, 2014 1 次提交
  19. 08 3月, 2014 3 次提交
  20. 06 3月, 2014 6 次提交
    • D
      drm/i915: Make i915_gem_retire_requests_ring() static · cb216aa8
      Damien Lespiau 提交于
      Its last usage outside of i915_gem.c was removed in:
      
        commit 1f70999f
        Author: Chris Wilson <chris@chris-wilson.co.uk>
        Date:   Mon Jan 27 22:43:07 2014 +0000
      
           drm/i915: Prevent recursion by retiring requests when the ring is full
      Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      cb216aa8
    • C
      drm/i915: Record pid/comm of hanging task · ab0e7ff9
      Chris Wilson 提交于
      After finding the guilty batch and request, we can use it to find the
      process that submitted the batch and then add the culprit into the error
      state.
      
      This is a slightly different approach from Ben's in that instead of
      adding the extra information into the struct i915_hw_context, we use the
      information already captured in struct drm_file which is then referenced
      from the request.
      
      v2: Also capture the workaround buffer for gen2, so that we can compare
          its contents against the intended batch for the active request.
      
      v3: Rebase (Mika)
      v4: Check for null context (Chris)
          checkpatch warnings fixed
      
      Link: http://lists.freedesktop.org/archives/intel-gfx/2013-August/032280.html
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
      Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v4)
      Acked-by: NBen Widawsky <ben@bwidawsk.net>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ab0e7ff9
    • C
      drm/i915: Rely on accurate request tracking for finding hung batches · 8d9fc7fd
      Chris Wilson 提交于
      In the past, it was possible to have multiple batches per request due to
      a stray signal or ENOMEM. As a result we had to scan each active object
      (filtered by those having the COMMAND domain) for the one that contained
      the ACTHD pointer. This was then made more complicated by the
      introduction of ppgtt, whereby ACTHD then pointed into the address space
      of the context and so also needed to be taken into account.
      
      This is a fairly robust approach (though the implementation is a little
      fragile and depends upon the per-generation setup, registers and
      parameters). However, due to the requirements for hangstats, we needed a
      robust method for associating batches with a particular request and
      having that we can rely upon it for finding the associated batch object
      for error capture.
      
      If the batch buffer tracking is not robust enough, that should become
      apparent quite quickly through an erroneous error capture. That should
      also help to make sure that the runtime reporting to userspace is
      robust. It also means that we then report the oldest incomplete batch on
      each ring, which can be useful for determining the state of userspace at
      the time of a hang.
      
      v2: Use i915_gem_find_active_request (Mika)
      
      v3: remove check for ring->get_seqno, split long lines (Ben)
      
      v4: check that context is available (Chris)
          checkpatch warnings fixed
      
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
      Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v3)
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v3)
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8d9fc7fd
    • C
      drm/i915: Reset vma->mm_list after unbinding · 64bf9303
      Chris Wilson 提交于
      In place of true activity counting, we walk the list of vma associated
      with an object managing each on the vm's active/inactive list everytime
      we call move-to-inactive. This depends upon the vma->mm_list being
      cleared after unbinding, or else we run into difficulty when tracking
      the object in multiple vm's - we see a use-after free and corruption of
      the mm_list.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      64bf9303
    • V
      drm/i915: Don't ban default context when stop_rings!=0 · ccc7bed0
      Ville Syrjälä 提交于
      If we've explicitly stopped the rings for testing purposes, don't ban
      the default context. Fixes kms_flip hang tests.
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Acked-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ccc7bed0
    • C
      drm/i915: Accurately track when we mark the hardware as idle/busy · f62a0076
      Chris Wilson 提交于
      We currently call intel_mark_idle() too often, as we do so as a
      side-effect of processing the request queue. However, we the calls to
      intel_mark_idle() are expected to be paired with a call to
      intel_mark_busy() (or else we try to idle the hardware by accessing
      registers that are already disabled). Make the idle/busy tracking
      explicit to prevent the multiple calls.
      
      v2: We can drop some of the complexity in __i915_add_request() as
      queue_delayed_work() already behaves as we want (not requeuing the item
      if it is already in the queue) and mark_busy/mark_idle imply that the
      idle task is inactive.
      
      v3: We do still need to cancel the pending idle task so that it is sent
      again after the current busy load completes (not in the middle of it).
      Reported-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Tested-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      f62a0076
  21. 14 2月, 2014 4 次提交