1. 08 5月, 2014 1 次提交
  2. 07 5月, 2014 1 次提交
    • B
      drm/i915: Make aliasing a 2nd class VM · 6e7186af
      Ben Widawsky 提交于
      There is a good debate to be had about how best to fit the aliasing
      PPGTT into the code. However, as it stands right now, getting aliasing
      PPGTT bindings is a hack, and done through implicit arguments. To make
      this absolutely clear, WARN and return an error if a driver writer tries
      to do something they shouldn't.
      
      I have no issue with an eventual revert of this patch. It makes sense
      for what we have today.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6e7186af
  3. 05 5月, 2014 6 次提交
  4. 23 4月, 2014 1 次提交
  5. 22 4月, 2014 1 次提交
    • D
      drm/irq: remove cargo-culted locking from irq_install/uninstall · e090c53b
      Daniel Vetter 提交于
      The dev->struct_mutex locking in drm_irq.c only protects
      dev->irq_enabled. Which isn't really much at all and only prevents
      especially nasty ums userspace from concurrently installing the
      interrupt handling a few times. Or at least trying.
      
      There are tons of unlocked readers of dev->irqs_enabled in the vblank
      wait code (and by extension also in the pageflip code since that uses
      the same vblank timestamp engine).
      
      Real modesetting drivers should ensure that nothing can go haywire
      with a sane setup teardown sequence. So we only really need this for
      the drm_control ioctl, everywhere else this will just paper over
      nastiness.
      
      Note that drm/i915 is a bit specially due to the gem+ums combination.
      So there we also need to properly protect the entervt and leavevt
      ioctls. But it's definitely saner to do everything in one go than to
      drop the lock in-between.
      
      Finally there's the gpu reset code in drm/i915. That one's just race
      (concurrent userspace calls to for vblank waits of pageflips could
      spuriously fail). So wrap it up in with a nice comment since fixing
      this is more involved.
      
      v2: Rebase and fix commit message (Thierry)
      Reviewed-by: NThierry Reding <treding@nvidia.com>
      Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e090c53b
  6. 11 4月, 2014 1 次提交
  7. 09 4月, 2014 1 次提交
  8. 04 4月, 2014 1 次提交
    • L
      drm: Add support for two-ended allocation, v3 · 62347f9e
      Lauri Kasanen 提交于
      Clients like i915 need to segregate cache domains within the GTT which
      can lead to small amounts of fragmentation. By allocating the uncached
      buffers from the bottom and the cacheable buffers from the top, we can
      reduce the amount of wasted space and also optimize allocation of the
      mappable portion of the GTT to only those buffers that require CPU
      access through the GTT.
      
      For other drivers, allocating small bos from one end and large ones
      from the other helps improve the quality of fragmentation.
      
      Based on drm_mm work by Chris Wilson.
      
      v3: Changed to use a TTM placement flag
      v2: Updated kerneldoc
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: Christian König <deathsimple@vodafone.de>
      Signed-off-by: NLauri Kasanen <cand@gmx.com>
      Signed-off-by: NDavid Airlie <airlied@redhat.com>
      62347f9e
  9. 31 3月, 2014 1 次提交
  10. 21 3月, 2014 1 次提交
  11. 19 3月, 2014 1 次提交
  12. 16 3月, 2014 1 次提交
    • D
      drm: use anon-inode instead of relying on cdevs · 6796cb16
      David Herrmann 提交于
      DRM drivers share a common address_space across all character-devices of a
      single DRM device. This allows simple buffer eviction and mapping-control.
      However, DRM core currently waits for the first ->open() on any char-dev
      to mark the underlying inode as backing inode of the device. This delayed
      initialization causes ugly conditions all over the place:
        if (dev->dev_mapping)
          do_sth();
      
      To avoid delayed initialization and to stop reusing the inode of the
      char-dev, we allocate an anonymous inode for each DRM device and reset
      filp->f_mapping to it on ->open().
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      6796cb16
  13. 13 3月, 2014 1 次提交
  14. 08 3月, 2014 3 次提交
  15. 06 3月, 2014 6 次提交
    • D
      drm/i915: Make i915_gem_retire_requests_ring() static · cb216aa8
      Damien Lespiau 提交于
      Its last usage outside of i915_gem.c was removed in:
      
        commit 1f70999f
        Author: Chris Wilson <chris@chris-wilson.co.uk>
        Date:   Mon Jan 27 22:43:07 2014 +0000
      
           drm/i915: Prevent recursion by retiring requests when the ring is full
      Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      cb216aa8
    • C
      drm/i915: Record pid/comm of hanging task · ab0e7ff9
      Chris Wilson 提交于
      After finding the guilty batch and request, we can use it to find the
      process that submitted the batch and then add the culprit into the error
      state.
      
      This is a slightly different approach from Ben's in that instead of
      adding the extra information into the struct i915_hw_context, we use the
      information already captured in struct drm_file which is then referenced
      from the request.
      
      v2: Also capture the workaround buffer for gen2, so that we can compare
          its contents against the intended batch for the active request.
      
      v3: Rebase (Mika)
      v4: Check for null context (Chris)
          checkpatch warnings fixed
      
      Link: http://lists.freedesktop.org/archives/intel-gfx/2013-August/032280.html
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
      Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v4)
      Acked-by: NBen Widawsky <ben@bwidawsk.net>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ab0e7ff9
    • C
      drm/i915: Rely on accurate request tracking for finding hung batches · 8d9fc7fd
      Chris Wilson 提交于
      In the past, it was possible to have multiple batches per request due to
      a stray signal or ENOMEM. As a result we had to scan each active object
      (filtered by those having the COMMAND domain) for the one that contained
      the ACTHD pointer. This was then made more complicated by the
      introduction of ppgtt, whereby ACTHD then pointed into the address space
      of the context and so also needed to be taken into account.
      
      This is a fairly robust approach (though the implementation is a little
      fragile and depends upon the per-generation setup, registers and
      parameters). However, due to the requirements for hangstats, we needed a
      robust method for associating batches with a particular request and
      having that we can rely upon it for finding the associated batch object
      for error capture.
      
      If the batch buffer tracking is not robust enough, that should become
      apparent quite quickly through an erroneous error capture. That should
      also help to make sure that the runtime reporting to userspace is
      robust. It also means that we then report the oldest incomplete batch on
      each ring, which can be useful for determining the state of userspace at
      the time of a hang.
      
      v2: Use i915_gem_find_active_request (Mika)
      
      v3: remove check for ring->get_seqno, split long lines (Ben)
      
      v4: check that context is available (Chris)
          checkpatch warnings fixed
      
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
      Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v3)
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v3)
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8d9fc7fd
    • C
      drm/i915: Reset vma->mm_list after unbinding · 64bf9303
      Chris Wilson 提交于
      In place of true activity counting, we walk the list of vma associated
      with an object managing each on the vm's active/inactive list everytime
      we call move-to-inactive. This depends upon the vma->mm_list being
      cleared after unbinding, or else we run into difficulty when tracking
      the object in multiple vm's - we see a use-after free and corruption of
      the mm_list.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      64bf9303
    • V
      drm/i915: Don't ban default context when stop_rings!=0 · ccc7bed0
      Ville Syrjälä 提交于
      If we've explicitly stopped the rings for testing purposes, don't ban
      the default context. Fixes kms_flip hang tests.
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Acked-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ccc7bed0
    • C
      drm/i915: Accurately track when we mark the hardware as idle/busy · f62a0076
      Chris Wilson 提交于
      We currently call intel_mark_idle() too often, as we do so as a
      side-effect of processing the request queue. However, we the calls to
      intel_mark_idle() are expected to be paired with a call to
      intel_mark_busy() (or else we try to idle the hardware by accessing
      registers that are already disabled). Make the idle/busy tracking
      explicit to prevent the multiple calls.
      
      v2: We can drop some of the complexity in __i915_add_request() as
      queue_delayed_work() already behaves as we want (not requeuing the item
      if it is already in the queue) and mark_busy/mark_idle imply that the
      idle task is inactive.
      
      v3: We do still need to cancel the pending idle task so that it is sent
      again after the current busy load completes (not in the middle of it).
      Reported-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Tested-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      f62a0076
  16. 14 2月, 2014 5 次提交
    • D
      drm/i915: Only bind each object rather than for every execbuffer · 8ea99c92
      Daniel Vetter 提交于
      One side-effect of the introduction of ppgtt was that we needed to
      rebind the object into the appropriate vm (and global gtt in some
      peculiar cases). For simplicity this was done twice for every object on
      every call to execbuffer. However, that adds a tremendous amount of CPU
      overhead (rewriting all the PTE for all objects into WC memory) per
      draw. The fix is to push all the decision about which vm to bind into
      and when down into the low-level bind routines through hints rather than
      performing the bind unconditionally in the execbuffer routine.
      
      Note that this is a regression introduced in the full ppgtt feature
      branch, before this we've only done re-bound objects when the relevant
      has_(aliasing_ppgtt|global_gtt)_mapping flag was clear. But since
      that's per-object and not per-vma that optimization broke.
      
      v2: Split out prep work and unrelated changes.
      
      v3: Bring back functional change around PIN_GLOBAL that I've
      accidentally split out.
      
      v4: Remove the temporary hack for the old binding logic to avoid
      bisection issues.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72906
      Tested-by: jianx.zhou@intel.com
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Acked-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8ea99c92
    • D
      drm/i915: Directly return the vma from bind_to_vm · 262de145
      Daniel Vetter 提交于
      This is prep work for reworking the object_pin logic. Atm
      it still does a (now redundant) lookup of the vma. The next
      patch will fix this.
      
      Split out from Chris vma-bind rework.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Reviewed-by: NJani Nikula <jani.nikula@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      262de145
    • D
      drm/i915: Simplify i915_gem_object_ggtt_unpin · b287110e
      Daniel Vetter 提交于
      Split out from Chris vma-bind rework.
      
      Jani wondered why this is save, and the reason is that i915_vma_unbind
      does all these checks, too. So they're redundant.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b287110e
    • D
      drm/i915: split PIN_GLOBAL out from PIN_MAPPABLE · bf3d149b
      Daniel Vetter 提交于
      With abitrary pin flags it makes sense to split out a "please bind
      this into global gtt" from the "please allocate in the mappable
      range".
      
      Use this unconditionally in our global gtt pin helper since this is
      what its callers want. Later patches will drop PIN_MAPPABLE where it's
      not strictly needed.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      bf3d149b
    • D
      drm/i915: Consolidate binding parameters into flags · 1ec9e26d
      Daniel Vetter 提交于
      Anything more than just one bool parameter is just a pain to read,
      symbolic constants are much better.
      
      Split out from Chris' vma-binding rework patch.
      
      v2: Undo the behaviour change in object_pin that Chris spotted.
      
      v3: Split out misplaced hunk to handle set_cache_level errors,
      spotted by Jani.
      
      v4: Keep the current over-zealous binding logic in the execbuffer code
      working with a quick hack while the overall binding code gets shuffled
      around.
      
      v5: Reorder the PIN_ flags for more natural patch splitup.
      
      v6: Pull out the PIN_GLOBAL split-up again.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1ec9e26d
  17. 13 2月, 2014 3 次提交
    • C
      drm/i915: Flush GPU rendering with a lockless wait during a pagefault · 6e4930f6
      Chris Wilson 提交于
      Arjan van de Ven reported that on his test machine that he was seeing
      stalls of greater than 1 frame greatly impacting the user experience. He
      tracked this down to being the locked flush during a pagefault as being
      the culprit hogging the struct_mutex and so blocking any other user from
      proceeding. Stalling on a pagefault is bad behaviour on userspace's
      part, for one it means that they are ignoring the coherency rules on
      pointer access through the GTT, but fortunately we can apply the same
      trick as the set-to-domain ioctl to do a lightweight, nonblocking flush
      of outstanding rendering first.
      
      "Prior to the patch it looks like this
      (this one testrun does not show the 20ms+ I've seen occasionally)
      
        4.99 ms     2.36 ms    31360  __wait_seqno i915_wait_seqno i915_gem_object_wait_rendering i915_gem_object_set_to_gtt_domain i915_gem_fault __do_fault handle_
      +pte_fault handle_mm_fault __do_page_fault do_page_fault page_fault
         4.99 ms     2.75 ms   107751  __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         4.99 ms     1.63 ms     1666  i915_mutex_lock_interruptible i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_fault do_page_fault page_fa
      +ult
         4.93 ms     2.45 ms      980  i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_
      +sysret
         4.89 ms     2.20 ms     3283  i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         4.34 ms     1.66 ms     1715  i915_mutex_lock_interruptible i915_gem_pwrite_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         3.73 ms     3.73 ms       49  i915_mutex_lock_interruptible i915_gem_set_domain_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         3.17 ms     0.33 ms      931  i915_mutex_lock_interruptible i915_gem_madvise_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         2.97 ms     0.43 ms     1029  i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         2.55 ms     0.51 ms      735  i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
      
      After the patch it looks like this:
      
         4.99 ms     2.14 ms    22212  __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         4.86 ms     0.99 ms    14170  __wait_seqno i915_gem_object_wait_rendering__nonblocking i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_
      +fault do_page_fault page_fault
         3.59 ms     1.31 ms      325  i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         3.37 ms     3.37 ms       65  i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         2.58 ms     2.58 ms       65  i915_mutex_lock_interruptible i915_gem_do_execbuffer.isra.23 i915_gem_execbuffer2 drm_ioctl i915_compat_ioctl compat_sys_ioctl
      +ia32_sysret
         2.19 ms     2.19 ms       65  i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_
      +sysret
         2.18 ms     2.18 ms       65  i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
         1.66 ms     1.66 ms       65  i915_gem_set_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
      
      It may not look like it, but this is quite a large difference, and I've
      been unable to reproduce > 5 msec delays at all, while before they do
      happen (just not in the trace above)."
      
      gem_gtt_hog on an old Pineview (GMA3150),
      before: 4969.119ms
      after:  4122.749ms
      Reported-by: NArjan van de Ven <arjan.van.de.ven@intel.com>
      Testcase: igt/gem_gtt_hog
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6e4930f6
    • C
      drm/i915: Downgrade *ERROR* message for invalid user input · bd9b6a4e
      Chris Wilson 提交于
      When we detect that the user passed along an invalid handle or object,
      we emit a warning as an aide for debugging. Since these are indeed only
      for debugging user triggerable errors (and the errors are reported back
      to userspace by the errno), the messages should only be at the debug
      level and not claiming that there is a catastrophic error in the
      driver/hardware.
      
      References: https://bugs.freedesktop.org/show_bug.cgi?id=74704Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      bd9b6a4e
    • D
      drm/i915: Always use INTEL_INFO() to access the device_info structure · 3d13ef2e
      Damien Lespiau 提交于
      If we make sure that all the dev_priv->info usages are wrapped by
      INTEL_INFO(), we can easily modify the ->info field to be structure and
      not a pointer while keeping the const protection in the INTEL_INFO()
      macro.
      
      v2: Rebased onto latest drm-nightly
      Suggested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3d13ef2e
  18. 05 2月, 2014 2 次提交
  19. 04 2月, 2014 3 次提交