1. 05 8月, 2016 4 次提交
    • C
      drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex · dcff85c8
      Chris Wilson 提交于
      The principal motivation for this was to try and eliminate the
      struct_mutex from i915_gem_suspend - but we still need to hold the mutex
      current for the i915_gem_context_lost(). (The issue there is that there
      may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
      struct_mutex via the stop_machine().) For the moment, enabling last
      request tracking for the engine, allows us to do busyness checking and
      waiting without requiring the struct_mutex - which is useful in its own
      right.
      
      As a side-effect of having a robust means for tracking engine busyness,
      we can replace our other busyness heuristic, that of comparing against
      the last submitted seqno. For paranoid reasons, we have a semi-ordered
      check of that seqno inside the hangchecker, which we can now improve to
      an ordered check of the engine's busyness (removing a locked xchg in the
      process).
      
      v2: Pass along "bool interruptible" as being unlocked we cannot rely on
      i915->mm.interruptible being stable or even under our control.
      v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk
      dcff85c8
    • C
      drm/i915: Enable lockless lookup of request tracking via RCU · 0eafec6d
      Chris Wilson 提交于
      If we enable RCU for the requests (providing a grace period where we can
      inspect a "dead" request before it is freed), we can allow callers to
      carefully perform lockless lookup of an active request.
      
      However, by enabling deferred freeing of requests, we can potentially
      hog a lot of memory when dealing with tens of thousands of requests per
      second - with a quick insertion of a synchronize_rcu() inside our
      shrinker callback, that issue disappears.
      
      v2: Currently, it is our responsibility to handle reclaim i.e. to avoid
      hogging memory with the delayed slab frees. At the moment, we wait for a
      grace period in the shrinker, and block for all RCU callbacks on oom.
      Suggested alternatives focus on flushing our RCU callback when we have a
      certain number of outstanding request frees, and blocking on that flush
      after a second high watermark. (So rather than wait for the system to
      run out of memory, we stop issuing requests - both are nondeterministic.)
      
      Paul E. McKenney wrote:
      
      Another approach is synchronize_rcu() after some largish number of
      requests.  The advantage of this approach is that it throttles the
      production of callbacks at the source.  The corresponding disadvantage
      is that it slows things up.
      
      Another approach is to use call_rcu(), but if the previous call_rcu()
      is still in flight, block waiting for it.  Yet another approach is
      the get_state_synchronize_rcu() / cond_synchronize_rcu() pair.  The
      idea is to do something like this:
      
              cond_synchronize_rcu(cookie);
              cookie = get_state_synchronize_rcu();
      
      You would of course do an initial get_state_synchronize_rcu() to
      get things going.  This would not block unless there was less than
      one grace period's worth of time between invocations.  But this
      assumes a busy system, where there is almost always a grace period
      in flight.  But you can make that happen as follows:
      
              cond_synchronize_rcu(cookie);
              cookie = get_state_synchronize_rcu();
              call_rcu(&my_rcu_head, noop_function);
      
      Note that you need additional code to make sure that the old callback
      has completed before doing a new one.  Setting and clearing a flag
      with appropriate memory ordering control suffices (e.g,. smp_load_acquire()
      and smp_store_release()).
      
      v3: More comments on compiler and processor order of operations within
      the RCU lookup and discover we can use rcu_access_pointer() here instead.
      
      v4: Wrap i915_gem_active_get_rcu() to take the rcu_read_lock itself.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: "Goel, Akash" <akash.goel@intel.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-25-git-send-email-chris@chris-wilson.co.uk
      0eafec6d
    • C
      drm/i915: Move obj->active:5 to obj->flags · 573adb39
      Chris Wilson 提交于
      We are motivated to avoid using a bitfield for obj->active for a couple
      of reasons. Firstly, we wish to document our lockless read of obj->active
      using READ_ONCE inside i915_gem_busy_ioctl() and that requires an
      integral type (i.e. not a bitfield). Secondly, gcc produces abysmal code
      when presented with a bitfield and that shows up high on the profiles of
      request tracking (mainly due to excess memory traffic as it converts
      the bitfield to a register and back and generates frequent AGI in the
      process).
      
      v2: BIT, break up a long line in compute the other engines, new paint
      for i915_gem_object_is_active (now i915_gem_object_get_active).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-23-git-send-email-chris@chris-wilson.co.uk
      573adb39
    • C
      drm/i915: Combine all i915_vma bitfields into a single set of flags · 3272db53
      Chris Wilson 提交于
      In preparation to perform some magic to speed up i915_vma_pin(), which
      is among the hottest of hot paths in execbuf, refactor all the bitfields
      accessed by i915_vma_pin() into a single unified set of flags.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-16-git-send-email-chris@chris-wilson.co.uk
      3272db53
  2. 04 8月, 2016 2 次提交
  3. 26 7月, 2016 1 次提交
  4. 20 7月, 2016 2 次提交
  5. 14 7月, 2016 1 次提交
  6. 12 7月, 2016 1 次提交
  7. 05 7月, 2016 1 次提交
  8. 02 7月, 2016 1 次提交
  9. 24 6月, 2016 1 次提交
  10. 09 5月, 2016 1 次提交
  11. 02 5月, 2016 2 次提交
  12. 28 4月, 2016 1 次提交
    • C
      drm/i915: Move ioremap_wc tracking onto VMA · 8ef8561f
      Chris Wilson 提交于
      By tracking the iomapping on the VMA itself, we can share that area
      between multiple users. Also by only revoking the iomapping upon
      unbinding from the mappable portion of the GGTT, we can keep that iomap
      across multiple invocations (e.g. execlists context pinning).
      
      Note that by moving the iounnmap tracking to the VMA, we actually end up
      fixing a leak of the iomapping in intel_fbdev.
      
      v1.5: Rebase prompted by Tvrtko
      v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma.
      v3: Move handling of ioremap space exhaustion to vmap_purge and also
      allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc.
      v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap
      v5: Back to i915_vm_to_ggtt
      v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical
      sections and ensure the VMA cannot be reaped whilst mapped.
      v7: Move i915_vma_iounmap so that consumers of the API are not tempted,
      and add iomem annotations
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-5-git-send-email-chris@chris-wilson.co.uk
      8ef8561f
  13. 20 4月, 2016 3 次提交
  14. 12 4月, 2016 1 次提交
  15. 05 4月, 2016 3 次提交
  16. 26 2月, 2016 1 次提交
  17. 27 1月, 2016 1 次提交
  18. 05 1月, 2016 1 次提交
  19. 13 10月, 2015 1 次提交
  20. 07 10月, 2015 5 次提交
    • C
      drm/i915: Avoid GPU stalls from kswapd · 5763ff04
      Chris Wilson 提交于
      Exclude active GPU pages from the purview of the background shrinker
      (kswapd), as these cause uncontrollable GPU stalls. Given that the
      shrinker is rerun until the freelists are satisfied, we should have
      opportunity in subsequent passes to recover the pages once idle. If the
      machine does run out of memory entirely, we have the forced idling in the
      oom-notifier as a means of releasing all the pages we can before an oom
      is prematurely executed.
      
      Note that this relies upon an up-front retire_requests to keep the
      inactive list in shape, which was added in a previous patch, mostly as
      execlist ctx pinning band-aids.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Add note about retire_requests.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      5763ff04
    • C
      drm/i915: During shrink_all we only need to idle the GPU · c9c0f5ea
      Chris Wilson 提交于
      We can forgo an evict-everything here as the shrinker operation itself
      will unbind any vma as required. If we explicitly idle the GPU through a
      switch to the default context, we not only create a request in an
      illegal context (e.g. whilst shrinking during execbuf with a request
      already allocated), but switching to the default context will not free
      up the memory backing the active contexts - unless in the unlikely
      situation that context had already been closed (and just kept arrive by
      being the current context). The saving is near zero and the danger real.
      
      To compensate for the loss of the forced retire, add a couple of
      retire-requests to i915_gem_shirnk() - this should help free up any
      transitive cache from the requests.
      
      Note that the second retire_requests is for the benefit of the
      hand-rolled execlist ctx active tracking: We need to manually kick
      requests to get those unpinned again. Once that's fixed we can try to
      remove this again.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Add summary of why we need a pile of retire_requests.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c9c0f5ea
    • C
      drm/i915: Add a tracepoint for the shrinker · 3abafa53
      Chris Wilson 提交于
      Often it is very useful to know why we suddenly purge vast tracts of
      memory and surprisingly up until now we didn't even have a tracepoint
      for when we shrink our memory.
      
      Note that there are slab_start/end tracepoints already, but those
      don't cover the internal recursion when we directly call into our
      shrinker code. Hence a separate tracepoint seems justified. Also note
      that we don't really need a separate tracepoint for the actual amount
      of pages freed since we already have an unbind tracpoint for that.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Add a note that there's also slab_start/end and why they're
      insufficient.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3abafa53
    • C
      drm/i915: shrinker_control->nr_to_scan is now unsigned long · 14387540
      Chris Wilson 提交于
      As the shrinker_control now passes us unsigned long targets, update our
      shrinker functions to match.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      14387540
    • D
      drm/i915: Fix kerneldoc for i915_gem_shrink_all · 1f2449cd
      Daniel Vetter 提交于
      I've botched this, so let's fix it.
      
      Botched in
      
      commit eb0b44ad
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Wed Mar 18 14:47:59 2015 +0100
      
          drm/i915: kerneldoc for i915_gem_shrinker.c
      
      v2: Be a good citizen^Wmaintainer and add the proper commit citation.
      Noticed by Jani.
      Reviewed-by: NJani Nikula <jani.nikula@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      1f2449cd
  21. 10 4月, 2015 1 次提交
  22. 20 3月, 2015 2 次提交