1. 03 9月, 2014 1 次提交
  2. 20 8月, 2014 1 次提交
    • O
      drm/i915/bdw: Don't write PDP in the legacy way when using LRCs · b7c71823
      Oscar Mateo 提交于
      This is mostly for correctness so that we know we are running the LR
      context correctly (this is, the PDPs are contained inside the context
      object).
      
      v2: Move the check to inside the enable PPGTT function. The switch
      happens in two places: the legacy context switch (that we won't hit
      when Execlists are enabled) and the PPGTT enable, which unfortunately
      we need. This would look much nicer if the ppgtt->enable was part of
      the ring init, where it logically belongs.
      
      v3: Move the check to the start of the enable PPGTT function.  None
      of the legacy PPGTT enabling is required when using LRCs as the
      PPGTT is enabled in the context descriptor and the PDPs are written
      in the LRC.
      
      v4: Clarify comment based on review feedback.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Resolve conflicts with ppgtt_enable rework.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b7c71823
  3. 13 8月, 2014 8 次提交
  4. 12 8月, 2014 2 次提交
    • D
      drm/i915: Some cleanups for the ppgtt lifetime handling · ee960be7
      Daniel Vetter 提交于
      So when reviewing Michel's patch I've noticed a few things and cleaned
      them up:
      - The early checks in ppgtt_release are now redundant: The inactive
        list should always be empty now, so we can ditch these checks. Even
        for the aliasing ppgtt (though that's a different confusion) since
        we tear that down after all the objects are gone.
      - The ppgtt handling functions are splattered all over. Consolidate
        them in i915_gem_gtt.c, give them OCD prefixes and add wrappers for
        get/put.
      - There was a bit a confusion in ppgtt_release about whether it cares
        about the active or inactive list. It should care about them both,
        so augment the WARNINGs to check for both.
      
      There's still create_vm_for_ctx left to do, put that is blocked on the
      removal of ppgtt->ctx. Once that's done we can rename it to
      i915_ppgtt_create and move it to its siblings for handling ppgtts.
      
      v2: Move the ppgtt checks into the inline get/put functions as
      suggested by Chris.
      
      v3: Inline the now redundant ppgtt local variable.
      
      Cc: Michel Thierry <michel.thierry@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ee960be7
    • M
      drm/i915: vma/ppgtt lifetime rules · b9d06dd9
      Michel Thierry 提交于
      VMAs should take a reference of the address space they use.
      
      Now, when the fd is closed, it will release the ref that the context was
      holding, but it will still be referenced by any vmas that are still
      active.
      
      ppgtt_release() should then only be called when the last thing referencing
      it releases the ref, and it can just call the base cleanup and free the
      ppgtt.
      
      Note that with this we will extend the lifetime of ppgtts which
      contain shared objects. But all the non-shared objects will get
      removed as soon as they drop of the active list and for the shared
      ones the shrinker can eventually reap them. Since we currently can't
      evict ppgtt pagetables either I don't think that temporary leak is
      important.
      Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
      [danvet: Add note about potential ppgtt leak with this approach.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b9d06dd9
  5. 08 8月, 2014 1 次提交
  6. 07 8月, 2014 1 次提交
  7. 11 7月, 2014 1 次提交
  8. 17 6月, 2014 1 次提交
    • A
      drm/i915: Added write-enable pte bit supportt · 24f3a8cf
      Akash Goel 提交于
      This adds support for a write-enable bit in the entry of GTT.
      This is handled via a read-only flag in the GEM buffer object which
      is then used to see how to set the bit when writing the GTT entries.
      Currently by default the Batch buffer & Ring buffers are marked as read only.
      
      v2: Moved the pte override code for read-only bit to 'byt_pte_encode'. (Chris)
          Fixed the issue of leaving 'gt_old_ro' as unused. (Chris)
      
      v3: Removed the 'gt_old_ro' field, now setting RO bit only for Ring Buffers(Daniel).
      
      v4: Added a new 'flags' parameter to all the pte(gen6) encode & insert_entries functions,
          in lieu of overloading the cache_level enum (Daniel).
      
      v5: Removed the superfluous VLV check & changed the definition location of PTE_READ_ONLY flag (Imre)
      Reviewed-by: NImre Deak <imre.deak@intel.com>
      Signed-off-by: NAkash Goel <akash.goel@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      24f3a8cf
  9. 14 6月, 2014 1 次提交
  10. 07 6月, 2014 1 次提交
  11. 05 6月, 2014 1 次提交
  12. 27 5月, 2014 1 次提交
    • C
      drm/i915: Prevent negative relocation deltas from wrapping · d23db88c
      Chris Wilson 提交于
      This is pure evil. Userspace, I'm looking at you SNA, repacks batch
      buffers on the fly after generation as they are being passed to the
      kernel for execution. These batches also contain self-referenced
      relocations as a single buffer encompasses the state commands, kernels,
      vertices and sampler. During generation the buffers are placed at known
      offsets within the full batch, and then the relocation deltas (as passed
      to the kernel) are tweaked as the batch is repacked into a smaller buffer.
      This means that userspace is passing negative relocations deltas, which
      subsequently wrap to large values if the batch is at a low address. The
      GPU hangs when it then tries to use the large value as a base for its
      address offsets, rather than wrapping back to the real value (as one
      would hope). As the GPU uses positive offsets from the base, we can
      treat the relocation address as the minimum address read by the GPU.
      For the upper bound, we trust that userspace will not read beyond the
      end of the buffer.
      
      So, how do we fix negative relocations from wrapping? We can either
      check that every relocation looks valid when we write it, and then
      position each object such that we prevent the offset wraparound, or we
      just special-case the self-referential behaviour of SNA and force all
      batches to be above 256k. Daniel prefers the latter approach.
      
      This fixes a GPU hang when it tries to use an address (relocation +
      offset) greater than the GTT size. The issue would occur quite easily
      with full-ppgtt as each fd gets its own VM space, so low offsets would
      often be handed out. However, with the rearrangement of the low GTT due
      to capturing the BIOS framebuffer, it is already affecting kernels 3.15
      onwards. I think only IVB+ is susceptible to this bug, but the workaround
      should only kick in rarely, so it seems sensible to always apply it.
      
      v3: Use a bias for batch buffers to prevent small negative delta relocations
      from wrapping.
      
      v4 from Daniel:
      - s/BIAS/BATCH_OFFSET_BIAS/
      - Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
        were growing rather cumbersome.
      - Add a comment to eb_get_batch explaining why we do this.
      - Apply the batch offset bias everywhere but mention that we've only
        observed it on gen7 gpus.
      - Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.
      
      v5: Add static to eb_get_batch, spotted by 0-day tester.
      
      Testcase: igt/gem_bad_reloc
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d23db88c
  13. 23 5月, 2014 1 次提交
  14. 13 5月, 2014 1 次提交
  15. 07 5月, 2014 3 次提交
  16. 05 5月, 2014 1 次提交
  17. 29 4月, 2014 1 次提交
  18. 24 4月, 2014 1 次提交
  19. 04 4月, 2014 1 次提交
    • L
      drm: Add support for two-ended allocation, v3 · 62347f9e
      Lauri Kasanen 提交于
      Clients like i915 need to segregate cache domains within the GTT which
      can lead to small amounts of fragmentation. By allocating the uncached
      buffers from the bottom and the cacheable buffers from the top, we can
      reduce the amount of wasted space and also optimize allocation of the
      mappable portion of the GTT to only those buffers that require CPU
      access through the GTT.
      
      For other drivers, allocating small bos from one end and large ones
      from the other helps improve the quality of fragmentation.
      
      Based on drm_mm work by Chris Wilson.
      
      v3: Changed to use a TTM placement flag
      v2: Updated kerneldoc
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: Christian König <deathsimple@vodafone.de>
      Signed-off-by: NLauri Kasanen <cand@gmx.com>
      Signed-off-by: NDavid Airlie <airlied@redhat.com>
      62347f9e
  20. 03 4月, 2014 1 次提交
  21. 02 4月, 2014 2 次提交
    • B
      drm/i915: Allow full PPGTT with param override · 8d214b7d
      Ben Widawsky 提交于
      When PPGTT was disabled by default, the patch also prevented the user
      from overriding this behavior via module parameter. Being able to test
      this on arbitrary kernels is extremely beneficial to track down the
      remaining bugs. The patch that prevented this was:
      
      commit 93a25a9e
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Thu Mar 6 09:40:43 2014 +0100
      
          drm/i915: Disable full ppgtt by default
      
      By default PPGTT is set to -1. 0 means off, 1 means aliasing only, 2
      means full, all other values are reserved.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8d214b7d
    • B
      drm/i915: Split out GTT specific header file · 0260c420
      Ben Widawsky 提交于
      This file contains all necessary defines, prototypes and typesdefs for
      manipulating GEN graphics address translation (this does not include the
      legacy AGP driver)
      
      Reiterating the comment in the header,
      "Please try to maintain the following order within this file unless it
      makes sense to do otherwise. From top to bottom:
      1. typedefs
      2. #defines, and macros
      3. structure definitions
      4. function prototypes
      
      Within each section, please try to order by generation in ascending
      order, from top to bottom (ie. GEN6 on the top, GEN8 on the bottom)."
      
      I've made some minor cleanups, and fixed a couple of typos while here -
      but there should be no functional changes.
      
      The purpose of the patch is to reduce clutter in our main header file,
      making room for new growth, and make documentation of our interfaces
      easier by splitting things out.
      
      With a little more work, like making i915_gtt a pointer, we could
      potentially completely isolate this header from i915_drv.h. At the
      moment however, I don't think it's worth the effort.
      
      Personally, I would have liked to put the PTE encoding functions in this
      file too, but I didn't want to rock the boat too much.
      
      A similar patch has been in use on my machine for some time. This exact
      patch though has only been compile tested.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      0260c420
  22. 31 3月, 2014 1 次提交
  23. 29 3月, 2014 1 次提交
  24. 28 3月, 2014 1 次提交
  25. 19 3月, 2014 1 次提交
  26. 13 3月, 2014 1 次提交
  27. 12 3月, 2014 2 次提交
  28. 08 3月, 2014 1 次提交
    • D
      drm/i915: Disable full ppgtt by default · 93a25a9e
      Daniel Vetter 提交于
      There are too many oustanding issues:
      
      - Fence handling in the current code is broken. There's a patch series
        from me, but it's blocked on and extended review (which includes
        writing the testcases).
      
      - IOMMU mapping handling is broken, we need to properly refcount it -
        currently it gets destroyed when the first vma is unbound, so way
        too early.
      
      - There's a pending reset issue on snb. Since Mika's reset work and
        full ppgtt have been pulled in in separate branches and ended up
        intermittingly breaking each another it's unclear who's the exact
        culprit here.
      
      - We still have persistent evidince of crazy recursion bugs through
        vma_unbind and ppgtt_relase, e.g.
      
        https://bugs.freedesktop.org/show_bug.cgi?id=73383
      
        This issue (and a few others meanwhile resolved) have blocked our
        performance measuring/tuning group since 3 months.
      
      - Secure batch dispatching is broken. This is blocking Brad Volkin's
        command checker work since 3 months.
      
      All these issues are confirmed to only happen when full ppgtt is
      enabled, falling back to aliasing ppgtt resolves them. But even
      aliasing ppgtt itself still has a regression:
      
      - We currently unconditionally bind objects into the aliasing ppgtt,
        which means all priviledged objects like ringbuffers are visible to
        unpriviledged access again. On top of that this also breaks the
        command checker for aliasing ppgtt, since it can't hide the
        validated batch any more.
      
      Furthermore topic/full-ppgtt has never been reviewed:
      
      - Lifetime rules around vma unbinding/release are unclear, resulting
        into this awesome hack called ppgtt_release. Which seems to take the
        blame for most of the recursion fallout.
      
      - Context/ring init works different on gpu reset than anywhere else.
        Such differeneces have in the past always lead to really hard to
        track down bugs.
      
      - Aliasing ppgtt is treated in a bunch of places as a real address
        space, but it isn't - the real address space is always the global
        gtt in that case. This results in a bit a mess between contexts and
        ppgtt object, further complication the context/ppgtt/vma lifetime
        rules.
      
      - We don't have any docs describing the overall concepts introduced
        with full ppgtt. A short, concise overview describing vmas and some
        of the strange bits around them (like the unbound vmas used by
        execbuf, or the new binding rules) really is needed.
      
      Note that a lot of the post topic/full-ppgtt merge fallout has already
      been addressed, this entire list here of 10 issues really only contains
      the still outstanding issues.
      
      Finally the 3.15 merge window is approaching and I think we need to
      use the remaining time to ensure that our fallback option of using
      aliasing ppgtt is in solid shape. Hence I think it's time to throw the
      switch. While at it demote the helper from static inline status
      because really.
      
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: Dave Airlie <airlied@gmail.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      93a25a9e