1. 12 1月, 2017 1 次提交
    • C
      drm/i915: Detect vma reserved for execbuf in evict-for-node · 16ee2061
      Chris Wilson 提交于
      The vma->exec_list is still the only means we have for both reserving an
      object in execbuf, and for constructing the eviction list. So during the
      construction of the eviction list, we must treat anything already on the
      exec_list as being pinned.
      
      Yes, this sharing of two semantically different lists will be fixed! But
      in the meantime, we have the issue that this is tripping up CI since we
      started using i915_gem_gtt_reserve_node() + i915_gem_evict_for_node()
      from the regular execbuf reservation path in commit 606fec95
      ("drm/i915: Prefer random replacement before eviction search"):
      
      [  108.424063] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:254!
      [  108.424072] invalid opcode: 0000 [#1] PREEMPT SMP
      [  108.424079] Modules linked in: snd_hda_intel i915 intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me snd_pcm lpc_ich mei sdhci_pci sdhci mmc_core e1000e ptp pps_core [last unloaded: i915]
      [  108.424132] CPU: 1 PID: 6865 Comm: gem_cs_tlb Tainted: G     U          4.10.0-rc3-CI-CI_DRM_2049+ #1
      [  108.424143] Hardware name: Hewlett-Packard HP EliteBook 8440p/172A, BIOS 68CCU Ver. F.24 09/13/2013
      [  108.424154] task: ffff88012ae22600 task.stack: ffffc90000a14000
      [  108.424220] RIP: 0010:i915_gem_evict_for_node+0x237/0x410 [i915]
      [  108.424229] RSP: 0018:ffffc90000a17a58 EFLAGS: 00010202
      [  108.424237] RAX: 0000000000005871 RBX: ffff88012d1ad778 RCX: 0000000000000000
      [  108.424246] RDX: 000000007ffff000 RSI: ffffc90000a17a68 RDI: ffff880127e694d8
      [  108.424255] RBP: ffffc90000a17aa0 R08: ffffc90000a17a68 R09: 0000000000000000
      [  108.424264] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000080000000
      [  108.424273] R13: ffffc90000a17a68 R14: ffff880127e694d8 R15: ffffffffa0387330
      [  108.424283] FS:  00007f8236e3d8c0(0000) GS:ffff880137c40000(0000) knlGS:0000000000000000
      [  108.424293] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  108.424305] CR2: 00007f82347a2000 CR3: 000000012c866000 CR4: 00000000000006e0
      [  108.424317] Call Trace:
      [  108.424368]  i915_gem_gtt_reserve+0x67/0x80 [i915]
      [  108.424424]  __i915_vma_do_pin+0x248/0x620 [i915]
      [  108.424487]  ? __i915_vma_do_pin+0x162/0x620 [i915]
      [  108.424540]  i915_gem_execbuffer_reserve_vma.isra.8+0x153/0x1f0 [i915]
      [  108.424591]  i915_gem_execbuffer_reserve.isra.9+0x40e/0x440 [i915]
      [  108.424643]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
      [  108.424696]  i915_gem_execbuffer2+0xc0/0x250 [i915]
      [  108.424712]  drm_ioctl+0x200/0x450
      [  108.424760]  ? i915_gem_execbuffer+0x330/0x330 [i915]
      [  108.424776]  do_vfs_ioctl+0x90/0x6e0
      [  108.424789]  ? up_read+0x1a/0x40
      [  108.424800]  ? trace_hardirqs_on_caller+0x122/0x1b0
      [  108.424813]  SyS_ioctl+0x3c/0x70
      [  108.424828]  entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  108.424839] RIP: 0033:0x7f8235867357
      [  108.424848] RSP: 002b:00007ffdc14504c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [  108.424866] RAX: ffffffffffffffda RBX: 00007ffdc1450600 RCX: 00007f8235867357
      [  108.424878] RDX: 00007ffdc14505a0 RSI: 0000000040406469 RDI: 0000000000000003
      [  108.424890] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000022
      [  108.424903] R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000002
      [  108.424915] R13: 0000000000419101 R14: 00007ffdc1450600 R15: 00007ffdc14505f0
      [  108.424928] Code: 45 b8 8b 4d c0 4c 89 f2 48 89 de ff d0 49 8b 07 4c 8b 45 b8 48 85 c0 75 dd 65 ff 0d d4 a1 c8 5f 0f 84 47 01 00 00 e9 0d fe ff ff <0f> 0b 45 31 f6 4c 8b 65 c8 49 8b 04 24 4d 39 ec 49 8d 9c 24 28
      [  108.425055] RIP: i915_gem_evict_for_node+0x237/0x410 [i915] RSP: ffffc90000a17a58
      
      Fixes: 172ae5b4 ("drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)")
      Fixes: 606fec95 ("drm/i915: Prefer random replacement before eviction search")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170111182132.19174-1-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      16ee2061
  2. 11 1月, 2017 2 次提交
  3. 06 1月, 2017 1 次提交
  4. 28 12月, 2016 3 次提交
  5. 27 12月, 2016 1 次提交
  6. 12 12月, 2016 1 次提交
  7. 06 12月, 2016 1 次提交
    • C
      drm/i915: Fix i915_gem_evict_for_vma (soft-pinning) · 172ae5b4
      Chris Wilson 提交于
      Soft-pinning depends upon being able to check for availabilty of an
      interval and evict overlapping object from a drm_mm range manager very
      quickly. Currently it uses a linear list, and so performance is dire and
      not suitable as a general replacement. Worse, the current code will oops
      if it tries to evict an active buffer.
      
      It also helps if the routine reports the correct error codes as expected
      by its callers and emits a tracepoint upon use.
      
      For posterity since the wrong patch was pushed (i.e. that missed these
      key points and had known bugs), this is the changelog that should have
      been on commit 506a8e87 ("drm/i915: Add soft-pinning API for
      execbuffer"):
      
      Userspace can pass in an offset that it presumes the object is located
      at. The kernel will then do its utmost to fit the object into that
      location. The assumption is that userspace is handling its own object
      locations (for example along with full-ppgtt) and that the kernel will
      rarely have to make space for the user's requests.
      
      This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following:
      * if the user supplies a virtual address via the execobject->offset
        *and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then
        that object is placed at that offset in the address space selected
        by the context specifier in execbuffer.
      * the location must be aligned to the GTT page size, 4096 bytes
      * as the object is placed exactly as specified, it may be used by this
        execbuffer call without relocations pointing to it
      
      It may fail to do so if:
      * EINVAL is returned if the object does not have a 4096 byte aligned
        address
      * the object conflicts with another pinned object (either pinned by
        hardware in that address space, e.g. scanouts in the aliasing ppgtt)
        or within the same batch.
        EBUSY is returned if the location is pinned by hardware
        EINVAL is returned if the location is already in use by the batch
      * EINVAL is returned if the object conflicts with its own alignment (as meets
        the hardware requirements) or if the placement of the object does not fit
        within the address space
      
      All other execbuffer errors apply.
      
      Presence of this execbuf extension may be queried by passing
      I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for
      a reported value of 1 (or greater).
      
      v2: Combine the hole/adjusted-hole ENOSPC checks
      v3: More color, more splitting, more blurb.
      
      Fixes: 506a8e87 ("drm/i915: Add soft-pinning API for execbuffer")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-2-chris@chris-wilson.co.uk
      172ae5b4
  8. 29 11月, 2016 1 次提交
  9. 29 10月, 2016 2 次提交
  10. 24 10月, 2016 1 次提交
    • C
      drm/i915: Move user fault tracking to a separate list · 275f039d
      Chris Wilson 提交于
      We want to decouple RPM and struct_mutex, but currently RPM has to walk
      the list of bound objects and remove userspace mmapping before we
      suspend (otherwise userspace may continue to access the GTT whilst it is
      powered down). This currently requires the struct_mutex to walk the
      bound_list, but if we move that to a separate list and lock we can take
      the first step towards removing the struct_mutex.
      
      v2: Split runtime suspend unmapping vs regular unmapping, to make the
      locking (and barriers) clearer. Add the object to the userfault_list
      prior to inserting the first PTE, the race between add/revoke depends
      upon struct_mutex for regular unmappings and rpm for runtime-suspend.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #v1
      Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-1-chris@chris-wilson.co.uk
      275f039d
  11. 14 10月, 2016 1 次提交
    • A
      drm/i915: Allocate intel_engine_cs structure only for the enabled engines · 3b3f1650
      Akash Goel 提交于
      With the possibility of addition of many more number of rings in future,
      the drm_i915_private structure could bloat as an array, of type
      intel_engine_cs, is embedded inside it.
      	struct intel_engine_cs engine[I915_NUM_ENGINES];
      Though this is still fine as generally there is only a single instance of
      drm_i915_private structure used, but not all of the possible rings would be
      enabled or active on most of the platforms. Some memory can be saved by
      allocating intel_engine_cs structure only for the enabled/active engines.
      Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
      indexed using the enums defined in intel_engine_id.
      To save memory and continue using the static engine/ring IDs, 'engine' is
      defined as an array of pointers.
      	struct intel_engine_cs *engine[I915_NUM_ENGINES];
      dev_priv->engine[engine_ID] will be NULL for disabled engine instances.
      
      There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
      i915.o file (but for i915.ko file text size remain same as 1193131 bytes).
      
      v2:
      - Remove the engine iterator field added in drm_i915_private structure,
        instead pass a local iterator variable to the for_each_engine**
        macros. (Chris)
      - Do away with intel_engine_initialized() and instead directly use the
        NULL pointer check on engine pointer. (Chris)
      
      v3:
      - Remove for_each_engine_id() macro, as the updated macro for_each_engine()
        can be used in place of it. (Chris)
      - Protect the access to Render engine Fault register with a NULL check, as
        engine specific init is done later in Driver load sequence.
      
      v4:
      - Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
      - Kill the superfluous init_engine_lists().
      
      v5:
      - Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
        allocation of intel_engine_cs structure. (Chris)
      
      v6:
      - Rebase.
      
      v7:
      - Optimize the for_each_engine_masked() macro. (Chris)
      - Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
      - Rebase.
      
      v8: Rebase.
      
      v9: Rebase.
      
      v10:
      - For index calculation use engine ID instead of pointer based arithmetic in
        intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
      - For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
      - Use for_each_engine macro for cleanup in intel_engines_init() and remove
        check for NULL engine pointer in cleanup() routines. (Joonas)
      
      v11: Rebase.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NAkash Goel <akash.goel@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
      3b3f1650
  12. 09 9月, 2016 2 次提交
  13. 19 8月, 2016 1 次提交
  14. 05 8月, 2016 5 次提交
  15. 04 8月, 2016 1 次提交
  16. 20 7月, 2016 2 次提交
  17. 16 7月, 2016 1 次提交
  18. 24 6月, 2016 2 次提交
  19. 09 5月, 2016 1 次提交
  20. 26 2月, 2016 1 次提交
  21. 09 12月, 2015 1 次提交
    • C
      drm/i915: Add soft-pinning API for execbuffer · 506a8e87
      Chris Wilson 提交于
      Userspace can pass in an offset that it presumes the object is located
      at. The kernel will then do its utmost to fit the object into that
      location. The assumption is that userspace is handling its own object
      locations (for example along with full-ppgtt) and that the kernel will
      rarely have to make space for the user's requests.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      
      v2: Fixed incorrect eviction found by Michal Winiarski - fix suggested by Chris
      Wilson.  Fixed incorrect error paths causing crash found by Michal Winiarski.
      (Not published externally)
      
      v3: Rebased because of trivial conflict in object_bind_to_vm.  Fixed eviction
      to allow eviction of soft-pinned objects when another soft-pinned object used
      by a subsequent execbuffer overlaps reported by Michal Winiarski.
      (Not published externally)
      
      v4: Moved soft-pinned objects to the front of ordered_vmas so that they are
      pinned first after an address conflict happens to avoid repeated conflicts in
      rare cases (Suggested by Chris Wilson).  Expanded comment on
      drm_i915_gem_exec_object2.offset to cover this new API.
      
      v5: Added I915_PARAM_HAS_EXEC_SOFTPIN parameter for detecting this capability
      (Kristian). Added check for multiple pinnings on eviction (Akash). Made sure
      buffers are not considered misplaced without the user specifying
      EXEC_OBJECT_SUPPORTS_48B_ADDRESS.  User must assume responsibility for any
      addressing workarounds.  Updated object2.offset field comment again to clarify
      NO_RELOC case (Chris).  checkpatch cleanup.
      
      v6: Trivial rebase on latest drm-intel-nightly
      
      v7: Catch attempts to pin above the max virtual address size and return
      EINVAL (Tvrtko). Decouple EXEC_OBJECT_SUPPORTS_48B_ADDRESS and
      EXEC_OBJECT_PINNED flags, user must pass both flags in any attempt to pin
      something at an offset above 4GB (Chris, Daniel Vetter).
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Akash Goel <akash.goel@intel.com>
      Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
      Cc: Michal Winiarski <michal.winiarski@intel.com>
      Cc: Zou Nanhai <nanhai.zou@intel.com>
      Cc: Kristian Høgsberg <hoegsberg@gmail.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
      Acked-by: PDT
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1449575707-20933-1-git-send-email-thomas.daniel@intel.com
      506a8e87
  22. 07 10月, 2015 1 次提交
  23. 20 3月, 2015 1 次提交
  24. 06 1月, 2015 2 次提交
  25. 19 9月, 2014 1 次提交
  26. 27 5月, 2014 1 次提交
    • C
      drm/i915: Prevent negative relocation deltas from wrapping · d23db88c
      Chris Wilson 提交于
      This is pure evil. Userspace, I'm looking at you SNA, repacks batch
      buffers on the fly after generation as they are being passed to the
      kernel for execution. These batches also contain self-referenced
      relocations as a single buffer encompasses the state commands, kernels,
      vertices and sampler. During generation the buffers are placed at known
      offsets within the full batch, and then the relocation deltas (as passed
      to the kernel) are tweaked as the batch is repacked into a smaller buffer.
      This means that userspace is passing negative relocations deltas, which
      subsequently wrap to large values if the batch is at a low address. The
      GPU hangs when it then tries to use the large value as a base for its
      address offsets, rather than wrapping back to the real value (as one
      would hope). As the GPU uses positive offsets from the base, we can
      treat the relocation address as the minimum address read by the GPU.
      For the upper bound, we trust that userspace will not read beyond the
      end of the buffer.
      
      So, how do we fix negative relocations from wrapping? We can either
      check that every relocation looks valid when we write it, and then
      position each object such that we prevent the offset wraparound, or we
      just special-case the self-referential behaviour of SNA and force all
      batches to be above 256k. Daniel prefers the latter approach.
      
      This fixes a GPU hang when it tries to use an address (relocation +
      offset) greater than the GTT size. The issue would occur quite easily
      with full-ppgtt as each fd gets its own VM space, so low offsets would
      often be handed out. However, with the rearrangement of the low GTT due
      to capturing the BIOS framebuffer, it is already affecting kernels 3.15
      onwards. I think only IVB+ is susceptible to this bug, but the workaround
      should only kick in rarely, so it seems sensible to always apply it.
      
      v3: Use a bias for batch buffers to prevent small negative delta relocations
      from wrapping.
      
      v4 from Daniel:
      - s/BIAS/BATCH_OFFSET_BIAS/
      - Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
        were growing rather cumbersome.
      - Add a comment to eb_get_batch explaining why we do this.
      - Apply the batch offset bias everywhere but mention that we've only
        observed it on gen7 gpus.
      - Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.
      
      v5: Add static to eb_get_batch, spotted by 0-day tester.
      
      Testcase: igt/gem_bad_reloc
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d23db88c
  27. 31 3月, 2014 1 次提交
  28. 14 2月, 2014 1 次提交
    • D
      drm/i915: Consolidate binding parameters into flags · 1ec9e26d
      Daniel Vetter 提交于
      Anything more than just one bool parameter is just a pain to read,
      symbolic constants are much better.
      
      Split out from Chris' vma-binding rework patch.
      
      v2: Undo the behaviour change in object_pin that Chris spotted.
      
      v3: Split out misplaced hunk to handle set_cache_level errors,
      spotted by Jani.
      
      v4: Keep the current over-zealous binding logic in the execbuffer code
      working with a quick hack while the overall binding code gets shuffled
      around.
      
      v5: Reorder the PIN_ flags for more natural patch splitup.
      
      v6: Pull out the PIN_GLOBAL split-up again.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <benjamin.widawsky@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1ec9e26d