1. 22 5月, 2019 5 次提交
  2. 17 4月, 2019 1 次提交
  3. 27 3月, 2019 1 次提交
  4. 22 3月, 2019 4 次提交
    • C
      drm/i915: Allow contexts to share a single timeline across all engines · ea593dbb
      Chris Wilson 提交于
      Previously, our view has been always to run the engines independently
      within a context. (Multiple engines happened before we had contexts and
      timelines, so they always operated independently and that behaviour
      persisted into contexts.) However, at the user level the context often
      represents a single timeline (e.g. GL contexts) and userspace must
      ensure that the individual engines are serialised to present that
      ordering to the client (or forgot about this detail entirely and hope no
      one notices - a fair ploy if the client can only directly control one
      engine themselves ;)
      
      In the next patch, we will want to construct a set of engines that
      operate as one, that have a single timeline interwoven between them, to
      present a single virtual engine to the user. (They submit to the virtual
      engine, then we decide which engine to execute on based.)
      
      To that end, we want to be able to create contexts which have a single
      timeline (fence context) shared between all engines, rather than multiple
      timelines.
      
      v2: Move the specialised timeline ordering to its own function.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-4-chris@chris-wilson.co.uk
      ea593dbb
    • C
      drm/i915: Extend CONTEXT_CREATE to set parameters upon construction · b9171541
      Chris Wilson 提交于
      It can be useful to have a single ioctl to create a context with all
      the initial parameters instead of a series of create + setparam + setparam
      ioctls. This extension to create context allows any of the parameters
      to be passed in as a linked list to be applied to the newly constructed
      context.
      
      v2: Make a local copy of user setparam (Tvrtko)
      v3: Use flags to detect availability of extension interface
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-3-chris@chris-wilson.co.uk
      b9171541
    • C
      drm/i915: Create/destroy VM (ppGTT) for use with contexts · e0695db7
      Chris Wilson 提交于
      In preparation to making the ppGTT binding for a context explicit (to
      facilitate reusing the same ppGTT between different contexts), allow the
      user to create and destroy named ppGTT.
      
      v2: Replace global barrier for swapping over the ppgtt and tlbs with a
      local context barrier (Tvrtko)
      v3: serialise with struct_mutex; it's lazy but required dammit
      v4: Rewrite igt_ctx_shared_exec to be more different (aimed to be more
      similarly, turned out different!)
      
      v5: Fix up test unwind for aliasing-ppgtt (snb)
      v6: Tighten language for uapi struct drm_i915_gem_vm_control.
      v7: Patch the context image for runtime ppgtt switching!
      
      Testcase: igt/gem_vm_create
      Testcase: igt/gem_ctx_param/vm
      Testcase: igt/gem_ctx_clone/vm
      Testcase: igt/gem_ctx_shared
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-2-chris@chris-wilson.co.uk
      e0695db7
    • C
      drm/i915: Introduce the i915_user_extension_method · 9d1305ef
      Chris Wilson 提交于
      An idea for extending uABI inspired by Vulkan's extension chains.
      Instead of expanding the data struct for each ioctl every time we need
      to add a new feature, define an extension chain instead. As we add
      optional interfaces to control the ioctl, we define a new extension
      struct that can be linked into the ioctl data only when required by the
      user. The key advantage being able to ignore large control structs for
      optional interfaces/extensions, while being able to process them in a
      consistent manner.
      
      In comparison to other extensible ioctls, the key difference is the
      use of a linked chain of extension structs vs an array of tagged
      pointers. For example,
      
      struct drm_amdgpu_cs_chunk {
              __u32           chunk_id;
              __u32           length_dw;
              __u64           chunk_data;
      };
      
      struct drm_amdgpu_cs_in {
              __u32           ctx_id;
              __u32           bo_list_handle;
              __u32           num_chunks;
              __u32           _pad;
              __u64           chunks;
      };
      
      allows userspace to pass in array of pointers to extension structs, but
      must therefore keep constructing that array along side the command stream.
      In dynamic situations like that, a linked list is preferred and does not
      similar from extra cache line misses as the extension structs themselves
      must still be loaded separate to the chunks array.
      
      v2: Apply the tail call optimisation directly to nip the worry of stack
      overflow in the bud.
      v3: Defend against recursion.
      v4: Fixup local types to match new uabi
      
      Opens:
      - do we include the result as an out-field in each chain?
      struct i915_user_extension {
      	__u64 next_extension;
      	__u64 name;
      	__s32 result;
      	__u32 mbz; /* reserved for future use */
      };
      * Undecided, so provision some room for future expansion.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-1-chris@chris-wilson.co.uk
      9d1305ef
  5. 06 3月, 2019 1 次提交
  6. 02 3月, 2019 2 次提交
    • C
      drm/i915: Fix I915_EXEC_RING_MASK · d90c06d5
      Chris Wilson 提交于
      This was supposed to be a mask of all known rings, but it is being used
      by execbuffer to filter out invalid rings, and so is instead mapping high
      unused values onto valid rings. Instead of a mask of all known rings,
      we need it to be the mask of all possible rings.
      
      Fixes: 549f7365 ("drm/i915: Enable SandyBridge blitter ring")
      Fixes: de1add36 ("drm/i915: Decouple execbuf uAPI from internal implementation")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: <stable@vger.kernel.org> # v4.6+
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301140404.26690-21-chris@chris-wilson.co.uk
      d90c06d5
    • C
      drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ · e8861964
      Chris Wilson 提交于
      Having introduced per-context seqno, we now have a means to identity
      progress across the system without feel of rollback as befell the
      global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in
      advance of submission safe in the knowledge that our target seqno and
      address is stable.
      
      However, since we are telling the GPU to busy-spin on the target address
      until it matches the signaling seqno, we only want to do so when we are
      sure that busy-spin will be completed quickly. To achieve this we only
      submit the request to HW once the signaler is itself executing (modulo
      preemption causing us to wait longer), and we only do so for default and
      above priority requests (so that idle priority tasks never themselves
      hog the GPU waiting for others).
      
      As might be reasonably expected, HW semaphores excel in inter-engine
      synchronisation microbenchmarks (where the 3x reduced latency / increased
      throughput more than offset the power cost of spinning on a second ring)
      and have significant improvement (can be up to ~10%, most see no change)
      for single clients that utilize multiple engines (typically media players
      and transcoders), without regressing multiple clients that can saturate
      the system or changing the power envelope dramatically.
      
      v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway.
      v4: Tell the world and include it as part of scheduler caps.
      
      Testcase: igt/gem_exec_whisper
      Testcase: igt/benchmarks/gem_wsim
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-3-chris@chris-wilson.co.uk
      e8861964
  7. 19 2月, 2019 1 次提交
  8. 18 2月, 2019 1 次提交
  9. 05 2月, 2019 1 次提交
    • T
      drm/i915: Expose RPCS (SSEU) configuration to userspace (Gen11 only) · e46c2e99
      Tvrtko Ursulin 提交于
      We want to allow userspace to reconfigure the subslice configuration on a
      per context basis.
      
      This is required for the functional requirement of shutting down non-VME
      enabled sub-slices on Gen11 parts.
      
      To do so, we expose a context parameter to allow adjustment of the RPCS
      register stored within the context image (and currently not accessible via
      LRI).
      
      If the context is adjusted before first use or whilst idle, the adjustment
      is for "free"; otherwise if the context is active we queue a request to do
      so (using the kernel context), following all other activity by that
      context, which is also marked as barrier for all following submission
      against the same context.
      
      Since the overhead of device re-configuration during context switching can
      be significant, especially in multi-context workloads, we limit this new
      uAPI to only support the Gen11 VME use case. In this use case either the
      device is fully enabled, and exactly one slice and half of the subslices
      are enabled.
      
      Example usage:
      
      	struct drm_i915_gem_context_param_sseu sseu = { };
      	struct drm_i915_gem_context_param arg = {
      		.param = I915_CONTEXT_PARAM_SSEU,
      		.ctx_id = gem_context_create(fd),
      		.size = sizeof(sseu),
      		.value = to_user_pointer(&sseu)
      	};
      
      	/* Query device defaults. */
      	gem_context_get_param(fd, &arg);
      
      	/* Set VME configuration on a 1x6x8 part. */
      	sseu.slice_mask = 0x1;
      	sseu.subslice_mask = 0xe0;
      	gem_context_set_param(fd, &arg);
      
      v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu()
          (Lionel)
      
      v3: Add ability to program this per engine (Chris)
      
      v4: Move most get_sseu() into i915_gem_context.c (Lionel)
      
      v5: Validate sseu configuration against the device's capabilities (Lionel)
      
      v6: Change context powergating settings through MI_SDM on kernel context
          (Chris)
      
      v7: Synchronize the requests following a powergating setting change using
          a global dependency (Chris)
          Iterate timelines through dev_priv.gt.active_rings (Tvrtko)
          Disable RPCS configuration setting for non capable users
          (Lionel/Tvrtko)
      
      v8: s/union intel_sseu/struct intel_sseu/ (Lionel)
          s/dev_priv/i915/ (Tvrtko)
          Change uapi class/instance fields to u16 (Tvrtko)
          Bump mask fields to 64bits (Lionel)
          Don't return EPERM when dynamic sseu is disabled (Tvrtko)
      
      v9: Import context image into kernel context's ppgtt only when
          reconfiguring powergated slice/subslices (Chris)
          Use aliasing ppgtt when needed (Michel)
      
      Tvrtko Ursulin:
      
      v10:
       * Update for upstream changes.
       * Request submit needs a RPM reference.
       * Reject on !FULL_PPGTT for simplicity.
       * Pull out get/set param to helpers for readability and less indent.
       * Use i915_request_await_dma_fence in add_global_barrier to skip waits
         on the same timeline and avoid GEM_BUG_ON.
       * No need to explicitly assign a NULL pointer to engine in legacy mode.
       * No need to move gen8_make_rpcs up.
       * Factored out global barrier as prep patch.
       * Allow to only CAP_SYS_ADMIN if !Gen11.
      
      v11:
       * Remove engine vfunc in favour of local helper. (Chris Wilson)
       * Stop retiring requests before updates since it is not needed
         (Chris Wilson)
       * Implement direct CPU update path for idle contexts. (Chris Wilson)
       * Left side dependency needs only be on the same context timeline.
         (Chris Wilson)
       * It is sufficient to order the timeline. (Chris Wilson)
       * Reject !RCS configuration attempts with -ENODEV for now.
      
      v12:
       * Rebase for make_rpcs.
      
      v13:
       * Centralize SSEU normalization to make_rpcs.
       * Type width checking (uAPI <-> implementation).
       * Gen11 restrictions uAPI checks.
       * Gen11 subslice count differences handling.
       Chris Wilson:
       * args->size handling fixes.
       * Update context image from GGTT.
       * Postpone context image update to pinning.
       * Use i915_gem_active_raw instead of last_request_on_engine.
      
      v14:
       * Add activity tracker on intel_context to fix the lifetime issues
         and simplify the code. (Chris Wilson)
      
      v15:
       * Fix context pin leak if no space in ring by simplifying the
         context pinning sequence.
      
      v16:
       * Rebase for context get/set param locking changes.
       * Just -ENODEV on !Gen11. (Joonas)
      
      v17:
       * Fix one Gen11 subslice enablement rule.
       * Handle error from i915_sw_fence_await_sw_fence_gfp. (Chris Wilson)
      
      v18:
       * Update commit message. (Joonas)
       * Restrict uAPI to VME use case. (Joonas)
      
      v19:
       * Rebase.
      
      v20:
       * Rebase for ce->active_tracker.
      
      v21:
       * Rebase for IS_GEN changes.
      
      v22:
       * Reserve uAPI for flags straight away. (Chris Wilson)
      
      v23:
       * Rebase for RUNTIME_INFO.
      
      v24:
       * Added some headline docs for the uapi usage. (Joonas/Chris)
      
      v25:
       * Renamed class/instance to engine_class/engine_instance to avoid clash
         with C++ keyword. (Tony Ye)
      
      v26:
       * Rebased for runtime pm api changes.
      
      v27:
       * Rebased for intel_context_init.
       * Wrap commit msg to 75.
      
      v28:
       (Chris Wilson)
       * Use i915_gem_ggtt.
       * Use i915_request_await_dma_fence to show a better example.
      
      v29:
       * i915_timeline_set_barrier can now fail. (Chris Wilson)
      
      v30:
       * Capture some acks.
      
      v31:
       * Drop the WARN_ON from use controllable paths. (Chris Wilson)
       * Use overflows_type for all checks.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100899
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107634
      Issue: https://github.com/intel/media-driver/issues/267Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Zhipeng Gong <zhipeng.gong@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tony Ye <tony.ye@intel.com>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Acked-by: NTimo Aaltonen <timo.aaltonen@canonical.com>
      Acked-by: NTakashi Iwai <tiwai@suse.de>
      Acked-by: NStéphane Marchesin <marcheu@chromium.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-4-tvrtko.ursulin@linux.intel.com
      e46c2e99
  10. 19 11月, 2018 1 次提交
  11. 23 10月, 2018 1 次提交
  12. 27 9月, 2018 1 次提交
  13. 20 7月, 2018 1 次提交
  14. 08 3月, 2018 2 次提交
    • L
      drm/i915: expose rcs topology through query uAPI · c822e059
      Lionel Landwerlin 提交于
      With the introduction of asymmetric slices in CNL, we cannot rely on
      the previous SUBSLICE_MASK getparam to tell userspace what subslices
      are available. Here we introduce a more detailed way of querying the
      Gen's GPU topology that doesn't aggregate numbers.
      
      This is essential for monitoring parts of the GPU with the OA unit,
      because counters need to be normalized to the number of
      EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
      not gives us sufficient information.
      
      The Mesa series making use of this API is :
      
          https://patchwork.freedesktop.org/series/38795/
      
      As a bonus we can draw representations of the GPU :
      
          https://imgur.com/a/vuqpa
      
      v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
          Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko)
          Add uapi macros to read data from *_info structs (Tvrtko)
      
      v3: Use !!(v & DRM_I915_BIT()) for uapi macros instead of custom shifts (Tvrtko)
      
      v4: factorize query item writting (Tvrtko)
          tweak uapi struct/define names (Tvrtko)
      
      v5: Replace ALIGN() macro (Chris)
      
      v6: Updated uapi comments (Tvrtko)
          Moved flags != 0 checks into vfuncs (Tvrtko)
      
      v7: Use access_ok() before copying anything, to avoid overflows (Chris)
          Switch BUG_ON() to GEM_WARN_ON() (Tvrtko)
      
      v8: Tweak uapi comments style to match the coding style (Lionel)
      
      v9: Fix error in comment about computation of enabled subslice (Tvrtko)
      
      v10: Fix/update comments in uAPI (Sagar)
      
      v11: Drop drm_i915_query_(slice|subslice|eu)_info in favor of a single
           drm_i915_query_topology_info (Joonas)
      
      v12: Add subslice_stride/eu_stride in drm_i915_query_topology_info (Joonas)
      
      v13: Fix comment in uAPI (Joonas)
      Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-7-lionel.g.landwerlin@intel.com
      c822e059
    • L
      drm/i915: add query uAPI · a446ae2c
      Lionel Landwerlin 提交于
      There are a number of information that are readable from hardware
      registers and that we would like to make accessible to userspace. One
      particular example is the topology of the execution units (how are
      execution units grouped in subslices and slices and also which ones
      have been fused off for die recovery).
      
      At the moment the GET_PARAM ioctl covers some basic needs, but
      generally is only able to return a single value for each defined
      parameter. This is a bit problematic with topology descriptions which
      are array/maps of available units.
      
      This change introduces a new ioctl that can deal with requests to fill
      structures of potentially variable lengths. The user is expected fill
      a query with length fields set at 0 on the first call, the kernel then
      sets the length fields to the their expected values. A second call to
      the kernel with length fields at their expected values will trigger a
      copy of the data to the pointed memory locations.
      
      The scope of this uAPI is only to provide information to userspace,
      not to allow configuration of the device.
      
      v2: Simplify dispatcher code iteration (Tvrtko)
          Tweak uapi drm_i915_query_item structure (Tvrtko)
      
      v3: Rename pad fields into flags (Chris)
          Return error on flags field != 0 (Chris)
          Only copy length back to userspace in drm_i915_query_item (Chris)
      
      v4: Use array of functions instead of switch (Chris)
      
      v5: More comments in uapi (Tvrtko)
          Return query item errors in length field (All)
      
      v6: Tweak uapi comments style to match the coding style (Lionel)
      
      v7: Add i915_query.h (Joonas)
      
      v8: (Lionel) Change the behavior of the item iterator to report
          invalid queries into the query item rather than stopping the
          iteration. This enables userspace applications to query newer
          items on older kernels and only have failure on the items that are
          not supported.
      
      v9: Edit copyright headers (Joonas)
      
      v10: Typos & comments in uapi (Joonas)
      Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-6-lionel.g.landwerlin@intel.com
      a446ae2c
  15. 06 2月, 2018 1 次提交
  16. 25 11月, 2017 1 次提交
  17. 23 11月, 2017 1 次提交
  18. 22 11月, 2017 3 次提交
    • T
      drm/i915/pmu: Add RC6 residency metrics · 6060b6ae
      Tvrtko Ursulin 提交于
      For clients like intel-gpu-overlay it is easier to read the
      counters via the perf API than having to parse sysfs.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-9-tvrtko.ursulin@linux.intel.com
      6060b6ae
    • T
      drm/i915/pmu: Add interrupt count metric · 0cd4684d
      Tvrtko Ursulin 提交于
      For clients like intel-gpu-overlay it is easier to read the
      count via the perf API than having to parse /proc.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-7-tvrtko.ursulin@linux.intel.com
      0cd4684d
    • T
      drm/i915/pmu: Expose a PMU interface for perf queries · b46a33e2
      Tvrtko Ursulin 提交于
      From: Chris Wilson <chris@chris-wilson.co.uk>
      From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      
      The first goal is to be able to measure GPU (and invidual ring) busyness
      without having to poll registers from userspace. (Which not only incurs
      holding the forcewake lock indefinitely, perturbing the system, but also
      runs the risk of hanging the machine.) As an alternative we can use the
      perf event counter interface to sample the ring registers periodically
      and send those results to userspace.
      
      Functionality we are exporting to userspace is via the existing perf PMU
      API and can be exercised via the existing tools. For example:
      
        perf stat -a -e i915/rcs0-busy/ -I 1000
      
      Will print the render engine busynnes once per second. All the performance
      counters can be enumerated (perf list) and have their unit of measure
      correctly reported in sysfs.
      
      v1-v2 (Chris Wilson):
      
      v2: Use a common timer for the ring sampling.
      
      v3: (Tvrtko Ursulin)
       * Decouple uAPI from i915 engine ids.
       * Complete uAPI defines.
       * Refactor some code to helpers for clarity.
       * Skip sampling disabled engines.
       * Expose counters in sysfs.
       * Pass in fake regs to avoid null ptr deref in perf core.
       * Convert to class/instance uAPI.
       * Use shared driver code for rc6 residency, power and frequency.
      
      v4: (Dmitry Rogozhkin)
       * Register PMU with .task_ctx_nr=perf_invalid_context
       * Expose cpumask for the PMU with the single CPU in the mask
       * Properly support pmu->stop(): it should call pmu->read()
       * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
       * Introduce refcounting of event subscriptions.
       * Make pmu.busy_stats a refcounter to avoid busy stats going away
         with some deleted event.
       * Expose cpumask for i915 PMU to avoid multiple events creation of
         the same type followed by counter aggregation by perf-stat.
       * Track CPUs getting online/offline to migrate perf context. If (likely)
         cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
         needed to see effect of CPU status tracking.
       * End result is that only global events are supported and perf stat
         works correctly.
       * Deny perf driver level sampling - it is prohibited for uncore PMU.
      
      v5: (Tvrtko Ursulin)
      
       * Don't hardcode number of engine samplers.
       * Rewrite event ref-counting for correctness and simplicity.
       * Store initial counter value when starting already enabled events
         to correctly report values to all listeners.
       * Fix RC6 residency readout.
       * Comments, GPL header.
      
      v6:
       * Add missing entry to v4 changelog.
       * Fix accounting in CPU hotplug case by copying the approach from
         arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)
      
      v7:
       * Log failure message only on failure.
       * Remove CPU hotplug notification state on unregister.
      
      v8:
       * Fix error unwind on failed registration.
       * Checkpatch cleanup.
      
      v9:
       * Drop the energy metric, it is available via intel_rapl_perf.
         (Ville Syrjälä)
       * Use HAS_RC6(p). (Chris Wilson)
       * Handle unsupported non-engine events. (Dmitry Rogozhkin)
       * Rebase for intel_rc6_residency_ns needing caller managed
         runtime pm.
       * Drop HAS_RC6 checks from the read callback since creating those
         events will be rejected at init time already.
       * Add counter units to sysfs so perf stat output is nicer.
       * Cleanup the attribute tables for brevity and readability.
      
      v10:
       * Fixed queued accounting.
      
      v11:
       * Move intel_engine_lookup_user to intel_engine_cs.c
       * Commit update. (Joonas Lahtinen)
      
      v12:
       * More accurate sampling. (Chris Wilson)
       * Store and report frequency in MHz for better usability from
         perf stat.
       * Removed metrics: queued, interrupts, rc6 counters.
       * Sample engine busyness based on seqno difference only
         for less MMIO (and forcewake) on all platforms. (Chris Wilson)
      
      v13:
       * Comment spelling, use mul_u32_u32 to work around potential GCC
         issue and somne code alignment changes. (Chris Wilson)
      
      v14:
       * Rebase.
      
      v15:
       * Rebase for RPS refactoring.
      
      v16:
       * Use the dynamic slot in the CPU hotplug state machine so that we are
         free to setup our state as multi-instance. Previously we were re-using
         the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as
         multi-instance, nor owned by our driver to start with.
       * Register the CPU hotplug handlers after the PMU, otherwise the callback
         will get called before the PMU is initialized which can end up in
         perf_pmu_migrate_context with an un-initialized base.
       * Added workaround for a probable bug in cpuhp core.
      
      v17:
       * Remove workaround for the cpuhp bug.
      
      v18:
       * Rebase for drm_i915_gem_engine_class getting upstream before us.
      
      v19:
       * Rebase. (trivial)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NDmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com
      b46a33e2
  19. 13 11月, 2017 1 次提交
  20. 11 11月, 2017 2 次提交
  21. 09 11月, 2017 1 次提交
  22. 03 11月, 2017 1 次提交
  23. 06 10月, 2017 1 次提交
  24. 05 10月, 2017 2 次提交
    • C
      drm/i915/scheduler: Support user-defined priorities · ac14fbd4
      Chris Wilson 提交于
      Use a priority stored in the context as the initial value when
      submitting a request. This allows us to change the default priority on a
      per-context basis, allowing different contexts to be favoured with GPU
      time at the expense of lower importance work. The user can adjust the
      context's priority via I915_CONTEXT_PARAM_PRIORITY, with more positive
      values being higher priority (they will be serviced earlier, after their
      dependencies have been resolved). Any prerequisite work for an execbuf
      will have its priority raised to match the new request as required.
      
      Normal users can specify any value in the range of -1023 to 0 [default],
      i.e. they can reduce the priority of their workloads (and temporarily
      boost it back to normal if so desired).
      
      Privileged users can specify any value in the range of -1023 to 1023,
      [default is 0], i.e. they can raise their priority above all overs and
      so potentially starve the system.
      
      Note that the existing schedulers are not fair, nor load balancing, the
      execution is strictly by priority on a first-come, first-served basis,
      and the driver may choose to boost some requests above the range
      available to users.
      
      This priority was originally based around nice(2), but evolved to allow
      clients to adjust their priority within a small range, and allow for a
      privileged high priority range.
      
      For example, this can be used to implement EGL_IMG_context_priority
      https://www.khronos.org/registry/egl/extensions/IMG/EGL_IMG_context_priority.txt
      
      	EGL_CONTEXT_PRIORITY_LEVEL_IMG determines the priority level of
              the context to be created. This attribute is a hint, as an
              implementation may not support multiple contexts at some
              priority levels and system policy may limit access to high
              priority contexts to appropriate system privilege level. The
              default value for EGL_CONTEXT_PRIORITY_LEVEL_IMG is
              EGL_CONTEXT_PRIORITY_MEDIUM_IMG."
      
      so we can map
      
      	PRIORITY_HIGH -> 1023 [privileged, will failback to 0]
      	PRIORITY_MED -> 0 [default]
      	PRIORITY_LOW -> -1023
      
      They also map onto the priorities used by VkQueue (and a VkQueue is
      essentially a timeline, our i915_gem_context under full-ppgtt).
      
      v2: s/CAP_SYS_ADMIN/CAP_SYS_NICE/
      v3: Report min/max user priorities as defines in the uapi, and rebase
      internal priorities on the exposed values.
      
      Testcase: igt/gem_exec_schedule
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-9-chris@chris-wilson.co.uk
      ac14fbd4
    • C
      drm/i915: Expand I915_PARAM_HAS_SCHEDULER into a capability bitmask · bf64e0b0
      Chris Wilson 提交于
      In the next few patches, we wish to enable different features for the
      scheduler, some which may subtlety change ABI (e.g. allow requests to be
      reordered under different circumstances). So we need to make sure
      userspace is cognizant of the changes (if they care), by which we employ
      the usual method of a GETPARAM. We already have an
      I915_PARAM_HAS_SCHEDULER (which notes the existing ability to reorder
      requests to avoid bubbles), and now we wish to extend that to be a
      bitmask to describe the different capabilities implemented.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-7-chris@chris-wilson.co.uk
      bf64e0b0
  25. 18 9月, 2017 1 次提交
  26. 14 9月, 2017 1 次提交
  27. 05 9月, 2017 1 次提交