1. 26 2月, 2022 3 次提交
  2. 09 2月, 2022 2 次提交
    • L
      drm/i915/guc: Use a single pass to calculate regset · bf890040
      Lucas De Marchi 提交于
      The ADS initialitazion was using 2 passes to calculate the regset sent
      to GuC to initialize each engine: the first pass to just have the final
      object size and the second to set each register in place in the final
      gem object.
      
      However in order to maintain an ordered set of registers to pass to guc,
      each register needs to be added and moved in the final array. The second
      phase may actually happen in IO memory rather than system memory and
      accessing IO memory by simply dereferencing the pointer doesn't work on
      all architectures. Other places of the ADS initializaition were
      converted to use the iosys_map API, but here there may be a lot more
      accesses to IO memory. So, instead of following that same approach,
      convert the regset initialization to calculate the final array in 1
      pass and in the second pass that array is just copied to its final
      location, updating the pointers for each engine written to the ADS blob.
      
      One important thing is that struct temp_regset now have
      different semantics: `registers` continues to track the registers of a
      single engine, however the other fields are updated together, according
      to the newly added `storage`, which tracks the memory allocated for
      all the registers. So rename some of these fields and add a
      __mmio_reg_add(): this function (possibly) allocates memory and operates
      on the storage pointer while guc_mmio_reg_add() continues to manage the
      registers pointer.
      
      On a Tiger Lake system using enable_guc=3, the following log message is
      now seen:
      
      	[  187.334310] i915 0000:00:02.0: [drm:intel_guc_ads_create [i915]] Used 4 KB for temporary ADS regset
      
      This change has also been tested on an ARM64 host with DG2 and other
      discrete graphics cards.
      
      v2 (Daniele):
        - Fix leaking tempset on error path
        - Add comments on struct temp_regset to document the meaning of each
          field
      
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Cc: John Harrison <John.C.Harrison@Intel.com>
      Cc: Matthew Brost <matthew.brost@intel.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
      Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220208070141.2095177-3-lucas.demarchi@intel.com
      bf890040
    • L
      drm/i915/guc: Prepare for error propagation · f4044ca1
      Lucas De Marchi 提交于
      Currently guc_mmio_reg_add() relies on having enough memory available in
      the array to add a new slot. It uses
      `GEM_BUG_ON(count >= regset->size);` to protect going above the
      threshold.
      
      In order to allow guc_mmio_reg_add() to handle the memory allocation by
      itself, it must return an error in case of failures.  Adjust return code
      so this error can be propagated to the callers of guc_mmio_reg_add() and
      guc_mmio_regset_init().
      
      No intended change in behavior.
      
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Cc: John Harrison <John.C.Harrison@Intel.com>
      Cc: Matthew Brost <matthew.brost@intel.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
      Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220208070141.2095177-2-lucas.demarchi@intel.com
      f4044ca1
  3. 02 2月, 2022 1 次提交
  4. 12 1月, 2022 2 次提交
    • M
      drm/i915/gt: Move engine registers to their own header · 202b1f4c
      Matt Roper 提交于
      Let's continue breaking up and cleaning up the massive i915_reg.h file
      by moving all registers that are defined in relation to an engine base
      to their own header.
      
      There are probably a bunch of other "engine registers" that we haven't
      moved yet (especially those that belong to the render engine in the
      0x2??? range), but this is a relatively straightforward first step.
      
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
      Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-8-matthew.d.roper@intel.com
      202b1f4c
    • J
      drm/i915/guc: Update to GuC version 69.0.3 · 77b6f79d
      John Harrison 提交于
      Update to the latest GuC release.
      
      The latest GuC firmware introduces a number of interface changes:
      
      GuC may return NO_RESPONSE_RETRY message for requests sent over CTB.
      Add support for this reply and try resending the request again as a
      new CTB message.
      
      A KLV (key-length-value) mechanism is now used for passing
      configuration data such as CTB management.
      
      With the new KLV scheme, the old CTB management actions are no longer
      used and are removed.
      
      Register capture on hang is now supported by GuC. Full i915 support
      for this will be added by a later patch. A minimum support of
      providing capture memory and register lists is required though, so add
      that in.
      
      The device id of the current platform needs to be provided at init time.
      
      The 'poll CS' w/a (Wa_22012773006) was blanket enabled by previous
      versions of GuC. It must now be explicitly requested by the KMD. So,
      add in the code to turn it on when relevant.
      
      The GuC log entry format has changed. This requires adding a new field
      to the log header structure to mark the wrap point at the end of the
      buffer (as the buffer size is no longer a multiple of the log entry
      size).
      
      New CTB notification messages are now sent for some things that were
      previously only sent via MMIO notifications.
      
      Of these, the crash dump notification was not really being handled by
      i915. It called the log flush code but that only flushed the regular
      debug log and then only if relay logging was enabled. So just report
      an error message instead.
      
      The 'exception' notification was just being ignored completely. So add
      an error message for that as well.
      
      Note that in either the crash dump or the exception case, the GuC is
      basically dead. The KMD will detect this via the heartbeat and trigger
      both an error log (which will include the crash dump as part of the
      GuC log) and a GT reset. So no other processing is really required.
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
      Reviewed-by: NMatthew Brost <matthew.brost@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220107000622.292081-3-John.C.Harrison@Intel.com
      77b6f79d
  5. 29 10月, 2021 1 次提交
    • U
      drm/i915/pmu: Connect engine busyness stats from GuC to pmu · 77cdd054
      Umesh Nerlige Ramappa 提交于
      With GuC handling scheduling, i915 is not aware of the time that a
      context is scheduled in and out of the engine. Since i915 pmu relies on
      this info to provide engine busyness to the user, GuC shares this info
      with i915 for all engines using shared memory. For each engine, this
      info contains:
      
      - total busyness: total time that the context was running (total)
      - id: id of the running context (id)
      - start timestamp: timestamp when the context started running (start)
      
      At the time (now) of sampling the engine busyness, if the id is valid
      (!= ~0), and start is non-zero, then the context is considered to be
      active and the engine busyness is calculated using the below equation
      
      	engine busyness = total + (now - start)
      
      All times are obtained from the gt clock base. For inactive contexts,
      engine busyness is just equal to the total.
      
      The start and total values provided by GuC are 32 bits and wrap around
      in a few minutes. Since perf pmu provides busyness as 64 bit
      monotonically increasing values, there is a need for this implementation
      to account for overflows and extend the time to 64 bits before returning
      busyness to the user. In order to do that, a worker runs periodically at
      frequency = 1/8th the time it takes for the timestamp to wrap. As an
      example, that would be once in 27 seconds for a gt clock frequency of
      19.2 MHz.
      
      Note:
      There might be an over-accounting of busyness due to the fact that GuC
      may be updating the total and start values while kmd is reading them.
      (i.e kmd may read the updated total and the stale start). In such a
      case, user may see higher busyness value followed by smaller ones which
      would eventually catch up to the higher value.
      
      v2: (Tvrtko)
      - Include details in commit message
      - Move intel engine busyness function into execlist code
      - Use union inside engine->stats
      - Use natural type for ping delay jiffies
      - Drop active_work condition checks
      - Use for_each_engine if iterating all engines
      - Drop seq locking, use spinlock at GuC level to update engine stats
      - Document worker specific details
      
      v3: (Tvrtko/Umesh)
      - Demarcate GuC and execlist stat objects with comments
      - Document known over-accounting issue in commit
      - Provide a consistent view of GuC state
      - Add hooks to gt park/unpark for GuC busyness
      - Stop/start worker in gt park/unpark path
      - Drop inline
      - Move spinlock and worker inits to GuC initialization
      - Drop helpers that are called only once
      
      v4: (Tvrtko/Matt/Umesh)
      - Drop addressed opens from commit message
      - Get runtime pm in ping, remove from the park path
      - Use cancel_delayed_work_sync in disable_submission path
      - Update stats during reset prepare
      - Skip ping if reset in progress
      - Explicitly name execlists and GuC stats objects
      - Since disable_submission is called from many places, move resetting
        stats to intel_guc_submission_reset_prepare
      
      v5: (Tvrtko)
      - Add a trylock helper that does not sleep and synchronize PMU event
        callbacks and worker with gt reset
      
      v6: (CI BAT failures)
      - DUTs using execlist submission failed to boot since __gt_unpark is
        called during i915 load. This ends up calling the GuC busyness unpark
        hook and results in kick-starting an uninitialized worker. Let
        park/unpark hooks check if GuC submission has been initialized.
      - drop cant_sleep() from trylock helper since rcu_read_lock takes care
        of that.
      
      v7: (CI) Fix igt@i915_selftest@live@gt_engines
      - For GuC mode of submission the engine busyness is derived from gt time
        domain. Use gt time elapsed as reference in the selftest.
      - Increase busyness calculation to 10ms duration to ensure batch runs
        longer and falls within the busyness tolerances in selftest.
      
      v8:
      - Use ktime_get in selftest as before
      - intel_reset_trylock_no_wait results in a lockdep splat that is not
        trivial to fix since the PMU callback runs in irq context and the
        reset paths are tightly knit into the driver. The test that uncovers
        this is igt@perf_pmu@faulting-read. Drop intel_reset_trylock_no_wait,
        instead use the reset_count to synchronize with gt reset during pmu
        callback. For the ping, continue to use intel_reset_trylock since ping
        is not run in irq context.
      
      - GuC PM timestamp does not tick when GuC is idle. This can potentially
        result in wrong busyness values when a context is active on the
        engine, but GuC is idle. Use the RING TIMESTAMP as GPU timestamp to
        process the GuC busyness stats. This works since both GuC timestamp and
        RING timestamp are synced with the same clock.
      
      - The busyness stats may get updated after the batch starts running.
        This delay causes the busyness reported for 100us duration to fall
        below 95% in the selftest. The only option at this time is to wait for
        GuC busyness to change from idle to active before we sample busyness
        over a 100us period.
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Signed-off-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
      Acked-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NMatthew Brost <matthew.brost@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211027004821.66097-2-umesh.nerlige.ramappa@intel.com
      77cdd054
  6. 16 10月, 2021 1 次提交
  7. 14 9月, 2021 1 次提交
  8. 28 7月, 2021 6 次提交
  9. 19 6月, 2021 1 次提交
  10. 06 6月, 2021 1 次提交
  11. 04 6月, 2021 1 次提交
  12. 21 12月, 2020 1 次提交
  13. 10 12月, 2020 1 次提交
  14. 29 10月, 2020 2 次提交
  15. 09 7月, 2020 3 次提交
  16. 10 12月, 2019 1 次提交
  17. 06 12月, 2019 1 次提交
  18. 12 8月, 2019 1 次提交
  19. 14 7月, 2019 2 次提交
  20. 01 7月, 2019 1 次提交
  21. 28 5月, 2019 2 次提交
  22. 08 3月, 2019 2 次提交
  23. 06 3月, 2019 1 次提交
  24. 24 7月, 2018 1 次提交
  25. 30 4月, 2018 1 次提交