1. 10 11月, 2021 1 次提交
  2. 06 11月, 2021 1 次提交
    • A
      drm/amdkfd: avoid recursive lock in migrations back to RAM · a6283010
      Alex Sierra 提交于
      [Why]:
      When we call hmm_range_fault to map memory after a migration, we don't
      expect memory to be migrated again as a result of hmm_range_fault. The
      driver ensures that all memory is in GPU-accessible locations so that
      no migration should be needed. However, there is one corner case where
      hmm_range_fault can unexpectedly cause a migration from DEVICE_PRIVATE
      back to system memory due to a write-fault when a system memory page in
      the same range was mapped read-only (e.g. COW). Ranges with individual
      pages in different locations are usually the result of failed page
      migrations (e.g. page lock contention). The unexpected migration back
      to system memory causes a deadlock from recursive locking in our
      driver.
      
      [How]:
      Creating a task reference new member under svm_range_list struct.
      Setting this with "current" reference, right before the hmm_range_fault
      is called. This member is checked against "current" reference at
      svm_migrate_to_ram callback function. If equal, the migration will be
      ignored.
      Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
      Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      a6283010
  3. 29 10月, 2021 3 次提交
  4. 16 9月, 2021 1 次提交
  5. 15 9月, 2021 1 次提交
  6. 06 8月, 2021 1 次提交
  7. 23 7月, 2021 1 次提交
    • O
      drm/amdkfd: Fix a concurrency issue during kfd recovery · 4f942aae
      Oak Zeng 提交于
      start_cpsch and stop_cpsch can be called during kfd device
      initialization or during gpu reset/recovery. So they can
      run concurrently. Currently in start_cpsch and stop_cpsch,
      pm_init and pm_uninit is not protected by the dpm lock.
      Imagine such a case that user use packet manager's function
      to submit a pm4 packet to hang hws (ie through command
      cat /sys/class/kfd/kfd/topology/nodes/1/gpu_id | sudo tee
      /sys/kernel/debug/kfd/hang_hws), while kfd device is under
      device reset/recovery so packet manager can be not initialized.
      There will be unpredictable protection fault in such case.
      
      This patch moves pm_init/uninit inside the dpm lock and check
      packet manager is initialized before using packet manager
      function.
      Signed-off-by: NOak Zeng <Oak.Zeng@amd.com>
      Acked-by: NChristian Konig <christian.koenig@amd.com>
      Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      4f942aae
  8. 30 6月, 2021 1 次提交
  9. 16 6月, 2021 1 次提交
    • F
      drm/amdkfd: Disable SVM per GPU, not per process · 5a75ea56
      Felix Kuehling 提交于
      When some GPUs don't support SVM, don't disabe it for the entire process.
      That would be inconsistent with the information the process got from the
      topology, which indicates SVM support per GPU.
      
      Instead disable SVM support only for the unsupported GPUs. This is done
      by checking any per-device attributes against the bitmap of supported
      GPUs. Also use the supported GPU bitmap to initialize access bitmaps for
      new SVM address ranges.
      
      Don't handle recoverable page faults from unsupported GPUs. (I don't
      think there will be unsupported GPUs that can generate recoverable page
      faults. But better safe than sorry.)
      Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Reviewed-by: NPhilip Yang <philip.yang@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      5a75ea56
  10. 05 6月, 2021 1 次提交
  11. 20 5月, 2021 1 次提交
  12. 24 4月, 2021 1 次提交
  13. 21 4月, 2021 9 次提交
  14. 10 4月, 2021 2 次提交
  15. 01 4月, 2021 1 次提交
  16. 24 3月, 2021 1 次提交
  17. 06 3月, 2021 1 次提交
  18. 30 10月, 2020 1 次提交
  19. 01 10月, 2020 1 次提交
  20. 26 9月, 2020 1 次提交
  21. 23 9月, 2020 1 次提交
  22. 18 9月, 2020 2 次提交
  23. 01 9月, 2020 1 次提交
  24. 27 8月, 2020 1 次提交
    • H
      drm/amdkfd: implement the dGPU fallback path for apu (v6) · 6127896f
      Huang Rui 提交于
      We still have a few iommu issues which need to address, so force raven
      as "dgpu" path for the moment.
      
      This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
      or ACPI CRAT table not correct.
      
      v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2.
      v3: Align with existed thunk, don't change the way of raven, only renoir
          will use "dgpu" path by default.
      v4: don't update global ignore_crat in the driver, and revise fallback
          function if CRAT is broken.
      v5: refine acpi crat good but no iommu support case, and rename the
          title.
      v6: fix the issue of dGPU initialized firstly, just modify the report
          value in the node_show().
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      6127896f
  25. 16 7月, 2020 2 次提交
  26. 01 7月, 2020 2 次提交