1. 08 2月, 2022 2 次提交
    • R
      drm/amdkfd: CRIU Implement KFD process_info ioctl · f185381b
      Rajneesh Bhardwaj 提交于
      This IOCTL op is expected to be called as a precursor to the actual
      Checkpoint operation. This does the basic discovery into the target
      process seized by CRIU and relays the information to the userspace that
      utilizes it to start the Checkpoint operation via another dedicated
      IOCTL op.
      
      The process_info IOCTL op determines the number of GPUs, buffer objects
      that are associated with the target process, its process id in
      caller's namespace since /proc/pid/mem interface maybe used to drain
      the contents of the discovered buffer objects in userspace and getpid
      returns the pid of CRIU dumper process. Also the pid of a process
      inside a container might be different than its global pid so return
      the ns pid.
      Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
      Signed-off-by: NDavid Yat Sin <david.yatsin@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      f185381b
    • R
      drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs · 36988070
      Rajneesh Bhardwaj 提交于
      Checkpoint-Restore in userspace (CRIU) is a powerful tool that can
      snapshot a running process and later restore it on same or a remote
      machine but expects the processes that have a device file (e.g. GPU)
      associated with them, provide necessary driver support to assist CRIU
      and its extensible plugin interface. Thus, In order to support the
      Checkpoint-Restore of any ROCm process, the AMD Radeon Open Compute
      Kernel driver, needs to provide a set of new APIs that provide
      necessary VRAM metadata and its contents to a userspace component
      (CRIU plugin) that can store it in form of image files.
      
      This introduces some new ioctls which will be used to checkpoint-Restore
      any KFD bound user process. KFD only allows ioctl calls from the same
      process that opened the KFD file descriptor. Since these ioctls are
      expected to be called from a KFD criu plugin which has elevated ptrace
      attached privileges and CAP_CHECKPOINT_RESTORE capabilities attached with
      the file descriptors so modify KFD to allow such calls.
      
      (API redesigned by David Yat Sin)
      Suggested-by: NFelix Kuehling <felix.kuehling@amd.com>
      Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: NDavid Yat Sin <david.yatsin@amd.com>
      Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      36988070
  2. 28 1月, 2022 2 次提交
  3. 20 1月, 2022 1 次提交
  4. 18 11月, 2021 7 次提交
  5. 29 10月, 2021 2 次提交
  6. 14 10月, 2021 3 次提交
  7. 03 8月, 2021 4 次提交
  8. 29 7月, 2021 3 次提交
  9. 13 7月, 2021 6 次提交
  10. 16 6月, 2021 1 次提交
    • F
      drm/amdkfd: Disable SVM per GPU, not per process · 5a75ea56
      Felix Kuehling 提交于
      When some GPUs don't support SVM, don't disabe it for the entire process.
      That would be inconsistent with the information the process got from the
      topology, which indicates SVM support per GPU.
      
      Instead disable SVM support only for the unsupported GPUs. This is done
      by checking any per-device attributes against the bitmap of supported
      GPUs. Also use the supported GPU bitmap to initialize access bitmaps for
      new SVM address ranges.
      
      Don't handle recoverable page faults from unsupported GPUs. (I don't
      think there will be unsupported GPUs that can generate recoverable page
      faults. But better safe than sorry.)
      Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Reviewed-by: NPhilip Yang <philip.yang@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      5a75ea56
  11. 12 6月, 2021 1 次提交
  12. 05 6月, 2021 3 次提交
  13. 21 4月, 2021 5 次提交