提交 · 78ccea9ff2ad6fb5c73f146b46193ef15d6ede5f · openeuler / Kernel

30 6月, 2021 1 次提交

drm/amdkfd: add sysfs counters for vm fault and migration · 751580b3

由 Philip Yang 提交于 6月 16, 2021

This is part of SVM profiling API, export sysfs counters for
per-process, per-GPU vm retry fault, pages migrated in and out of GPU vram.

counters will not be updated in parallel in GPU retry fault handler and
migration to vram/ram path, use READ_ONCE to avoid compiler
optimization.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

751580b3

16 6月, 2021 1 次提交

drm/amdkfd: Disable SVM per GPU, not per process · 5a75ea56

由 Felix Kuehling 提交于 6月 10, 2021

When some GPUs don't support SVM, don't disabe it for the entire process.
That would be inconsistent with the information the process got from the
topology, which indicates SVM support per GPU.

Instead disable SVM support only for the unsupported GPUs. This is done
by checking any per-device attributes against the bitmap of supported
GPUs. Also use the supported GPU bitmap to initialize access bitmaps for
new SVM address ranges.

Don't handle recoverable page faults from unsupported GPUs. (I don't
think there will be unsupported GPUs that can generate recoverable page
faults. But better safe than sorry.)
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NPhilip Yang <philip.yang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5a75ea56

05 6月, 2021 1 次提交

drm/amdkfd: Add flush-type parameter to kfd_flush_tlb · 3543b055

由 Eric Huang 提交于 6月 01, 2021

It is to provide more tlb flush types option for different
case scenario.
Signed-off-by: NEric Huang <jinhuieric.huang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3543b055

20 5月, 2021 1 次提交

drm/amdkfd: refine the poison data consumption handling · e2b1f9f5

由 Dennis Li 提交于 5月 11, 2021

The user applications maybe register the KFD_EVENT_TYPE_HW_EXCEPTION and
KFD_EVENT_TYPE_MEMORY events, driver could notify them when poison data
consumed. Beside that, some applications maybe register SIGBUS signal
hander. These applications will handle poison data by themselves, exit
or re-create context to re-dispatch works.
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e2b1f9f5

24 4月, 2021 1 次提交

drm/amdkfd: add per-vmid-debug map_process_support · fd6a440e

由 Jonathan Kim 提交于 3月 02, 2021

In order to support multi-process debugging, HWS PM4 packet
MAP_PROCESS requires an extension of 5 DWORDS to support targeting of
per-vmid SPI debug control registers as well as watch points per process.
Signed-off-by: NJonathan Kim <jonathan.kim@amd.com>
Reviewed-by: NFelix Kuehling <felix.kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fd6a440e

21 4月, 2021 9 次提交

drm/amdkfd: refine migration policy with xnack on · cda0f85b

由 Felix Kuehling 提交于 2月 24, 2021

With xnack on, GPU vm fault handler decide the best restore location,
then migrate range to the best restore location and update GPU mapping
to recover the GPU vm fault.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cda0f85b

drm/amdkfd: register HMM device private zone · 814ab993

由 Philip Yang 提交于 2月 07, 2020

Register vram memory as MEMORY_DEVICE_PRIVATE type resource, to
allocate vram backing pages for page migration.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

814ab993

drm/amdkfd: add xnack enabled flag to kfd_process · 063e33c5

由 Alex Sierra 提交于 5月 28, 2020

XNACK mode controls the SQ RETRY_DISABLE setting that determines,
whether recoverable page faults can be supported on GFXv9 hardware.
Only on Aldebaran we can support different processes running with
different XNACK modes. On older chips all processes must use the same
RETRY_DISABLE setting. However, processes not relying on recoverable
page faults can work with RETRY enabled. This means XNACK off is always
available as a fallback so we can use the same mode on all GPUs in a
process.
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

063e33c5

drm/amdkfd: svm range eviction and restore · 8a7c184a

由 Felix Kuehling 提交于 2月 24, 2021

HMM interval notifier callback notify CPU page table will be updated,
stop process queues if the updated address belongs to svm range
registered in process svms objects tree. Scheduled restore work to
update GPU page table using new pages address in the updated svm range.

The restore worker flushes any deferred work to make sure it restores
an up-to-date svm_range_list.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8a7c184a

drm/amdkfd: deregister svm range · 4683cfec

由 Philip Yang 提交于 4月 08, 2020

When application explicitly call unmap or unmap from mmput when
application exit, driver will receive MMU_NOTIFY_UNMAP event to remove
svm range from process svms object tree and list first, unmap from GPUs
(in the following patch).

Split the svm ranges to handle partial unmapping of svm ranges. To
avoid deadlocks, updating MMU notifiers, range lists and interval trees
is done in a deferred worker. New child ranges are attached to their
parent range's child_list until the worker can update the
svm_range_list. svm_range_set_attr flushes deferred work and takes the
mmap_write_lock to guarantee that it has an up-to-date svm_range_list.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4683cfec

drm/amdkfd: register svm range · 42de677f

由 Philip Yang 提交于 2月 06, 2020

svm range structure stores the range start address, size, attributes,
flags, prefetch location and gpu bitmap which indicates which GPU this
range maps to. Same virtual address is shared by CPU and GPUs.

Process has svm range list which uses both interval tree and list to
store all svm ranges registered by the process. Interval tree is used by
GPU vm fault handler and CPU page fault handler to get svm range
structure from the specific address. List is used to scan all ranges in
eviction restore work.

No overlap range interval [start, last] exist in svms object interval
tree. If process registers new range which has overlap with old range,
the old range split into 2 ranges depending on the overlap happens at
head or tail part of old range.

Apply attributes preferred location, prefetch location, mapping flags,
migration granularity to svm range, store mapping gpu index into bitmap.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

42de677f

drm/amdkfd: add svm ioctl API · 40ce74d1

由 Philip Yang 提交于 2月 05, 2020

Add svm (shared virtual memory) ioctl data structure and API definition.

The svm ioctl API is designed to be extensible in the future. All
operations are provided by a single IOCTL to preserve ioctl number
space. The arguments structure ends with a variable size array of
attributes that can be used to set or get one or multiple attributes.
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

40ce74d1

drm/amdkfd: helper to convert gpu id and idx · 2aeb742b

由 Alex Sierra 提交于 4月 07, 2020

svm range uses gpu bitmap to store which GPU svm range maps to.
Application pass driver gpu id to specify GPU, the helper is needed to
convert gpu id to gpu bitmap idx.

Access through kfd_process_device pointers array from kfd_process.
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2aeb742b

drm/amdkfd: Use drm_priv to pass VM from KFD to amdgpu · b40a6ab2

由 Felix Kuehling 提交于 4月 07, 2021

amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap
to access the BO through the corresponding file descriptor. The VM can
also be extracted from drm_priv, so drm_priv can replace the vm parameter
in the kfd2kgd interface.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NPhilip Yang <philip.yang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b40a6ab2

10 4月, 2021 2 次提交

drm/amdkfd: dqm fence memory corruption · b010affe

由 Qu Huang 提交于 1月 28, 2021

Amdgpu driver uses 4-byte data type as DQM fence memory,
and transmits GPU address of fence memory to microcode
through query status PM4 message. However, query status
PM4 message definition and microcode processing are all
processed according to 8 bytes. Fence memory only allocates
4 bytes of memory, but microcode does write 8 bytes of memory,
so there is a memory corruption.

Changes since v1:
  * Change dqm->fence_addr as a u64 pointer to fix this issue,
also fix up query_status and amdkfd_fence_wait_timeout function
uses 64 bit fence value to make them consistent.
Signed-off-by: NQu Huang <jinsdb@126.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b010affe

drm/amdgpu: replace per_device_list by array · 6ae27841

由 Alex Sierra 提交于 4月 01, 2020

Remove per_device_list from kfd_process and replace it with a
kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps
to manage the kfd_process_devices binded to a specific kfd_process.
Also, functions used by kfd_chardev to iterate over the list were
removed, since they are not valid anymore. Instead, it was replaced by a
local loop iterating the array.
Signed-off-by: NAlex Sierra <alex.sierra@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NJonathan Kim <jonathan.kim@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6ae27841

01 4月, 2021 1 次提交

drm/amdkfd: dqm fence memory corruption · e92049ae

由 Qu Huang 提交于 1月 28, 2021

Amdgpu driver uses 4-byte data type as DQM fence memory,
and transmits GPU address of fence memory to microcode
through query status PM4 message. However, query status
PM4 message definition and microcode processing are all
processed according to 8 bytes. Fence memory only allocates
4 bytes of memory, but microcode does write 8 bytes of memory,
so there is a memory corruption.

Changes since v1:
  * Change dqm->fence_addr as a u64 pointer to fix this issue,
also fix up query_status and amdkfd_fence_wait_timeout function
uses 64 bit fence value to make them consistent.
Signed-off-by: NQu Huang <jinsdb@126.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

e92049ae

24 3月, 2021 1 次提交

drm/amdkfd: Add kernel parameter to stop queue eviction on vm fault · 6d909c5d

由 Oak Zeng 提交于 6月 22, 2020

This is to keep wavefront context for debug purpose
Signed-off-by: NOak Zeng <Oak.Zeng@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6d909c5d

06 3月, 2021 1 次提交

drm/amdkfd: Move set_trap_handler out of dqm->ops · 7c9631af

由 Jay Cornwall 提交于 2月 25, 2021

Trap handler is set per-process per-device and is unrelated
to queue management.

Move implementation closer to TMA setup code.
Signed-off-by: NJay Cornwall <jay.cornwall@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7c9631af

30 10月, 2020 1 次提交

drm/amdkfd: Fix getting unique_id in topology · d95c368a

由 Kent Russell 提交于 10月 28, 2020

Since the unique_id is now obtained in amdgpu in smu_late_init,
topology misses getting the value during KFD device initialization.
To work around this, we use amdgpu_amdkfd_get_unique_id to get
the unique_id at read time. Due to this, we can remove unique_id from
the kfd_dev structure, since we only need it in the KFD node properties
struct
Signed-off-by: NKent Russell <kent.russell@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d95c368a

01 10月, 2020 1 次提交

drm/amd/amdkfd: Surface files in Sysfs to allow users to get number of · f2fa07b3

由 Ramesh Errabolu 提交于 9月 29, 2020

compute units that are in use.

[Why]
Allow user to know how many compute units (CU) are in use at any given
moment.

[How]
Surface files in Sysfs that allow user to determine the number of compute
units that are in use for a given process. One Sysfs file is used per
device.
Signed-off-by: NRamesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-By: NHarish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f2fa07b3

26 9月, 2020 1 次提交

drm/amdgpu: store noretry parameter per driver instance · 9b498efa

由 Alex Deucher 提交于 9月 23, 2020

This will allow us to have different defaults per asic
in a future patch.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NLuben Tuikov <luben.tuikov@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9b498efa

23 9月, 2020 1 次提交

drm/amdkfd: Move process doorbell allocation into kfd device · 59d7115d

由 Mukul Joshi 提交于 9月 18, 2020

Move doorbell allocation for a process into kfd device and
allocate doorbell space in each PDD during process creation.
Currently, KFD manages its own doorbell space but for some
devices, amdgpu would allocate the complete doorbell
space instead of leaving a chunk of doorbell space for KFD to
manage. In a system with mix of such devices, KFD would need
to request process doorbell space based on the type of device,
either from amdgpu or from its own doorbell space.
Signed-off-by: NMukul Joshi <mukul.joshi@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

59d7115d

18 9月, 2020 2 次提交

drm/amdkfd: Add process eviction counters to sysfs · 4327bed2

由 Philip Cox 提交于 6月 30, 2020

Add per-process eviction counters to sysfs to keep track of
how many eviction events have happened for each process.

v2: rename the stats dir, and track all evictions per process, per device.
v3: Simplify the stats kobject handling and cleanup.
v4: more code cleanup
Signed-off-by: NPhilip Cox <Philip.Cox@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4327bed2

drm, iommu: Change type of pasid to u32 · c7b6bac9

由 Fenghua Yu 提交于 9月 15, 2020

PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with uapi
definitions, define PASID and its variations (e.g. max PASID) as "u32".
"u32" is also shorter and a little more explicit than "unsigned int".

No PASID type change in uapi although it defines PASID as __u64 in
some places.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com

c7b6bac9

01 9月, 2020 1 次提交

drm/amdkfd: Add GPU reset SMI event · 55977744

由 Mukul Joshi 提交于 8月 28, 2020

Add support for reporting GPU reset events through SMI. KFD
would report both pre and post GPU reset events.
Signed-off-by: NMukul Joshi <mukul.joshi@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

55977744

27 8月, 2020 1 次提交

drm/amdkfd: implement the dGPU fallback path for apu (v6) · 6127896f

由 Huang Rui 提交于 8月 18, 2020

We still have a few iommu issues which need to address, so force raven
as "dgpu" path for the moment.

This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
or ACPI CRAT table not correct.

v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2.
v3: Align with existed thunk, don't change the way of raven, only renoir
    will use "dgpu" path by default.
v4: don't update global ignore_crat in the driver, and revise fallback
    function if CRAT is broken.
v5: refine acpi crat good but no iommu support case, and rename the
    title.
v6: fix the issue of dGPU initialized firstly, just modify the report
    value in the node_show().
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6127896f

16 7月, 2020 2 次提交

drm/amdkfd: Provide SMI events watch · 938a0650

由 Amber Lin 提交于 5月 13, 2020

When the compute is malfunctioning or performance drops, the system admin
will use SMI (System Management Interface) tool to monitor/diagnostic what
went wrong. This patch provides an event watch interface for the user
space to register devices and subscribe events they are interested. After
registered, the user can use annoymous file descriptor's poll function
with wait-time specified and wait for events to happen. Once an event
happens, the user can use read() to retrieve information related to the
event.

VM fault event is done in this patch.

v2: - remove UNREGISTER and add event ENABLE/DISABLE
    - correct kfifo usage
    - move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
v5: move events enablement from ioctl to fd write
v6: sparse fix
Signed-off-by: NAmber Lin <Amber.Lin@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

938a0650

drm/amdkfd: fix kernel-doc and cleanup · a4497974

由 Rajneesh Bhardwaj 提交于 7月 13, 2020

 - fix some styling issues
 - fixes for kernel-doc type
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a4497974

01 7月, 2020 2 次提交

drm/amdkfd: Add eviction debug messages · b2057956

由 Felix Kuehling 提交于 6月 11, 2020

Use WARN to print messages with backtrace when evictions are triggered.
This can help determine the root cause of evictions and help spot driver
bugs triggering evictions unintentionally, or help with performance tuning
by avoiding conditions that cause evictions in a specific workload.

The messages are controlled by a new module parameter that can be changed
at runtime:

  echo Y > /sys/module/amdgpu/parameters/debug_evictions
  echo N > /sys/module/amdgpu/parameters/debug_evictions
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NPhilip Yang <Philip.Yang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b2057956

drm/amdkfd: Use correct major in devcgroup check · 7159562a

由 Lorenz Brun 提交于 6月 11, 2020

The existing code used the major version number of the DRM driver
instead of the device major number of the DRM subsystem for
validating access for a devices cgroup.

This meant that accesses allowed by the devices cgroup weren't
permitted and certain accesses denied by the devices cgroup were
permitted (if they matched the wrong major device number).
Signed-off-by: NLorenz Brun <lorenz@brun.one>
Fixes: 6b855f7b ("drm/amdkfd: Check against device cgroup")
Reviewed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7159562a

18 6月, 2020 1 次提交

drm/amdkfd: Use correct major in devcgroup check · 99c7b309

由 Lorenz Brun 提交于 6月 11, 2020

The existing code used the major version number of the DRM driver
instead of the device major number of the DRM subsystem for
validating access for a devices cgroup.

This meant that accesses allowed by the devices cgroup weren't
permitted and certain accesses denied by the devices cgroup were
permitted (if they matched the wrong major device number).
Signed-off-by: NLorenz Brun <lorenz@brun.one>
Fixes: 6b855f7b ("drm/amdkfd: Check against device cgroup")
Reviewed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

99c7b309

29 5月, 2020 1 次提交

drm/amdkfd: Track SDMA utilization per process · 32cb59f3

由 Mukul Joshi 提交于 5月 26, 2020

Track SDMA usage on a per process basis and report it through sysfs.
The value in the sysfs file indicates the amount of time SDMA has
been in-use by this process since the creation of the process.
This value is in microsecond granularity.
Signed-off-by: NMukul Joshi <mukul.joshi@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

32cb59f3

01 5月, 2020 1 次提交

drm/amdkfd: Track GPU memory utilization per process · d4566dee

由 Mukul Joshi 提交于 4月 28, 2020

Track GPU VRAM usage on a per process basis and report it through
sysfs.
Signed-off-by: NMukul Joshi <mukul.joshi@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d4566dee

29 4月, 2020 3 次提交

drm/amdkfd: Enable over-subscription with >1 GWS queue · b8020b03

由 Joseph Greathouse 提交于 9月 18, 2019

The current GWS usage model will only allows a single GWS-enabled
process to be active on the GPU at once. This ensures that a
barrier-using kernel gets a known amount of GPU hardware, to
prevent deadlock due to inability to go beyond the GWS barrier.

The HWS watches how many GWS entries are assigned to each process,
and goes into over-subscription mode when two processes need more
than the 64 that are available. The current KFD method for working
with this is to allocate all 64 GWS entries to each GWS-capable
process.

When more than one GWS-enabled process is in the runlist, we must
make sure the runlist is in over-subscription mode, so that the
HWS gets a chained RUN_LIST packet and continues scheduling
kernels.
Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b8020b03

drm/amdkfd: Enable GWS based on FW Support · 29633d0e

由 Joseph Greathouse 提交于 1月 15, 2020

Rather than only enabling GWS support based on the hws_gws_support
modparm, also check whether the GPU's HWS firmware supports GWS.
Leave the old modparm in place in case users want to test GWS
on GPUs not yet in the support list.

v2: fix broken syntax from the first patch.
Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

29633d0e

drm/amdkfd: New IOCTL to allocate queue GWS (v2) · 5bb4b78b

由 Oak Zeng 提交于 5月 06, 2019

Add a new kfd ioctl to allocate queue GWS. Queue
GWS is released on queue destroy.

v2: re-introduce this API with the following fixes squashed in:
- drm/amdkfd: fix null pointer dereference on dev
- drm/amdkfd: Return proper error code for gws alloc API
- drm/amdkfd: Remove GPU ID in GWS queue creation
Signed-off-by: NOak Zeng <Oak.Zeng@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5bb4b78b

14 4月, 2020 1 次提交

device_cgroup: Cleanup cgroup eBPF device filter code · eec8fd02

由 Odin Ugedal 提交于 4月 03, 2020

Original cgroup v2 eBPF code for filtering device access made it
possible to compile with CONFIG_CGROUP_DEVICE=n and still use the eBPF
filtering. Change
commit 4b7d4d45 ("device_cgroup: Export devcgroup_check_permission")
reverted this, making it required to set it to y.

Since the device filtering (and all the docs) for cgroup v2 is no longer
a "device controller" like it was in v1, someone might compile their
kernel with CONFIG_CGROUP_DEVICE=n. Then (for linux 5.5+) the eBPF
filter will not be invoked, and all processes will be allowed access
to all devices, no matter what the eBPF filter says.
Signed-off-by: NOdin Ugedal <odin@ugedal.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

eec8fd02

27 2月, 2020 1 次提交

drm/amd: Extend ROCt to surface UUID for devices that have them · 0c663695

由 Divya Shikre 提交于 2月 25, 2020

Devices from Arcturus onwards will have their UUID exposed to Thunk.
Adding neccessary functions to the kernel to propagate the uuid.
Signed-off-by: NDivya Shikre <DivyaUday.Shikre@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0c663695

13 2月, 2020 1 次提交

drm/amdkfd: refactor runtime pm for baco · 9593f4d6

由 Rajneesh Bhardwaj 提交于 1月 21, 2020

So far the kfd driver implemented same routines for runtime and system
wide suspend and resume (s2idle or mem). During system wide suspend the
kfd aquires an atomic lock that prevents any more user processes to
create queues and interact with kfd driver and amd gpu. This mechanism
created problem when amdgpu device is runtime suspended with BACO
enabled. Any application that relies on kfd driver fails to load because
the driver reports a locked kfd device since gpu is runtime suspended.

However, in an ideal case, when gpu is runtime  suspended the kfd driver
should be able to:

 - auto resume amdgpu driver whenever a client requests compute service
 - prevent runtime suspend for amdgpu  while kfd is in use

This change refactors the amdgpu and amdkfd drivers to support BACO and
runtime power management.
Reviewed-by: NOak Zeng <oak.zeng@amd.com>
Reviewed-by: NFelix Kuehling <felix.kuehling@amd.com>
Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9593f4d6

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功