提交 · 359cecdd499783abcf4ba6589066db8d8cb58e88 · openeuler / Kernel

14 7月, 2018 5 次提交

drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event() · 359cecdd

由 Yong Zhao 提交于 7月 13, 2018

memory_exception_data is already initialized for not-present faults.
It only needs to be overridden for permission faults.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

359cecdd

drm/amdkfd: Workaround to accommodate Raven too many PPR issue · 8725aeca

由 Yong Zhao 提交于 7月 13, 2018

On Raven multiple PPRs can be queued up by the hardware. When the
first of those requests is handled by the IOMMU driver, the memory
access succeeds. After that the application may be done with the
memory and unmap it. At that point the page table entries are
invalidated, but there are still outstanding duplicate PPRs for those
addresses. When the IOMMU driver processes those duplicate requests,
it finds invalid page table entries and triggers an invalid PPR fault.

As a workaround, don't signal invalid PPR faults on Raven to avoid
segfaulting applications that haven't done anything wrong. As a side
effect, real GPU memory access faults may go unnoticed by the
application.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

8725aeca

drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues · eab69801

由 Yong Zhao 提交于 7月 13, 2018

On Raven Invalid PPRs (peripheral page requests) can be reported
because multiple PPRs can be still queued when memory is freed.
Apply a rate limit to avoid flooding the log in this case.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

eab69801

drm/amdkfd: Make SDMA engine number an ASIC-dependent variable · 98bb9222

由 Yong Zhao 提交于 7月 13, 2018

On Raven there is only one SDMA engine instead of previously assumed two,
so we need to adapt our code to this new scenario.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

98bb9222

drm/amdkfd: Consolidate duplicate memory banks info in topology · f3ed5df8

由 Yong Zhao 提交于 7月 13, 2018

If there are several memory banks that has the same properties in CRAT,
we aggregate them into one memory bank. This cleans up memory banks on
APUs (e.g. Raven) where the CRAT reports each memory channel as a
separate bank. This only confuses user mode, which only deals with
virtual memory.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

f3ed5df8

12 7月, 2018 16 次提交

drm/amdkfd: Clean up reference of radeon · e7016d8e

由 Yong Zhao 提交于 7月 11, 2018

Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

e7016d8e

drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager · 8d5f3552

由 Yong Zhao 提交于 7月 11, 2018

This will make reading code much easier.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

8d5f3552

drm/amdkfd: Use module parameters noretry as the internal variable name · 2b281977

由 Yong Zhao 提交于 7月 11, 2018

This makes all module parameters use the same form. Meanwhile clean up
the surrounding code.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

2b281977

drm/amdkfd: Introduce KFD module parameter halt_if_hws_hang · 0e9a860c

由 Yong Zhao 提交于 7月 11, 2018

This avoids triggering a GPU reset or otherwise changing the HW
state. Instead KFD will hang, which allows HW debugging tools to
analyze the problem.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

0e9a860c

drm/amdkfd: Add debugfs interface to trigger HWS hang · a29ec470

由 Shaoyun Liu 提交于 7月 11, 2018

Signed-off-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

a29ec470

drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculation · 951df6d9

由 Shaoyun Liu 提交于 7月 11, 2018

The bitmap index calculation should reverse the logic used on allocation
so it will clear the same bit used on allocation
Signed-off-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

951df6d9

drm/amdkfd: Implement hang detection in KFD and call amdgpu · 73ea648d

由 Shaoyun Liu 提交于 7月 11, 2018

The reset will be performed in a new hw_exception work thread to
handle HWS hang without blocking the thread that detected the hang.
Signed-off-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

73ea648d

drm/amdkfd: Implement GPU reset handlers in KFD · e42051d2

由 Shaoyun Liu 提交于 7月 11, 2018

Lock KFD and evict existing queues on reset. Notify user mode by
signaling hw_exception events.
Signed-off-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

e42051d2

drm/amdkfd: Add gpu reset interface and place holder · e3b7a967

由 Shaoyun Liu 提交于 7月 11, 2018

Signed-off-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

e3b7a967

drm/amdkfd: fix zero reading of VMID and PASID for Hawaii · 58e69886

由 Lan Xiao 提交于 7月 11, 2018

Upon VM Fault, the VMID and PASID written by HW are zeros in
Hawaii. Instead of reading from ih_ring_entry, read directly
from the registers. This workaround fix the soft hang issues
caused by mishandled VM Fault in Hawaii.
Signed-off-by: NLan Xiao <Lan.Xiao@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

58e69886

drm/amdkfd: Handle VM faults in KFD · 2640c3fa

由 shaoyunl 提交于 7月 11, 2018

1. Pre-GFX9 the amdgpu ISR saves the vm-fault status and address per
   per-vmid. amdkfd needs to get the information from amdgpu through the
   new get_vm_fault_info interface. On GFX9 and later, all the required
   information is in the IH ring
2. amdkfd unmaps all queues from the faulting process and create new
   run-list without the guilty process
3. amdkfd notifies the runtime of the vm fault trap via EVENT_TYPE_MEMORY
Signed-off-by: Nshaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

2640c3fa

drm/amdkfd: send SIGSEGV to process upon KFD_EVENT_TYPE_MEMORY · 101fee63

由 Moses Reuben 提交于 7月 11, 2018

Signed-off-by: NMoses Reuben <moses.reuben@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

101fee63

drm/amdkfd: Fix error codes in kfd_get_process · e47cb828

由 Wei Lu 提交于 7月 11, 2018

Return ERR_PTR(-EINVAL) if kfd_get_process fails to find the process.
This fixes kernel oopses when a child process calls KFD ioctls with
a file descriptor inherited from the parent process.
Signed-off-by: NWei Lu <wei.lu2@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

e47cb828

drm/amdkfd: Fix race between scheduler and context restore · a60d811b

由 Jay Cornwall 提交于 7月 11, 2018

The scheduler may raise SQ_WAVE_STATUS.SPI_PRIO via SQ_CMD before
context restore has completed. Restoring SPI_PRIO=0 after this point
may cause context save to fail as the lower priority wavefronts
are not selected for execution among spin-waiting wavefronts.

Leave SPI_PRIO at its SPI-initialized or scheduler-raised value.

v2: Also fix race with exception handler
Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

a60d811b

drm/amdkfd: Stop using GFP_NOIO explicitly · 1cd106ec

由 Felix Kuehling 提交于 7月 11, 2018

This is no longer needed with the memalloc_nofs_save/restore in
dqm_lock/unlock.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

1cd106ec

drm/amdkfd: Reliably prevent reclaim-FS while holding DQM lock · efeaed4d

由 Felix Kuehling 提交于 7月 11, 2018

This is needed to prevent deadlocks when MMU notifiers run in
reclaim-FS context and take the DQM lock for userptr evictions.
Previously this was done by making all memory allocations under
DQM locks GFP_NOIO. This is error prone. Using
memalloc_nofs_save/restore will reliably affect all memory
allocations anywhere in the kernel while the DQM lock is held.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

efeaed4d

11 7月, 2018 1 次提交

drm/admkfd use modern ktime accessors · 0337976f

由 Arnd Bergmann 提交于 7月 11, 2018

getrawmonotonic64() and get_monotonic_boottime64() are deprecated
because of the nonstandard naming.

The replacement functions ktime_get_raw_ns() and ktime_get_boot_ns()
also simplify the callers.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

0337976f

24 4月, 2018 4 次提交

drm/amdkfd: fix build, select MMU_NOTIFIER · 7bbc0b95

由 Randy Dunlap 提交于 4月 13, 2018

When CONFIG_MMU_NOTIFIER is not enabled, struct mmu_notifier has an
incomplete type definition, which causes build errors.

../drivers/gpu/drm/amd/amdkfd/kfd_priv.h:607:22: error: field 'mmu_notifier' has incomplete type
../include/linux/kernel.h:979:32: error: dereferencing pointer to incomplete type
../include/linux/kernel.h:980:18: error: dereferencing pointer to incomplete type
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:434:2: error: implicit declaration of function 'mmu_notifier_unregister_no_release' [-Werror=implicit-function-declaration]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:435:2: error: implicit declaration of function 'mmu_notifier_call_srcu' [-Werror=implicit-function-declaration]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:438:21: error: variable 'kfd_process_mmu_notifier_ops' has initializer but incomplete type
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: error: unknown field 'release' specified in initializer
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: excess elements in struct initializer [enabled by default]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: (near initialization for 'kfd_process_mmu_notifier_ops') [enabled by default]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:534:2: error: implicit declaration of function 'mmu_notifier_register' [-Werror=implicit-function-declaration]
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Tested-by: NAnders Roxell <anders.roxell@linaro.org>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

7bbc0b95

drm/amdkfd: fix clock counter retrieval for node without GPU · 1cf6cc74

由 Andres Rodriguez 提交于 4月 10, 2018

Currently if a user requests clock counters for a node without a GPU
resource we will always return EINVAL.

Instead if no GPU resource is attached, fill the gpu_clock_counter
argument with zeroes so that we may proceed and return valid CPU
counters.
Signed-off-by: NAndres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

1cf6cc74

drm/amdkfd: Fix the error return code in kfd_ioctl_unmap_memory_from_gpu() · ded5e562

由 Wei Yongjun 提交于 3月 30, 2018

Passing NULL pointer to PTR_ERR will result in return value of 0
indicating success which is clearly not what it is intended here.
This patch returns -EINVAL instead.

v2: change ret code to -ENODEV

Fixes: 5ec7e028 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

ded5e562

drm/amdkfd: kfd_dev_is_large_bar() can be static · a4efd3a4

由 kbuild test robot 提交于 3月 28, 2018

Fixes: 5ec7e028 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

a4efd3a4

14 4月, 2018 1 次提交

drm/amdkfd: Remove vla · af47b390

由 Laura Abbott 提交于 4月 13, 2018

There's an ongoing effort to remove VLAs[1] from the kernel to eventually
turn on -Wvla. Switch to a constant value that covers all hardware.

[1] https://lkml.org/lkml/2018/3/7/621Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

af47b390

02 5月, 2018 12 次提交

drm/amdkfd: Add sanity checks in IRQ handlers · c129db12

由 Felix Kuehling 提交于 5月 01, 2018

Only accept interrupts from KFD VMIDs. Just checking for a PASID may
not be enough because amdgpu started using PASIDs to map VM faults
to processes.

Warn if an IRQ doesn't have a valid PASID (indicating a firmware bug).
Suggested-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Suggested-by: NOak Zeng <Oak.Zeng@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

c129db12

drm/amdkfd: Remove queue node when destroy queue failed · 2533f074

由 Shaoyun Liu 提交于 5月 01, 2018

HWS may hang in the middle of destroy queue, remove the queue from the
process queue list so it won't be freed again in the future
Signed-off-by: NShaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

2533f074

drm/amdkfd: Locking PM mutex while allocating IB buffer · bfdcbfd2

由 Ben Goz 提交于 5月 01, 2018

Signed-off-by: NBen Goz <ben.goz@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

bfdcbfd2

drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK · ccb76b14

由 Felix Kuehling 提交于 5月 01, 2018

The initialization is not necessary. amd-kfd-staging and ROCm
releases have worked without it for two years.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

ccb76b14

drm/amdkfd: Fix signal handling performance again · eeb27b7e

由 Felix Kuehling 提交于 5月 01, 2018

It turns out that idr_for_each_entry is really slow compared to just
iterating over the slots. Based on measurements the difference is
estimated to be about a factor 64. That means using idr_for_each_entry
is only worth it with very few allocated events.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

eeb27b7e

drm/amdkfd: Fix CP soft hang on APUs · f8ea72d0

由 Yong Zhao 提交于 5月 01, 2018

The problem happens on Raven and Carrizo. The context save handler
should not clear the high bits of PC_HI before extracting the bits
of IB_STS.

The bug is not relevant to VEGA10 until we enable demand paging.
Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

f8ea72d0

drm/amdkfd: Separate trap handler assembly code and its hex values · 0db54b24

由 Yong Zhao 提交于 5月 01, 2018

Since the assembly code is inside "#if 0", it is ineffective. Despite that,
during debugging, we need to change the assembly code, extract it into
a separate file and compile the new file into hex values using sp3.
That process also requires us to remove "#if 0" and modify lines starting
with "#", so that sp3 can successfully compile the new file.

With this change, all the above chore is no longer needed, and
cwsr_trap_handler_gfx*.asm can be directly used by sp3 to generate its
hex values.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

0db54b24

drm/amdkfd: Remove redundant include of amd-iommu.h · a2e94158

由 Felix Kuehling 提交于 5月 01, 2018

Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

a2e94158

drm/amdkfd: use %px to print user space address instead of %p · fa7e6514

由 Philip Yang 提交于 5月 01, 2018

Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

fa7e6514

drm/amdkfd: Use volatile MTYPE in default/alternate apertures · 2774c63e

由 Jay Cornwall 提交于 5月 01, 2018

MTYPE_NC_NV (0) marks scalar/vector L1 cache lines as non-volatile.
Cache lines loaded through these apertures are intended to be
invalidated before (and sometimes during) a dispatch. The non-volatile
qualifier prevents these cache lines from being distinguished from
those loaded through the private aperture.

Use MTYPE_NC (1) instead on both Gfx7 and Gfx8. This allows the
compiler to use the BUFFER_WBINVL1_VOL instruction and is a precursor
to automatic per-dispatch scalar/vector L1 volatile invalidation.
Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

2774c63e

drm/amdkfd: Reduce priority of context-saving waves before spin-wait · 87e6d4e0

由 Jay Cornwall 提交于 5月 01, 2018

Synchronization between context-saving wavefronts is achieved by
sending a SAVEWAVE message to the SPI and then spin-waiting for a
response. These spin-waiting wavefronts may inhibit the progress
of other wavefronts in the context save handler, leading to the
synchronization condition never being achieved.

Before spin-waiting reduce the priority of each wavefront to
guarantee foward progress in the others.
Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

87e6d4e0

drm/amdkfd: Dump HQD of HIQ · 24f48a42

由 Oak Zeng 提交于 5月 01, 2018

Signed-off-by: NOak Zeng <Oak.Zeng@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

24f48a42

24 4月, 2018 1 次提交

drm/amdkfd: Integer overflows in ioctl · 8feaccf7

由 Dan Carpenter 提交于 4月 24, 2018

args->n_devices is a u32 that comes from the user.  The multiplication
could overflow on 32 bit systems possibly leading to privilege
escalation.

Fixes: 5ec7e028 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: Dan Carpenter dan.carpenter@oracle.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

8feaccf7

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功