- 09 12月, 2017 1 次提交
-
-
由 Felix Kuehling 提交于
dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on ASIC information. Also allow building KFD without IOMMUv2 support. This is still useful for dGPUs and prepares for enabling KFD on architectures that don't support AMD IOMMUv2. v2: * Centralize IOMMUv2 code to avoid #ifdefs in too many places v3: * Imply AMD_IOMMU_V2 in Kconfig Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NChristian Konig <christian.koenig@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 05 1月, 2018 3 次提交
-
-
由 Felix Kuehling 提交于
v2: remove needs_iommu field as it doesn't exists CC: linux-pci@vger.kernel.org Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Felix Kuehling 提交于
Some dGPUs don't support HWS. Allow them to use a per-device sched_policy that may be different from the global default. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Felix Kuehling 提交于
This will be needed for most dGPUs. CC: linux-pci@vger.kernel.org Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 28 11月, 2017 1 次提交
-
-
由 Felix Kuehling 提交于
Allow HWS to to execute multiple processes on the hardware concurrently. The number of concurrent processes is limited by the number of VMIDs allocated to the HWS. A module parameter can be used for limiting this further or turn it off altogether (mainly for debugging purposes). Signed-off-by: NYong Zhao <yong.zhao@amd.com> Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 15 11月, 2017 1 次提交
-
-
由 Felix Kuehling 提交于
This hardware feature allows the GPU to preempt shader execution in the middle of a compute wave, save the state and restore it later to resume execution. Memory for saving the state is allocated per queue in user mode and the address and size passed to the create_queue ioctl. The size depends on the number of waves that can be in flight simultaneously on a given ASIC. Signed-off-by: NShaoyun.liu <shaoyun.liu@amd.com> Signed-off-by: NYong Zhao <yong.zhao@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 28 10月, 2017 1 次提交
-
-
由 Andres Rodriguez 提交于
In systems under heavy load the IH work may experience significant scheduling delays. Under load + system workqueue: Max Latency: 7.023695 ms Avg Latency: 0.263994 ms Under load + high priority workqueue: Max Latency: 1.162568 ms Avg Latency: 0.163213 ms Further work is required to measure the impact of per-cpu settings on IH performance. Signed-off-by: NAndres Rodriguez <andres.rodriguez@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 27 9月, 2017 1 次提交
-
-
由 Felix Kuehling 提交于
PASID management is moving into KGD. Limiting the PASID range to the number of doorbell pages is no longer practical. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 9月, 2017 3 次提交
-
-
由 Yong Zhao 提交于
The hard-coded values related to VMID were removed in KFD, as those values can be calculated in the KFD initialization function. v2: remove unnecessary local variable Signed-off-by: NYong Zhao <yong.zhao@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Yong Zhao 提交于
When we do suspend/resume through "sudo pm-suspend" while there is HSA activity running, upon resume we will encounter HWS hanging, which is caused by memory read/write failures. The root cause is that when suspend, we neglected to unbind pasid from kfd device. Another major change is that the bind/unbinding is changed to be performed on a per process basis, instead of whether there are queues in dqm. v2: - free IOMMU device if kfd_bind_processes_to_device fails in kfd_resume - add comments to kfd_bind/unbind_processes_to/from_device - minor cleanups Signed-off-by: NYong Zhao <yong.zhao@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Yong Zhao 提交于
The idea is to let kfd init and resume function share the same code path as much as possible, rather than to have two copies of almost identical code. That way improves the code readability and maintainability. Signed-off-by: NYong Zhao <yong.zhao@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 16 8月, 2017 9 次提交
-
-
由 Felix Kuehling 提交于
To match current firmware. The map process packet has been extended to support scratch. This is a non-backwards compatible change and it's about two years old. So no point keeping the old version around conditionally. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Yong Zhao 提交于
v2: Turned WARN into dev_warn and made the message more helpful Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Felix Kuehling 提交于
In most cases, BUG_ONs can be replaced with WARN_ON with an error return. In some void functions just turn them into a WARN_ON and possibly an early exit. v2: * Cleaned up error handling in pm_send_unmap_queue * Removed redundant WARN_ON in kfd_process_destroy_delayed Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Felix Kuehling 提交于
gtt_sa_bitmap is accessed by bitmap functions, which operate on longs. Therefore the array should be allocated in long units. Also round up in case the number of bits is not a multiple of BITS_PER_LONG. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Felix Kuehling 提交于
Handle errors in doorbell aperture initialization instead of BUG_ON. iounmap doorbell aperture during finalization. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Felix Kuehling 提交于
Remove BUG_ONs that check for NULL pointer arguments that are dereferenced in the same function. Dereferencing the NULL pointer will generate a BUG anyway, so the explicit check is redundant and unnecessary overhead. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Kent Russell 提交于
Upstream prefers the !x notation to x==NULL or x==false. Along those lines change the ==true or !=NULL references as well. Also make the references to !x the same, excluding () for readability. Signed-off-by: NKent Russell <kent.russell@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Kent Russell 提交于
Consolidate log commands so that dev_info(NULL, "Error...") uses the more accurate pr_err, remove the module name from the log (can be seen via dynamic debugging with +m), and the function name (can be seen via dynamic debugging with +f). We also don't need debug messages saying what function we're in. Those can be added by devs when needed Don't print vendor and device ID in error messages. They are typically the same for all GPUs in a multi-GPU system. So this doesn't add any value to the message. Lastly, remove parentheses around %d, %i and 0x%llX. According to kernel.org: "Printing numbers in parentheses (%d) adds no value and should be avoided." Signed-off-by: NKent Russell <kent.russell@amd.com> Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Kent Russell 提交于
Using checkpatch.pl -f <file> showed a number of style issues. This patch addresses as many of them as possible. Some long lines have been left for readability, but attempts to minimize them have been made. v2: Broke long lines in gfx_v7 get_fw_version Signed-off-by: NKent Russell <kent.russell@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 14 7月, 2017 1 次提交
-
-
由 Jay Cornwall 提交于
Dead code. Change-Id: Ic0bb1bcca87e96bc5e8fa9894727b0de152e8818 Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 01 6月, 2017 1 次提交
-
-
由 Andres Rodriguez 提交于
Update the KGD to KFD interface to allow sharing pipes with queue granularity instead of pipe granularity. This allows for more interesting pipe/queue splits. v2: fix overflow check for res.queue_mask v3: fix shift overflow when setting res.queue_mask v4: fix comment in is_pipeline_enabled() v5: clamp res.queue_mask to the first MEC only Reviewed-by: NEdward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAndres Rodriguez <andresx7@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 20 7月, 2015 1 次提交
-
-
由 Ben Goz 提交于
This patch adds the PCI IDs of supported CZ devices to the supported_devices structure in amdkfd. That structure is used during the amdkfd probing stage, to check if the currently probed device is eligible to be handled by amdkfd. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 07 6月, 2015 1 次提交
-
-
由 Oded Gabbay 提交于
This patch adds two missing properties initializations to the device info structure of CZ. As we don't have CZ support yet, it isn't critical, but its important to fix this now instead of forgetting about it later. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 03 6月, 2015 2 次提交
-
-
由 Yair Shachar 提交于
This patch adds the skeleton H/W debugger module support. This code enables registration and unregistration of a single HSA process at a time. The module saves the process's pasid and use it to verify that only the registered process is allowed to execute debugger operations through the kernel driver. v2: rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it Signed-off-by: NYair Shachar <yair.shachar@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Yair Shachar 提交于
This patch adds support for static user-mode queues in QCM. Queues which are designated as static can NOT be preempted by the CP microcode when it is executing its scheduling algorithm. This is needed for supporting the debugger feature, because we can't allow the CP to preempt queues which are currently being debugged. The number of queues that can be designated as static is limited by the number of HQDs (Hardware Queue Descriptors). Signed-off-by: NYair Shachar <yair.shachar@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 19 5月, 2015 3 次提交
-
-
由 Alexey Skidanov 提交于
This patch adds Peripheral Page Request (PPR) failure processing and reporting. Bad address or pointer to a system memory block with inappropriate read/write permission cause such PPR failure during a user queue processing. PPR request handling is done by IOMMU driver notifying AMDKFD module on PPR failure. The process triggering a PPR failure will be notified by appropriate event or SIGTERM signal will be sent to it. v3: - Change all bool fields in struct kfd_memory_exception_failure to uint32_t Signed-off-by: NAlexey Skidanov <alexey.skidanov@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Andrew Lewycky 提交于
This patch adds the events module (kfd_events.c) and the interrupt handle module for Kaveri (cik_event_interrupt.c). The patch updates the interrupt_is_wanted(), so that it now calls the interrupt isr function specific for the device that received the interrupt. That function(implemented in cik_event_interrupt.c) returns whether this interrupt is of interest to us or not. The patch also updates the interrupt_wq(), so that it now calls the device's specific wq function, which checks the interrupt source and tries to signal relevant events. v2: Increase limit of signal events to 4096 per process Remove bitfields from struct cik_ih_ring_entry Rename radeon_kfd_event_mmap to kfd_event_mmap Add debug prints to allocate_free_slot and allocate_signal_page Make allocate_event_notification_slot return a correct value Add warning prints to create_signal_event Remove error print from IOCTL path Reformatted debug prints in kfd_event_mmap Map correct size (as received from mmap) in kfd_event_mmap v3: Reduce limit of signal events back to 256 per process Fix allocation of kernel memory for signal events Signed-off-by: NAndrew Lewycky <Andrew.Lewycky@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Andrew Lewycky 提交于
This patch adds the interrupt handling module, kfd_interrupt.c, and its related members in different data structures to the amdkfd driver. The amdkfd interrupt module maintains an internal interrupt ring per amdkfd device. The internal interrupt ring contains interrupts that needs further handling. The extra handling is deferred to a later time through a workqueue. There's no acknowledgment for the interrupts we use. The hardware simply queues a new interrupt each time without waiting. The fixed-size internal queue means that it's possible for us to lose interrupts because we have no back-pressure to the hardware. However, only interrupts that are "wanted" by amdkfd, are copied into the amdkfd s/w interrupt ring, in order to minimize the chances for overflow of the ring. Signed-off-by: NAndrew Lewycky <Andrew.Lewycky@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 25 3月, 2015 1 次提交
-
-
由 Xihan Zhang 提交于
The current code can only support one kgd instance. We have to support multiple kgd instances in one system. i.e two amdgpu or two radeon or one amdgpu + one radeon or more than two kgd instances. Signed-off-by: NXihan Zhang <xihan.zhang@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 18 1月, 2015 1 次提交
-
-
由 Oded Gabbay 提交于
This patch replaces the two current amdkfd module parameters with a new one. The current parameters that are being replaced are: - Maximum number of HSA processes - Maximum number of queues per process The new parameter that replaces them is called "Maximum queues per device" This replacement achieves two goals: - Allows the user to have as many HSA processes as it wants (until a maximum of 512 HSA processes in Kaveri). - Removes the limitation the user had on maximum number of queues per HSA process. E.g. the user can now have processes which only have one queue and other processes which have hundreds of queues, while before the user couldn't have more than 128 queues per process (as default). The default value of the new parameter is 4096 (32 * 128, which were the defaults of the old parameters). There is almost no additional GART memory required for the default case. As a reminder, this amount of queues requires a little bit below 4MB of GART memory. v2: In addition, This patch defines a new counter for queues accounting in the DQM structure. This is done because the current counter only counts active queues which allows the user to create more queues than the max_num_of_queues_per_device module parameter allows. However, we need the current counter for the runlist packet build process, so the solution is to have a dedicated counter for this accounting. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NBen Goz <ben.goz@amd.com>
-
- 10 1月, 2015 4 次提交
-
-
由 Oded Gabbay 提交于
This patch changes the calls to allocate the gart memory for amdkfd from the old interface (radeon_sa) to the new one (kfd_gtt_sa) The new gart sub-allocator is initialized with chunk size equal to 512 bytes. This is because the KV MQD is 512 Bytes and most of the sub-allocations are MQDs. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oded Gabbay 提交于
This patch makes the gart's buffer size calculation more accurate. This buffer is needed per GPU. It takes into account maximum number of MQDs, runlist packets, kernel queues and reserves 512KB for other misc allocations. The total size is just shy of 4MB, for 32 processes and 128 queues per process, which are the defaults for amdkfd kernel module parameters. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oded Gabbay 提交于
This patch adds new kfd gtt sub-allocator functions that service the amdkfd driver when it wants to use gtt memory. The sub-allocator uses a bitmap to handle the memory area that was transferred to it during init. It divides the memory area into chunks, according to chunk size parameter. The allocation function will allocate contiguous chunks from that memory area, according to the requested size. If the requested size is smaller than the chunk size, a single chunk will be allocated. v2: Do some more verifications on parameters that are passed into kfd_gtt_sa_init() Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alexey Skidanov 提交于
This patch adds the number of watch points to the node capabilities in the topology module Signed-off-by: NAlexey Skidanov <Alexey.Skidanov@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 08 1月, 2015 1 次提交
-
-
由 Michel Dänzer 提交于
The work queue couldn't reliably prevent the SW ring buffer from overflowing, so dmesg was spammed by kfd kfd: Interrupt ring overflow, dropping interrupt. messages when running e.g. the Atlantis Substance demo from https://wiki.unrealengine.com/Linux_Demos on Kaveri. Since the SW ring buffer doesn't actually do anything at this point, just remove it for now. When actual interrupt processing code is added to amdkfd, it should try to do things immediately and only defer to work queues when necessary. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 12 1月, 2015 1 次提交
-
-
由 Oded Gabbay 提交于
This patch does some re-org on the device_queue_manager structure. It takes out all the function pointers from the structure and puts them in a new structure, called device_queue_manager_ops. Then, it puts an instance of that structure inside device_queue_manager. This re-org is done to prepare the DQM module to support more than one AMD APU (Kaveri). Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 1月, 2015 1 次提交
-
-
由 Oded Gabbay 提交于
Instead of creating a BUG if trying to free a NULL GART sub-allocation object, just return 0 (success). This is done to mirror behavior of kfree. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 10 11月, 2014 1 次提交
-
-
由 Oded Gabbay 提交于
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 01 1月, 2015 1 次提交
-
-
由 Ben Goz 提交于
This patch adds a new property to kfd_device_info structure. That structure holds information that is H/W specific. The new property is called asic_family and its purpose is to distinguish between different asic families in amdkfd operations, mainly in QCM (queue control & management) This patch also adds a new enum, to select different ASICs. We set the current kfd_device_info instance as Kaveri and create a new instance which describes the new AMD APU, codenamed 'Carrizo'. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-