- 22 1月, 2015 2 次提交
-
-
由 Oded Gabbay 提交于
Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NJammy Zhou <Jammy.Zhou@amd.com>
-
由 Oded Gabbay 提交于
Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NJammy Zhou <Jammy.Zhou@amd.com>
-
- 18 1月, 2015 1 次提交
-
-
由 Ben Goz 提交于
Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 1月, 2015 3 次提交
-
-
由 Oded Gabbay 提交于
This patch completely removes the sync_with_hw() because it was broken and actually there is no point of using it. This function was used to: - Make sure that the submitted packet to the HIQ (which is a kernel queue) was read by the CP. However, it was discovered that the method this function used to do that (checking wptr == rptr) is not consistent with how the actual CP firmware works in all cases. - Make sure that the queue is empty before issuing the next packet. To achieve that, the function blocked amdkfd from continuing until the recently submitted packet was consumed. However, the acquire_packet_buffer() already checks if there is enough room for a new packet so calling sync_with_hw() is redundant. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oded Gabbay 提交于
Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oded Gabbay 提交于
In order not to occupy the current core and thus prevent the core from servicing IOMMU PPR requests, this patch replaces the call in DQM to cpu_relax() with a call to schedule(). Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 1月, 2015 1 次提交
-
-
由 Ben Goz 提交于
This patch fixes a minor bug in allocate_hqd(), where the loop run from the next-to-allocate pipe until the number of pipes. This is wrong because we need to consider the possibility where next-to-allocate pipe is not 0, and thus, the for-loop only checks part of the pipes and doesn't wrap-around, as it supposed to do. Therefore, we add another counting variable to make sure we go over all the pipes, regardless of where we start to look at the first iteration of the loop. This bug only affected non-HWS mode. In HWS mode, the CP fw is responsible for allocating the HQD. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 1月, 2015 11 次提交
-
-
由 Oded Gabbay 提交于
This patch change the calls throughout the amdkfd driver from the old kfd-->kgd interface to the new kfd gtt sa inside amdkfd v2: change the new call in sdma code that appeared because of the sdma feature Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oded Gabbay 提交于
This patch changes the calls to allocate the gart memory for amdkfd from the old interface (radeon_sa) to the new one (kfd_gtt_sa) The new gart sub-allocator is initialized with chunk size equal to 512 bytes. This is because the KV MQD is 512 Bytes and most of the sub-allocations are MQDs. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oded Gabbay 提交于
This patch makes the gart's buffer size calculation more accurate. This buffer is needed per GPU. It takes into account maximum number of MQDs, runlist packets, kernel queues and reserves 512KB for other misc allocations. The total size is just shy of 4MB, for 32 processes and 128 queues per process, which are the defaults for amdkfd kernel module parameters. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oded Gabbay 提交于
This patch adds new kfd gtt sub-allocator functions that service the amdkfd driver when it wants to use gtt memory. The sub-allocator uses a bitmap to handle the memory area that was transferred to it during init. It divides the memory area into chunks, according to chunk size parameter. The allocation function will allocate contiguous chunks from that memory area, according to the requested size. If the requested size is smaller than the chunk size, a single chunk will be allocated. v2: Do some more verifications on parameters that are passed into kfd_gtt_sa_init() Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oded Gabbay 提交于
This patch adds new fields to kfd_dev struct that are necessary for the new kfd gtt sa module Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ben Goz 提交于
This patch passes the correct queue type to pqm_create_queue() instead of a fixed KFD_QUEUE_TYPE_COMPUTE type. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ben Goz 提交于
This patch adds a check to the create queue ioctl path, which identifies SDMA queue type that is sent by userspace. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ben Goz 提交于
This patch adds support for SDMA user-mode queues to the QCM - the Queue management system that manages queues-per-device and queues-per-process. v2: Remove calls to interface function that initializes sdma engines. v3: Use the new names of some of the defines. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ben Goz 提交于
This patch adds support for SDMA mqd operations: - init_mqd_sdma - uninit_mqd_sdma - load_mqd_sdma - update_mqd_sdma - destroy_mqd_sdma - is_occupied_sdma It also adds SDMA queue information to some private structures of amdkfd. v3: Use the new names of some of the defines. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alexey Skidanov 提交于
This patch splits the current kfd_get_process_device_data() to two functions, one that specifically creates a pdd and another one which just do lookup. This is done to enhance the readability and maintainability of the code. Signed-off-by: NAlexey Skidanov <Alexey.Skidanov@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
由 Alexey Skidanov 提交于
This patch adds the number of watch points to the node capabilities in the topology module Signed-off-by: NAlexey Skidanov <Alexey.Skidanov@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 08 1月, 2015 2 次提交
-
-
由 Oded Gabbay 提交于
Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
由 Michel Dänzer 提交于
The work queue couldn't reliably prevent the SW ring buffer from overflowing, so dmesg was spammed by kfd kfd: Interrupt ring overflow, dropping interrupt. messages when running e.g. the Atlantis Substance demo from https://wiki.unrealengine.com/Linux_Demos on Kaveri. Since the SW ring buffer doesn't actually do anything at this point, just remove it for now. When actual interrupt processing code is added to amdkfd, it should try to do things immediately and only defer to work queues when necessary. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 07 1月, 2015 3 次提交
-
-
由 Oded Gabbay 提交于
This patch changes kfd_ioctl() to be very similar to drm_ioctl(). The patch defines an array of amdkfd_ioctls, which maps IOCTL definition to the ioctl function. The kfd_ioctl() uses that mapping to call the appropriate ioctl function, through a function pointer. This patch also declares a new typedef for the ioctl function pointer. v2: Renamed KFD_COMMAND_(START|END) to AMDKFD_... Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Acked-by: NChristian König <christian.koenig@amd.com>
-
由 Oded Gabbay 提交于
This patch reformats the ioctl definitions in kfd_ioctl.h to be similar to the drm ioctls definition style. v2: Renamed KFD_COMMAND_(START|END) to AMDKFD_... Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Acked-by: NChristian König <christian.koenig@amd.com>
-
由 Oded Gabbay 提交于
This patch moves the copy_to_user() and copy_from_user() calls from the different ioctl functions in amdkfd to the general kfd_ioctl() function, as this is a common code for all ioctls. This was done according to example taken from drm_ioctl.c Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
- 05 1月, 2015 1 次提交
-
-
由 Ben Goz 提交于
This patch fixes a bug where deallocate_vmid() didn't actually unmap the VMID<-->PASID mapping (in the registers). That can cause undefined behavior. This bug only occurs in non-HWS mode. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 12月, 2014 1 次提交
-
-
由 Sasha Levin 提交于
Commit "amdkfd: use sizeof(long) granularity for the pasid bitmask" calculated the number of longs it will need, but ended up allocating that number of bytes rather than longs. Fix that silly error and allocate the amount of data really required. Signed-off-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 05 1月, 2015 1 次提交
-
-
由 Ben Goz 提交于
This patch fixes a bug in DQM, where the MQD of a newly created compute queue is not loaded to an HQD slot. As a result, the CP never reads packets from this queue. This bug happens only in non-HWS (hardware scheduling) mode. In HWS mode, the CP is responsible of loading MQDs to HQDs slots. Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 12月, 2014 1 次提交
-
-
由 Ben Goz 提交于
Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 06 12月, 2014 1 次提交
-
-
由 Oded Gabbay 提交于
This patch checks if the process that opens the /dev/kfd device is 32-bit process. If so, it returns -EPERM and prints a warning message in dmesg. This is done to prevent 32-bit user processes from using amdkfd, and hence, HSA features. AMD's HSA userspace stack will also support only 64-bit processes on Linux. Reviewed-by: NAlexey Skidanov <alexey.skidanov@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 05 12月, 2014 1 次提交
-
-
由 Oded Gabbay 提交于
Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 04 12月, 2014 1 次提交
-
-
由 Oded Gabbay 提交于
In function acquire_packet_buffer() we may return -ENOMEM. In that case, we should set the *buffer_ptr to NULL, so that calling functions which check the *buffer_ptr value as a criteria for success, will know that acquire_packet_buffer() failed. Reviewed-by: NAlexey Skidanov <alexey.skidanov@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 03 12月, 2014 2 次提交
-
-
由 Sasha Levin 提交于
srcu callbacks are running in atomic context, we can't allocate using __GFP_WAIT. Signed-off-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
由 Sasha Levin 提交于
All the bit operations (such as find_first_zero_bit()) read sizeof(long) bytes at a time. If we allocated less than sizeof(long) bytes for the bitmask we would be accessing invalid memory when working with the bitmask. Change the allocator to allocate sizeof(long) multiples for the bitmask. Signed-off-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 02 12月, 2014 1 次提交
-
-
由 Ben Goz 提交于
Original code sent always 0 as the index number of the node. This patch fixes this bug by sending a variable which is incremented per node. Signed-off-by: NBen Goz <ben.goz@amd.com> Reviewed-by: NOded Gabbay <oded.gabbay@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 08 12月, 2014 1 次提交
-
-
由 Oded Gabbay 提交于
This patch fixes a device QCM bug, where the number of queues were not counted correctly for the operation of update queue. The count was incorrect as there was no regard to the previous state of the queue. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 02 12月, 2014 1 次提交
-
-
由 Ben Goz 提交于
This patch starts to add support for the VI APU in the KQ (kernel queue) module. Because most (more than 90%) of the KQ code is shared among AMD's APUs, we chose a design that performs most/all the code in the shared KQ file (kfd_kernel_queue.c). If there is H/W specific code to be executed, than it is written in an asic-specific extension function for that H/W. That asic-specific extension function is called from the shared function at the appropriate time. This requires that for every asic-specific extension function that is implemented in a specific ASIC, there will be an equivalent implementation in ALL ASICs, even if those implementations are just stubs. That way we achieve: - Maintainability: by having one copy of most of the code, we only need to fix bugs at one locations - Readability: very clear what is the shared code and what is done per ASIC - Extensibility: very easy to add new H/W specific files/functions Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 1月, 2015 3 次提交
-
-
由 Oded Gabbay 提交于
This patch does some re-org on the kernel_queue structure. It takes out all the function pointers from the structure and puts them in a new structure, called kernel_queue_ops. Then, it puts an instance of that structure inside kernel_queue. This re-org is done to prepare the KQ module to support more than one AMD APU (Kaveri). Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ben Goz 提交于
This patch starts to add support for the VI APU in the DQM module. Because most (more than 90%) of the DQM code is shared among AMD's APUs, we chose a design that performs most/all the code in the shared DQM file (kfd_device_queue_manager.c). If there is H/W specific code to be executed, than it is written in an asic-specific extension function for that H/W. That asic-specific extension function is called from the shared function at the appropriate time. This requires that for every asic-specific extension function that is implemented in a specific ASIC, there will be an equivalent implementation in ALL ASICs, even if those implementations are just stubs. That way we achieve: - Maintainability: by having one copy of most of the code, we only need to fix bugs at one locations - Readability: very clear what is the shared code and what is done per ASIC - Extensibility: very easy to add new H/W specific files/functions Signed-off-by: NBen Goz <ben.goz@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oded Gabbay 提交于
This patch does some re-org on the device_queue_manager structure. It takes out all the function pointers from the structure and puts them in a new structure, called device_queue_manager_ops. Then, it puts an instance of that structure inside device_queue_manager. This re-org is done to prepare the DQM module to support more than one AMD APU (Kaveri). Signed-off-by: NOded Gabbay <oded.gabbay@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 1月, 2015 1 次提交
-
-
由 Oded Gabbay 提交于
Instead of creating a BUG if trying to free a NULL GART sub-allocation object, just return 0 (success). This is done to mirror behavior of kfree. Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-
- 26 11月, 2014 1 次提交
-
-
由 Dan Carpenter 提交于
This is dead code. We don't need to unbind here, we can just return directly. Reviewed-by: NOded Gabbay <oded.gabbay@amd.com> Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
-