- 05 9月, 2019 11 次提交
-
-
由 Oded Gabbay 提交于
This patch re-factors the device_setup_cdev() function to make it more generic. It doesn't manipulate members of the driver's internal device structure but instead works only on the arguments that are sent to it. This is in preparation for using this function to create an additional char device per ASIC. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Oded Gabbay 提交于
This patch adds a new list to the driver's device structure. The list will keep the file private data structures that the driver creates when a user process opens the device. This change is needed because it is useless to try to count how many FD are open. Instead, track our own private data structure per open file and once it is released, remove it from the list. As long as the list is not empty, it means we have a user that can do something with our device. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Oded Gabbay 提交于
This patch renames the "user_ctx" field in the device structure to "compute_ctx". This better reflects the meaning of this context. In addition, we also check in the ctx_fini() that the debug mode should be disabled only if the context being destroyed is the compute context. This has no effect right now as we only have a single process and a single context, but this makes the code more ready for multiple process support. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Oded Gabbay 提交于
When the user query the dram usage of a context, show it the dram usage of its context, not the user context that is currently running on the device. This has no effect right now as we only have a single process and a single context, but this makes the code more ready for multiple process support. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Oded Gabbay 提交于
This patch calls the kill user process function after we rollback the in-flight CSs. This is because the user process can't be closed while there are open CSs. Therefore, there is no point of sending it a SIGKILL before we do the rollback CS part. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Oded Gabbay 提交于
This patch adds a field to the context's structure that will hold a unique handle for the context. This will be needed when the user will create the context. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Chuhong Yuan 提交于
Instead of using to_pci_dev + pci_get_drvdata, use dev_get_drvdata to make code simpler. Signed-off-by: NChuhong Yuan <hslester96@gmail.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
The ability of setting power management properties by the system administrator (through sysfs properties) is only relevant for the GOYA ASIC. Therefore, move the relevant sysfs properties to the GOYA sysfs specific file, to make the properties appear in sysfs only for GOYA cards. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NOmer Shpigelman <oshpigelman@habana.ai>
-
由 Oded Gabbay 提交于
In the driver timeout functions, we give the simulator a factor of 10 in the timeout. This was necessary when the requested timeout is small but if it was a few seconds, this can result in a very large timeout which is unnecessary. This patch caps the maximum timeout of the simulator to 10 seconds, which is our largest timeout in the code. That is more then enough for anything the simulator is doing. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NOmer Shpigelman <oshpigelman@habana.ai>
-
由 Oded Gabbay 提交于
When rejecting CS because of too many in-flight CS, print a debug message about it as it useful to know when the user is debugging (it indicates a back-pressure from the driver as the device is not fast enough to consume the CS) Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NOmer Shpigelman <oshpigelman@habana.ai>
-
由 Oded Gabbay 提交于
This property has attempted to show the number of open file descriptors on the device. This was a stupid and futile attempt so remove this property completely. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 12 8月, 2019 6 次提交
-
-
由 Ben Segal 提交于
When unmasking IRQs inside the ASIC, the driver passes an array of all the IRQ to unmask. The ASIC's CPU is working in LE so when running in a BE host, the driver needs to do the proper endianness swapping when preparing this array. In addition, this patch also fixes the endianness of a couple of kernel log debug messages that print values of packets Signed-off-by: NBen Segal <bpsegal20@gmail.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
The PQs of internal H/W queues (QMANs) can be located in different memory areas for different ASICs. Therefore, when writing PQEs, we need to use the correct function according to the location of the PQ. e.g. if the PQ is located in the device's memory (SRAM or DRAM), we need to use memcpy_toio() so it would work in architectures that have separate address ranges for IO memory. This patch makes the code that writes the PQE to be ASIC-specific so we can handle this properly per ASIC. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Tested-by: NBen Segal <bpsegal20@gmail.com>
-
由 Ben Segal 提交于
This patch fix the CQ irq handler to work in hosts with BE architecture. It adds the correct endian-swapping macros around the relevant memory accesses. Signed-off-by: NBen Segal <bpsegal20@gmail.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Ben Segal 提交于
Packets that arrive from the user and need to be parsed by the driver are assumed to be in LE format. This patch fix all the places where the code handles these packets and use the correct endianness macros to handle them, as the driver handles the packets in CPU format (LE or BE depending on the arch). Signed-off-by: NBen Segal <bpsegal20@gmail.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Tomer Tayar 提交于
The patch fix the DRAM usage accounting by adding a missing update of the DRAM memory consumption, when a context is being torn down without an organized release of the allocated memory. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Tomer Tayar 提交于
In case kernel context init fails during device initialization, both hl_ctx_put() and kfree() are called, ending with a double free of the kernel context. Calling kfree() is needed only when a failure happens between the allocation of the kernel context and its initialization, so move it to there and remove it from the error flow. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 29 7月, 2019 2 次提交
-
-
由 Ben Segal 提交于
This patch fix a bug in the host memory polling macro. The bug is that the memory being polled can be written by the device, which always writes it in LE. However, if the host is running Linux in BE mode, we need to convert the value that was written by the device before matching it to the required value that the caller has given to the macro. Signed-off-by: NBen Segal <bpsegal20@gmail.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Ben Segal 提交于
writeX macros might perform byte-swapping in BE architectures. As our F/W is in LE format, we need to make sure no byte-swapping will occur. There is a standard kernel function (called memcpy_toio) for copying data to I/O area which is used in a lot of drivers to download F/W to PCIe adapters. That function also makes sure the data is copied "as-is", without byte-swapping. This patch use that function to copy the F/W to the GOYA ASIC instead of writeX macros. Signed-off-by: NBen Segal <bpsegal20@gmail.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 01 7月, 2019 3 次提交
-
-
由 Tomer Tayar 提交于
The information which is currently provided as a response to the "HL_INFO_HW_IDLE" IOCTL is merely a general boolean value. This patch extends it and provides also a bitmask that indicates which of the device engines are busy. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Tomer Tayar 提交于
Command submissions sent to the device are composed of command buffers which are targeted to different device engines, like DMA and compute entities. When a command submission gets stuck, knowing in which engine the stuck is, is crucial for debugging. This patch adds a debugfs node that exports this information, by displaying the engines' various registers that assemble their idle/busy status. The information retrieval is based on the is_device_idle ASIC function. The printout in this function, of the first detected busy engine, is removed because it becomes redundant in the presence of the more elaborated info of the new debugfs node. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Tomer Tayar 提交于
The patch updates the device idle check: - Add reading the DMA core status register, because it is possible that a QMAN has finished its work but the DMA itself is still running. - Remove the MME shadow status check, as the MME ARCH status register includes the status of all MME shadows. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 27 6月, 2019 1 次提交
-
-
由 Oded Gabbay 提交于
VRHOT event from the F/W indicates the device has reached a temperature of 100 Celsius degrees. In this case, the driver should only print this information to the kernel log. The device will shutdown itself automatically when reaching 125 degrees. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 08 7月, 2019 1 次提交
-
-
由 Arnd Bergmann 提交于
dma_addr_t might be different sizes depending on the configuration, so we cannot print it as %llx: drivers/misc/habanalabs/goya/goya.c: In function 'goya_sw_init': drivers/misc/habanalabs/goya/goya.c:698:21: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'dma_addr_t' {aka 'unsigned int'} [-Werror=format=] Use the special %pad format string. This requires passing the argument by reference. Fixes: 2a51558c ("habanalabs: remove DMA mask hack for Goya") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 20 6月, 2019 1 次提交
-
-
由 Arnd Bergmann 提交于
We cannot cast a 64-bit integer to a pointer on 32-bit architectures without a warning: drivers/misc/habanalabs/habanalabs_ioctl.c: In function 'debug_coresight': drivers/misc/habanalabs/habanalabs_ioctl.c:143:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] input = memdup_user((const void __user *) args->input_ptr, Use the macro that was defined for this purpose. Fixes: 315bc055 ("habanalabs: add new IOCTL for debug, tracing and profiling") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 16 6月, 2019 1 次提交
-
-
由 Tomer Tayar 提交于
Allows using the addr/data32 debugfs nodes to access a device VA of a host mapped memory when the IOMMU is disabled. Due to the possible large amount of a user host mapped memory, the driver doesn't maintain a database with the host addresses per device VA. When the IOMMU is disabled, this missing info is being overcome by simply using phys_to_virt(). However, this is not useful when the IOMMU is enabled, and thus the enforced limitation. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 04 6月, 2019 1 次提交
-
-
由 Tomer Tayar 提交于
The trace buffer address is 40 bits wide. The end of the buffer is set in the RWP register (lower 32 bits), and in the RWPHI register (upper 8 bits). Currently only the lower 32 bits are read, and this patch fixes it and concatenates the upper 8 bits to the output address. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 03 6月, 2019 1 次提交
-
-
由 Tomer Tayar 提交于
The debugfs interface for accessing DRAM virtual addresses currently uses the 12 LSBs of a virtual address as an offset. However, it should use the 20 LSBs in case the device MMU page size is 2MB instead of 4KB. This patch fixes the offset calculation to be based on the page size. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 31 5月, 2019 1 次提交
-
-
由 Oded Gabbay 提交于
This patch checks if an MMU mapping is erroneous in that the physical address that is being mapped is NOT divisible by the page size. If that thing happens, then the H/W will issue a transaction which will be translated to a wrong address, because part of the address will not be taken (the remainder of address/page size). Because the physical address is being handled by the driver, a WARN is suitable here as it implies a bug in the driver code itself and not a user bug. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 29 5月, 2019 6 次提交
-
-
由 Oded Gabbay 提交于
This patch removes the non-standard DMA mask setting for Goya. Now that the device CPU goes through the MMU, we are not limited to allocating the CPU accessible memory area in the address space of under 39 bits. Therefore, we don't need to set the DMA masking twice during initialization, a practice that is not working on POWER architecture. The patch sets the DMA mask to 48 bits once during the initialization. The address of the CPU accessible memory area is configured to the MMU and the matching VA is given to the device CPU. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
This patch configures the Goya CPU to actually go through the MMU for translation. The configuration is done after the configuration of the relevant MMU mappings. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
This patch adds the necessary MMU mappings for the Goya CPU to access the device DRAM and the host memory. The first 256MB of the device DRAM is being mapped. That's where the F/W is running. The 2MB area located on the host memory for the purpose of communication between the driver and the device CPU is also being mapped. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
This patch initializes the MMU structures for the kernel context. This is needed before we can configure mappings for the kernel context. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
This patch initializes the MMU S/W structures before the VM S/W structures, instead of doing that as part of the VM S/W initialization. This is done because we need to configure some MMU mappings for the kernel context, before the VM is initialized. The VM initialization can't be moved earlier because it depends on the size of the DRAM, which is retrieved from the device CPU. Communication with the device CPU will require the MMU mappings to be configured and hence the de-coupling. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
This patch changes the order of H/W IP initializations. The MMU needs to be initialized before the device CPU queues, because the CPU will go through the ASIC MMU in order to reach the host memory (where the queues are located). Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 06 6月, 2019 1 次提交
-
-
由 Oded Gabbay 提交于
This patch changes the print of an error message about mis-configuration of the debug infrastructure to be rate-limited, to prevent flooding of kernel log, as these configuration requests can come at a high rate. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 04 6月, 2019 1 次提交
-
-
由 Oded Gabbay 提交于
This patch removes two code sections in the common code that contain code which is only relevant for simulator support (which is not upstreamed). This removal saves the need to update this code upstream, which is not needed anyway. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 30 5月, 2019 3 次提交
-
-
由 Dalit Ben Zoor 提交于
unsecured registers can be changed by the user, and hence should be restored to their default values in context switch Signed-off-by: NDalit Ben Zoor <dbenzoor@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Dalit Ben Zoor 提交于
On context switch we need to ensure that each user is not be affected by other user, so we need to clear sync objects and monitors in context switch instead of in restore_phase_topology function. Signed-off-by: NDalit Ben Zoor <dbenzoor@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Dalit Ben Zoor 提交于
Set protection bits for some tpc registers that should to be secured. Signed-off-by: NDalit Ben Zoor <dbenzoor@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-