- 30 11月, 2020 1 次提交
-
-
由 Oded Gabbay 提交于
No need to print when the driver starts to initialize the H/W. Drivers should be silent when everything is OK. Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
- 04 11月, 2020 1 次提交
-
-
由 Arnd Bergmann 提交于
All throughout the driver, normal kernel pointers are stored as 'u64' struct members, which is kind of silly and requires casting through a uintptr_t to void* every time they are used. There is one line that missed the intermediate uintptr_t case, which leads to a compiler warning: drivers/misc/habanalabs/common/command_buffer.c: In function 'hl_cb_mmap': drivers/misc/habanalabs/common/command_buffer.c:512:44: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 512 | rc = hdev->asic_funcs->cb_mmap(hdev, vma, (void *) cb->kernel_address, Rather than adding one more cast, just fix the type and remove all the other casts. Fixes: 0db57535 ("habanalabs: make use of dma_mmap_coherent") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
- 22 9月, 2020 13 次提交
-
-
由 Oded Gabbay 提交于
Future F/W versions will have enhanced security measures and the driver won't be able to do certain configurations that it always did and those configurations will be done by the firmware. We use the firmware's preboot version to determine whether security measures are enabled or not. Because we need this very early in our code, the read of the preboot version is moved to the earliest possible place, right after the device's PCI initialization. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Tomer Tayar 提交于
There are cases in which the device should access the host memory of a CB through the device MMU, and thus this memory should be mapped. The patch adds a flag to the CB IOCTL, in which a user can ask the driver to perform the mapping when creating a CB. The mapping is allowed only if a dedicated VA range was allocated for the specific ASIC. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Tomer Tayar 提交于
Future changes require using a context while handling a command buffer, and thus need to save the context in the command buffer object. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Moti Haimovski 提交于
This commit adds the number of HOPs supported by the device to the device MMU properties. Signed-off-by: NMoti Haimovski <mhaimovski@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
For consistency with GAUDI code, add check of the relevant flag in the device structure before resetting the GOYA device in case of firmware event. Reviewed-by: NTomer Tayar <ttayar@habana.ai> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
Old function pointer that was left when the call to this function pointer was removed. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
There were a couple of comments where the name ArmCP was still used. Rename it to CPU-CP. In addition, rename ArmCP or ARM in log messages to "device CPU". Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Hillf Danton 提交于
Add dma_mmap_coherent() for goya and gaudi to match their use of dma_alloc_coherent(), see the Link tag for why. Link: https://lore.kernel.org/lkml/20200609091727.GA23814@lst.de/ Cc: Christoph Hellwig <hch@lst.de> Cc: Zhang Li <li.zhang@bitmain.com> Cc: Ding Z Nan <oshack@hotmail.com> Signed-off-by: NHillf Danton <hdanton@sina.com> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
When shifting a boolean variable by more than 31 bits and putting the result into a u64 variable, we need to cast the boolean into unsigned 64 bits to prevent possible overflow. Reported-by: Nkernel test robot <lkp@intel.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
ArmCP mandates that the device CPU is always an ARM processor, which might be wrong in the future. Most of this change is an internal renaming of variables, functions and defines but there are two entries in sysfs which have armcp in their names. Add identical cpucp entries but don't remove yet the armcp entries. Those will be deprecated next year. Add the documentation about it in sysfs documentation. Signed-off-by: NMoti Haimovski <mhaimovski@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 farah kassabri 提交于
change busy engines bitmask to 64 bits in order to represent more engines, needed for future ASIC support. Signed-off-by: Nfarah kassabri <fkassabri@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Dotan Barak 提交于
If there is a failure during the testing of a queue, to ease up debugging - print the queue id. Signed-off-by: NDotan Barak <dbarak@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Ofir Bitton 提交于
Update firmware header with new API for getting pcie info such as tx/rx throughput and replay counter. These counters are needed by customers for monitor and maintenance of multiple devices. Add new opcodes to the INFO ioctl to retrieve these counters. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 22 8月, 2020 1 次提交
-
-
由 Ofir Bitton 提交于
During command buffer parsing, driver extracts packet id from user buffer. Driver must validate this packet id, since it is being used in order to extract information from internal structures. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 29 7月, 2020 2 次提交
-
-
由 kernel test robot 提交于
Signed-off-by: Nkernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20200729000313.GA14680@e442e3f624c4Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Greg Kroah-Hartman 提交于
There's no need to try to be cute with the include file locations in the Makefile, so just specify exactly where the files are. Bonus is this fixes the problem of building with O= as well as trying to just build the subdirectory alone. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Omer Shpigelman <oshpigelman@habana.ai> Cc: Tomer Tayar <ttayar@habana.ai> Cc: Moti Haimovski <mhaimovski@habana.ai> Cc: Ofir Bitton <obitton@habana.ai> Cc: Ben Segal <bpsegal20@gmail.com> Cc: Christine Gharzuzi <cgharzuzi@habana.ai> Cc: Pawel Piskorski <ppiskorski@habana.ai> Link: https://lore.kernel.org/r/20200728171851.55842-1-gregkh@linuxfoundation.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 7月, 2020 7 次提交
-
-
由 Ofir Bitton 提交于
Create a device MMU-mapped internal command buffer pool, in order to allow the driver to allocate CBs for the signal/wait operations that are fetched by the queues when they are configured with the user's address space ID. We must pre-map this internal pool due to performance issues. This pool is needed for future ASIC support and it is currently unused in GOYA and GAUDI. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
Currently the driver halts the device CPU in the halt engines function, which halts all the engines of the ASIC. The problem is that if later on we stop the reset process (due to inability to clean memory mappings in time), the CPU will remain in halt mode. This creates many issues, such as thermal/power control and FLR handling. Therefore, move the halting of the device CPU to the very end of the reset process, just before writing to the registers to initiate the reset. In addition, the driver now needs to send a message to the device F/W to disable it from sending interrupts to the host machine because during halt engines function the driver disables the MSI/MSI-X interrupts. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NTomer Tayar <ttayar@habana.ai>
-
由 Ofir Bitton 提交于
Currently the amount of maximum queues is statically configured. Using a static value is causing redundunt cycles when traversing all queues and consumes more memory than actually needed. In this patch we configure each asic with the exact number of queues needed. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Ofir Bitton 提交于
Divide iATU initialization into inbound/outbound methods. We must separate it in order to enable different match mode per PCIe region. In addition, added support for PCI address match mode. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Adam Aharon 提交于
The profiler needs to know the PLL values for correctly showing the profiling data. Because our firmware can use different PLL configurations, we need to read the PLL values from the ASIC to pass them to the profiler. Signed-off-by: NAdam Aharon <aaharon@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Ofir Bitton 提交于
Currently sync stream is limited only for external queues. We want to remove this constraint by adding a new queue property dedicated for sync stream. In addition we move the initialization and reset methods to the common code since we can re-use them with slight changes. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Ofir Bitton 提交于
Training schemes requires much more concurrent command submissions than inference does. In addition, training command submissions can be completed in a non serialized manner. Hence, we add support in which each ASIC will be able to configure the amount of concurrent pending command submissions, rather than use a predefined amount. This change will enhance performance by allowing the user to add more concurrent work without waiting for the previous work to be completed. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 11 7月, 2020 2 次提交
-
-
由 Oded Gabbay 提交于
We see that sometimes the CPU in GOYA and GAUDI is occupied by the power/thermal loop and can't answer requests from the driver fast enough. Therefore, to avoid false notifications on timeouts, increase the timeout to 4 seconds on each message sent to the device CPU. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NTomer Tayar <ttayar@habana.ai>
-
由 Oded Gabbay 提交于
For debugging purposes, we need to allow the root user better control of the clock gating feature of the DMA and compute engines. Therefore, change the clock gating debugfs interface to be bitmask instead of true/false. Each bit represents a different engine, according to gaudi_engine_id enum. See debugfs documentation for more details. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NOmer Shpigelman <oshpigelman@habana.ai>
-
- 01 7月, 2020 1 次提交
-
-
由 Lee Jones 提交于
Seeing as 'addr' is unsigned, it would be impossible for the assigned value to be anything other than zero or positive. Squashes the following W=1 warnings: drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_read32’: drivers/misc/habanalabs/goya/goya.c:3945:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 3945 | } else if ((addr >= DRAM_PHYS_BASE) && | ^~ drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_write32’: drivers/misc/habanalabs/goya/goya.c:4002:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 4002 | } else if ((addr >= DRAM_PHYS_BASE) && | ^~ drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_read64’: drivers/misc/habanalabs/goya/goya.c:4047:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 4047 | } else if ((addr >= DRAM_PHYS_BASE) && | ^~ drivers/misc/habanalabs/goya/goya.c: In function ‘goya_debugfs_write64’: drivers/misc/habanalabs/goya/goya.c:4091:19: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] 4091 | } else if ((addr >= DRAM_PHYS_BASE) && | ^~ drivers/misc/habanalabs/pci.c:328: warning: Excess function parameter 'dma_mask' description in 'hl_pci_set_dma_mask' drivers/misc/habanalabs/goya/goya_coresight.c: In function ‘goya_debug_coresight’: drivers/misc/habanalabs/goya/goya_coresight.c:643:6: warning: variable ‘val’ set but not used [-Wunused-but-set-variable] 643 | u32 val; | ^~~ Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Tomer Tayar <ttayar@habana.ai> Signed-off-by: NLee Jones <lee.jones@linaro.org> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200701085853.164358-7-lee.jones@linaro.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 5月, 2020 2 次提交
-
-
由 Omer Shpigelman 提交于
MMU cache invalidation timeout indicates that the device is unstable and therefore unusable. Hence in such case do hard reset and return an error to the user if was called from ioctl. In addition, change the print to error level and rephrase its text. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
GAUDI does not support soft-reset as it leaves the NIC ports in an awkward state, where their QMANs were reset but the NIC itself is still working. In addition, there is not much sense in doing soft-reset when training is done on multiple GAUDIs. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NTomer Tayar <ttayar@habana.ai>
-
- 19 5月, 2020 6 次提交
-
-
由 Rachel Stahl 提交于
The patch_cb_size is not updated for Wreg32 in its validate function, so updated in goya_validate_cb. Signed-off-by: NRachel Stahl <rstahl@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
In Gaudi there is a feature of clock gating certain engines. Therefore, add this property to the device structure. In addition, due to a limitation of this feature, the driver needs to dynamically enable or disable this feature during run-time. Therefore, add ASIC interface functions to enable/disable this function from the common code. Moreover, this feature must be turned off when the user wishes to debug the ASIC by reading/writing registers and/or memory through the driver's debugfs. Therefore, add an option to enable/disable clock gating via the debugfs interface. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Omer Shpigelman 提交于
Coresight is not supported on simulator, therefore add a boolean for checking that (currently used by un-upstreamed code). Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Omer Shpigelman 提交于
This feature requires handling h/w resources which are a bit different from one ASIC to the other. Therefore, we need to define a set of interfaces the ASIC code provides to the common code to signal, wait, reset sync object and to reset and init a queue. As this feature is not supported in Goya, provide an empty implementation of those functions. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Ofir Bitton 提交于
Load CPU device boot loader during driver boot time in order to avoid flash write for every boot loader update. To preserve backward-compatibility, skip the device boot load if the device doesn't request it. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Tomer Tayar 提交于
Add a new opcode to the INFO IOCTL that retrieves the device time alongside the host time, to allow a user application that want to measure device time together with host time (such as a profiler) to synchronize these times. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 17 5月, 2020 4 次提交
-
-
由 Oded Gabbay 提交于
When we have DMA QMAN with multiple streams, we need to know whether the command buffer contains at least one DMA packet in order to configure the barriers correctly when adding the 2xMSG_PROT at the end of the JOB. If there is no DMA packet, then there is no need to put engine barrier. This is relevant only for GAUDI as GOYA doesn't have streams so the engine can't be busy by another stream. Reviewed-by: NTomer Tayar <ttayar@habana.ai> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
Retrieve from the firmware the DMA mask value we need to set according to the device's PCI controller configuration. This is needed when working on POWER9 machines, as the device's PCI controller is configured in a different way in those machines. Reviewed-by: NTomer Tayar <ttayar@habana.ai> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
Move the code of device CPU initialization from being ASIC-Dependent to common code. In addition, add support for the new error reporting feature of the firmware boot code. Reviewed-by: NOmer Shpigelman <oshpigelman@habana.ai> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Omer Shpigelman 提交于
We want to remove the following restrictions/assumptions in our driver: 1. The H/W queue index is also the completion queue index. 2. The H/W queue index is also the IRQ number of the completion queue. 3. All queues of the same type have consecutive indexes. Therefore we add the support for H/W queues of the same type with nonconsecutive indexes and completion queue index and IRQ number different than the H/W queue index. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-