- 28 1月, 2021 4 次提交
-
-
由 Omer Shpigelman 提交于
For consistency, modify all memory ioctl functions to get the ioctl arguments structure rather than the arguments themselves. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Omer Shpigelman 提交于
Change all memory functions documentation according to kernel doc format. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Ofir Bitton 提交于
In order for reserving VA ranges for kernel memory, we need to allow the VM module to be initiated with kernel context. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Ohad Sharabi 提交于
remove mmu_cache_lock as it protects a section which is already protected by mmu_lock. in addition, wrap mmu cache invalidate calls in hl_vm_ctx_fini with mmu_lock. Signed-off-by: NOhad Sharabi <osharabi@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
- 12 1月, 2021 1 次提交
-
-
由 Oded Gabbay 提交于
When using Deep learning framework such as tensorflow or pytorch, there are tens of thousands of host memory mappings. When the user frees all those mappings at the same time, the process of unmapping and unpinning them can take a long time, which may cause a soft lockup bug. To prevent this, we need to free the core to do other things during the unmapping process. For now, we chose to do it every 32K unmappings (each unmap is a single 4K page). Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
- 30 11月, 2020 12 次提交
-
-
由 Ofir Bitton 提交于
If huge range is not valid, driver uses the host range also for huge page allocations, but driver never frees its allocation. This introduces a memory leak every time a user closes its context. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Ofir Bitton 提交于
We introduce a new wrapper which allows us to mmu map any size to any host va_range available. In addition we remove duplicated code from various places in driver and using this new wrapper instead. This wrapper supports mapping only contiguous physical memory blocks and will be used for mappings that are done to the driver ASID. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Ofir Bitton 提交于
Add support for reserving va block with alignment different than page size. This is a pre-requisite for allocations needed in future ASICs Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Oded Gabbay 提交于
Whether an ASIC has MMU towards its DRAM is an ASIC property, so move it to the asic fixed properties structure. Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Ofir Bitton 提交于
Instead of using a dedicated va range for each internal pool, we introduce a new way for reserving a va block from an existing va range. This is a more generic way of reserving va blocks for future use. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Ofir Bitton 提交于
Use an array of va_ranges instead of keeping each va_range separately, we do this for better readability and in order to support access to a specific range in a much elegant manner. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Ofir Bitton 提交于
The new state indicates that device should be reset in order to re-gain funcionality. This unique state can occur if reset_on_lockup is disabled and an actual lockup has occurred. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Tomer Tayar 提交于
Several header files are repeatedly included in many files. Move these files to habanalabs.h which is included by all. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 farah kassabri 提交于
In cases of multi-tenants, administrators may want to prevent data leakage between users running on the same device one after another. To do that the driver can scrub the internal memory (both SRAM and DRAM) after a user finish to use the memory. Because in GAUDI the driver allows only one application to use the device at a time, it can scrub the memory when user app close FD. In future devices where we have MMU on the DRAM, we can scrub the DRAM memory with a finer granularity (page granularity) when the user allocates the memory. This feature is not supported in Goya. To allow users that want to debug their applications, we add a kernel module parameter to load the driver with this feature disabled. Signed-off-by: Nfarah kassabri <fkassabri@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Oded Gabbay 提交于
In GAUDI we don't have an MMU towards the HBM device memory. Therefore, the user access that memory directly through physical address (via the different engines) without the need to go through the driver to allocate/free memory on the HBM. For system monitoring purposes, the driver will keep track of the HBM usage. This can be done as long as the user accurately reports the allocations and releases of HBM memory, through the existing MEMORY IOCTL uapi. Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Oded Gabbay 提交于
In case we are running without MMU enabled (debug mode), no need to initialize the VM module in the driver. Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
由 Ofir Bitton 提交于
If huge range is not valid, driver uses the host range also for huge page allocations, but driver never frees its allocation. This introduces a memory leak every time a user closes its context. Signed-off-by: NOfir Bitton <obitton@habana.ai> Reviewed-by: NOded Gabbay <ogabbay@kernel.org> Signed-off-by: NOded Gabbay <ogabbay@kernel.org>
-
- 25 9月, 2020 1 次提交
-
-
由 Oded Gabbay 提交于
We don't try to allocate huge pages here so remove the huge word. Reviewed-by: NTomer Tayar <ttayar@habana.ai> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 22 9月, 2020 1 次提交
-
-
由 Omer Shpigelman 提交于
Change the acquiring of a device virtual address for mapping by using the smallest possible alignment, rather than the biggest, depending on the page size used by the user for allocating the memory. This will lower the virtual space memory consumption. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 22 8月, 2020 1 次提交
-
-
由 Ofir Bitton 提交于
vmalloc can return different return code than NULL and a valid pointer. We must validate it in order to dereference a non valid pointer. Signed-off-by: NOfir Bitton <obitton@habana.ai> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 29 7月, 2020 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
There's no need to try to be cute with the include file locations in the Makefile, so just specify exactly where the files are. Bonus is this fixes the problem of building with O= as well as trying to just build the subdirectory alone. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Omer Shpigelman <oshpigelman@habana.ai> Cc: Tomer Tayar <ttayar@habana.ai> Cc: Moti Haimovski <mhaimovski@habana.ai> Cc: Ofir Bitton <obitton@habana.ai> Cc: Ben Segal <bpsegal20@gmail.com> Cc: Christine Gharzuzi <cgharzuzi@habana.ai> Cc: Pawel Piskorski <ppiskorski@habana.ai> Link: https://lore.kernel.org/r/20200728171851.55842-1-gregkh@linuxfoundation.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 7月, 2020 2 次提交
-
-
由 Oded Gabbay 提交于
For internal needs of our CI we need to move all the common code into a common folder instead of putting them in the root folder of the driver. Same applies to the common header files under include/ Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NOmer Shpigelman <oshpigelman@habana.ai>
-
由 Oded Gabbay 提交于
rephrase some error/warning/notice messages to make them more accessible to ordinary users. There is no need to print context ASID as the driver currently doesn't support multiple contexts. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Reviewed-by: NTomer Tayar <ttayar@habana.ai>
-
- 01 6月, 2020 1 次提交
-
-
由 Tomer Tayar 提交于
Fix the following smatch error in unmap_device_va(): error: uninitialized symbol 'rc'. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200601065648.8775-1-oded.gabbay@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 5月, 2020 1 次提交
-
-
由 Omer Shpigelman 提交于
MMU cache invalidation timeout indicates that the device is unstable and therefore unusable. Hence in such case do hard reset and return an error to the user if was called from ioctl. In addition, change the print to error level and rephrase its text. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 24 3月, 2020 2 次提交
-
-
由 Omer Shpigelman 提交于
Host memory may be allocated with huge pages. A different virtual range may be used for mapping in this case. Add Huge PCI MMU (HPMMU) properties to support it. This patch is a prerequisite for future ASICs support and has no effect on Goya ASIC as currently a single virtual host range is used for all page sizes. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Pawel Piskorski 提交于
Optimize hl_mmu_map and hl_mmu_unmap by not calling flush(ctx) within per-page loop. Signed-off-by: NPawel Piskorski <ppiskorski@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 21 11月, 2019 6 次提交
-
-
由 Omer Shpigelman 提交于
Now that the VA block free list is not updated on context close in order to optimize this flow, no need in the sanity checks of the list contents as these will fail for sure. In addition, remove the "context closing with VA in use" print during hard reset as this situation is a side effect of the failure that caused the hard reset. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Omer Shpigelman 提交于
Reduce context close time by performing MMU cache invalidation once at the end of the unmap loop rather in each iteration, in order to avoid hard reset with open contexts. Reset with open contexts can potentially lead to a kernel crash as the generic pool of the MMU hops is destroyed while it is not empty because some unmap operations are not done. The commit affect mainly when running on simulator. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Omer Shpigelman 提交于
Reduce context close time by skipping the VA block free list update in order to avoid hard reset with open contexts. Reset with open contexts can potentially lead to a kernel crash as the generic pool of the MMU hops is destroyed while it is not empty because some unmap operations are not done. The commit affect mainly when running on simulator. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Omer Shpigelman 提交于
Split the properties used for MMU mappings to DRAM and PCI (host) types. This is a prerequisite for future ASICs support. Note that in Goya ASIC, the PMMU and DMMU are the same (except of page sizes) as only one MMU mechanism is used for both of the mapping types. Hence this patch should not have any effect on current behavior. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Omer Shpigelman 提交于
Add the ability to invalidate the necessary MMU cache only. This ability is a prerequisite for future ASICs support. Note that in Goya ASIC, a single cache is used for both host/DRAM mappings and hence this patch should not have any effect on current behavior. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Omer Shpigelman 提交于
Some of the functions in the memory module code were too long and/or contained multiple operations that are not always done together. Re-factor the code by dividing those functions to smaller functions which are more readable and maintainable. Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 12 8月, 2019 1 次提交
-
-
由 Tomer Tayar 提交于
The patch fix the DRAM usage accounting by adding a missing update of the DRAM memory consumption, when a context is being torn down without an organized release of the allocated memory. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 29 5月, 2019 2 次提交
-
-
由 Oded Gabbay 提交于
This patch initializes the MMU S/W structures before the VM S/W structures, instead of doing that as part of the VM S/W initialization. This is done because we need to configure some MMU mappings for the kernel context, before the VM is initialized. The VM initialization can't be moved earlier because it depends on the size of the DRAM, which is retrieved from the device CPU. Communication with the device CPU will require the MMU mappings to be configured and hence the de-coupling. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
由 Oded Gabbay 提交于
This patch fix a bug in the mmu code that checks whether we can use huge page mappings for host pages. The code is supposed to enable huge page mappings only if ALL DMA addresses are aligned to 2MB AND the number of pages in each DMA chunk is a modulo of the number of pages in 2MB. However, the code ignored the first requirement for the first DMA chunk. This patch fix that issue by making sure the requirement of address alignment is validated against all DMA chunks. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 01 5月, 2019 1 次提交
-
-
由 Tomer Tayar 提交于
Routing device accesses to the host memory requires the usage of a base offset, which is canceled by the iATU just before leaving the device. The value of the base offset might be distinctive between different ASIC types. The manipulation of the addresses is currently used throughout the driver code, and one should be aware to it whenever providing a host memory address to the device. This patch removes this manipulation from the driver common code, and moves it to the ASIC specific functions that are responsible for host memory allocation/mapping. Signed-off-by: NTomer Tayar <ttayar@habana.ai> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 06 4月, 2019 1 次提交
-
-
由 Oded Gabbay 提交于
This patch makes some improvement in how IOCTLs behave when the device is disabled or under reset. The new code checks, at the start of every IOCTL, if the device is disabled or in reset. If so, it prints an appropriate kernel message and returns -EBUSY to user-space. In addition, the code modifies the location of where the hard_reset_pending flag is being set or cleared: 1. It is now cleared immediately after the reset *tear-down* flow is finished but before the re-initialization flow begins. 2. It is being set in the remove function of the device, to make the behavior the same with the hard-reset flow There are two exceptions to the disable or in reset check: 1. The HL_INFO_DEVICE_STATUS opcode in the INFO IOCTL. This opcode allows the user to inquire about the status of the device, whether it is operational, in reset or malfunction (disabled). If the driver will block this IOCTL, the user won't be able to retrieve the status in case of malfunction or in reset. 2. The WAIT_FOR_CS IOCTL. This IOCTL allows the user to inquire about the status of a CS. We want to allow the user to continue to do so, even if we started a soft-reset process because it will allow the user to get the correct error code for each CS he submitted. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 04 4月, 2019 1 次提交
-
-
由 Oded Gabbay 提交于
To make the memory ioctl code more readable, this patch moves the legacy/debug code path of mmu-disabled to a separate function, which is called (if necessary) from the main memory ioctl function. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 01 4月, 2019 1 次提交
-
-
由 Oded Gabbay 提交于
Unmapping ptes in the device MMU on Palladium can take a long time, which can cause a kernel BUG of CPU soft lockup. This patch minimize the chances for this bug by sleeping a little between unmapping ptes. Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-