提交 · f19040ce418d6fb4837dba45a920c0ae0d5c698f · openeuler / Kernel

28 1月, 2021 4 次提交

habanalabs: modify memory functions signatures · f19040ce

由 Omer Shpigelman 提交于 12月 09, 2020

For consistency, modify all memory ioctl functions to get the ioctl
arguments structure rather than the arguments themselves.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

f19040ce

habanalabs: kernel doc format in memory functions · 3b762f55

由 Omer Shpigelman 提交于 12月 09, 2020

Change all memory functions documentation according to kernel doc
format.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

3b762f55

habanalabs: Init the VM module for kernel context · 8e39e75a

由 Ofir Bitton 提交于 11月 12, 2020

In order for reserving VA ranges for kernel memory, we need
to allow the VM module to be initiated with kernel context.
Signed-off-by: NOfir Bitton <obitton@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

8e39e75a

habanalabs: refactor MMU locks code · cb6ef0ee

由 Ohad Sharabi 提交于 11月 26, 2020

remove mmu_cache_lock as it protects a section which is already
protected by mmu_lock.

in addition, wrap mmu cache invalidate calls in hl_vm_ctx_fini with
mmu_lock.
Signed-off-by: NOhad Sharabi <osharabi@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

cb6ef0ee

12 1月, 2021 1 次提交

habanalabs: prevent soft lockup during unmap · 9488307a

由 Oded Gabbay 提交于 1月 11, 2021

When using Deep learning framework such as tensorflow or pytorch, there
are tens of thousands of host memory mappings. When the user frees
all those mappings at the same time, the process of unmapping and
unpinning them can take a long time, which may cause a soft lockup
bug.

To prevent this, we need to free the core to do other things during
the unmapping process. For now, we chose to do it every 32K unmappings
(each unmap is a single 4K page).
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

9488307a

30 11月, 2020 12 次提交

habanalabs: free host huge va_range if not used · 8e718f2e

由 Ofir Bitton 提交于 11月 26, 2020

If huge range is not valid, driver uses the host range also for
huge page allocations, but driver never frees its allocation.
This introduces a memory leak every time a user closes its context.
Signed-off-by: NOfir Bitton <obitton@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

8e718f2e

habanalabs: mmu map wrapper for sizes larger than a page · 5c05487f

由 Ofir Bitton 提交于 10月 22, 2020

We introduce a new wrapper which allows us to mmu map any size
to any host va_range available. In addition we remove duplicated
code from various places in driver and using this new wrapper
instead.
This wrapper supports mapping only contiguous physical
memory blocks and will be used for mappings that are done to the
driver ASID.
Signed-off-by: NOfir Bitton <obitton@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

5c05487f

habanalabs: support reserving aligned va block · 412c41fc

由 Ofir Bitton 提交于 11月 04, 2020

Add support for reserving va block with alignment different than
page size. This is a pre-requisite for allocations needed in future
ASICs
Signed-off-by: NOfir Bitton <obitton@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

412c41fc

habanalabs: move asic property to correct structure · 7f070c91

由 Oded Gabbay 提交于 11月 09, 2020

Whether an ASIC has MMU towards its DRAM is an ASIC property, so
move it to the asic fixed properties structure.
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

7f070c91

habanalabs: use host va range for internal pools · be91b91f

由 Ofir Bitton 提交于 10月 22, 2020

Instead of using a dedicated va range for each internal pool,
we introduce a new way for reserving a va block from an existing
va range. This is a more generic way of reserving va blocks for
future use.
Signed-off-by: NOfir Bitton <obitton@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

be91b91f

habanalabs: refactor mmu va_range db structure · 784b916d

由 Ofir Bitton 提交于 10月 22, 2020

Use an array of va_ranges instead of keeping each va_range separately,
we do this for better readability and in order to support access to
a specific range in a much elegant manner.
Signed-off-by: NOfir Bitton <obitton@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

784b916d

habanalabs: add 'needs reset' state in driver · 66a76401

由 Ofir Bitton 提交于 10月 05, 2020

The new state indicates that device should be reset in order
to re-gain funcionality.
This unique state can occur if reset_on_lockup is disabled
and an actual lockup has occurred.
Signed-off-by: NOfir Bitton <obitton@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

66a76401

habanalabs: Move repeatedly included headers to habanalabs.h · ba7e389c

由 Tomer Tayar 提交于 10月 25, 2020

Several header files are repeatedly included in many files.
Move these files to habanalabs.h which is included by all.
Signed-off-by: NTomer Tayar <ttayar@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

ba7e389c

habanalabs/gaudi: scrub all memory upon closing FD · 03df136b

由 farah kassabri 提交于 5月 06, 2020

In cases of multi-tenants, administrators may want to prevent data
leakage between users running on the same device one after another.

To do that the driver can scrub the internal memory (both SRAM and
DRAM) after a user finish to use the memory.

Because in GAUDI the driver allows only one application to use the
device at a time, it can scrub the memory when user app close FD.

In future devices where we have MMU on the DRAM, we can scrub the DRAM
memory with a finer granularity (page granularity) when the user
allocates the memory.

This feature is not supported in Goya.

To allow users that want to debug their applications, we add a kernel
module parameter to load the driver with this feature disabled.
Signed-off-by: Nfarah kassabri <fkassabri@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

03df136b

habanalabs/gaudi: monitor device memory usage · 3e622996

由 Oded Gabbay 提交于 10月 18, 2020

In GAUDI we don't have an MMU towards the HBM device memory. Therefore,
the user access that memory directly through physical address (via the
different engines) without the need to go through the driver to
allocate/free memory on the HBM.

For system monitoring purposes, the driver will keep track of the HBM
usage. This can be done as long as the user accurately reports the
allocations and releases of HBM memory, through the existing MEMORY
IOCTL uapi.
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

3e622996

habanalabs: don't init vm module if no MMU · f3a965c2

由 Oded Gabbay 提交于 10月 04, 2020

In case we are running without MMU enabled (debug mode), no need to
initialize the VM module in the driver.
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

f3a965c2

habanalabs: free host huge va_range if not used · c8c39fbd

由 Ofir Bitton 提交于 11月 26, 2020

If huge range is not valid, driver uses the host range also for
huge page allocations, but driver never frees its allocation.
This introduces a memory leak every time a user closes its context.
Signed-off-by: NOfir Bitton <obitton@habana.ai>
Reviewed-by: NOded Gabbay <ogabbay@kernel.org>
Signed-off-by: NOded Gabbay <ogabbay@kernel.org>

c8c39fbd

25 9月, 2020 1 次提交

habanalabs: correct an error message · fc6121e9

由 Oded Gabbay 提交于 9月 23, 2020

We don't try to allocate huge pages here so remove the huge word.
Reviewed-by: NTomer Tayar <ttayar@habana.ai>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

fc6121e9

22 9月, 2020 1 次提交

habanalabs: use smallest possible alignment for virtual addresses · 7c52fb0a

由 Omer Shpigelman 提交于 6月 28, 2020

Change the acquiring of a device virtual address for mapping by using the
smallest possible alignment, rather than the biggest, depending on the
page size used by the user for allocating the memory. This will lower the
virtual space memory consumption.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

7c52fb0a

22 8月, 2020 1 次提交

habanalabs: check correct vmalloc return code · 0839152f

由 Ofir Bitton 提交于 8月 11, 2020

vmalloc can return different return code than NULL and a valid
pointer. We must validate it in order to dereference a non valid
pointer.
Signed-off-by: NOfir Bitton <obitton@habana.ai>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

0839152f

29 7月, 2020 1 次提交

habanalabs: fix up absolute include instructions · 7b16a155

由 Greg Kroah-Hartman 提交于 7月 28, 2020

There's no need to try to be cute with the include file locations in the
Makefile, so just specify exactly where the files are.

Bonus is this fixes the problem of building with O= as well as trying to
just build the subdirectory alone.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Omer Shpigelman <oshpigelman@habana.ai>
Cc: Tomer Tayar <ttayar@habana.ai>
Cc: Moti Haimovski <mhaimovski@habana.ai>
Cc: Ofir Bitton <obitton@habana.ai>
Cc: Ben Segal <bpsegal20@gmail.com>
Cc: Christine Gharzuzi <cgharzuzi@habana.ai>
Cc: Pawel Piskorski <ppiskorski@habana.ai>
Link: https://lore.kernel.org/r/20200728171851.55842-1-gregkh@linuxfoundation.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7b16a155

25 7月, 2020 2 次提交

habanalabs: create common folder · 70b2f993

由 Oded Gabbay 提交于 7月 13, 2020

For internal needs of our CI we need to move all the common code into a
common folder instead of putting them in the root folder of the driver.

Same applies to the common header files under include/
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: NOmer Shpigelman <oshpigelman@habana.ai>

70b2f993

habanalabs: rephrase error messages · 0eab4f89

由 Oded Gabbay 提交于 6月 22, 2020

rephrase some error/warning/notice messages to make them more accessible to
ordinary users.

There is no need to print context ASID as the driver currently doesn't
support multiple contexts.
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: NTomer Tayar <ttayar@habana.ai>

0eab4f89

01 6月, 2020 1 次提交

habanalabs: initialize variable to default value · c68f1bae

由 Tomer Tayar 提交于 6月 01, 2020

Fix the following smatch error in unmap_device_va():
error: uninitialized symbol 'rc'.
Signed-off-by: NTomer Tayar <ttayar@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
Link: https://lore.kernel.org/r/20200601065648.8775-1-oded.gabbay@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c68f1bae

25 5月, 2020 1 次提交

habanalabs: handle MMU cache invalidation timeout · 8ff5f4fd

由 Omer Shpigelman 提交于 5月 24, 2020

MMU cache invalidation timeout indicates that the device is unstable and
therefore unusable.
Hence in such case do hard reset and return an error to the user if was
called from ioctl.
In addition, change the print to error level and rephrase its text.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

8ff5f4fd

24 3月, 2020 2 次提交

habanalabs: split the host MMU properties · 64a7e295

由 Omer Shpigelman 提交于 1月 05, 2020

Host memory may be allocated with huge pages.
A different virtual range may be used for mapping in this case.
Add Huge PCI MMU (HPMMU) properties to support it.
This patch is a prerequisite for future ASICs support and has no effect on
Goya ASIC as currently a single virtual host range is used for all page
sizes.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

64a7e295

habanalabs: flush only at the end of the map/unmap · 7fc40bca

由 Pawel Piskorski 提交于 12月 06, 2019

Optimize hl_mmu_map and hl_mmu_unmap by not calling flush(ctx)
within per-page loop.
Signed-off-by: NPawel Piskorski <ppiskorski@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

7fc40bca

21 11月, 2019 6 次提交

habanalabs: remove unnecessary checks · e604f551

由 Omer Shpigelman 提交于 11月 14, 2019

Now that the VA block free list is not updated on context close in order
to optimize this flow, no need in the sanity checks of the list contents
as these will fail for sure.
In addition, remove the "context closing with VA in use" print during hard
reset as this situation is a side effect of the failure that caused the
hard reset.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

e604f551

habanalabs: invalidate MMU cache only once · bea84c4d

由 Omer Shpigelman 提交于 11月 14, 2019

Reduce context close time by performing MMU cache invalidation once at the
end of the unmap loop rather in each iteration, in order to avoid hard
reset with open contexts.
Reset with open contexts can potentially lead to a kernel crash as the
generic pool of the MMU hops is destroyed while it is not empty because
some unmap operations are not done.
The commit affect mainly when running on simulator.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

bea84c4d

habanalabs: skip VA block list update in reset flow · 71c5e55e

由 Omer Shpigelman 提交于 11月 14, 2019

Reduce context close time by skipping the VA block free list update in
order to avoid hard reset with open contexts.
Reset with open contexts can potentially lead to a kernel crash as the
generic pool of the MMU hops is destroyed while it is not empty because
some unmap operations are not done.
The commit affect mainly when running on simulator.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

71c5e55e

habanalabs: split MMU properties to PCI/DRAM · 54bb6744

由 Omer Shpigelman 提交于 11月 14, 2019

Split the properties used for MMU mappings to DRAM and PCI (host) types.
This is a prerequisite for future ASICs support.
Note that in Goya ASIC, the PMMU and DMMU are the same (except of page
sizes) as only one MMU mechanism is used for both of the mapping types.
Hence this patch should not have any effect on current behavior.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

54bb6744

habanalabs: type specific MMU cache invalidation · 7b6e4ea0

由 Omer Shpigelman 提交于 11月 14, 2019

Add the ability to invalidate the necessary MMU cache only.
This ability is a prerequisite for future ASICs support.
Note that in Goya ASIC, a single cache is used for both host/DRAM
mappings and hence this patch should not have any effect on current
behavior.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

7b6e4ea0

habanalabs: re-factor memory module code · 7f74d4d3

由 Omer Shpigelman 提交于 8月 12, 2019

Some of the functions in the memory module code were too long and/or
contained multiple operations that are not always done together. Re-factor
the code by dividing those functions to smaller functions which are more
readable and maintainable.
Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

7f74d4d3

12 8月, 2019 1 次提交

habanalabs: fix DRAM usage accounting on context tear down · c8113756

由 Tomer Tayar 提交于 8月 04, 2019

The patch fix the DRAM usage accounting by adding a missing update of
the DRAM memory consumption, when a context is being torn down without an
organized release of the allocated memory.
Signed-off-by: NTomer Tayar <ttayar@habana.ai>
Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

c8113756

29 5月, 2019 2 次提交

habanalabs: de-couple MMU and VM module initialization · 37d68ce5

由 Oded Gabbay 提交于 5月 29, 2019

This patch initializes the MMU S/W structures before the VM S/W
structures, instead of doing that as part of the VM S/W initialization.

This is done because we need to configure some MMU mappings for the kernel
context, before the VM is initialized. The VM initialization can't be
moved earlier because it depends on the size of the DRAM, which is
retrieved from the device CPU. Communication with the device CPU will
require the MMU mappings to be configured and hence the de-coupling.
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

37d68ce5

habanalabs: fix bug in checking huge page optimization · d7241701

由 Oded Gabbay 提交于 5月 28, 2019

This patch fix a bug in the mmu code that checks whether we can use huge
page mappings for host pages.

The code is supposed to enable huge page mappings only if ALL DMA
addresses are aligned to 2MB AND the number of pages in each DMA chunk is
a modulo of the number of pages in 2MB. However, the code ignored the
first requirement for the first DMA chunk.

This patch fix that issue by making sure the requirement of address
alignment is validated against all DMA chunks.
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

d7241701

01 5月, 2019 1 次提交

habanalabs: Manipulate DMA addresses in ASIC functions · 94cb669c

由 Tomer Tayar 提交于 5月 01, 2019

Routing device accesses to the host memory requires the usage of a base
offset, which is canceled by the iATU just before leaving the device.
The value of the base offset might be distinctive between different ASIC
types.
The manipulation of the addresses is currently used throughout the
driver code, and one should be aware to it whenever providing a host
memory address to the device.
This patch removes this manipulation from the driver common code, and
moves it to the ASIC specific functions that are responsible for
host memory allocation/mapping.
Signed-off-by: NTomer Tayar <ttayar@habana.ai>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

94cb669c

06 4月, 2019 1 次提交

habanalabs: improve IOCTLs behavior when disabled or reset · 3f5398cf

由 Oded Gabbay 提交于 4月 06, 2019

This patch makes some improvement in how IOCTLs behave when the device is
disabled or under reset.

The new code checks, at the start of every IOCTL, if the device is
disabled or in reset. If so, it prints an appropriate kernel message and
returns -EBUSY to user-space.

In addition, the code modifies the location of where the
hard_reset_pending flag is being set or cleared:

1. It is now cleared immediately after the reset *tear-down* flow is
   finished but before the re-initialization flow begins.

2. It is being set in the remove function of the device, to make the
   behavior the same with the hard-reset flow

There are two exceptions to the disable or in reset check:

1. The HL_INFO_DEVICE_STATUS opcode in the INFO IOCTL. This opcode allows
   the user to inquire about the status of the device, whether it is
   operational, in reset or malfunction (disabled). If the driver will
   block this IOCTL, the user won't be able to retrieve the status in
   case of malfunction or in reset.

2. The WAIT_FOR_CS IOCTL. This IOCTL allows the user to inquire about the
   status of a CS. We want to allow the user to continue to do so, even if
   we started a soft-reset process because it will allow the user to get
   the correct error code for each CS he submitted.
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

3f5398cf

04 4月, 2019 1 次提交

habanalabs: split mmu/no-mmu code paths in memory ioctl · 54303a1a

由 Oded Gabbay 提交于 4月 04, 2019

To make the memory ioctl code more readable, this patch moves the
legacy/debug code path of mmu-disabled to a separate function, which is
called (if necessary) from the main memory ioctl function.
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

54303a1a

01 4月, 2019 1 次提交

habanalabs: prevent CPU soft lockup on Palladium · e850b89f

由 Oded Gabbay 提交于 3月 31, 2019

Unmapping ptes in the device MMU on Palladium can take a long time, which
can cause a kernel BUG of CPU soft lockup.

This patch minimize the chances for this bug by sleeping a little between
unmapping ptes.
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

e850b89f

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功