1. 21 11月, 2019 1 次提交
  2. 05 9月, 2019 10 次提交
  3. 12 8月, 2019 1 次提交
    • O
      habanalabs: fix endianness handling for internal QMAN submission · b9040c99
      Oded Gabbay 提交于
      The PQs of internal H/W queues (QMANs) can be located in different memory
      areas for different ASICs. Therefore, when writing PQEs, we need to use
      the correct function according to the location of the PQ. e.g. if the PQ
      is located in the device's memory (SRAM or DRAM), we need to use
      memcpy_toio() so it would work in architectures that have separate
      address ranges for IO memory.
      
      This patch makes the code that writes the PQE to be ASIC-specific so we
      can handle this properly per ASIC.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      Tested-by: NBen Segal <bpsegal20@gmail.com>
      b9040c99
  4. 29 7月, 2019 1 次提交
  5. 01 7月, 2019 2 次提交
  6. 29 5月, 2019 1 次提交
    • O
      habanalabs: add MMU mappings for Goya CPU · 95b5a8b8
      Oded Gabbay 提交于
      This patch adds the necessary MMU mappings for the Goya CPU to access the
      device DRAM and the host memory.
      
      The first 256MB of the device DRAM is being mapped. That's where the F/W
      is running.
      
      The 2MB area located on the host memory for the purpose of communication
      between the driver and the device CPU is also being mapped.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      95b5a8b8
  7. 25 5月, 2019 1 次提交
    • O
      habanalabs: halt debug engines on user process close · 89225ce4
      Omer Shpigelman 提交于
      This patch fix a potential bug where a user's process has closed
      unexpectedly without disabling the debug engines. In that case, the debug
      engines might continue running but because the user's MMU mappings are
      going away, we will get page fault errors.
      
      This behavior is also opposed to the general rule where nothing runs on
      the device after the user process closes.
      
      The patch stops the debug H/W engines upon process termination and thus
      makes sure nothing runs on the device after the process goes away.
      Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      89225ce4
  8. 17 5月, 2019 1 次提交
    • O
      habanalabs: don't limit packet size for device CPU · cbb10f1e
      Oded Gabbay 提交于
      This patch removes a limitation on the maximum packet size that is read by
      the device CPU as that limitation is not needed.
      
      Therefore, the patch also removes an elaborate calculation that is based
      on this limitation which is also not needed now. Instead, use a fixed
      value for the memory pool size of the packets.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      cbb10f1e
  9. 13 5月, 2019 1 次提交
  10. 12 5月, 2019 1 次提交
  11. 09 5月, 2019 1 次提交
    • O
      habanalabs: change polling functions to macros · a08b51a9
      Oded Gabbay 提交于
      This patch changes two polling functions to macros, in order to make their
      API the same as the standard readl_poll_timeout so we would be able to
      define the "condition for exit" when calling these macros.
      
      This will simplify the code as it will eliminate the need to check both
      for timeout and for the (cond) in the calling function.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      a08b51a9
  12. 04 5月, 2019 1 次提交
    • O
      habanalabs: force user to set device debug mode · 19734970
      Oded Gabbay 提交于
      This patch adds the implementation of the HL_DEBUG_OP_SET_MODE opcode in
      the DEBUG IOCTL.
      
      It forces the user who wants to debug the device to set the device into
      debug mode before he can configure the debug engines. The patch also makes
      sure to disable debug mode upon user releasing FD, in case the user forgot
      to disable debug mode.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      19734970
  13. 05 5月, 2019 1 次提交
  14. 30 4月, 2019 1 次提交
  15. 01 5月, 2019 3 次提交
  16. 29 4月, 2019 1 次提交
    • T
      habanalabs: Use single pool for CPU accessible host memory · 03d5f641
      Tomer Tayar 提交于
      The device's CPU accessible memory on host is managed in a dedicated
      pool, except for 2 regions - Primary Queue (PQ) and Event Queue (EQ) -
      which are allocated from generic DMA pools.
      Due to address length limitations of the CPU, the addresses of all these
      memory regions must have the same MSBs starting at bit 40.
      This patch modifies the allocation of the PQ and EQ to be also from the
      dedicated pool, to ensure compliance with the limitation.
      Signed-off-by: NTomer Tayar <ttayar@habana.ai>
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      03d5f641
  17. 28 4月, 2019 1 次提交
    • O
      habanalabs: return old dram bar address upon change · a38693d7
      Oded Gabbay 提交于
      This patch changes the ASIC interface function that changes the DRAM bar
      window. The change is to return the old address that the DRAM bar pointed
      to instead of an error code.
      
      This simplifies the code that use this function (mainly in debugfs) to
      restore the bar to the old setting.
      
      This is also needed for easier support in future ASICs.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      a38693d7
  18. 26 4月, 2019 1 次提交
  19. 22 4月, 2019 1 次提交
    • O
      habanalabs: use ASIC functions interface for rreg/wreg · b2377e03
      Oded Gabbay 提交于
      This patch slightly changes the macros of RREG32 and WREG32, which are
      used when reading or writing from registers.
      
      Instead of directly calling a function in the common code from these
      macros, the new code calls a function from the ASIC functions interface.
      
      This change allows us to share much more code between real ASICs and
      simulators, which in turn reduces the maintenance burden and
      the chances for forgetting to port code between the ASIC files.
      
      The patch also implements the hl_poll_timeout macro, instead of calling
      the generic readl_poll_timeout macro. This is required to allow use of
      this macro in the simulator files.
      
      As a result from this change, more functions in goya.c are shared with the
      simulator and therefore, should not be defined as static.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      b2377e03
  20. 10 4月, 2019 1 次提交
  21. 04 4月, 2019 1 次提交
    • O
      habanalabs: ASIC_AUTO_DETECT enum value is redundant · 29593840
      Oded Gabbay 提交于
      This patch removes the enum value of ASIC_AUTO_DETECT because we can use
      the validity of the pdev variable to know whether we have a real device or
      a simulator. For a real device, we detect the asic type from the device ID
      while for a simulator, the simulator code calls create_hdev() with the
      specified ASIC type.
      
      Set ASIC_INVALID as the first option in the enum to make sure that no
      other enum value will receive the value 0 (which indicates a non-existing
      entry in the simulator array).
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      29593840
  22. 02 4月, 2019 2 次提交
    • O
      habanalabs: refactoring in goya.c · bedd1442
      Oded Gabbay 提交于
      This patch does some refactoring in goya.c to make code more reusable
      between goya code and the goya simulator code (which is not upstreamed).
      
      In addition, the patch removes some dead functions from goya.c which are
      not used by the current upstream code
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      bedd1442
    • O
      habanalabs: add new IOCTL for debug, tracing and profiling · 315bc055
      Omer Shpigelman 提交于
      Habanalabs ASICs use the ARM coresight infrastructure to support debug,
      tracing and profiling of neural networks topologies.
      
      Because the coresight is configured using register writes and reads, and
      some of the registers hold sensitive information (e.g. the address in
      the device's DRAM where the trace data is written to), the user must go
      through the kernel driver to configure this mechanism.
      
      This patch implements the common code of the IOCTL and calls the
      ASIC-specific function for the actual H/W configuration.
      
      The IOCTL supports configuration of seven coresight components:
      ETR, ETF, STM, FUNNEL, BMON, SPMU and TIMESTAMP
      
      The user specifies which component he wishes to configure and provides a
      pointer to a structure (located in its process space) that contains the
      relevant configuration.
      
      The common code copies the relevant data from the user-space to kernel
      space and then calls the ASIC-specific function to do the H/W
      configuration.
      
      After the configuration is done, which is usually composed
      of several IOCTL calls depending on what the user wanted to trace, the
      user can start executing the topology. The trace data will be written to
      the user's area in the device's DRAM.
      
      After the tracing operation is complete, and user will call the IOCTL
      again to disable the tracing operation. The user also need to read
      values from registers for some of the components (e.g. the size of the
      trace data in the device's DRAM). In that case, the user will provide a
      pointer to an "output" structure in user-space, which the IOCTL code will
      fill according the to selected component.
      Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      315bc055
  23. 24 3月, 2019 1 次提交
  24. 08 3月, 2019 1 次提交
    • O
      habanalabs: keep track of the device's dma mask · d9973871
      Oded Gabbay 提交于
      This patch refactors the code that is responsible to set the DMA mask for
      the device.
      
      Upon each change of the dma mask, the driver will save the new value that
      was set. This is needed in order to make sure we don't try to increase the
      mask a second time, in case we failed in the first time. This is
      especially relevant for Power machines, as that may cause a change in
      configuration of the TVT which will break the device.
      
      Goya will first try to set the device's dma mask to 39 bits, so that the
      memory that is allocated on the host machine for communication with the
      device's cpu will be in a bus address which is lower then 39 bits. Later,
      Goya will try to increase that mask to 48 bits, but only if setting the
      mask to 39 bits was successful.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      d9973871
  25. 03 3月, 2019 1 次提交
    • O
      habanalabs: perform accounting for active CS · cbaa99ed
      Oded Gabbay 提交于
      This patch adds accounting for active CS. Active means that the CS was
      submitted to the H/W queues and was not completed yet.
      
      This is necessary to support suspend operation. Because the device will be
      reset upon suspend, we can only suspend after all active CS have been
      completed. Hence, we need to perform accounting on their number.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      cbaa99ed
  26. 05 3月, 2019 1 次提交
  27. 28 2月, 2019 1 次提交
    • O
      habanalabs: disable CPU access on timeouts · a28ce422
      Oded Gabbay 提交于
      This patch provides a workaround for a bug in the F/W where the response
      time for a request from KMD may take more then 100ms. This could cause the
      queue between KMD and the F/W to get out of sync.
      
      The WA is to:
      1. Increase the timeout of ALL requests to 1s.
      2. In case a request isn't answered in time, mark the state as
      "cpu_disabled" and prevent sending further requests from KMD to the F/W.
      This will eventually lead to a heartbeat failure and hard reset of the
      device.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a28ce422