1. 22 9月, 2020 13 次提交
  2. 22 8月, 2020 2 次提交
  3. 29 7月, 2020 1 次提交
  4. 25 7月, 2020 13 次提交
  5. 11 7月, 2020 2 次提交
  6. 25 5月, 2020 2 次提交
  7. 19 5月, 2020 7 次提交
    • O
      habanalabs: move event handling to common firmware file · ebd8d122
      Ofir Bitton 提交于
      Instead of writing similar event handling code for each ASIC, move the code
      to the common firmware file. This code will be used for GAUDI and all
      future ASICs.
      
      In addition, add two new fields to the auto-generated events file: valid
      and description. This will save the need to manually write the events
      description in the source code and simplify the code.
      Signed-off-by: NOfir Bitton <obitton@habana.ai>
      Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      ebd8d122
    • O
      habanalabs: add gaudi asic-dependent code · ac0ae6a9
      Oded Gabbay 提交于
      Add the ASIC-dependent code for GAUDI. Supply (almost) all of the function
      callbacks that the driver's common code need to initialize, finalize and
      submit workloads to the GAUDI ASIC.
      
      It also contains the code to initialize the F/W of the GAUDI ASIC and to
      receive events from the F/W.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      ac0ae6a9
    • O
      habanalabs: support clock gating enable/disable · ca62433f
      Oded Gabbay 提交于
      In Gaudi there is a feature of clock gating certain engines.
      Therefore, add this property to the device structure.
      
      In addition, due to a limitation of this feature, the driver needs to
      dynamically enable or disable this feature during run-time. Therefore, add
      ASIC interface functions to enable/disable this function from the common
      code.
      
      Moreover, this feature must be turned off when the user wishes to debug the
      ASIC by reading/writing registers and/or memory through the driver's
      debugfs. Therefore, add an option to enable/disable clock gating via the
      debugfs interface.
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      ca62433f
    • O
      habanalabs: add dedicated define for hard reset · e09498b0
      Omer Shpigelman 提交于
      Gaudi requires longer waiting during reset due to closing of network ports.
      Add this explanation to the relevant comment in the code and add a
      dedicated define for this reset timeout period, instead of multiplying
      another define.
      Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
      Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      e09498b0
    • O
      habanalabs: check if CoreSight is supported · 9e5e49cd
      Omer Shpigelman 提交于
      Coresight is not supported on simulator, therefore add a boolean for
      checking that (currently used by un-upstreamed code).
      Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
      Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      9e5e49cd
    • O
      habanalabs: add signal/wait to CS IOCTL operations · b75f2250
      Omer Shpigelman 提交于
      Add the following two operations to the CS IOCTL:
      
      Signal:
      
      The signal operation is basically a command submission, that is created by
      the driver upon user request. It will be implemented using a dedicated PQE
      that will increment a specific SOB. There will be a new flag:
      HL_CS_FLAGS_SIGNAL. When the user set this flag in the CS IOCTL structure,
      the driver will execute a dedicated code path that will prepare this
      special PQE and submit it. The user only needs to provide a queue index on
      which to put the signal.
      
      Wait:
      
      The wait operation is also a command submission that is created by the
      driver upon user request. It will be implemented using a dedicated PQE that
      will contain packets of "ARM a monitor" + FENCE packet. There will be a new
      flag: HL_CS_FLAGS_WAIT. When the user set this flag in the CS structure,
      the driver will execute a dedicated code path that will prepare this
      special PQE and submit it.
      
      The user needs to provide the following parameters:
      1. queue ID
      2. an array of signal_seq numbers and the number of signals to wait on
         (the length of signal_seq_arr).
      
      The IOCTL will return the CS sequence number of the wait it put on the
      queue ID.
      
      Currently, the code supports signal_seq_nr==1. But this API definition will
      allow us to put a single PQE that waits on multiple signals.
      
      To correctly configure the monitor and fence, the driver will need to
      retrieve the specified signal CS object that contains the relevant SOB and
      its expected value. In case the signal CS has already been completed, there
      is no point of adding a wait operation. In this case, the driver will
      return to the user *without* putting anything on the PQ. The return code
      should reflect to the user that the signal was completed, as we won't
      return a CS sequence number for this wait.
      Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
      Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      b75f2250
    • O
      habanalabs: handle the h/w sync object · b0b5d925
      Omer Shpigelman 提交于
      Define a structure representing the h/w sync object (SOB).
      
      a SOB can contain up to 2^15 values. Each signal CS will increment the SOB
      by 1, so after some time we will reach the maximum number the SOB can
      represent. When that happens, the driver needs to move to a different SOB
      for the signal operation.
      
      A SOB can be in 1 of 4 states:
      
      1. Working state with value < 2^15
      
      2. We reached a value of 2^15, but the signal operations weren't completed
      yet OR there are pending waits on this signal. For the next submission, the
      driver will move to another SOB.
      
      3. ALL the signal operations on the SOB have finished AND there are no more
      pending waits on the SOB AND we reached a value of 2^15 (This basically
      means the refcnt of the SOB is 0 - see explanation below). When that
      happens, the driver can clear the SOB by simply doing WREG32 0 to it and
      set the refcnt back to 1.
      
      4. The SOB is cleared and can be used next time by the driver when it needs
      to reuse an SOB.
      
      Per SOB, the driver will maintain a single refcnt, that will be initialized
      to 1. When a signal or wait operation on this SOB is submitted to the PQ,
      the refcnt will be incremented. When a signal or wait operation on this SOB
      completes, the refcnt will be decremented. After the submission of the
      signal operation that increments the SOB to a value of 2^15, the refcnt is
      also decremented.
      Signed-off-by: NOmer Shpigelman <oshpigelman@habana.ai>
      Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
      Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
      b0b5d925