1. 21 8月, 2020 4 次提交
  2. 16 6月, 2020 1 次提交
  3. 08 5月, 2020 1 次提交
  4. 25 4月, 2020 3 次提交
  5. 03 1月, 2020 7 次提交
  6. 01 10月, 2019 4 次提交
    • S
    • S
      scsi: mpt3sas: Add app owned flag support for diag buffer · a8a6cbcd
      Sreekanth Reddy 提交于
      Added a new status flag named MPT3_DIAG_BUFFER_IS_APP_OWNED and it will set
      whenever application registers the diag buffer & it will be cleared when
      application unregisters the buffer.
      
      When this flag is enabled, and if application issues diag buffer register
      command without releasing the buffer, then register command will be failed
      with -EINVAL status by saying that this buffer is already registered by the
      application.
      
      When user issues a trace buffer register command through sysfs parameter,
      and if trace buffer is in released stated but not yet unregistered by the
      application which was owning it, then driver will unregister the buffer by
      itself and freshly register the 1MB sized trace buffer with the HBA
      firmware.
      
      Link: https://lore.kernel.org/r/1568379890-18347-9-git-send-email-sreekanth.reddy@broadcom.comSigned-off-by: NSreekanth Reddy <sreekanth.reddy@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      a8a6cbcd
    • S
      scsi: mpt3sas: Reuse diag buffer allocated at load time · a066f4c3
      Sreekanth Reddy 提交于
      The diag buffer which is allocated during driver load time or through sysfs
      parameter is marked as driver allocated diag buffer.
      MPT3_DIAG_BUFFER_IS_DRIVER_ALLOCATED bit will be set for this buffer.
      
      This buffer won't be de-allocated even when application issues unregister
      command, driver just clears the registered status bit. Same buffer will be
      reused while re-registering the same diag buffer type by any application.
      While re-registering the same diag buffer type application has to register
      with the same size that the buffer was allocated during driver load
      time. This buffer size can be read by the application by issuing diag
      'query' command.
      
      This always makes sure that the memory is available for applications for
      collecting the firmware logs. Only thing is that this won't allow the
      application to re-register the diag buffer with different size, but the
      buffer size which is allocated during driver load time will be enough for
      most of the cases for collecting the firmware logs.
      
      Link: https://lore.kernel.org/r/1568379890-18347-8-git-send-email-sreekanth.reddy@broadcom.comSigned-off-by: NSreekanth Reddy <sreekanth.reddy@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      a066f4c3
    • S
      scsi: mpt3sas: Register trace buffer based on NVDATA settings · d04a6edf
      Sreekanth Reddy 提交于
      Currently if user wishes to enable the host trace buffer during driver load
      time, then user has to load the driver with module parameter
      'diag_buffer_enable' set to one.
      
      Alternatively now the user can enable host trace buffer by enabling the
      following fields in manufacturing page11 in NVDATA (nvdata xml is used
      while building HBA firmware image):
      
       * HostTraceBufferMaxSizeKB - Maximum trace buffer size in KB that host can
                                    allocate,
      
       * HostTraceBufferMinSizeKB - Minimum trace buffer size in KB atleast host
                                    should allocate,
      
       * HostTraceBufferDecrementSizeKB - size by which host can reduce from
                                    buffer size and retry the buffer allocation
                                    when buffer allocation failed with previous
                                    calculated buffer size.
      
      The driver will register the trace buffer automatically without any module
      parameter during boot time when above fields are enabled in manufacturing
      page11 in HBA firmware.
      
      Driver follows the following algorithm for enabling the host trace buffer
      during driver load time:
      
      * If user has loaded the driver with module parameter 'diag_buffer_enable'
        set to one, then driver allocates 2MB buffer and registers this buffer
        with HBA firmware for capturing the firmware trace logs.
      
      * Else driver reads manufacture page11 data and checks whether
        HostTraceBufferMaxSizeKB filed is zero or not?
      
        - If HostTraceBufferMaxSizeKB is non-zero then driver tries to allocate
          HostTraceBufferMaxSizeKB size of memory. If the buffer allocation is
          successful, then it will register this buffer with HBA firmware, else
          in a loop the driver will try again by reducing the current buffer size
          with HostTraceBufferDecrementSizeKB size until memory allocation is
          successful or buffer size falls below HostTraceBufferMinSizeKB. If the
          memory allocation is successful, then the buffer will be registered
          with the firmware. Else, if the buffer size falls below the
          HostTraceBufferMinSizeKB, then driver won't register trace buffer with
          HBA firmware.
      
        - If HostTraceBufferMaxSizeKB is zero, then driver won't register trace
          buffer with HBA firmware.
      
      Link: https://lore.kernel.org/r/1568379890-18347-2-git-send-email-sreekanth.reddy@broadcom.comSigned-off-by: NSreekanth Reddy <sreekanth.reddy@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      d04a6edf
  7. 30 8月, 2019 1 次提交
  8. 08 8月, 2019 5 次提交
  9. 27 6月, 2019 1 次提交
    • S
      scsi: mpt3sas: Determine smp affinity on per HBA basis · 610ef1e9
      Sreekanth Reddy 提交于
      Even though 'smp_affinity_enable' module parameter is enabled, if the
      number of online CPUs is bigger than the number of msix vectors enabled on
      that HBA, then smp affinity settings should be disabled only for this HBA.
      
      But currently the smp affinity setting is disabled globally and hence smp
      affinity will be disabled for subsequent HBAs even though number of msix
      vectors enabled for this HBA matches the number of online CPU.
      
      To fix this, define a per HBA variable smp_affinity_enable.  Initially this
      variable is initialized with smp_affinity_enable module parameter value. If
      this HBA has less number of msix vectors configured when compared to number
      of online cpus, then only this HBA's variable smp_affinity_enable is set to
      zero.
      Signed-off-by: NSreekanth Reddy <sreekanth.reddy@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      610ef1e9
  10. 19 6月, 2019 7 次提交
  11. 19 3月, 2019 4 次提交
    • S
      scsi: mpt3sas: Update mpt3sas driver version to 28.100.00.00 · 4bcb298e
      Suganath Prabu 提交于
      Updated driver version to 28.100.00.00, which is equivalent to OOB Phase 9.
      Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      4bcb298e
    • S
      scsi: mpt3sas: Improve the threshold value and introduce module param · 288addd6
      Suganath Prabu 提交于
      * Reduce the threshold value to 1/4 of the queue depth.
      
      * With this FW can find enough entries to post the Reply Descriptors in the
        reply descriptor post queue.
      
      * With module param, user can play with threshold value, the same
        irqpoll_weight is used as the budget in processing of reply descriptor
        post queues in _base_process_reply_queue.
      Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      288addd6
    • S
      scsi: mpt3sas: Load balance to improve performance and avoid soft lockups · 51e3b2ad
      Suganath Prabu 提交于
      Driver uses "reply descriptor post queues" in round robin fashion so that
      IO's are distributed to all the available reply descriptor post queues
      equally.  With this each reply descriptor post queue load is balanced.
      
      This is enabled only if CPUs count to MSI-X vector count ratio is X:1
      (where X > 1) This improves performance and also fixes soft lockups.
      Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      51e3b2ad
    • S
      scsi: mpt3sas: Irq poll to avoid CPU hard lockups · 320e77ac
      Suganath Prabu 提交于
      Issue Description:
      We have seen cpu lock up issue from fields if system has greater (more than
      96) logical cpu count.  SAS3.0 controller (Invader series) supports at max
      96 msix vector and SAS3.5 product (Ventura) supports at max 128 msix
      vectors.
      
      This may be a generic issue (if PCI device supports completion on multiple
      reply queues).  Let me explain it w.r.t to mpt3sas supported h/w just to
      simplify the problem and possible changes to handle such issues. IT HBA
      (mpt3sas) supports multiple reply queues in completion path. Driver creates
      MSI-x vectors for controller as "min of (FW supported Reply queue, Logical
      CPUs)". If submitter is not interrupted via completion on same CPU, there
      is a loop in the IO path. This behavior can cause hard/soft CPU lockups, IO
      timeout, system sluggish etc.
      
      Example - one CPU (e.g. CPU A) is busy submitting the IOs and another CPU
      (e.g. CPU B) is busy with processing the corresponding IO's reply
      descriptors from reply descriptor queue upon receiving the interrupts from
      HBA.  If the CPU A is continuously pumping the IOs then always CPU B (which
      is executing the ISR) will see the valid reply descriptors in the reply
      descriptor queue and it will be continuously processing those reply
      descriptor in a loop without quitting the ISR handler.
      
      Mpt3sas driver will exit ISR handler if it finds unused reply descriptor in
      the reply descriptor queue. Since CPU A will be continuously sending the
      IOs, CPU B may always see a valid reply descriptor (posted by HBA Firmware
      after processing the IO) in the reply descriptor queue. In worst case,
      driver will not quit from this loop in the ISR handler. Eventually, CPU
      lockup will be detected by watchdog.
      
      Above mentioned behavior is not common if "rq_affinity" set to 2 or
      affinity_hint is honored by irqbalance as "exact". If rq_affinity is set
      to 2, submitter will be always interrupted via completion on same CPU.  If
      irqbalance is using "exact" policy, interrupt will be delivered to
      submitter CPU.
      
      If CPU counts to MSI-X vectors (reply descriptor Queues) count ratio is not
      1:1, we still have exposure of issue explained above and for that we don't
      have any solution.
      
      Exposure of soft/hard lockup if CPU count is more than MSI-x supported by
      device.
      
      If CPUs count to MSI-x vectors count ratio is not 1:1, (Other way, if CPU
      counts to MSI-x vector count ratio is something like X:1, where X > 1) then
      'exact' irqbalance policy OR rq_affinity = 2 won't help to avoid CPU
      hard/soft lockups. There won't be any one to one mapping between CPU to
      MSI-x vector instead one MSI-x interrupt (or reply descriptor queue) is
      shared with group/set of CPUs and there is a possibility of having a loop
      in the IO path within that CPU group and may observe lockups.
      
      For example: Consider a system having two NUMA nodes and each node having
      four logical CPUs and also consider that number of MSI-x vectors enabled on
      the HBA is two, then CPUs count to MSI-x vector count ratio as 4:1.  e.g.
      MSIx vector 0 is affinity to CPU 0, CPU 1, CPU 2 & CPU 3 of NUMA node 0 and
      MSI-x vector 1 is affinity to CPU 4, CPU 5, CPU 6 & CPU 7 of NUMA node 1.
      
      numactl --hardware
      available: 2 nodes (0-1)
      node 0 cpus: 0 1 2 3                 --> MSI-x 0
      node 0 size: 65536 MB
      node 0 free: 63176 MB
      node 1 cpus: 4 5 6 7                 -->MSI-x 1
      node 1 size: 65536 MB
      node 1 free: 63176 MB
      
      Assume that user started an application which uses all the CPUs of NUMA
      node 0 for issuing the IOs.  Only one CPU from affinity list (it can be any
      cpu since this behavior depends upon irqbalance) CPU0 will receive the
      interrupts from MSIx vector 0 for all the IOs. Eventually, CPU 0 IO
      submission percentage will be decreasing and ISR processing percentage will
      be increasing as it is more busy with processing the interrupts.  Gradually
      IO submission percentage on CPU 0 will be zero and it's ISR processing
      percentage will be 100 percentage as IO loop has already formed within the
      NUMA node 0, i.e. CPU 1, CPU 2 & CPU 3 will be continuously busy with
      submitting the heavy IOs and only CPU 0 is busy in the ISR path as it
      always find the valid reply descriptor in the reply descriptor
      queue. Eventually, we will observe the hard lockup here.
      
      Chances of occurring of hard/soft lockups are directly proportional to
      value of X. If value of X is high, then chances of observing CPU lockups is
      high.
      
      Solution: Use IRQ poll interface defined in " irq_poll.c".  mpt3sas driver
      will execute ISR routine in Softirq context and it will always quit the
      loop based on budget provided in IRQ poll interface.
      
      In these scenarios (i.e. where CPUs count to MSI-X vectors count ratio is
      X:1 (where X > 1)), IRQ poll interface will avoid CPU hard lockups due to
      voluntary exit from the reply queue processing based on budget.  Note -
      Only one MSI-x vector is busy doing processing.
      
      Irqstat output:
      
      IRQs / 1 second(s)
      IRQ#  TOTAL  NODE0   NODE1   NODE2   NODE3  NAME
        44    122871   122871   0       0       0  IR-PCI-MSI-edge mpt3sas0-msix0
        45        0              0           0       0       0  IR-PCI-MSI-edge mpt3sas0-msix1
      
      We use this approach only if cpu count is more than FW supported MSI-x
      vector
      Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      320e77ac
  12. 05 2月, 2019 2 次提交