1. 06 2月, 2019 8 次提交
    • J
      scsi: lpfc: Adapt partitioned XRI lists to efficient sharing · c490850a
      James Smart 提交于
      The XRI get/put lists were partitioned per hardware queue. However, the
      adapter rarely had sufficient resources to give a large number of resources
      per queue. As such, it became common for a cpu to encounter a lack of XRI
      resource and request the upper io stack to retry after returning a BUSY
      condition. This occurred even though other cpus were idle and not using
      their resources.
      
      Create as efficient a scheme as possible to move resources to the cpus that
      need them. Each cpu maintains a small private pool which it allocates from
      for io. There is a watermark that the cpu attempts to keep in the private
      pool.  The private pool, when empty, pulls from a global pool from the
      cpu. When the cpu's global pool is empty it will pull from other cpu's
      global pool. As there many cpu global pools (1 per cpu or hardware queue
      count) and as each cpu selects what cpu to pull from at different rates and
      at different times, it creates a radomizing effect that minimizes the
      number of cpu's that will contend with each other when the steal XRI's from
      another cpu's global pool.
      
      On io completion, a cpu will push the XRI back on to its private pool.  A
      watermark level is maintained for the private pool such that when it is
      exceeded it will move XRI's to the CPU global pool so that other cpu's may
      allocate them.
      
      On NVME, as heartbeat commands are critical to get placed on the wire, a
      single expedite pool is maintained. When a heartbeat is to be sent, it will
      allocate an XRI from the expedite pool rather than the normal cpu
      private/global pools. On any io completion, if a reduction in the expedite
      pools is seen, it will be replenished before the XRI is placed on the cpu
      private pool.
      
      Statistics are added to aid understanding the XRI levels on each cpu and
      their behaviors.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      c490850a
    • J
      scsi: lpfc: Synchronize hardware queues with SCSI MQ interface · ace44e48
      James Smart 提交于
      Now that the lower half has much better per-cpu parallelization using the
      hardware queues, the SCSI MQ support needs to be tied into it.
      
      The involves the following mods:
      
       - Use the hardware queue info from the midlayer to help select the
         hardware queue to utilize. This required change to the get_scsi-buf_xxx
         routines.
      
       - Remove lpfc_sli4_scmd_to_wqidx_distr() routine. No longer needed.
      
       - Includes fix for SLI-3 that does not have multi queue parallelization.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      ace44e48
    • J
      scsi: lpfc: Convert ring number to hardware queue for nvme wqe posting. · 1fbf9742
      James Smart 提交于
      SLI4 nvme functions are passing the SLI3 ring number when posting wqe to
      hardware. This should be indicating the hardware queue to use, not the ring
      number.
      
      Replace ring number with the hardware queue that should be used.
      
      Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb
      routine that properly adapts.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      1fbf9742
    • J
      scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures · 4c47efc1
      James Smart 提交于
      Many io statistics were being sampled and saved using adapter-based data
      structures. This was creating a lot of contention and cache thrashing in
      the I/O path.
      
      Move the statistics to the hardware queue data structures.  Given the
      per-queue data structures, use of atomic types is lessened.
      
      Add new sysfs and debugfs stat routines to collate the per hardware queue
      values and report at an adapter level.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      4c47efc1
    • J
      scsi: lpfc: Partition XRI buffer list across Hardware Queues · 5e5b511d
      James Smart 提交于
      Once the IO buff allocations were made shared, there was a single XRI
      buffer list shared by all hardware queues.  A single list isn't great for
      performance when shared across the per-cpu hardware queues.
      
      Create a separate XRI IO buffer get/put list for each Hardware Queue.  As
      SGLs and associated IO buffers get allocated/posted to the firmware; round
      robin their assignment across all available hardware Queues so that there
      is an equitable assignment.
      
      Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic
      for XRI allocation.
      
      Add a debugfs interface to display hardware queue statistics
      
      Added new empty_io_bufs counter to track if a cpu runs out of XRIs.
      
      Replace common_ variables/names with io_ to make meanings clearer.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      5e5b511d
    • J
      scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu · cdb42bec
      James Smart 提交于
      Currently, both nvme and fcp each have their own concept of an io_channel,
      which is a combination wq/cq and associated msix.  Different cpus would
      share an io_channel.
      
      The driver is now moving to per-cpu wq/cq pairs and msix vectors.  The
      driver will still use separate wq/cq pairs per protocol on each cpu, but
      the protocols will share the msix vector.
      
      Given the elimination of the nvme and fcp io channels, the module
      parameters will be removed.  A new parameter, lpfc_hdw_queue is added which
      allows the wq/cq pair allocation per cpu to be overridden and allocated to
      lesser value. If lpfc_hdw_queue is zero, the number of pairs allocated will
      be based on the number of cpus. If non-zero, the parameter specifies the
      number of queues to allocate. At this time, the maximum non-zero value is
      64.
      
      To manage this new paradigm, a new hardware queue structure is created to
      track queue activity and relationships.
      
      As MSIX vector allocation must be known before setting up the
      relationships, msix allocation now occurs before queue datastructures are
      allocated. If the number of vectors allocated is less than the desired
      hardware queues, the hardware queue counts will be reduced to the number of
      vectors
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      cdb42bec
    • J
      scsi: lpfc: Remove extra vector and SLI4 queue for Expresslane · 7370d10a
      James Smart 提交于
      There is a extra queue and msix vector for expresslane. Now that the driver
      will be doing queues per cpu, this oddball queue is no longer needed.
      Expresslane will utilize the normal per-cpu queues.
      
      Updated debugfs sli4 queue output to go along with the change
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      7370d10a
    • J
      scsi: lpfc: Implement common IO buffers between NVME and SCSI · 0794d601
      James Smart 提交于
      Currently, both NVME and SCSI get their IO buffers from separate
      pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split
      between protocols.
      
      Eliminate the independent pools and use a single pool. Each buffer
      structure now has a common section and a protocol section. Per protocol
      routines for SGL initialization are removed and replaced by common
      routines. Initialization of the buffers is only done on the common area.
      All other fields, which are protocol specific, are initialized when the
      buffer is allocated for use in the per-protocol allocation routine.
      
      In the past, the SCSI side allocated IO buffers as part of slave_alloc
      calls until the maximum XRIs for SCSI was reached. As all XRIs are now
      common and may be used for either protocol, allocation for everything is
      done as part of adapter initialization and the scsi side has no action in
      slave alloc.
      
      As XRI's are no longer split, the lpfc_xri_split module parameter is
      removed.
      
      Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put
      routines.  All SLI4 adapters utilize the new IO buffer scheme
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      0794d601
  2. 20 12月, 2018 1 次提交
  3. 08 12月, 2018 5 次提交
  4. 29 11月, 2018 1 次提交
  5. 16 11月, 2018 1 次提交
  6. 07 11月, 2018 4 次提交
  7. 17 10月, 2018 1 次提交
  8. 03 10月, 2018 1 次提交
  9. 12 9月, 2018 3 次提交
    • J
      scsi: lpfc: add support to retrieve firmware logs · d2cc9bcd
      James Smart 提交于
      This patch adds the ability to read firmware logs from the adapter. The driver
      registers a buffer with the adapter that is then written to by the adapter.
      The adapter posts CQEs to indicate content updates in the buffer. While the
      adapter is writing to the buffer in a circular fashion, an application will
      poll the driver to read the next amount of log data from the buffer.
      
      Driver log buffer size is configurable via the ras_fwlog_buffsize sysfs
      attribute. Verbosity to be used by firmware when logging to host memory is
      controlled through the ras_fwlog_level attribute.  The ras_fwlog_func
      attribute enables or disables loggy by firmware.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      d2cc9bcd
    • J
      scsi: lpfc: Correct irq handling via locks when taking adapter offline · 523128e5
      James Smart 提交于
      When taking the board offline while performing i/o, unsafe locking errors
      occurred and irq level isn't properly managed.
      
      In lpfc_sli_hba_down, spin_lock_irqsave(&phba->hbalock, flags) does not
      disable softirqs raised from timer expiry.  It is possible that a softirq is
      raised from the lpfc_els_retry_delay routine and recursively requests the same
      phba->hbalock spinlock causing deadlock.
      
      Address the deadlocks by creating a new port_list lock. The softirq behavior
      can then be managed a level deeper into the calling sequences.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      523128e5
    • J
      scsi: lpfc: raise sg count for nvme to use available sg resources · 5b9e70b2
      James Smart 提交于
      The driver allocates a sg list per io struture based on a fixed maximum
      size. When it registers with the protocol transports and indicates the max sg
      list size it supports, the driver manipulates the fixed value to report a
      lesser amount so that it has reserved space for sg elements that are used for
      DIF.
      
      The driver initialization path sets the cfg_sg_seg_cnt field to the
      manipulated value for scsi. NVME initialization ran afterward and capped it's
      maximum by the manipulated value for SCSI. This erroneously made NVME report
      the SCSI-reduce-for-DIF value that reduced the max io size for nvme and wasted
      sg elements.
      
      Rework the driver so that cfg_sg_seg_cnt becomes the overall maximum size and
      allow the max size to be tunable.  A separate (new) scsi sg count is then
      setup with the scsi-modified reduced value. NVME then initializes based off
      the overall maximum.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      5b9e70b2
  10. 11 7月, 2018 3 次提交
  11. 13 6月, 2018 1 次提交
    • K
      treewide: kzalloc() -> kcalloc() · 6396bb22
      Kees Cook 提交于
      The kzalloc() function has a 2-factor argument form, kcalloc(). This
      patch replaces cases of:
      
              kzalloc(a * b, gfp)
      
      with:
              kcalloc(a * b, gfp)
      
      as well as handling cases of:
      
              kzalloc(a * b * c, gfp)
      
      with:
      
              kzalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kzalloc_array(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kzalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kzalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kzalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kzalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kzalloc
      + kcalloc
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kzalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(sizeof(THING) * C2, ...)
      |
        kzalloc(sizeof(TYPE) * C2, ...)
      |
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(C1 * C2, ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      6396bb22
  12. 29 5月, 2018 2 次提交
  13. 08 5月, 2018 5 次提交
  14. 19 4月, 2018 3 次提交
  15. 13 3月, 2018 1 次提交