1. 11 2月, 2020 4 次提交
  2. 22 12月, 2019 3 次提交
  3. 20 12月, 2019 1 次提交
  4. 22 11月, 2019 1 次提交
  5. 13 11月, 2019 1 次提交
  6. 09 11月, 2019 2 次提交
    • B
      scsi: lpfc: Fix lpfc_cpumask_of_node_init() · 61951a6d
      Bart Van Assche 提交于
      Fix the following kernel warning:
      
      cpumask_of_node(-1): (unsigned)node >= nr_node_ids(1)
      
      Fixes: dcaa2136 ("scsi: lpfc: Change default IRQ model on AMD architectures")
      Link: https://lore.kernel.org/r/20191108225947.1395-1-jsmart2021@gmail.comSigned-off-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      61951a6d
    • B
      scsi: lpfc: Fix a kernel warning triggered by lpfc_sli4_enable_intr() · eea2d396
      Bart Van Assche 提交于
      Fix the following lockdep warning:
      
      ============================================
      WARNING: possible recursive locking detected
      5.4.0-rc6-dbg+ #2 Not tainted
      --------------------------------------------
      systemd-udevd/130 is trying to acquire lock:
      ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: irq_calc_affinity_vectors+0x63/0x90
      
      but task is already holding lock:
      
      ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc]
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
             CPU0
             ----
        lock(cpu_hotplug_lock.rw_sem);
        lock(cpu_hotplug_lock.rw_sem);
      
      *** DEADLOCK ***
       May be due to missing lock nesting notation
      2 locks held by systemd-udevd/130:
       #0: ffff8880d53fe210 (&dev->mutex){....}, at: __device_driver_lock+0x4a/0x70
       #1: ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc]
      
      stack backtrace:
      CPU: 1 PID: 130 Comm: systemd-udevd Not tainted 5.4.0-rc6-dbg+ #2
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Call Trace:
       dump_stack+0xa5/0xe6
       __lock_acquire.cold+0xf7/0x23a
       lock_acquire+0x106/0x240
       cpus_read_lock+0x41/0xe0
       irq_calc_affinity_vectors+0x63/0x90
       __pci_enable_msix_range+0x10a/0x950
       pci_alloc_irq_vectors_affinity+0x144/0x210
       lpfc_sli4_enable_intr+0x4b2/0xd50 [lpfc]
       lpfc_pci_probe_one+0x1411/0x22b0 [lpfc]
       local_pci_probe+0x7c/0xc0
       pci_device_probe+0x25d/0x390
       really_probe+0x170/0x510
       driver_probe_device+0x127/0x190
       device_driver_attach+0x98/0xa0
       __driver_attach+0xb6/0x1a0
       bus_for_each_dev+0x100/0x150
       driver_attach+0x31/0x40
       bus_add_driver+0x246/0x300
       driver_register+0xe0/0x170
       __pci_register_driver+0xde/0xf0
       lpfc_init+0x134/0x1000 [lpfc]
       do_one_initcall+0xda/0x47e
       do_init_module+0x10a/0x3b0
       load_module+0x4318/0x47c0
       __do_sys_finit_module+0x134/0x1d0
       __x64_sys_finit_module+0x47/0x50
       do_syscall_64+0x6f/0x2e0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: dcaa2136 ("scsi: lpfc: Change default IRQ model on AMD architectures")
      Link: https://lore.kernel.org/r/20191107052158.25788-4-bvanassche@acm.orgSigned-off-by: NBart Van Assche <bvanassche@acm.org>
      Reviewed-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      eea2d396
  7. 06 11月, 2019 2 次提交
    • J
      scsi: lpfc: Change default IRQ model on AMD architectures · dcaa2136
      James Smart 提交于
      The current driver attempts to allocate an interrupt vector per cpu using
      the systems managed IRQ allocator (flag PCI_IRQ_AFFINITY). The system IRQ
      allocator will either provide the per-cpu vector, or return fewer
      vectors. When fewer vectors, they are evenly spread between the numa nodes
      on the system.  When run on an AMD architecture, if interrupts occur to a
      cpu that is not in the same numa node as the adapter generating the
      interrupt, there are extreme costs and overheads in performance.  Thus, if
      1:1 vector allocation is used, or the "balanced" vectors in the other numa
      nodes, performance can be hit significantly.
      
      A much more performant model is to allocate interrupts only on the cpus
      that are in the numa node where the adapter resides.  I/O completion is
      still performed by the cpu where the I/O was generated. Unfortunately,
      there is no flag to request the managed IRQ subsystem allocate vectors only
      for the CPUs in the numa node as the adapter.
      
      On AMD architecture, revert the irq allocation to the normal style
      (non-managed) and then use irq_set_affinity_hint() to set the cpu
      affinity and disable user-space rebalancing.
      
      Tie the support into CPU offline/online. If the cpu being offlined owns a
      vector, the vector is re-affinitized to one of the other CPUs on the same
      numa node. If there are no more CPUs on the numa node, the vector has all
      affinity removed and lets the system determine where it's serviced.
      Similarly, when the cpu that owned a vector comes online, the vector is
      reaffinitized to the cpu.
      
      Link: https://lore.kernel.org/r/20191105005708.7399-10-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      dcaa2136
    • J
      scsi: lpfc: Add registration for CPU Offline/Online events · 93a4d6f4
      James Smart 提交于
      The recent affinitization didn't address cpu offlining/onlining.  If an
      interrupt vector is shared and the low order cpu owning the vector is
      offlined, as interrupts are managed, the vector is taken offline. This
      causes the other CPUs sharing the vector will hang as they can't get io
      completions.
      
      Correct by registering callbacks with the system for Offline/Online
      events. When a cpu is taken offline, its eq, which is tied to an interrupt
      vector is found. If the cpu is the "owner" of the vector and if the
      eq/vector is shared by other CPUs, the eq is placed into a polled mode.
      Additionally, code paths that perform io submission on the "sharing CPUs"
      will check the eq state and poll for completion after submission of new io
      to a wq that uses the eq.
      
      Similarly, when a cpu comes back online and owns an offlined vector, the eq
      is taken out of polled mode and rearmed to start driving interrupts for eq.
      
      Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      93a4d6f4
  8. 29 10月, 2019 2 次提交
  9. 25 10月, 2019 6 次提交
  10. 18 10月, 2019 1 次提交
  11. 01 10月, 2019 3 次提交
  12. 30 8月, 2019 2 次提交
  13. 20 8月, 2019 12 次提交
    • J
      scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair · c00f62e6
      James Smart 提交于
      Currently, each hardware queue, typically allocated per-cpu, consists of a
      WQ/CQ pair per protocol. Meaning if both SCSI and NVMe are supported 2
      WQ/CQ pairs will exist for the hardware queue. Separate queues are
      unnecessary. The current implementation wastes memory backing the 2nd set
      of queues, and the use of double the SLI-4 WQ/CQ's means less hardware
      queues can be supported which means there may not always be enough to have
      a pair per cpu. If there is only 1 pair per cpu, more cpu's may get their
      own WQ/CQ.
      
      Rework the implementation to use a single WQ/CQ pair by both protocols.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      c00f62e6
    • J
      scsi: lpfc: Add NVMe sequence level error recovery support · 0d8af096
      James Smart 提交于
      FC-NVMe-2 added support for sequence level error recovery in the FC-NVME
      protocol. This allows for the detection of errors and lost frames and
      immediate retransmission of data to avoid exchange termination, which
      escalates into NVMeoFC connection and association failures. A significant
      RAS improvement.
      
      The driver is modified to indicate support for SLER in the NVMe PRLI is
      issues and to check for support in the PRLI response.  When both sides
      support it, the driver will set a bit in the WQE to enable the recovery
      behavior on the exchange. The adapter will take care of all detection and
      retransmission.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      0d8af096
    • J
      scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware. · d79c9e9d
      James Smart 提交于
      Typical SLI-4 hardware supports up to 2 4KB pages to be registered per XRI
      to contain the exchanges Scatter/Gather List. This caps the number of SGL
      elements that can be in the SGL. There are not extensions to extend the
      list out of the 2 pages.
      
      The G7 hardware adds a SGE type that allows the SGL to be vectored to a
      different scatter/gather list segment. And that segment can contain a SGE
      to go to another segment and so on.  The initial segment must still be
      pre-registered for the XRI, but it can be a much smaller amount (256Bytes)
      as it can now be dynamically grown.  This much smaller allocation can
      handle the SG list for most normal I/O, and the dynamic aspect allows it to
      support many MB's if needed.
      
      The implementation creates a pool which contains "segments" and which is
      initially sized to hold the initial small segment per xri. If an I/O
      requires additional segments, they are allocated from the pool.  If the
      pool has no more segments, the pool is grown based on what is now
      needed. After the I/O completes, the additional segments are returned to
      the pool for use by other I/Os. Once allocated, the additional segments are
      not released under the assumption of "if needed once, it will be needed
      again". Pools are kept on a per-hardware queue basis, which is typically
      1:1 per cpu, but may be shared by multiple cpus.
      
      The switch to the smaller initial allocation significantly reduces the
      memory footprint of the driver (which only grows if large ios are
      issued). Based on the several K of XRIs for the adapter, the 8KB->256B
      reduction can conserve 32MBs or more.
      
      It has been observed with per-cpu resource pools that allocating a resource
      on CPU A, may be put back on CPU B. While the get routines are distributed
      evenly, only a limited subset of CPUs may be handling the put routines.
      This can put a strain on the lpfc_put_cmd_rsp_buf_per_cpu routine because
      all the resources are being put on a limited subset of CPUs.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      d79c9e9d
    • J
      scsi: lpfc: Migrate to %px and %pf in kernel print calls · 32350664
      James Smart 提交于
      In order to see real addresses, convert %p with %px for kernel addresses
      and replace %p with %pf for functions.
      
      While converting, standardize on "x%px" throughout (not %px or 0x%px).
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      32350664
    • J
      scsi: lpfc: Fix sli4 adapter initialization with MSI · 07b1b914
      James Smart 提交于
      When forcing the use of MSI (vs MSI-X) the driver is crashing in
      pci_irq_get_affinity.
      
      The driver was not using the new pci_alloc_irq_vectors interface in the MSI
      path.
      
      Fix by using pci_alloc_irq_vectors() with PCI_RQ_MSI in the MSI path.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      07b1b914
    • J
      scsi: lpfc: Fix nvme sg_seg_cnt display if HBA does not support NVME · 6a224b47
      James Smart 提交于
      The driver is currently reporting a non-zero nvme sg_seg_cnt value of 256
      when nvme is disabled. It should be zero.
      
      Fix by ensuring the value is cleared.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      6a224b47
    • J
      scsi: lpfc: Fix hang when downloading fw on port enabled for nvme · 84f2ddf8
      James Smart 提交于
      As part of firmware download, the adapter is reset. On the adapter the
      reset causes the function to stop and all outstanding io is terminated
      (without responses). The reset path then starts teardown of the adapter,
      starting with deregistration of the remote ports with the nvme-fc
      transport. The local port is then deregistered and the driver waits for
      local port deregistration. This never finishes.
      
      The remote port deregistrations terminated the nvme controllers, causing
      them to send aborts for all the outstanding io. The aborts were serviced in
      the driver, but stalled due to its state. The nvme layer then stops to
      reclaim it's outstanding io before continuing.  The io must be returned
      before the reset on the controller is deemed complete and the controller
      delete performed.  The remote port deregistration won't complete until all
      the controllers are terminated. And the local port deregistration won't
      complete until all controllers and remote ports are terminated. Thus things
      hang.
      
      The issue is the reset which stopped the adapter also stopped all the
      responses that would drive i/o completions, and the aborts were also
      stopped that stopped i/o completions. The driver, when resetting the
      adapter like this, needs to be generating the completions as part of the
      adapter reset so that I/O complete (in error), and any aborts are not
      queued.
      
      Fix by adding flush routines whenever the adapter port has been reset or
      discovered in error. The flush routines will generate the completions for
      the scsi and nvme outstanding io. The abort ios, if waiting, will be caught
      and flushed as well.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      84f2ddf8
    • J
      scsi: lpfc: Fix crash due to port reset racing vs adapter error handling · 8c24a4f6
      James Smart 提交于
      If the adapter encounters a condition which causes the adapter to fail
      (driver must detect the failure) simultaneously to a request to the driver
      to reset the adapter (such as a host_reset), the reset path will be racing
      with the asynchronously-detect adapter failure path.  In the failing
      situation, one path has started to tear down the adapter data structures
      (io_wq's) while the other path has initiated a repeat of the teardown and
      is in the lpfc_sli_flush_xxx_rings path and attempting to access the
      just-freed data structures.
      
      Fix by the following:
      
       - In cases where an adapter failure is detected, rather than explicitly
         calling offline_eratt() to start the teardown, change the adapter state
         and let the later calls of posted work to the slowpath thread invoke the
         adapter recovery.  In essence, this means all requests to reset are
         serialized on the slowpath thread.
      
       - Clean up the routine that restarts the adapter. If there is a failure
         from brdreset, don't immediately error and leave things in a partial
         state. Instead, ensure the adapter state is set and finish the teardown
         of structures before returning.
      
       - If in the scsi host reset handler and the board fails to reset and
         restart (which can be due to parallel reset/recovery paths), instead of
         hard failing and explicitly calling offline_eratt() (which gets into the
         redundant path), just fail out and let the asynchronous path resolve the
         adapter state.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      8c24a4f6
    • J
      scsi: lpfc: Fix sg_seg_cnt for HBAs that don't support NVME · c26c265b
      James Smart 提交于
      On an SLI-3 adapter which does not support NVMe, but with the driver global
      attribute to enable nvme on any adapter if it does support NVMe
      (e.g. module parameter lpfc_enable_fc4_type=3), the SGL and total SGE
      values are being munged by the protocol enablement when it shouldn't be.
      
      Correct by changing the location of where the NVME sgl information is being
      applied, which will avoid any SLI-3-based adapter.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      c26c265b
    • J
      scsi: lpfc: Fix oops when fewer hdwqs than cpus · 3ad348d9
      James Smart 提交于
      When tearing down the adapter for a reset, online/offline, or driver
      unload, the queue free routine would hit a GPF oops.  This only occurs on
      conditions where the number of hardware queues created is fewer than the
      number of cpus in the system. In this condition cpus share a hardware
      queue. And of course, it's the 2nd cpu that shares a hardware that
      attempted to free it a second time and hit the oops.
      
      Fix by reworking the cpu to hardware queue mapping such that:
      Assignment of hardware queues to cpus occur in two passes:
      first pass: is first time assignment of a hardware queue to a cpu.
        This will set the LPFC_CPU_FIRST_IRQ flag for the cpu.
      second pass: for cpus that did not get a hardware queue they will
        be assigned one from a primary cpu (one set in first pass).
      
      Deletion of hardware queues is driven by cpu itteration, and queues
      will only be deleted if the LPFC_CPU_FIRST_IRQ flag is set.
      
      Also contains a few small cleanup fixes and a little better logging.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      3ad348d9
    • J
      scsi: lpfc: Fix failure to clear non-zero eq_delay after io rate reduction · 8d34a59c
      James Smart 提交于
      Unusually high IO latency can be observed with little IO in progress. The
      latency may remain high regardless of amount of IO and can only be cleared
      by forcing lpfc_fcp_imax values to non-zero and then back to zero.
      
      The driver's eq_delay mechanism that scales the interrupt coalescing based
      on io completion load failed to reduce or turn off coalescing when load
      decreased. Specifically, if no io completed on a cpu within an eq_delay
      polling window, the eq delay processing was skipped and no change was made
      to the coalescing values. This left the coalescing values set when they
      were no longer applicable.
      
      Fix by always clearing the percpu counters for each time period and always
      run the eq_delay calculations if an eq has a non-zero coalescing value.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      8d34a59c
    • J
      scsi: lpfc: Fix crash on driver unload in wq free · 3cee98db
      James Smart 提交于
      If a timer routine uses workqueues, it could fire before the workqueue is
      allocated.
      
      Fix by allocating the workqueue before the timer routines are setup
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      3cee98db