1. 01 12月, 2016 11 次提交
    • M
      scsi: cxlflash: Cleanup send_tmf() · d4ace351
      Matthew R. Ochs 提交于
      The send_tmf() routine includes some copy/paste cruft that can be
      removed as well as the setting of an AFU command-specific while
      holding the tmf_slock. While not a bug, it is out of place and
      should be shifted down alongside the other command initialization
      statements for clarity.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      d4ace351
    • M
      scsi: cxlflash: Remove AFU command lock · 9ba848ac
      Matthew R. Ochs 提交于
      The original design of the cxlflash driver required AFU commands
      to convey state information across multiple threads. The IOASA
      "host use" byte was used to track if a command was done, errored,
      or timed out. A per-command spin lock was used to serialize access
      to this byte. As this is no longer required with the introduction
      of completions and various refactoring over time, the spin lock,
      state tracking, and associated code can be removed. To support the
      simplification, the wait_resp() routine is refactored to return a
      success or failure. Additionally, as the simplification to the
      AFU internal command routine, explicit assignments of AFU command
      fields to zero are removed as the memory is zeroed upon allocation.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      9ba848ac
    • M
      scsi: cxlflash: Wait for active AFU commands to timeout upon tear down · de01283b
      Matthew R. Ochs 提交于
      With the removal of the static private command pool, the ability to
      'complete' outstanding commands was lost. While not an issue for the
      commands originating outside the driver, internal AFU commands are
      synchronous and therefore have a timeout associated with them. To
      avoid a stale memory access, the tear down sequence needs to ensure
      that there are not any active commands before proceeding. As these
      internal AFU commands are rare events, the simplest way to accomplish
      this is detecting the activity and waiting for it to timeout.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      de01283b
    • M
      scsi: cxlflash: Remove private command pool · 25bced2b
      Matthew R. Ochs 提交于
      Clean up and remove the remaining private command pool infrastructure
      that is no longer required.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      25bced2b
    • M
      scsi: cxlflash: Use cmd_size for private commands · 5fbb96c8
      Matthew R. Ochs 提交于
      Instead of using a private pool of AFU commands, use cmd_size to prime
      the private pool of SCSI commands such that they are allocated with a
      size large enough to contain an aligned AFU command. Use scsi_cmd_priv()
      to derive the aligned/zeroed private command on queuecommand and TMF
      paths. Remove cmd_checkout() as it is no longer required. The remaining
      AFU private command infrastructure will be removed in a cleanup commit.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      5fbb96c8
    • M
      scsi: cxlflash: Allocate memory instead of using command pool for AFU sync · 350bb478
      Matthew R. Ochs 提交于
      As staging for the removal of the AFU command pool, remove the reliance
      upon the pool for the internal AFU sync command. Instead of obtaining an
      AFU command from the pool, dynamically allocate memory with the appropriate
      alignment requirements. Since the AFU sync service is only executed from
      the process environment, blocking is acceptable.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      350bb478
    • M
      scsi: cxlflash: Remove unused buffer from AFU command · e7ab2d40
      Matthew R. Ochs 提交于
      The cxlflash driver originally required a per-command 4K buffer that
      hosted data passed to the AFU. When the routines that initiate AFU
      and internal SCSI commands were refactored to use scsi_execute(), the
      need for this buffer became obsolete. As it is no longer necessary,
      the buffer is removed.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      e7ab2d40
    • U
      scsi: cxlflash: Avoid command room violation · 11f7b184
      Uma Krishnan 提交于
      During test, a command room violation interrupt is occasionally seen
      for the master context when the CXL flash devices are stressed.
      
      After studying the code, there could be gaps in the way command room
      value is being cached in cxlflash. When the cached command room is zero
      the thread attempting to send becomes burdened with updating the cached
      value with the actual value from the AFU. Today, this is handled with an
      atomic set operation of the raw value read. Following the atomic update,
      the thread proceeds to send.
      
      This behavior is incorrect on two counts:
      
         - The update fails to take into account the current thread and its
           consumption of one of the hardware commands.
      
         - The update does not take into account other threads also atomically
           updating. Per design, a worker thread updates the cached value when a
           send thread times out. By not protecting the update with a lock, the
           cached value can be incorrectly clobbered.
      
      To correct these issues, the update of the cached command room has been
      simplified and also protected using a spin lock which is held until the
      MMIO is complete. This ensures the command room is properly consumed by
      the same thread. Update of cached value also takes into account the
      current thread consuming a hardware command.
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      11f7b184
    • U
      scsi: cxlflash: Improve context_reset() logic · 3d2f617d
      Uma Krishnan 提交于
      Currently, the context reset routine waits for command room to
      be available before sending the reset request. Per review of the
      SISLite specification and clarifications from the CXL Flash AFU
      designers, this wait is unnecessary. The reset request can be
      sent anytime regardless of command room, so long as only a single
      reset request is active at any one point in time.
      
      This commit simplifies the reset routine by removing the wait for
      command room. Additionally it adds a debug trace to help pinpoint
      hardware errors when a context reset does not complete.
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      3d2f617d
    • U
      scsi: cxlflash: Fix crash in cxlflash_restore_luntable() · 8a260543
      Uma Krishnan 提交于
      During test, the following crash was observed:
      
      [34538.981505] Faulting instruction address: 0xd000000007c9c870
      cpu 0x9: Vector: 300 (Data Access) at [c0000007f1e8f590]
          pc: d000000007c9c870: cxlflash_restore_luntable+0x70/0x1d0 [cxlflash]
          lr: d000000007c9c84c: cxlflash_restore_luntable+0x4c/0x1d0 [cxlflash]
          sp: c0000007f1e8f810
         msr: 9000000100009033
         dar: c00000171d637438
       dsisr: 40000000
        current = 0xc0000007f1e43f90
        paca    = 0xc000000007b25100   softe: 0        irq_happened: 0x01
          pid   = 493, comm = eehd
      enter ? for help
      [c0000007f1e8f8a0] d000000007c940b0 init_afu+0xd60/0x1200 [cxlflash]
      [c0000007f1e8f9a0] d000000007c945a8 cxlflash_pci_slot_reset+0x58/0xe0 [cxlflash]
      [c0000007f1e8fa20] d00000000715f790 cxl_pci_slot_reset+0x230/0x340 [cxl]
      [c0000007f1e8fae0] c000000000040dd4 eeh_report_reset+0x144/0x180
      [c0000007f1e8fb20] c00000000003f708 eeh_pe_dev_traverse+0x98/0x170
      [c0000007f1e8fbb0] c000000000041618 eeh_handle_normal_event+0x328/0x410
      [c0000007f1e8fc30] c000000000041db8 eeh_handle_event+0x178/0x330
      [c0000007f1e8fce0] c000000000042118 eeh_event_handler+0x1a8/0x1b0
      [c0000007f1e8fd80] c00000000011420c kthread+0xec/0x100
      [c0000007f1e8fe30] c00000000000a47c ret_from_kernel_thread+0x5c/0xe0
      
      When superpipe mode is disabled for a LUN, the references for the
      local lun are deleted but the LUN is still identified as being present
      in the LUN table. This mismatched state can result in the above crash
      when the LUN table is restored during an error recovery operation.
      
      To fix this issue, the local LUN information structure is updated to
      reflect the LUN is no longer in the LUN table once all references to
      the LUN are gone.
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      8a260543
    • U
      scsi: cxlflash: Set sg_tablesize to 1 instead of SG_NONE · 68ab2d76
      Uma Krishnan 提交于
      The following Oops is encountered when blk_mq is enabled with the
      cxlflash driver:
      
      [ 2960.817172] Oops: Kernel access of bad area, sig: 11 [#5]
      [ 2960.817309] NIP  __blk_mq_run_hw_queue+0x278/0x4c0
      [ 2960.817313] LR __blk_mq_run_hw_queue+0x2bc/0x4c0
      [ 2960.817314] Call Trace:
      [ 2960.817320] __blk_mq_run_hw_queue+0x2bc/0x4c0 (unreliable)
      [ 2960.817324] blk_mq_run_hw_queue+0xd8/0x100
      [ 2960.817329] blk_mq_insert_requests+0x14c/0x1f0
      [ 2960.817333] blk_mq_flush_plug_list+0x150/0x190
      [ 2960.817338] blk_flush_plug_list+0x11c/0x2b0
      [ 2960.817344] blk_finish_plug+0x58/0x80
      [ 2960.817348] __do_page_cache_readahead+0x1c0/0x2e0
      [ 2960.817352] force_page_cache_readahead+0x68/0xd0
      [ 2960.817356] generic_file_read_iter+0x43c/0x6a0
      [ 2960.817359] blkdev_read_iter+0x68/0xa0
      [ 2960.817361] __vfs_read+0x11c/0x180
      [ 2960.817364] vfs_read+0xa4/0x1c0
      [ 2960.817366] SyS_read+0x6c/0x110
      [ 2960.817369] system_call+0x38/0xb4
      
      The SCSI blk_mq stack assumes that sg_tablesize is always a non-zero
      value with scsi_mq_setup_tags() allocating tags using sg_tablesize.
      The cxlflash driver currently uses SG_NONE (0) for the sg_tablesize
      as the devices it supports are not capable of scatter gather. This
      mismatch of values results in the Oops above.
      
      To resolve this issue, sg_tablesize for cxlflash can simply be set
      to 1, a value which satisfies the constraints in cxlflash and the
      lack of support of SG_NONE in SCSI blk_mq.
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      68ab2d76
  2. 15 9月, 2016 4 次提交
  3. 09 9月, 2016 2 次提交
    • U
      scsi: cxlflash: Remove the device cleanly in the system shutdown path · babf985d
      Uma Krishnan 提交于
      Commit 704c4b0d ("cxlflash: Shutdown notify support for CXL Flash
      cards") was recently introduced to notify the AFU when a system is going
      down. Due to the position of the cxlflash driver in the device stack,
      cxlflash devices are _always_ removed during a reboot/shutdown. This can
      lead to a crash if the cxlflash shutdown hook is invoked _after_ the
      shutdown hook for the owning virtual PHB. Furthermore, the current
      implementation of shutdown/remove hooks for cxlflash are not tolerant to
      being invoked when the device is not enabled. This can also lead to a
      crash in situations where the remove hook is invoked after the device
      has been removed via the vPHBs shutdown hook. An example of this
      scenario would be an EEH reset failure while a reboot/shutdown is in
      progress.
      
      To solve both problems, the shutdown hook for cxlflash is updated to
      simply remove the device. This path already includes the AFU
      notification and thus this solution will continue to perform the
      original intent. At the same time, the remove hook is updated to protect
      against being called when the device is not enabled.
      
      Fixes: 704c4b0d ("cxlflash: Shutdown notify support for CXL Flash
      cards")
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      babf985d
    • U
      scsi: cxlflash: Scan host only after the port is ready for I/O · bbbfae96
      Uma Krishnan 提交于
      When a port link is established, the AFU sends a 'link up' interrupt.
      After the link is up, corresponding initialization steps are performed
      on the card. Following that, when the card is ready for I/O, the AFU
      sends 'login succeeded' interrupt. Today, cxlflash invokes
      scsi_scan_host() upon receipt of both interrupts.
      
      SCSI commands sent to the port prior to the 'login succeeded' interrupt
      will fail with 'port not available' error. This is not desirable.
      Moreover, when async_scan is active for the host, subsequent scan calls
      are terminated with error. Due to this, the scsi_scan_host() call
      performed after 'login succeeded' interrupt could portentially return
      error and the devices may not be scanned properly.
      
      To avoid this problem, scsi_scan_host() should be called only after the
      'login succeeded' interrupt.
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      bbbfae96
  4. 24 8月, 2016 2 次提交
  5. 19 8月, 2016 3 次提交
  6. 27 7月, 2016 1 次提交
  7. 19 7月, 2016 1 次提交
  8. 13 7月, 2016 3 次提交
  9. 06 5月, 2016 1 次提交
    • M
      cxlflash: Fix to resolve dead-lock during EEH recovery · 635f6b08
      Manoj N. Kumar 提交于
      When a cxlflash adapter goes into EEH recovery and multiple processes
      (each having established its own context) are active, the EEH recovery
      can hang if the processes attempt to recover in parallel. The symptom
      logged after a couple of minutes is:
      
      INFO: task eehd:48 blocked for more than 120 seconds.
      Not tainted 4.5.0-491-26f710d+ #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      eehd            0    48      2
      Call Trace:
      __switch_to+0x2f0/0x410
      __schedule+0x300/0x980
      schedule+0x48/0xc0
      rwsem_down_write_failed+0x294/0x410
      down_write+0x88/0xb0
      cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash]
      cxl_vphb_error_detected+0x88/0x110 [cxl]
      cxl_pci_error_detected+0xb0/0x1d0 [cxl]
      eeh_report_error+0xbc/0x130
      eeh_pe_dev_traverse+0x94/0x160
      eeh_handle_normal_event+0x17c/0x450
      eeh_handle_event+0x184/0x370
      eeh_event_handler+0x1c8/0x1d0
      kthread+0x110/0x130
      ret_from_kernel_thread+0x5c/0xa4
      INFO: task blockio:33215 blocked for more than 120 seconds.
      
      Not tainted 4.5.0-491-26f710d+ #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      blockio         0 33215  33213
      Call Trace:
      0x1 (unreliable)
      __switch_to+0x2f0/0x410
      __schedule+0x300/0x980
      schedule+0x48/0xc0
      rwsem_down_read_failed+0x124/0x1d0
      down_read+0x68/0x80
      cxlflash_ioctl+0x70/0x6f0 [cxlflash]
      scsi_ioctl+0x3b0/0x4c0
      sg_ioctl+0x960/0x1010
      do_vfs_ioctl+0xd8/0x8c0
      SyS_ioctl+0xd4/0xf0
      system_call+0x38/0xb4
      INFO: task eehd:48 blocked for more than 120 seconds.
      
      The hang is because of a 3 way dead-lock:
      
      Process A holds the recovery mutex, and waits for eehd to complete.
      Process B holds the semaphore and waits for the recovery mutex.
      eehd waits for semaphore.
      
      The fix is to have Process B above release the semaphore before
      attempting to acquire the recovery mutex. This will allow
      eehd to proceed to completion.
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      635f6b08
  10. 29 3月, 2016 2 次提交
    • M
      cxlflash: Move to exponential back-off when cmd_room is not available · ea765431
      Manoj N. Kumar 提交于
      While profiling the cxlflash_queuecommand() path under a heavy load it
      was found that number of retries to find cmd_room was fairly high.
      
      There are two problems with the current back-off:
      a) It starts with a udelay of 0
      b) It backs-off linearly
      
      Tried several approaches (a higher multiple 10*n, 100*n, as well as n^2,
      2^n) and found that the exponential back-off(2^n) approach had the least
      overall cost. Cost as being defined as overall time spent waiting.
      
      The fix is to change the linear back-off to an exponential back-off.
      This solution also takes care of the problem with the initial
      delay (starts with 1 usec).
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      ea765431
    • M
      cxlflash: Fix regression issue with re-ordering patch · 9526f360
      Manoj N. Kumar 提交于
      While running 'sg_reset -H' back to back the following exception was seen:
      
      [  735.115695] Faulting instruction address: 0xd0000000098c0864
      cpu 0x0: Vector: 300 (Data Access) at [c000000ffffafa80]
          pc: d0000000098c0864: cxlflash_async_err_irq+0x84/0x5c0 [cxlflash]
          lr: c00000000013aed0: handle_irq_event_percpu+0xa0/0x310
          sp: c000000ffffafd00
         msr: 9000000000009033
         dar: 2010000
       dsisr: 40000000
        current = 0xc000000001510880
        paca    = 0xc00000000fb80000   softe: 0        irq_happened: 0x01
          pid   = 0, comm = swapper/0
      
      Linux version 4.5.0-491-26f710d+
      
      enter ? for help
      [c000000ffffafe10] c00000000013aed0 handle_irq_event_percpu+0xa0/0x310
      [c000000ffffafed0] c00000000013b1a8 handle_irq_event+0x68/0xc0
      [c000000ffffaff00] c0000000001404ec handle_fasteoi_irq+0xec/0x2a0
      [c000000ffffaff30] c00000000013a084 generic_handle_irq+0x54/0x80
      [c000000ffffaff60] c000000000011130 __do_irq+0x80/0x1d0
      [c000000ffffaff90] c000000000024d40 call_do_irq+0x14/0x24
      [c000000001573a20] c000000000011318 do_IRQ+0x98/0x140
      [c000000001573a70] c000000000002594 hardware_interrupt_common+0x114/0x180
      
      This exception is being hit because the async_err interrupt path performs
      an MMIO to read the interrupt status register. The MMIO region in this
      case is not available.
      
      Commit 6ded8b3c ("cxlflash: Unmap problem state area before detaching
      master context") re-ordered the sequence in which term_mc() and stop_afu()
      are called. This introduces a window for interrupts to come in with the
      problem space area unmapped, that did not exist previously.
      
      The fix is to separate the disabling of all AFU interrupts to a distinct
      function, term_intr() so that it is the first thing that is done in the
      tear down process.
      
      To keep the initialization process symmetric, separate the AFU interrupt
      setup also to a distinct function: init_intr().
      
      Fixes: 6ded8b3c ("cxlflash: Unmap problem state area before detaching master context")
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      9526f360
  11. 09 3月, 2016 8 次提交
  12. 07 1月, 2016 2 次提交
    • M
      cxlflash: Enable device id for future IBM CXL adapter · a2746fb1
      Manoj Kumar 提交于
      This drop enables a future card with a device id of 0x0600 to be
      recognized by the cxlflash driver.
      
      As per the design, the Accelerator Function Unit (AFU) for this new IBM
      CXL Flash Adapter retains the same host interface as the previous
      generation. For the early prototypes of the new card, the driver with
      this change behaves exactly as the driver prior to this behaved with the
      earlier generation card. Therefore, no card specific programming has
      been added. These card specific changes can be staged in later if
      needed.
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      a2746fb1
    • M
      cxlflash: Resolve oops in wait_port_offline · b45cdbaf
      Manoj Kumar 提交于
      If an async error interrupt is generated, and the error requires the FC
      link to be reset, it cannot be performed in the interrupt context. So a
      work element is scheduled to complete the link reset in a process
      context. If either an EEH event or an escalation occurs in between when
      the interrupt is generated and the scheduled work is started, the MMIO
      space may no longer be available. This will cause an oops in the worker
      thread.
      
      [  606.806583] NIP kthread_data+0x28/0x40
      [  606.806633] LR wq_worker_sleeping+0x30/0x100
      [  606.806694] Call Trace:
      [  606.806721] 0x50 (unreliable)
      [  606.806796] wq_worker_sleeping+0x30/0x100
      [  606.806884] __schedule+0x69c/0x8a0
      [  606.806959] schedule+0x44/0xc0
      [  606.807034] do_exit+0x770/0xb90
      [  606.807109] die+0x300/0x460
      [  606.807185] bad_page_fault+0xd8/0x150
      [  606.807259] handle_page_fault+0x2c/0x30
      [  606.807338] wait_port_offline.constprop.12+0x60/0x130 [cxlflash]
      
      To prevent the problem space area from being unmapped, when there is
      pending work, a mapcount (using the kref mechanism) is held.  The
      mapcount is released only when the work is completed.  The last
      reference release is tied to the unmapping service.
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      b45cdbaf