1. 12 1月, 2017 1 次提交
  2. 01 12月, 2016 13 次提交
  3. 15 9月, 2016 3 次提交
  4. 09 9月, 2016 2 次提交
    • U
      scsi: cxlflash: Remove the device cleanly in the system shutdown path · babf985d
      Uma Krishnan 提交于
      Commit 704c4b0d ("cxlflash: Shutdown notify support for CXL Flash
      cards") was recently introduced to notify the AFU when a system is going
      down. Due to the position of the cxlflash driver in the device stack,
      cxlflash devices are _always_ removed during a reboot/shutdown. This can
      lead to a crash if the cxlflash shutdown hook is invoked _after_ the
      shutdown hook for the owning virtual PHB. Furthermore, the current
      implementation of shutdown/remove hooks for cxlflash are not tolerant to
      being invoked when the device is not enabled. This can also lead to a
      crash in situations where the remove hook is invoked after the device
      has been removed via the vPHBs shutdown hook. An example of this
      scenario would be an EEH reset failure while a reboot/shutdown is in
      progress.
      
      To solve both problems, the shutdown hook for cxlflash is updated to
      simply remove the device. This path already includes the AFU
      notification and thus this solution will continue to perform the
      original intent. At the same time, the remove hook is updated to protect
      against being called when the device is not enabled.
      
      Fixes: 704c4b0d ("cxlflash: Shutdown notify support for CXL Flash
      cards")
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      babf985d
    • U
      scsi: cxlflash: Scan host only after the port is ready for I/O · bbbfae96
      Uma Krishnan 提交于
      When a port link is established, the AFU sends a 'link up' interrupt.
      After the link is up, corresponding initialization steps are performed
      on the card. Following that, when the card is ready for I/O, the AFU
      sends 'login succeeded' interrupt. Today, cxlflash invokes
      scsi_scan_host() upon receipt of both interrupts.
      
      SCSI commands sent to the port prior to the 'login succeeded' interrupt
      will fail with 'port not available' error. This is not desirable.
      Moreover, when async_scan is active for the host, subsequent scan calls
      are terminated with error. Due to this, the scsi_scan_host() call
      performed after 'login succeeded' interrupt could portentially return
      error and the devices may not be scanned properly.
      
      To avoid this problem, scsi_scan_host() should be called only after the
      'login succeeded' interrupt.
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      bbbfae96
  5. 27 7月, 2016 1 次提交
  6. 13 7月, 2016 3 次提交
  7. 29 3月, 2016 2 次提交
    • M
      cxlflash: Move to exponential back-off when cmd_room is not available · ea765431
      Manoj N. Kumar 提交于
      While profiling the cxlflash_queuecommand() path under a heavy load it
      was found that number of retries to find cmd_room was fairly high.
      
      There are two problems with the current back-off:
      a) It starts with a udelay of 0
      b) It backs-off linearly
      
      Tried several approaches (a higher multiple 10*n, 100*n, as well as n^2,
      2^n) and found that the exponential back-off(2^n) approach had the least
      overall cost. Cost as being defined as overall time spent waiting.
      
      The fix is to change the linear back-off to an exponential back-off.
      This solution also takes care of the problem with the initial
      delay (starts with 1 usec).
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      ea765431
    • M
      cxlflash: Fix regression issue with re-ordering patch · 9526f360
      Manoj N. Kumar 提交于
      While running 'sg_reset -H' back to back the following exception was seen:
      
      [  735.115695] Faulting instruction address: 0xd0000000098c0864
      cpu 0x0: Vector: 300 (Data Access) at [c000000ffffafa80]
          pc: d0000000098c0864: cxlflash_async_err_irq+0x84/0x5c0 [cxlflash]
          lr: c00000000013aed0: handle_irq_event_percpu+0xa0/0x310
          sp: c000000ffffafd00
         msr: 9000000000009033
         dar: 2010000
       dsisr: 40000000
        current = 0xc000000001510880
        paca    = 0xc00000000fb80000   softe: 0        irq_happened: 0x01
          pid   = 0, comm = swapper/0
      
      Linux version 4.5.0-491-26f710d+
      
      enter ? for help
      [c000000ffffafe10] c00000000013aed0 handle_irq_event_percpu+0xa0/0x310
      [c000000ffffafed0] c00000000013b1a8 handle_irq_event+0x68/0xc0
      [c000000ffffaff00] c0000000001404ec handle_fasteoi_irq+0xec/0x2a0
      [c000000ffffaff30] c00000000013a084 generic_handle_irq+0x54/0x80
      [c000000ffffaff60] c000000000011130 __do_irq+0x80/0x1d0
      [c000000ffffaff90] c000000000024d40 call_do_irq+0x14/0x24
      [c000000001573a20] c000000000011318 do_IRQ+0x98/0x140
      [c000000001573a70] c000000000002594 hardware_interrupt_common+0x114/0x180
      
      This exception is being hit because the async_err interrupt path performs
      an MMIO to read the interrupt status register. The MMIO region in this
      case is not available.
      
      Commit 6ded8b3c ("cxlflash: Unmap problem state area before detaching
      master context") re-ordered the sequence in which term_mc() and stop_afu()
      are called. This introduces a window for interrupts to come in with the
      problem space area unmapped, that did not exist previously.
      
      The fix is to separate the disabling of all AFU interrupts to a distinct
      function, term_intr() so that it is the first thing that is done in the
      tear down process.
      
      To keep the initialization process symmetric, separate the AFU interrupt
      setup also to a distinct function: init_intr().
      
      Fixes: 6ded8b3c ("cxlflash: Unmap problem state area before detaching master context")
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      9526f360
  8. 09 3月, 2016 5 次提交
  9. 07 1月, 2016 5 次提交
  10. 30 10月, 2015 5 次提交
    • M
      cxlflash: Fix to avoid lock instrumentation rejection · 0d73122c
      Matthew R. Ochs 提交于
      When running with lock instrumentation (e.g. lockdep), some of the
      instrumentation can become disabled at probe time for a cxlflash
      adapter. This is due to a missing lock registration for the tmf_slock.
      
      The fix is to call spin_lock_init() for the tmf_slock during probe.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: NManoj Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      0d73122c
    • M
      cxlflash: Fix to escalate to LINK_RESET on login timeout · e6e6df3f
      Manoj Kumar 提交于
      A 'login timed out' asynchronous error interrupt is generated if no
      response is seen to a FLOGI within 2 seconds.  If the time out error
      is not escalated to a LINK_RESET the port will not be available for
      use. This fix provides the required escalation.
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: NTomas Henzl <thenzl@redhat.com>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      e6e6df3f
    • M
      cxlflash: Fix to avoid leaving dangling interrupt resources · ee3491ba
      Matthew R. Ochs 提交于
      When running with an unsupported AFU, the cxlflash driver fails
      the probe. When the driver is removed, the following Oops is
      encountered on a show_interrupts() thread:
      
      Call Trace:
      [c000001fba5a7a10] [0000000000000003] 0x3 (unreliable)
      [c000001fba5a7a60] [c00000000053dcf4] vsnprintf+0x204/0x4c0
      [c000001fba5a7ae0] [c00000000030045c] seq_vprintf+0x5c/0xd0
      [c000001fba5a7b20] [c00000000030051c] seq_printf+0x4c/0x60
      [c000001fba5a7b50] [c00000000013e140] show_interrupts+0x370/0x4f0
      [c000001fba5a7c10] [c0000000002ff898] seq_read+0xe8/0x530
      [c000001fba5a7ca0] [c00000000035d5c0] proc_reg_read+0xb0/0x110
      [c000001fba5a7cf0] [c0000000002ca74c] __vfs_read+0x6c/0x180
      [c000001fba5a7d90] [c0000000002cb464] vfs_read+0xa4/0x1c0
      [c000001fba5a7de0] [c0000000002cc51c] SyS_read+0x6c/0x110
      [c000001fba5a7e30] [c000000000009204] system_call+0x38/0xb4
      
      The Oops is due to not cleaning up correctly on the unsupported
      AFU error path, leaving various allocated and registered resources.
      In this case, interrupts are in a semi-allocated/registered state,
      which the show_interrupts() thread attempts to use.
      
      To fix, the cleanup logic in init_afu() is consolidated to error
      gates at the bottom of the function and the appropriate goto is
      added to each error path. As a mini side fix while refactoring
      in this routine, the else statement following the AFU version
      evaluation is eliminated as it is not needed.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: NManoj Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: NTomas Henzl <thenzl@redhat.com>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      ee3491ba
    • M
      cxlflash: Correct trace string · fa3f2c6e
      Matthew R. Ochs 提交于
      The trace following the failure of alloc_mem() incorrectly identifies
      which function failed. This can lead to misdiagnosing a failure.
      
      Fix the string to correctly indicate that alloc_mem() failed.
      Reported-by: NBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: NBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: NTomas Henzl <thenzl@redhat.com>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      fa3f2c6e
    • M
      cxlflash: Fix to avoid corrupting adapter fops · 17ead26f
      Matthew R. Ochs 提交于
      The fops owned by the adapter can be corrupted in certain scenarios,
      opening a window where certain fops are temporarily NULLed before being
      reset to their proper value. This can potentially lead software to make
      incorrect decisions, leaving the user with the inability to function as
      intended.
      
      An example of this behavior can be observed when there are a number of
      users with a high rate of turn around (attach to LUN, perform an I/O,
      detach from LUN, repeat). Every so often a user is given a valid
      context and adapter file descriptor, but the file associated with the
      descriptor lacks the correct read permission bit (FMODE_CAN_READ) and
      thus the read system call bails before calling the valid read fop.
      
      Background:
      
      The fops is stored in the adapter structure to provide the ability to
      lookup the adapter structure from within the fop handler. CXL services
      use the file's private_data and at present, the CXL context does not
      have a private section. In an effort to limit areas of the cxlflash
      driver with code specific the superpipe function, a design choice was
      made to keep the details of the fops situated away from the legacy
      portions of the driver. This drove the behavior that the adapter fops
      is set at the beginning of the disk attach ioctl handler when there
      are no users present.
      
      The corruption that this fix remedies is due to the fact that the fops
      is initially defaulted to values found within a static structure. When
      the fops is handed down to the CXL services later in the attach path,
      certain services are patched. The fops structure remains correct until
      the user count drops to 0 and the fops is reset, triggering the process
      to repeat again. The user counts are tightly coupled with the creation
      and deletion of the user context. If multiple users perform a disk
      attach at the same time, when the user count is currently 0, some users
      can be in the middle of obtaining a file descriptor and have not yet
      reached the context creation code that [in addition to creating the
      context] increments the user count. Subsequent users coming in to
      perform the attach see that the user count is still 0, and reinitialize
      the fops, temporarily removing the patched fops. The users that are in
      the middle obtaining their file descriptor may then receive an invalid
      descriptor.
      
      The fix simply removes the user count altogether and moves the fops
      initialization to probe time such that it is only performed one time
      for the life of the adapter. In the future, if the CXL services adopt
      a private member for their context, that could be used to store the
      adapter structure reference and cxlflash could revert to a model that
      does not require an embedded fops.
      Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: NBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: NDaniel Axtens <dja@axtens.net>
      Reviewed-by: NTomas Henzl <thenzl@redhat.com>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      17ead26f