1. 22 9月, 2011 1 次提交
  2. 27 8月, 2011 2 次提交
    • K
      [SCSI] mptfusion: Fix for device offline while doing aggressive HBA reset · 98cbe371
      kashyap.desai@lsi.com 提交于
      [Resend patch as per Bernd Schubert comment ]
      
      Issue:
      
      Device goes offline while doing aggressive HBA reset
      along with IO using some utility.
      
      Root cause:
      
      FW goes into bad state due to aggressive reset. Softreset does not
      help to recover FW. And also aggressive reset open up the window for
      Error handling thread to kicked off at the same time HBA will be in
      constant RESET loop as part of aggressive reset test case can lead
      Device to goes offline.
      
      Changes:
      
      1. Added extra check as below inside eh_timed_out call back as below.
         if(ioc->ioc_reset_in_progress) Rc = EH_TIMER_RESET
      
      2. Removed " DOORBELL_ACTIVE" check for SAS controller from task
         management context.  Since SAS controller uses high priority queue
         for task management. This check is not required for SAS controller.
      
      3. Moved SoftReset call to HardReset from Task Mgmt context.
      Signed-off-by: NKashyap Desai <kashyap.desai@lsi.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      98cbe371
    • K
      [SCSI] mptfusion: Better handling of DEAD IOC PCI-E Link down error condition · e62cca19
      kashyap.desai@lsi.com 提交于
      Find Non-Operation IOC and remove it from OS: Detecting
      dead(non-functional) ioc will be done reading doorbell register value
      from fault reset thread, which has been called from work thread
      context after each specific interval. If doorbell value is 0xFFFFFFFF,
      it will be considered as IOC is non-operational and marked as dead
      ioc.
      
      Once Dead IOC has been detected, it will be removed at pci layer using
      "pci_remove_bus_device" API.
      Signed-off-by: NKashyap Desai <kashyap.desai@lsi.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      e62cca19
  3. 27 7月, 2011 1 次提交
  4. 02 5月, 2011 1 次提交
  5. 01 5月, 2011 1 次提交
  6. 13 2月, 2011 1 次提交
  7. 28 7月, 2010 5 次提交
    • K
      [SCSI] mptfusion: Block Error handling for deleting devices or Device in DMD · c9de7dc4
      Kashyap, Desai 提交于
      Issue description:
      In multipath topology, when device deletion is in transient state,
      multipath driver can call blk_flush_queue() as part of path failure.
      Before device get deleted from OS, Device may go OFFLINE as part of error
      handling kicked off triggered from multipathing driver. Above condition hits
      more frequently if device missing delay timer (which is LSI specific firmware
      parameter) is non zero value.
      
      root cause of this issue is Error handling thread is getting kicked off for
      device which is not really present(in transient state of deleting).
      
      This patch has solution for this issue. driver is now using eh_timed_out
      callback. See below.
      
      mptsas_transport_template->eh_timed_out = mptsas_eh_timed_out
      
      Using mptsas_eh_timed_out function, driver can decide weather vdevice is
      under Device missing delay or deleting state.
      
      for either of those cases, there is BLK_EH_RESET_TIMER return to scsi mid
      and error handling thread will not be kicked off for that particular scsi
      command.
      Signed-off-by: NKashyap Desai <kashyap.desai@lsi.com>
      Cc: Stable Tree <stable@kernel.org>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      c9de7dc4
    • K
      [SCSI] mptfusion: print Doorbell register in a case of hard reset and timeout · 97009a29
      Kei Tokunaga 提交于
      Printing Doorbell register in a case of hard reset and timeout
      should be useful for figuring out the state of the system.
      Signed-off-by: NKei Tokunaga <tokunaga.keiich@jp.fujitsu.com>
      Acked-by: N"Desai, Kashyap" <Kashyap.Desai@lsi.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      97009a29
    • K
      [SCSI] mptfusion: schedule_target_reset from all Reset context · b68bf096
      Kashyap, Desai 提交于
      Issue:
      target reset will be queued to driver's internal queue to get schedule
      later. When driver add target into internal target_reset queue we will block IOs
      on those target using scsi midlayer API. Now due to some cause driver is not
      executing those target_reset list and it is always in block state.
      
      Changes:
      now we are clearing target_reset queue from all other Callback context
      instead of only DeviceReset context.Now wherever driver is clearing
      taskmgmt_in_progress flag it is considering target_reset queue cleanup
      also.
      Signed-off-by: NKashyap Desai <kashyap.desai@lsi.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      b68bf096
    • K
      [SCSI] mptfusion: Use DID_TRANSPORT_DISRUPTED instead of DID_BUS_BUSY · 4d069566
      Kashyap, Desai 提交于
      Changed the return value for Nexus Loss IOs to be DID_TRANSPORT_DISRUPTED.
      What this will allow is the multi-path driver to delay the fail over
      process. They would like the path to keep up as long as the nexus loss
      Loginfo is return from firmware. With DID_BUS_BUSY the path fails over
      immediately.
      Signed-off-by: NKashyap Desai <kashyap.desai@lsi.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      4d069566
    • R
      [SCSI] mptsas: fix hangs caused by ATA pass-through · 2a1b7e57
      Ryan Kuester 提交于
      I may have an explanation for the LSI 1068 HBA hangs provoked by ATA
      pass-through commands, in particular by smartctl.
      
      First, my version of the symptoms.  On an LSI SAS1068E B3 HBA running
      01.29.00.00 firmware, with SATA disks, and with smartd running, I'm seeing
      occasional task, bus, and host resets, some of which lead to hard faults of
      the HBA requiring a reboot.  Abusively looping the smartctl command,
      
          # while true; do smartctl -a /dev/sdb > /dev/null; done
      
      dramatically increases the frequency of these failures to nearly one per
      minute.  A high IO load through the HBA while looping smartctl seems to
      improve the chance of a full scsi host reset or a non-recoverable hang.
      
      I reduced what smartctl was doing down to a simple test case which
      causes the hang with a single IO when pointed at the sd interface.  See
      the code at the bottom of this e-mail.  It uses an SG_IO ioctl to issue
      a single pass-through ATA identify device command.  If the buffer
      userspace gives for the read data has certain alignments, the task is
      issued to the HBA but the HBA fails to respond.  If run against the sg
      interface, neither the test code nor smartctl causes a hang.
      
      sd and sg handle the SG_IO ioctl slightly differently.  Unless you
      specifically set a flag to do direct IO, sg passes a buffer of its own,
      which is page-aligned, to the block layer and later copies the result
      into the userspace buffer regardless of its alignment.  sd, on the other
      hand, always does direct IO unless the userspace buffer fails an
      alignment test at block/blk-map.c line 57, in which case a page-aligned
      buffer is created and used for the transfer.
      
      The alignment test currently checks for word-alignment, the default
      setup by scsi_lib.c; therefore, userspace buffers of almost any
      alignment are given directly to the HBA as DMA targets.  The LSI 1068
      hardware doesn't seem to like at least a couple of the alignments which
      cross a page boundary (see the test code below).  Curiously, many
      page-boundary-crossing alignments do work just fine.
      
      So, either the hardware has an bug handling certain alignments or the
      hardware has a stricter alignment requirement than the driver is
      advertising.  If stricter alignment is required, then in no case should
      misaligned buffers from userspace be allowed through without being
      bounced or at least causing an error to be returned.
      
      It seems the mptsas driver could use blk_queue_dma_alignment() to advertise
      a stricter alignment requirement.  If it does, sd does the right thing and
      bounces misaligned buffers (see block/blk-map.c line 57).  The following
      patch to 2.6.34-rc5 makes my symptoms go away.  I'm sure this is the wrong
      place for this code, but it gets my idea across.
      Acked-by: N"Desai, Kashyap" <Kashyap.Desai@lsi.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      2a1b7e57
  8. 24 5月, 2010 1 次提交
  9. 11 4月, 2010 4 次提交
  10. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  11. 09 2月, 2010 1 次提交
  12. 19 1月, 2010 1 次提交
    • K
      [SCSI] mptfusion: Added MPI_SCSIIO_CONTROL_HEADOFQ priority · 65f89c23
      Kashyap, Desai 提交于
      There is a 'ioprio' field in the BIO and the Request structure.
      check this priority field and set MPI_SCSIIO_CONTROL_HEADOFQ
      to pass down I/O priority.
      An enhancement to the LSI Disk Array Controller firmware is being
      developed to look at the Head Of Queue bit to allow I/Os with the HOQ bit
      set to be processed before I/Os which do not have the HOQ bit set.
      In order to set the HOQ bit, the mpt fusion driver  needs to look at the
      'ioprio' field in the request structure associated with the scsi command.
      Signed-off-by: NKashyap Desai <kashyap.desai@lsi.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      65f89c23
  13. 05 12月, 2009 1 次提交
    • M
      [SCSI] modify change_queue_depth to take in reason why it is being called · e881a172
      Mike Christie 提交于
      This patch modifies scsi_host_template->change_queue_depth so that
      it takes an argument indicating why it is being called. This will be
      used so that if a LLD needs to do some extra processing when
      handling queue fulls or later ramp ups, it can do so.
      
      This is a simple port of the drivers setting a change_queue_depth
      callback. In the patch I just have these LLDs adjust the queue depth
      if the user was requesting it.
      Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
      
      [Vasu.Dev: v2
      	Also converted pmcraid_change_queue_depth and then verified
      all modules compile  using "make allmodconfig" for any new build
      warnings on X86_64.
      
      	Updated original description after combing two original
      patches from Mike to make this patch git bisectable.]
      Signed-off-by: NVasu Dev <vasu.dev@intel.com>
      [jejb: fixed up 53c700]
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      e881a172
  14. 30 10月, 2009 1 次提交
    • K
      [SCSI] mptspi: Fix for incorrect data underrun errata · 9b53b392
      Kashyap, Desai 提交于
      Errata:
      Certain conditions on the scsi bus may casue the 53C1030 to incorrectly signal
      a SCSI_DATA_UNDERRUN to the host.
      
      Workaround 1:
      For an Errata on LSI53C1030 When the length of request data
      and transfer data are different with result of command (READ or VERIFY),
      DID_SOFT_ERROR is set.
      
      Workaround 2:
      For potential trouble on LSI53C1030. It is checked whether the length of
      request data is equal to the length of transfer and residual.
      MEDIUM_ERROR is set by incorrect data.
      Signed-off-by: NKashyap Desai <kashyap.desai@lsi.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      9b53b392
  15. 12 9月, 2009 1 次提交
  16. 23 8月, 2009 3 次提交
  17. 15 6月, 2009 1 次提交
  18. 10 6月, 2009 9 次提交
  19. 14 1月, 2009 1 次提交
  20. 17 12月, 2008 1 次提交
  21. 24 10月, 2008 1 次提交
  22. 27 7月, 2008 1 次提交