1. 28 2月, 2017 1 次提交
  2. 07 2月, 2017 1 次提交
  3. 09 1月, 2017 1 次提交
  4. 15 7月, 2016 1 次提交
  5. 14 7月, 2016 2 次提交
  6. 09 6月, 2016 1 次提交
    • W
      scsi: fix race between simultaneous decrements of ->host_failed · 72d8c36e
      Wei Fang 提交于
      sas_ata_strategy_handler() adds the works of the ata error handler to
      system_unbound_wq. This workqueue asynchronously runs work items, so the
      ata error handler will be performed concurrently on different CPUs. In
      this case, ->host_failed will be decreased simultaneously in
      scsi_eh_finish_cmd() on different CPUs, and become abnormal.
      
      It will lead to permanently inequality between ->host_failed and
      ->host_busy, and scsi error handler thread won't start running. IO
      errors after that won't be handled.
      
      Since all scmds must have been handled in the strategy handler, just
      remove the decrement in scsi_eh_finish_cmd() and zero ->host_busy after
      the strategy handler to fix this race.
      
      Fixes: 50824d6c ("[SCSI] libsas: async ata-eh")
      Cc: stable@vger.kernel.org
      Signed-off-by: NWei Fang <fangwei1@huawei.com>
      Reviewed-by: NJames Bottomley <jejb@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      72d8c36e
  7. 10 5月, 2016 3 次提交
  8. 05 4月, 2016 5 次提交
  9. 07 12月, 2015 1 次提交
  10. 04 8月, 2015 2 次提交
    • T
      Revert "libata: Implement NCQ autosense" · 74a80d67
      Tejun Heo 提交于
      This reverts commit 42b966fb.
      
      As implemented, ACS-4 sense reporting for ATA devices bypasses error
      diagnosis and handling in libata degrading EH behavior significantly.
      Revert the related changes for now.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: stable@vger.kernel.org #v4.1+
      74a80d67
    • T
      Revert "libata: Implement support for sense data reporting" · 84ded2f8
      Tejun Heo 提交于
      This reverts commit fe7173c2.
      
      As implemented, ACS-4 sense reporting for ATA devices bypasses error
      diagnosis and handling in libata degrading EH behavior significantly.
      Revert the related changes for now.
      
      ATA_ID_COMMAND_SET_3/4 constants are not reverted as they're used by
      later changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: stable@vger.kernel.org #v4.1+
      84ded2f8
  11. 03 8月, 2015 1 次提交
  12. 22 5月, 2015 1 次提交
  13. 05 5月, 2015 1 次提交
  14. 26 4月, 2015 1 次提交
  15. 27 3月, 2015 6 次提交
  16. 21 1月, 2015 1 次提交
  17. 09 1月, 2015 1 次提交
  18. 07 1月, 2015 1 次提交
  19. 06 1月, 2015 1 次提交
  20. 06 11月, 2014 1 次提交
  21. 15 7月, 2014 1 次提交
    • A
      libata: EH should handle AMNF error condition as a media error · eec7e1c1
      Alexey Asemov 提交于
      libata-eh.c should handle AMNF error condition (error byte bit 0,
      usually code 0x01) in libata-eh.c along with UNC as a media error so
      SCSI stack can handle it properly (translation code 0x01 is already
      present in libata-scsi.c) but was never passed down due to lack of
      handling in EH.
      
      While using linux-based machine (AMD 6550M-based notebook, PCI IDs for the
      controller are 1022:7801 subsys 1025:059d) and ddrescue to salvage data
      from failing hard drive (WD7500BPVT 2.5" 750G SATA2), I've found that pure
      AMNF 0x01 error code generates generic "device error" that is retried
      several times by SCSI stack instead of "media error" that is passed up to
      software.
      
      So we may assume deprecated AMNF error code is surely not dead yet, and
      it's better for it to be handled properly. As we may see it is used by
      modern enough devices, and used properly: drive returned AMNF only when IDs
      for track cannot be read completely due to dying head or positioning,
      otherwise it returned UNC(orrectables).
      
      Not handling it causes wrong generic error code ("device error") reporting
      down the stack, can damage failing drives further because of excessive
      retries, and slows salvaging down a lot. Also, there is handling code in
      libata-scsi.c for 0x01 AMNF error already.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=80031
      
      tj: Shortened $SUBJ and moved its content to the first paragraph.
      Signed-off-by: NAlexey Asemov <alex@alex-at.ru>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      eec7e1c1
  22. 19 3月, 2014 1 次提交
    • D
      libata, libsas: kill pm_result and related cleanup · bc6e7c4b
      Dan Williams 提交于
      Tejun says:
        "At least for libata, worrying about suspend/resume failures don't make
         whole lot of sense.  If suspend failed, just proceed with suspend.  If
         the device can't be woken up afterwards, that's that.  There isn't
         anything we could have done differently anyway.  The same for resume, if
         spinup fails, the device is dud and the following commands will invoke
         EH actions and will eventually fail.  Again, there really isn't any
         *choice* to make.  Just making sure the errors are handled gracefully
         (ie. don't crash) and the following commands are handled correctly
         should be enough."
      
      The only libata user that actually cares about the result from a suspend
      operation is libsas.  However, it only cares about whether queuing a new
      operation collides with an in-flight one.  All libsas does with the
      error is retry, but we can just let libata wait for the previous
      operation before continuing.
      
      Other cleanups include:
      1/ Unifying all ata port pm operations on an ata_port_pm_ prefix
      2/ Marking all ata port pm helper routines as returning void, only
         ata_port_pm_ entry points need to fake a 0 return value.
      3/ Killing ata_port_{suspend|resume}_common() in favor of calling
         ata_port_request_pm() directly
      4/ Killing the wrappers that just do a to_ata_port() conversion
      5/ Clearly marking the entry points that do async operations with an
        _async suffix.
      
      Reference: http://marc.info/?l=linux-scsi&m=138995409532286&w=2
      
      Cc: Phillip Susi <psusi@ubuntu.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Suggested-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NTodd Brandt <todd.e.brandt@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      bc6e7c4b
  23. 08 3月, 2014 1 次提交
    • D
      libata: end the r-word · 35bf8821
      Dan Williams 提交于
      Prompted by the social effort in the US to discourage usage of the
      adjective "retarded".
      
      In this case we needlessly anthropomorphize hard drives.  The
      implication is that due to design deficiencies in the device reset
      recovery time is negatively impacted.  We can simply clearly state that
      fact.  "Exceptional devices cause outliers in reset recovery time." This
      steers clear of any unintended comparison of such devices to humans with
      cognitive disabilities.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      35bf8821
  24. 23 11月, 2013 1 次提交
  25. 15 11月, 2013 1 次提交
  26. 27 10月, 2013 1 次提交
  27. 08 10月, 2013 1 次提交
    • G
      libata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures · f13e2201
      Gwendal Grignou 提交于
      libata EH decrements scmd->retries when the command failed for reasons
      unrelated to the command itself so that, for example, commands aborted
      due to suspend / resume cycle don't get penalized; however,
      decrementing scmd->retries isn't enough for ATA passthrough commands.
      
      Without this fix, ATA passthrough commands are not resend to the
      drive, and no error is signalled to the caller because:
      
      - allowed retry count is 1
      - ata_eh_qc_complete fill the sense data, so result is valid
      - sense data is filled with untouched ATA registers.
      Signed-off-by: NGwendal Grignou <gwendal@google.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      f13e2201