1. 20 5月, 2011 1 次提交
  2. 15 5月, 2011 1 次提交
    • T
      libata: fix oops when LPM is used with PMP · 5f6f12cc
      Tejun Heo 提交于
      ae01b249 (libata: Implement ATA_FLAG_NO_DIPM and apply it to mcp65)
      added ATA_FLAG_NO_DIPM and made ata_eh_set_lpm() check the flag.
      However, @ap is NULL if @link points to a PMP link and thus the
      unconditional @ap->flags dereference leads to the following oops.
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
        IP: [<ffffffff813f98e1>] ata_eh_recover+0x9a1/0x1510
        ...
        Pid: 295, comm: scsi_eh_4 Tainted: P            2.6.38.5-core2 #1 System76, Inc. Serval Professional/Serval Professional
        RIP: 0010:[<ffffffff813f98e1>]  [<ffffffff813f98e1>] ata_eh_recover+0x9a1/0x1510
        RSP: 0018:ffff880132defbf0  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff880132f40000 RCX: 0000000000000000
        RDX: ffff88013377c000 RSI: ffff880132f40000 RDI: 0000000000000000
        RBP: ffff880132defce0 R08: ffff88013377dc58 R09: ffff880132defd98
        R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
        R13: 0000000000000000 R14: ffff88013377c000 R15: 0000000000000000
        FS:  0000000000000000(0000) GS:ffff8800bf700000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 0000000000000018 CR3: 0000000001a03000 CR4: 00000000000406e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Process scsi_eh_4 (pid: 295, threadinfo ffff880132dee000, task ffff880133b416c0)
        Stack:
         0000000000000000 ffff880132defcc0 0000000000000000 ffff880132f42738
         ffffffff813ee8f0 ffffffff813eefe0 ffff880132defd98 ffff88013377f190
         ffffffffa00b3e30 ffffffff813ef030 0000000032defc60 ffff880100000000
        Call Trace:
         [<ffffffff81400867>] sata_pmp_error_handler+0x607/0xc30
         [<ffffffffa00b273f>] ahci_error_handler+0x1f/0x70 [libahci]
         [<ffffffff813faade>] ata_scsi_error+0x5be/0x900
         [<ffffffff813cf724>] scsi_error_handler+0x124/0x650
         [<ffffffff810834b6>] kthread+0x96/0xa0
         [<ffffffff8100cd64>] kernel_thread_helper+0x4/0x10
        Code: 8b 95 70 ff ff ff b8 00 00 00 00 48 3b 9a 10 2e 00 00 48 0f 44 c2 48 89 85 70 ff ff ff 48 8b 8d 70 ff ff ff f6 83 69 02 00 00 01 <48> 8b 41 18 0f 85 48 01 00 00 48 85 c9 74 12 48 8b 51 08 48 83
        RIP  [<ffffffff813f98e1>] ata_eh_recover+0x9a1/0x1510
         RSP <ffff880132defbf0>
        CR2: 0000000000000018
      
      Fix it by testing @link->ap->flags instead.
      
      stable: ATA_FLAG_NO_DIPM was added during 2.6.39 cycle but was
              backported to 2.6.37 and 38.  This is a fix for that and thus
              also applicable to 2.6.37 and 38.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: N"Nathan A. Mourey II" <nmoureyii@ne.rr.com>
      LKML-Reference: <1304555277.2059.2.camel@localhost.localdomain>
      Cc: Connor H <cmdkhh@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NJeff Garzik <jgarzik@pobox.com>
      5f6f12cc
  3. 24 4月, 2011 1 次提交
  4. 31 3月, 2011 1 次提交
  5. 02 3月, 2011 3 次提交
    • J
      libata: separate error handler into usable components · 0e0b494c
      James Bottomley 提交于
      Right at the moment, the libata error handler is incredibly
      monolithic.  This makes it impossible to use from composite drivers
      like libsas and ipr which have to handle error themselves in the first
      instance.
      
      The essence of the change is to split the monolithic error handler
      into two components: one which handles a queue of ata commands for
      processing and the other which handles the back end of readying a
      port.  This allows the upper error handler fine grained control in
      calling libsas functions (and making sure they only get called for ATA
      commands whose lower errors have been fixed up).
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      0e0b494c
    • J
      libata: fix eh locking · c34aeebc
      James Bottomley 提交于
      The SCSI host eh_cmd_q should be protected by the host lock (not the
      port lock).  This probably doesn't matter that much at the moment,
      since we try to serialise the add and eh pieces, but it might matter
      in future for more convenient error handling.  Plus this switches
      libata to the standard eh pattern where you lock, remove from the cmd
      queue to a local list and unlock and then operate on the local list.
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      c34aeebc
    • T
      libata: fix hotplug for drivers which don't implement LPM · eb0e85e3
      Tejun Heo 提交于
      ata_eh_analyze_serror() suppresses hotplug notifications if LPM is
      being used because LPM generates spurious hotplug events.  It compared
      whether link->lpm_policy was different from ATA_LPM_MAX_POWER to
      determine whether LPM is enabled; however, this is incorrect as for
      drivers which don't implement LPM, lpm_policy is always
      ATA_LPM_UNKNOWN.  This disabled hotplug detection for all drivers
      which don't implement LPM.
      
      Fix it by comparing whether lpm_policy is greater than
      ATA_LPM_MAX_POWER.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@kernel.org
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      eb0e85e3
  6. 14 2月, 2011 2 次提交
    • J
      [SCSI] libata: separate error handler into usable components · 64878c0e
      James Bottomley 提交于
      Right at the moment, the libata error handler is incredibly
      monolithic.  This makes it impossible to use from composite drivers
      like libsas and ipr which have to handle error themselves in the first
      instance.
      
      The essence of the change is to split the monolithic error handler
      into two components: one which handles a queue of ata commands for
      processing and the other which handles the back end of readying a
      port.  This allows the upper error handler fine grained control in
      calling libsas functions (and making sure they only get called for ATA
      commands whose lower errors have been fixed up).
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Jeff Garzik <jeff@garzik.org>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      64878c0e
    • J
      [SCSI] libata: fix eh locking · 4451ef63
      James Bottomley 提交于
      The SCSI host eh_cmd_q should be protected by the host lock (not the
      port lock).  This probably doesn't matter that much at the moment,
      since we try to serialise the add and eh pieces, but it might matter
      in future for more convenient error handling.  Plus this switches
      libata to the standard eh pattern where you lock, remove from the cmd
      queue to a local list and unlock and then operate on the local list.
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Jeff Garzik <jeff@garzik.org>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      4451ef63
  7. 25 12月, 2010 1 次提交
    • T
      libata: issue DIPM enable commands with LPM state updated · e5005b15
      Tejun Heo 提交于
      Low level drivers may behave differently depending on the current
      link->lpm_policy.  During ata_eh_set_lpm(), DIPM enable commands are
      issued after the successful completion of ap->ops->set_lpm(), which
      means that the controller is already in the target state.  This causes
      DIPM enable commands to be processed with mismatching controller power
      state and link->lpm_policy value.
      
      In ahci, link->lpm_policy is used to ignore certain PHY events if LPM
      is enabled; however, as DIPM commands are issued with stale
      link->lpm_policy, they sometimes end up triggering these conditions
      and get aborted leading to LPM configuration failure.
      
      Fix it by updating link->lpm_policy before issuing DIPM enable
      commands.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NKyle McMartin <kyle@mcmartin.ca>
      Cc: stable@kernel.org
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      e5005b15
  8. 22 10月, 2010 6 次提交
    • T
      libata: implement cross-port EH exclusion · c0c362b6
      Tejun Heo 提交于
      In libata, the non-EH code paths should always take and release
      ap->lock explicitly when accessing hardware or shared data structures.
      However, once EH is active, it's assumed that the port is owned by EH
      and EH methods don't explicitly take ap->lock unless race from irq
      handler or other code paths are expected.  However, libata EH didn't
      guarantee exclusion among EHs for ports of the same host.  IOW,
      multiple EHs may execute in parallel on multiple ports of the same
      controller.
      
      In many cases, especially in SATA, the ports are completely
      independent of each other and this doesn't cause problems; however,
      there are cases where different ports share the same resource, which
      lead to obscure timing related bugs such as the one fixed by commit
      213373cf (ata_piix: fix locking around SIDPR access).
      
      This patch implements exclusion among EHs of the same host.  When EH
      begins, it acquires per-host EH ownership by calling ata_eh_acquire().
      When EH finishes, the ownership is released by calling
      ata_eh_release().  EH ownership is also released whenever the EH
      thread goes to sleep from ata_msleep() or explicitly and reacquired
      after waking up.
      
      This ensures that while EH is actively accessing the hardware, it has
      exclusive access to it while allowing EHs to interleave and progress
      in parallel as they hit waiting stages, which dominate the time spent
      in EH.  This achieves cross-port EH exclusion without pervasive and
      fragile changes while still allowing parallel EH for the most part.
      
      This was first reported by yuanding02@gmail.com more than three years
      ago in the following bugzilla.  :-)
      
        https://bugzilla.kernel.org/show_bug.cgi?id=8223Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Reported-by: yuanding02@gmail.com
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      c0c362b6
    • T
      libata: add @ap to ata_wait_register() and introduce ata_msleep() · 97750ceb
      Tejun Heo 提交于
      Add optional @ap argument to ata_wait_register() and replace msleep()
      calls with ata_msleep() which take optional @ap in addition to the
      duration.  These will be used to implement EH exclusion.
      
      This patch doesn't cause any behavior difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      97750ceb
    • T
      libata: implement LPM support for port multipliers · 6c8ea89c
      Tejun Heo 提交于
      Port multipliers can do DIPM on fan-out links fine.  Implement support
      for it.  Tested w/ SIMG 57xx and marvell PMPs.  Both the host and
      fan-out links enter power save modes nicely.
      
      SIMG 37xx and 47xx report link offline on SStatus causing EH to detach
      the devices.  Blacklisted.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      6c8ea89c
    • T
      libata: reimplement link power management · 6b7ae954
      Tejun Heo 提交于
      The current LPM implementation has the following issues.
      
      * Operation order isn't well thought-out.  e.g. HIPM should be
        configured after IPM in SControl is properly configured.  Not the
        other way around.
      
      * Suspend/resume paths call ata_lpm_enable/disable() which must only
        be called from EH context directly.  Also, ata_lpm_enable/disable()
        were called whether LPM was in use or not.
      
      * Implementation is per-port when it should be per-link.  As a result,
        it can't be used for controllers with slave links or PMP.
      
      * LPM state isn't managed consistently.  After a link reset for
        whatever reason including suspend/resume the actual LPM state would
        be reset leaving ap->lpm_policy inconsistent.
      
      * Generic/driver-specific logic boundary isn't clear.  Currently,
        libahci has to mangle stuff which libata EH proper should be
        handling.  This makes the implementation unnecessarily complex and
        fragile.
      
      * Tied to ALPM.  Doesn't consider DIPM only cases and doesn't check
        whether the device allows HIPM.
      
      * Error handling isn't implemented.
      
      Given the extent of mismatch with the rest of libata, I don't think
      trying to fix it piecewise makes much sense.  This patch reimplements
      LPM support.
      
      * The new implementation is per-link.  The target policy is still
        port-wide (ap->target_lpm_policy) but all the mechanisms and states
        are per-link and integrate well with the rest of link abstraction
        and can work with slave and PMP links.
      
      * Core EH has proper control of LPM state.  LPM state is reconfigured
        when and only when reconfiguration is necessary.  It makes sure that
        LPM state is reset when probing for new device on the link.
        Controller agnostic logic is now implemented in libata EH proper and
        driver implementation only has to deal with controller specifics.
      
      * Proper error handling.  LPM config failure is attributed to the
        device on the link and LPM is disabled for the link if it fails
        repeatedly.
      
      * ops->enable/disable_pm() are replaced with single ops->set_lpm()
        which takes @policy and @hints.  This simplifies driver specific
        implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      6b7ae954
    • T
      libata: clean up lpm related symbols and sysfs show/store functions · c93b263e
      Tejun Heo 提交于
      Link power management related symbols are in confusing state w/ mixed
      usages of lpm, ipm and pm.  This patch cleans up lpm related symbols
      and sysfs show/store functions as follows.
      
      * lpm states - NOT_AVAILABLE, MIN_POWER, MAX_PERFORMANCE and
        MEDIUM_POWER are renamed to ATA_LPM_UNKNOWN and
        ATA_LPM_{MIN|MAX|MED}_POWER.
      
      * Pre/postfixes are unified to lpm.
      
      * sysfs show/store functions for link_power_management_policy were
        curiously named get/put and unnecessarily complex.  Renamed to
        show/store and simplified.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      c93b263e
    • G
      [libata] Add ATA transport class · d9027470
      Gwendal Grignou 提交于
      This is a scheleton for libata transport class.
      All information is read only, exporting information from libata:
      - ata_port class: one per ATA port
      - ata_link class: one per ATA port or 15 for SATA Port Multiplier
      - ata_device class: up to 2 for PATA link, usually one for SATA.
      Signed-off-by: NGwendal Grignou <gwendal@google.com>
      Reviewed-by: NGrant Grundler <grundler@google.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      d9027470
  9. 10 9月, 2010 1 次提交
  10. 02 8月, 2010 1 次提交
  11. 02 7月, 2010 1 次提交
    • T
      libata: take advantage of cmwq and remove concurrency limitations · ad72cf98
      Tejun Heo 提交于
      libata has two concurrency related limitations.
      
      a. ata_wq which is used for polling PIO has single thread per CPU.  If
         there are multiple devices doing polling PIO on the same CPU, they
         can't be executed simultaneously.
      
      b. ata_aux_wq which is used for SCSI probing has single thread.  In
         cases where SCSI probing is stalled for extended period of time
         which is possible for ATAPI devices, this will stall all probing.
      
      #a is solved by increasing maximum concurrency of ata_wq.  Please note
      that polling PIO might be used under allocation path and thus needs to
      be served by a separate wq with a rescuer.
      
      #b is solved by using the default wq instead and achieving exclusion
      via per-port mutex.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJeff Garzik <jgarzik@pobox.com>
      ad72cf98
  12. 20 5月, 2010 2 次提交
    • T
      libata-sff: separate out BMDMA EH · fe06e5f9
      Tejun Heo 提交于
      Some of error handling logic in ata_sff_error_handler() and all of
      ata_sff_post_internal_cmd() are for BMDMA.  Create
      ata_bmdma_error_handler() and ata_bmdma_post_internal_cmd() and move
      BMDMA part into those.
      
      While at it, change DMA protocol check to ata_is_dma(), fix
      post_internal_cmd to call ap->ops->bmdma_stop instead of directly
      calling ata_bmdma_stop() and open code hardreset selection so that
      ata_std_error_handler() doesn't have to know about sff hardreset.
      
      As these two functions are BMDMA specific, there's no reason to check
      for bmdma_addr before calling bmdma methods if the protocol of the
      failed command is DMA.  sata_mv and pata_mpc52xx now don't need to set
      .post_internal_cmd to ATA_OP_NULL and pata_icside and sata_qstor don't
      need to set it to their bmdma_stop routines.
      
      ata_sff_post_internal_cmd() becomes noop and is removed.
      
      This fixes p3 described in clean-up-BMDMA-initialization patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      fe06e5f9
    • T
      libata-sff: port_task is SFF specific · c429137a
      Tejun Heo 提交于
      port_task is tightly bound to the standard SFF PIO HSM implementation.
      Using it for any other purpose would be error-prone and there's no
      such user and if some drivers need such feature, it would be much
      better off using its own.  Move it inside CONFIG_ATA_SFF and rename it
      to sff_pio_task.
      
      The only function which is exposed to the core layer is
      ata_sff_flush_pio_task() which is renamed from ata_port_flush_task()
      and now also takes care of resetting hsm_task_state to HSM_ST_IDLE,
      which is possible as it's now specific to PIO HSM.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      c429137a
  13. 23 4月, 2010 2 次提交
  14. 21 1月, 2010 1 次提交
  15. 03 12月, 2009 1 次提交
    • T
      libata: retry failed FLUSH if device didn't fail it · 6013efd8
      Tejun Heo 提交于
      If ATA device failed FLUSH, it means that the device failed to write
      out some amount of data and the error needs to be reported to upper
      layers. As retries can't recover the lost data, FLUSH failures need to
      be reported immediately in general.
      
      However, if FLUSH fails due to transmission errors, the FLUSH needs to
      be retried; otherwise, filesystems may switch to RO mode and/or raid
      array may drop a drive for a random transmission glitch.
      
      This condition can be rather easily reproduced on certain ahci
      controllers which go through a PHY event after powersave mode switch +
      ext4 combination.  Powersave mode switch is often closely followed by
      flush from the filesystem failing the FLUSH with ATA bus error which
      makes the filesystem code believe that data is lost and drop to RO
      mode.  This was reported in the following bugzilla bug.
      
        http://bugzilla.kernel.org/show_bug.cgi?id=14543
      
      This patch makes libata EH retry FLUSH if it wasn't failed by the
      device.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NAndrey Vihrov <andrey.vihrov@gmail.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      6013efd8
  16. 16 10月, 2009 1 次提交
    • T
      libata: fix PMP initialization · 4f7c2874
      Tejun Heo 提交于
      Commit 842faa6c fixed error handling
      during attach by not committing detected device class to dev->class
      while attaching a new device.  However, this change missed the PMP
      class check in the configuration loop causing a new PMP device to go
      through ata_dev_configure() as if it were an ATA or ATAPI device.
      
      As PMP device doesn't have a regular IDENTIFY data, this makes
      ata_dev_configure() tries to configure a PMP device using an invalid
      data.  For the most part, it wasn't too harmful and went unnoticed but
      this ends up clearing dev->flags which may have ATA_DFLAG_AN set by
      sata_pmp_attach().  This means that SATA_PMP_FEAT_NOTIFY ends up being
      disabled on PMPs and on PMPs which honor the flag breaks hotplug
      support.
      
      This problem was discovered and reported by Ethan Hsiao.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NEthan Hsiao <ethanhsiao@jmicron.com>
      Cc: stable@kernel.org
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      4f7c2874
  17. 07 10月, 2009 1 次提交
    • T
      libata: fix incorrect link online check during probe · 3b761d3d
      Tejun Heo 提交于
      While trying to work around spurious detection retries for
      non-existent devices on slave links, commit
      816ab897 incorrectly added link
      offline check logic before ata_eh_thaw() was called.  This means that
      if an occupied link goes down briefly at the time that offline check
      was performed, device class will be cleared to ATA_DEV_NONE and libata
      wouldn't retry thus failing detection of the device.
      
      The offline check should be done after the port is thawed together
      with online check so that such link glitches can be detected by the
      interrupt handler and handled properly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NTim Blechmann <tim@klingt.org>
      Cc: stable@kernel.org
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      3b761d3d
  18. 02 9月, 2009 3 次提交
    • R
      libata: add command name parsing for error output · 6521148c
      Robert Hancock 提交于
      This patch improve libata's output for error/notification messages
      to allow easier comprehension and debugging:
      
      When ATAPI commands issued through the SCSI layer fail, use SCSI
      functions to print the CDB in human-readable form instead of just
      dumping out the CDB in hex.
      
      Print out the name of the failed command (as defined by the ATA
      specification) in error handling output along with the raw register
      contents.
      
      When reporting status of ACPI taskfile commands executed on resume,
      also output the names of the commands being executed (or not) in
      readable form.
      
      Since the extra data for printing command names increases kernel
      size slightly, a config option has been added to allow disabling
      command name output (as well as some of the error register parsing)
      for those highly sensitive to kernel text size.
      Signed-off-by: NRobert Hancock <hancockrwd@gmail.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      6521148c
    • T
      libata: clear eh_info on reset completion · 1e641060
      Tejun Heo 提交于
      Resets are done with port frozen but some controllers still issue
      interrupts during reset and they may end up recording error conditions
      in ehi leading to unnecessary EH retrials.
      
      This patch makes ata_eh_reset() clear ehi on reset completion.  As
      reset is the most severe recovery action, there's nothing to lose by
      clearing ehi on its completion.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NZdenek Kaspar <zkaspar82@gmail.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      1e641060
    • J
      [libata] EH: freeze port before aborting commands · 54c38444
      Jeff Garzik 提交于
      Call the ->freeze() hook before aborting qc's, because some hardware
      requires special handling prior to accessing the taskfile registers
      (for diagnosis/analysis/reset).  Most notably, hardware may wish to
      disable the DMA engine or interrupts in the ->freeze() hook.
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      54c38444
  19. 29 7月, 2009 1 次提交
  20. 15 7月, 2009 1 次提交
  21. 13 6月, 2009 1 次提交
  22. 12 5月, 2009 2 次提交
    • T
      libata: clear ering on resume · 6f9c1ea2
      Tejun Heo 提交于
      Error timestamps are in jiffies which doesn't run while suspended and
      PHY events during resume isn't too uncommon.  When the two are
      combined, it can lead to unnecessary speed downs if the machine is
      suspended and resumed repeatedly.  Clear error history on resume.
      
      This was reported and verified in bnc#486803 by Vladimir Botka.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NVladimir Botka <vbotka@novell.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      6f9c1ea2
    • T
      libata: fix attach error handling · 842faa6c
      Tejun Heo 提交于
      New device attach path in ata_eh_revalidate_and_attach() is divided
      into two separate loops because ATA requires IDENTIFY to be issued to
      slave first while the user expects to see device probe messages from
      the master device.  new_mask is used to track which devices are the
      new ones between the first loop and the second.
      
      This usually works well but if an error occurs during configuration
      stage, ata_dev_revalidate_and_attach() returns with error code and
      forgets new_mask.  On the retry run, dev->class is set and new_mask
      for the device is clear, so the device just gets revalidated and thus
      ends up skipping post-configuration procedure including scheduling of
      SCSI_HOTPLUG for the device.  When this occurs, ATA part of probing
      works fine but SCSI probing usually doesn't happen and makes the
      device unreachable.
      
      The behavior has been around for a very long time but it has been
      uncovered with the recent addition of 1_5_GBPS horkage which uses
      -EAGAIN return value from ata_dev_configure() to restart the probing
      sequence after forcing cable speed.
      
      This can be fixed by making sure dev->class is permanently set only
      after all configurations are successfully complete.  Fix it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NTim Connors <tconnors+linuxkml@astro.swin.edu.au>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      842faa6c
  23. 25 3月, 2009 1 次提交
    • A
      [libata] Improve timeout handling · c96f1732
      Alan Cox 提交于
      On a timeout call a device specific handler early in the recovery so that
      we can complete and process successful commands which timed out due to IRQ
      loss or the like rather more elegantly.
      
      [Revised to exclude the timeout handling on a few devices that inherit from
       SFF but are not SFF enough to use the default timeout handler]
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      c96f1732
  24. 05 3月, 2009 2 次提交
    • T
      libata: make sure port is thawed when skipping resets · d6515e6f
      Tejun Heo 提交于
      When SCR access is available and the link is offline, softreset is
      skipped as it only wastes time and some controllers don't respond very
      well.  However, the skip path forgot to thaw the port, which not only
      blocks further event notification from the port but also causes
      repeated EH invocations on the same event on drivers which rely on
      ->thaw() to clear events if the IRQ is shared with another device or
      port.
      
      This problem has always been there but is uncovered by recent sata_nv
      nf2/3 change which dropped hardreset support while maintaining SCR
      access.  nf2/3 doesn't clear hotplug event mask from the interrupt
      handler but relies on ->thaw() to clear them.  When the hardreset was
      there, the reset action was never skipped and the port was always
      thawed but, with the hardreset gone, ->prereset() determines that
      there's no need for softreset and both ->softreset() and ->thaw() are
      skipped.  This leads to stuck hotplug event in the IRQ status register
      triggering hotplug event whenever IRQ is delieverd on the same IRQ.
      As the controller shares the same IRQ for both ports, this happens on
      every IO if one port is occpupied and the other isn't.
      
      This patch fixes the problem by making sure that the port is thawed on
      reset-skip path.
      
      bko#11615 reports this problem.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Robert Hancock <hancockrwd@gmail.com>
      Reported-by: NDan Andresan <danyer@gmail.com>
      Reported-by: NArne Woerner <arne_woerner@yahoo.com>
      Reported-by: NStefan Lippers-Hollmann <s.L-H@gmx.de>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      d6515e6f
    • T
      libata: don't use on-stack sense buffer · b5357081
      Tejun Heo 提交于
      sense_buffer is used as DMA target and shouldn't be allocated on
      stack.  Use ap->sector_buf instead.  This problem is spotted by Chuck
      Ebbert.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NChuck Ebbert <cebbert@redhat.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      b5357081
  25. 03 2月, 2009 2 次提交
    • T
      libata: add no penalty retry request for EH device handling routines · cf9a590a
      Tejun Heo 提交于
      Let -EAGAIN from EH device handling routines trigger EH retry without
      consuming its tries count.  This will be used to implement link SPD
      horkage which requires hardreset to adjust SPD without affecting other
      EH decisions.  As it bypasses the forward progress guarantee provided
      by the tries count, the requester is responsible for ensuring forward
      progress.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      cf9a590a
    • T
      libata: improve probe failure handling · c2c7a89c
      Tejun Heo 提交于
      When link is flaky at high speed, it isn't uncommon for a device to
      repeatedly fail probing sequence early after successfully negotiating
      high link speed.  This often leads to consecutive hotplug events
      without successful probing.
      
      This patch improves libata EH such that it remembers probing trials
      and if there have been more than two unsuccessful trials in the past
      60 seconds, slows down link speed to 1.5Gbps.
      
      As link speed negotiation is the duty of the PHY layer proper, the
      goal of this fallback mechanism is to provide the last resort when
      everything else fails, which unfortunately happens not too
      infrequently, so no fancy 6->3->1.5 speeding down or highest
      successful transmission speed seen kind of logics (yet).
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      c2c7a89c