- 25 3月, 2009 1 次提交
-
-
由 Alan Cox 提交于
On a timeout call a device specific handler early in the recovery so that we can complete and process successful commands which timed out due to IRQ loss or the like rather more elegantly. [Revised to exclude the timeout handling on a few devices that inherit from SFF but are not SFF enough to use the default timeout handler] Signed-off-by: NAlan Cox <alan@redhat.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 05 3月, 2009 2 次提交
-
-
由 Tejun Heo 提交于
When SCR access is available and the link is offline, softreset is skipped as it only wastes time and some controllers don't respond very well. However, the skip path forgot to thaw the port, which not only blocks further event notification from the port but also causes repeated EH invocations on the same event on drivers which rely on ->thaw() to clear events if the IRQ is shared with another device or port. This problem has always been there but is uncovered by recent sata_nv nf2/3 change which dropped hardreset support while maintaining SCR access. nf2/3 doesn't clear hotplug event mask from the interrupt handler but relies on ->thaw() to clear them. When the hardreset was there, the reset action was never skipped and the port was always thawed but, with the hardreset gone, ->prereset() determines that there's no need for softreset and both ->softreset() and ->thaw() are skipped. This leads to stuck hotplug event in the IRQ status register triggering hotplug event whenever IRQ is delieverd on the same IRQ. As the controller shares the same IRQ for both ports, this happens on every IO if one port is occpupied and the other isn't. This patch fixes the problem by making sure that the port is thawed on reset-skip path. bko#11615 reports this problem. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Robert Hancock <hancockrwd@gmail.com> Reported-by: NDan Andresan <danyer@gmail.com> Reported-by: NArne Woerner <arne_woerner@yahoo.com> Reported-by: NStefan Lippers-Hollmann <s.L-H@gmx.de> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
sense_buffer is used as DMA target and shouldn't be allocated on stack. Use ap->sector_buf instead. This problem is spotted by Chuck Ebbert. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NChuck Ebbert <cebbert@redhat.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 03 2月, 2009 6 次提交
-
-
由 Tejun Heo 提交于
Let -EAGAIN from EH device handling routines trigger EH retry without consuming its tries count. This will be used to implement link SPD horkage which requires hardreset to adjust SPD without affecting other EH decisions. As it bypasses the forward progress guarantee provided by the tries count, the requester is responsible for ensuring forward progress. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
When link is flaky at high speed, it isn't uncommon for a device to repeatedly fail probing sequence early after successfully negotiating high link speed. This often leads to consecutive hotplug events without successful probing. This patch improves libata EH such that it remembers probing trials and if there have been more than two unsuccessful trials in the past 60 seconds, slows down link speed to 1.5Gbps. As link speed negotiation is the duty of the PHY layer proper, the goal of this fallback mechanism is to provide the last resort when everything else fails, which unfortunately happens not too infrequently, so no fancy 6->3->1.5 speeding down or highest successful transmission speed seen kind of logics (yet). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Add @spd_limit to sata_down_spd_limit() so that the caller can specify the SPD limit it wants. This parameter doesn't get in the way even when it's too low. The closest possible limit is applied. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
dev->ering used to be cleared together with the rest of ata_device in ata_dev_init() which is called whenever a probing event occurs. dev->ering is about to be used to track probing failures so it needs to remain persistent over multiple porbing events. This patch achieves this by doing the following. * Instead of CLEAR_OFFSET, define CLEAR_BEGIN and CLEAR_END and only clear between BEGIN and END. ering is moved after END. The split of persistent area is to allow hotter items remain at the head. * ering is explicitly cleared on ata_dev_disable() and when device attach succeeds. So, ering is persistent throug a device's life time (unless explicitly cleared of course) and also through periods inbetween disablement of an attached device and successful detection of the next one. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
ata_dev_disable() is about to be more tightly integrated into EH logic. Move it to libata-eh.c. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
The dev->pio_mode > XFER_PIO_0 test is there to avoid unnecessary speed down warning messages but it accidentally disabled SATA link spd down during configuration phase after reset where PIO mode is always zero. This patch fixes the problem by moving the test where it belongs. This makes libata probing sequence behave better when the connection is flaky at higher link speeds which isn't too uncommon for eSATA devices. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 29 12月, 2008 2 次提交
-
-
由 Tejun Heo 提交于
ata_port_detach() first made sure EH saw ATA_PFLAG_UNLOADING and then assumed EH context belongs to it and performed detach operation itself. However, UNLOADING doesn't disable all of EH and this could lead to problems including triggering WARN_ON()'s in EH path. This patch makes port detach behave more like other EH actions such that ata_port_detach() requests EH to detach and waits for completion. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
There currently are the following looping constructs. * __ata_port_for_each_link() for all available links * ata_port_for_each_link() for edge links * ata_link_for_each_dev() for all devices * ata_link_for_each_dev_reverse() for all devices in reverse order Now there's a need for looping construct which is similar to __ata_port_for_each_link() but iterates over PMP links before the host link. Instead of adding another one with long name, do the following cleanup. * Implement and export ata_link_next() and ata_dev_next() which take @mode parameter and can be used to build custom loop. * Implement ata_for_each_link() and ata_for_each_dev() which take looping mode explicitly. The following iteration modes are implemented. * ATA_LITER_EDGE : loop over edge links * ATA_LITER_HOST_FIRST : loop over all links, host link first * ATA_LITER_PMP_FIRST : loop over all links, PMP links first * ATA_DITER_ENABLED : loop over enabled devices * ATA_DITER_ENABLED_REVERSE : loop over enabled devices in reverse order * ATA_DITER_ALL : loop over all devices * ATA_DITER_ALL_REVERSE : loop over all devices in reverse order This change removes exlicit device enabledness checks from many loops and makes it clear which ones are iterated over in which direction. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 11 11月, 2008 1 次提交
-
-
由 Tejun Heo 提交于
ehc->last_reset is used to ensure that resets are not issued too close to each other. It's initialized to jiffies minus one minute on EH entry. However, when new links are initialized after PMP is probed, new links have zero for this timestamp resulting in long wait depending on the current jiffies. This patch makes last_set considered iff ATA_EHI_DID_RESET is set, in which case last_reset is always initialized. As an added precaution, WARN_ON() is added so that warning is printed if last_reset is in future. This problem is spotted and debugged by Shane Huang. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Shane Huang <Shane.Huang@amd.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 28 10月, 2008 2 次提交
-
-
由 Tejun Heo 提交于
libata EH saves xfer_mode and ncq_enabled at start to later set DUBIOUS_XFER flag if it has changed. These values need to be cleared on device detach such that hot device swap doesn't accidentally miss DUBIOUS_XFER. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
There were several places where only enabled devices should be iterated over but device enabledness wasn't checked. * IDENTIFY data 40 wire check in cable_is_40wire() * xfer_mode/ncq_enabled saving in ata_scsi_error() * DUBIOUS_XFER handling in ata_set_mode() While at it, reformat comments in cable_is_40wire(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 23 10月, 2008 3 次提交
-
-
由 Tejun Heo 提交于
Reset methods don't have access to phys link status for slave links and may incorrectly indicate device presence causing unnecessary probe failures for unoccupied links. This patch clears device class to NONE during post-reset processing if phys link is offline. As on/offlineness semantics is strictly defined and used in multiple places by the core layer, this won't change behavior for drivers which don't use slave links. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Slave link action mask is transferred to master link and all the EH actions are taken by the master link. ata_eh_about_to_do() and ata_eh_done() are called with ATA_EH_ALL_ACTIONS to clear the slave link actions during transfer. This always sets ATA_PFLAG_RECOVERED flag causing spurious "EH complete" messages. Don't set ATA_PFLAG_RECOVERED for slave link actions. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
ATA_EHI_NO_AUTOPSY and ATA_EHI_QUIET are used to control the behavior of EH. As only the master link is visible outside EH, these flags are set only for the master link although they should also apply to the slave link, which causes spurious EH messages during probe and suspend/resume. This patch transfers those two flags to slave ehc.i before performing slave autopsy and reporting. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 09 10月, 2008 1 次提交
-
-
由 Jens Axboe 提交于
Right now SCSI and others do their own command timeout handling. Move those bits to the block layer. Instead of having a timer per command, we try to be a bit more clever and simply have one per-queue. This avoids the overhead of having to tear down and setup a timer for each command, so it will result in a lot less timer fiddling. Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 29 9月, 2008 3 次提交
-
-
由 Tejun Heo 提交于
Resets make ATAPI devices raise UNIT ATTENTION which fails the next command. As resets can happen asynchronously for unrelated reasons, this sometimes disrupts innocent users. For example, reading DVD fails after the system wakes up from suspend or the other device sharing the channel went through bus error. Clearing UA has some problems as it might clear UA which the userland needs to know about. However, UA after resets can only be about the reset itself and benefits of clearing it overweights cons. Missing UA can only delay failure to one of the following commands anyway. For example, timeout while burning is in progress will trigger reset and reset the device state and probably corrupt the burning run. Although the userland application won't get the UA, its pending writes will fail. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Elias Oltmanns 提交于
On user request (through sysfs), the IDLE IMMEDIATE command with UNLOAD FEATURE as specified in ATA-7 is issued to the device and processing of the request queue is stopped thereafter until the specified timeout expires or user space asks to resume normal operation. This is supposed to prevent the heads of a hard drive from accidentally crashing onto the platter when a heavy shock is anticipated (like a falling laptop expected to hit the floor). In fact, the whole port stops processing commands until the timeout has expired in order to avoid any resets due to failed commands on another device. Signed-off-by: NElias Oltmanns <eo@nebensachen.de> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Explanation taken from the comment of ata_slave_link_init(). In libata, a port contains links and a link contains devices. There is single host link but if a PMP is attached to it, there can be multiple fan-out links. On SATA, there's usually a single device connected to a link but PATA and SATA controllers emulating TF based interface can have two - master and slave. However, there are a few controllers which don't fit into this abstraction too well - SATA controllers which emulate TF interface with both master and slave devices but also have separate SCR register sets for each device. These controllers need separate links for physical link handling (e.g. onlineness, link speed) but should be treated like a traditional M/S controller for everything else (e.g. command issue, softreset). slave_link is libata's way of handling this class of controllers without impacting core layer too much. For anything other than physical link handling, the default host link is used for both master and slave. For physical link handling, separate @ap->slave_link is used. All dirty details are implemented inside libata core layer. From LLD's POV, the only difference is that prereset, hardreset and postreset are called once more for the slave link, so the reset sequence looks like the following. prereset(M) -> prereset(S) -> hardreset(M) -> hardreset(S) -> softreset(M) -> postreset(M) -> postreset(S) Note that softreset is called only for the master. Softreset resets both M/S by definition, so SRST on master should handle both (the standard method will work just fine). As slave_link excludes PMP support and only code paths which deal with the attributes of physical link are affected, all the changes are localized to libata.h, libata-core.c and libata-eh.c. * ata_is_host_link() updated so that slave_link is considered as host link too. * iterator extended to iterate over the slave_link when using the underbarred version. * force param handling updated such that devno 16 is mapped to the slave link/device. * ata_link_on/offline() updated to return the combined result from master and slave link. ata_phys_link_on/offline() are the direct versions. * EH autopsy and report are performed separately for master slave links. Reset is udpated to implement the above described reset sequence. Except for reset update, most changes are minor, many of them just modifying dev->link to ata_dev_phys_link(dev) or using phys online test instead. After this update, LLDs can take full advantage of per-dev SCR registers by simply turning on slave link. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 22 8月, 2008 4 次提交
-
-
由 Tejun Heo 提交于
SError belongs to link not port. Use ata_link_printk() to print it. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
As an optimization, follow-up SRST used to be skipped if classification wasn't requested even when hardreset requested it via -EAGAIN. However, some hardresets can't wait for device readiness and skipping SRST can cause timeout or other failures during revalidation. Always perform follow-up SRST if hardreset returns -EAGAIN. This makes reset paths more predictable and thus less error-prone. While at it, move hardreset error checking such that it's done right after hardreset is finished. This simplifies followup SRST condition check a bit and makes the reset path easier to modify. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
ehc->i.action got accidentally overwritten to ATA_EH_HARD/SOFTRESET in ata_eh_reset(). The original intention was to clear reset action which wasn't selected. This can cause unexpected behavior when other EH actions are scheduled together with reset. Fix it. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Implement force params nohrst, nosrst and norst. This is to work around reset related problems and ease debugging. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 15 7月, 2008 5 次提交
-
-
由 Tejun Heo 提交于
Update atapi_eh_request_sense() to take @dev, @sense_buf and @dfl_sense_key instead of taking @qc and extracting information from it. This change is to make the function more generic and allow it to be called from other places. While at it, make cdb initialization use initializer. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
ATA_TMOUT_INTERNAL which was 30secs were used for all internal commands which is way too long when something goes wrong. This patch implements command type based stepped timeouts. Different command types can use different timeouts and each command type can use different timeout values after timeouts. ie. the initial timeout is set to a value which should cover most of the cases but not too long so that run away cases don't delay things too much. After the first try times out, the second try can use longer timeout and if that one times out too, it can go for full 30sec timeout. IDENTIFYs use 5s - 10s - 30s timeout and all other commands use 5s - 10s timeouts. This patch significantly cuts down the needed time to handle failure cases while still allowing libata to work with nut job devices through retries. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
This doesn't introduce any functional changes. This is to make reset timeout table consistent with to-be-added command timeout tables. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
EH retries were delayed by 5 seconds to ensure that resets don't occur back-to-back. However, this 5 second delay is superflous or excessive in many cases. For example, after IDENTIFY times out, there's no reason to wait five more seconds before retrying. This patch adds ehc->last_reset timestamp and record the timestamp for the last reset trial or success and uses it to space resets by ATA_EH_RESET_COOL_DOWN which is 5 secs and removes unconditional 5 sec sleeps. As this change makes inter-try waits often shorter and they're redundant in nature, this patch also removes the "retrying..." messages. While at it, convert explicit rounding up division to DIV_ROUND_UP(). This change speeds up EH in many cases w/o sacrificing robustness. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
libata has been using mix of jiffies and msecs for time druations. This is getting confusing. As writing sub HZ values in jiffies is PITA and msecs_to_jiffies() can't be used as initializer, unify unit for all time durations to msecs. So, durations are in msecs and deadlines are in jiffies. ata_deadline() is added to compute deadline from a start time and duration in msecs. While at it, drop now superflous _msec suffix from arguments and rename @timeout to @deadline if it represents a fixed point in time rather than duration. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 20 5月, 2008 4 次提交
-
-
由 Tejun Heo 提交于
No reason to get overzealous about recovered comm and data errors. Some PHYs habitually sets them w/o no good reason and being draconian about these soft error conditions doesn't seem to help anybody. If need ever rises, we might need to add soft PHY error condition, say AC_ERR_MAYBE_ATA_BUS and use it only to determine whether speed down is necessary but I don't think that's very likely to happen. It's far more likely we'll get timeouts or fatal transmission errors if recovered errors are so prominent that they hamper operation. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Originally, whole reset processing was done while the port is frozen and SError was cleared during @postreset(). This had two race conditions. 1: hotplug could occur after reset but before SError is cleared and libata won't know about it. 2: hotplug could occur after all the reset is complete but before the port is thawed. As all events are cleared on thaw, the hotplug event would be lost. Commit ac371987 kills the first race by clearing SError during link resume but before link onlineness test. However, this doesn't fix race #2 and in some cases clearing SError after SRST is a good idea. This patch solves this problem by cross checking link onlineness with classification result after SError is cleared and port is thawed. Reset is retried if link is online but all devices attached to the link are unknown. As all devices will be revalidated, this one-way check is enough to ensure that all devices are detected and revalidated reliably. This, luckily, also fixes the cases where host controller returns bogus status while harddrive is spinning up after hotplug making classification run before the device sends the first FIS and thus causes misdetection. Low level drivers can bypass the logic by setting class explicitly to ATA_DEV_NONE if ever necessary (currently none requires this). Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Previously reset freeze/thaw handling lived outside of ata_eh_reset() mainly because the original PMP reset code needed the port frozen while resetting all the fan-out ports, which is no longer the case. This patch moves freeze/thaw handling into ata_eh_reset(). @prereset() and @postreset() are now called w/o freezing the port although @prereset() an be called frozen if the port is frozen prior to entering ata_eh_reset(). This makes code simpler and will help removing hotplug event related races. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Reorganize ata_eh_reset() such that @prereset() is called even when no reset method is available and if block is used instead of goto to skip actual reset. This makes no reset case behave better (readiness wait) and future changes easier. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 06 5月, 2008 1 次提交
-
-
由 Mark Lord 提交于
Export ata_eh_analyze_ncq_error() for subsequent use by sata_mv, as suggested by Tejun. Signed-off-by: NMark Lord <mlord@pobox.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 25 4月, 2008 1 次提交
-
-
由 Mark Lord 提交于
Fix mis-reporting of NCQ errors by ensuring that result_tf->flags is properly initialized in libata-eh. This allows ata_gen_ata_sense() to report the failed block number correctly to SCSI after a media error during NCQ. This patch may also be a candidate for backporting to earlier kernels. Without this fix, SCSI will fail I/O on the entire request rather than just the bad sector. That can be bad for a request that was merged from many independent read reads from different tasks. Signed-off-by: NMark Lord <mlord@pobox.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 18 4月, 2008 4 次提交
-
-
由 Tejun Heo 提交于
When no reset method is available, libata currently oopses. Although the condition can't happen unless there's a bug in a low level driver, oopsing isn't the best way to report the error condition. Complain, dump stack and fail reset instead. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Currently, SATA softresets should do link onlineness check before actually performing SRST protocol but it doesn't really belong to softreset. This patch moves onlineness check in softreset to ata_eh_reset() and ata_eh_followup_srst_needed() to clean up code and help future sata_mv changes which need clear separation between SCR and TF accesses. sata_fsl is peculiar in that its softreset really isn't softreset but combination of hardreset and softreset. This patch adds dummy private ->prereset to keep the current behavior but the driver really should implement separate hard and soft resets and return -EAGAIN from hardreset if it should be follwed by softreset. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Some code paths which had been made obsolete by recent reset simplification were still around. Kill them. * ata_eh_reset() checked for ATA_DEV_UNKNOWN to determine classification failure. This is no longer applicable. * ata_do_reset() should convert ATA_DEV_UNKNOWN to ATA_DEV_NONE regardless of reset result (e.g. -EAGAIN). * LLDs don't need to convert ATA_DEV_UNKNOWN to ATA_DEV_NONE. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
Implement helpers to test whether PMP is supported, attached and determine pmp number to use when issuing SRST to a link. While at it, move ata_is_host_link() so that it's together with the two new PMP helpers. This change simplifies LLDs and helps making PMP support optional. Signed-off-by: NTejun Heo <htejun@gmail.com>
-