- 02 3月, 2008 2 次提交
-
-
由 Stefan Richter 提交于
"modprobe firewire-ohci; sleep .1; modprobe -r firewire-ohci" used to result in crashes like this: BUG: unable to handle kernel paging request at ffffffff8807b455 IP: [<ffffffff8807b455>] PGD 203067 PUD 207063 PMD 7c170067 PTE 0 Oops: 0010 [1] PREEMPT SMP CPU 0 Modules linked in: i915 drm cpufreq_ondemand acpi_cpufreq freq_table applesmc input_polldev led_class coretemp hwmon eeprom snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss button thermal processor sg snd_hda_intel snd_pcm snd_timer snd snd_page_alloc sky2 i2c_i801 rtc [last unloaded: crc_itu_t] Pid: 9, comm: events/0 Not tainted 2.6.25-rc2 #3 RIP: 0010:[<ffffffff8807b455>] [<ffffffff8807b455>] RSP: 0018:ffff81007dcdde88 EFLAGS: 00010246 RAX: ffff81007dc95040 RBX: ffff81007dee5390 RCX: 0000000000005e13 RDX: 0000000000008c8b RSI: 0000000000000001 RDI: ffff81007dee5388 RBP: ffff81007dc5eb40 R08: 0000000000000002 R09: ffffffff8022d05c R10: ffffffff8023b34c R11: ffffffff8041a353 R12: ffff81007dee5388 R13: ffffffff8807b455 R14: ffffffff80593bc0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff8055a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: ffffffff8807b455 CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process events/0 (pid: 9, threadinfo ffff81007dcdc000, task ffff81007dc95040) Stack: ffffffff8023b396 ffffffff88082524 0000000000000000 ffffffff8807d9ae ffff81007dc5eb40 ffff81007dc9dce0 ffff81007dc5eb40 ffff81007dc5eb80 ffff81007dc9dce0 ffffffffffffffff ffffffff8023be87 0000000000000000 Call Trace: [<ffffffff8023b396>] ? run_workqueue+0xdf/0x1df [<ffffffff8023be87>] ? worker_thread+0xd8/0xe3 [<ffffffff8023e917>] ? autoremove_wake_function+0x0/0x2e [<ffffffff8023bdaf>] ? worker_thread+0x0/0xe3 [<ffffffff8023e813>] ? kthread+0x47/0x74 [<ffffffff804198e0>] ? trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8020c008>] ? child_rip+0xa/0x12 [<ffffffff8020b6e3>] ? restore_args+0x0/0x3d [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171 [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171 [<ffffffff8023e7cc>] ? kthread+0x0/0x74 [<ffffffff8020bffe>] ? child_rip+0x0/0x12 Code: Bad RIP value. RIP [<ffffffff8807b455>] RSP <ffff81007dcdde88> CR2: ffffffff8807b455 ---[ end trace c7366c6657fe5bed ]--- Note that this crash happened _after_ firewire-core was unloaded. The shared workqueue tried to run firewire-core's device initialization jobs or similar jobs. The fix makes sure that firewire-ohci and hence firewire-core is not unloaded before all device shutdown jobs have been completed. This is determined by the count of device initializations minus device releases. Also skip useless retries in the node initialization job if the node is to be shut down. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NJarod Wilson <jwilson@redhat.com>
-
由 Stefan Richter 提交于
Patch "firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device" had the unintended effect that firewire-sbp2 could not be unloaded anymore until all SBP-2 devices were unplugged. We now fix the NULL pointer bug by reacquiring a reference to the sdev instead of holding a reference to the sdev (and to the module) all the time. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Tested-by: NJarod Wilson <jwilson@redhat.com>
-
- 20 2月, 2008 3 次提交
-
-
由 Stefan Richter 提交于
Fix a kernel bug when unplugging an SBP-2 device after having its scsi_device already removed via the "delete" sysfs attribute. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
Fix a kernel bug when running rescan-scsi-bus while a FireWire disk is connected: http://bugzilla.kernel.org/show_bug.cgi?id=10008Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
While fw-sbp2 takes the necessary time to reconnect to a logical unit after bus reset, the SCSI core keeps sending new commands. They are all immediately completed with host busy status, and application clients or filesystems will break quickly. The SCSI device might even be taken offline: http://bugzilla.kernel.org/show_bug.cgi?id=9734 The only remedy seems to be to block the SCSI device until reconnect. Alas the SCSI core has no useful API to block only one logical unit i.e. the scsi_device, therefore we block the entire Scsi_Host. This currently corresponds to an SBP-2 target. In case of targets with multiple logical units, we need to satisfy the dependencies between logical units by carefully tracking the blocking state of the target and its units. We block all logical units of a target as soon as one of them needs to be blocked, and keep them blocked until all of them are ready to be unblocked. Furthermore, as the history of the old sbp2 driver has shown, the scsi_block_requests() API is a minefield with high potential of deadlocks. We therefore take extra measures to keep logical units unblocked during __scsi_add_device() and during shutdown. This avoids I/O errors during reconnect in many but alas not in all cases. There may still be errors after a re-login had to be performed. Also, some bridges have been seen to cease fetching management ORBs if I/O went on up until a bus reset. In these cases, all management ORBs time out after mgt_orb_timeout. The old sbp2 driver is less vulnerable or maybe not vulnerable to this, for as yet unknown reasons. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
- 16 2月, 2008 10 次提交
-
-
由 Stefan Richter 提交于
fw-sbp2 is unable to reconnect while performing __scsi_add_device because there is only a single workqueue thread context available for both at the moment. This should be fixed eventually. An actual failure of __scsi_add_device is easy to handle, but an incomplete execution of __scsi_add_device with an sdev returned would remain undetected and leave the SBP-2 target unusable. Therefore we use a workaround: If there was a bus reset during __scsi_add_device (i.e. during the SCSI probe), we remove the new sdev immediately, log out, and attempt login and SCSI probe again. Tested-by: Jarod Wilson <jwilson@redhat.com> (earlier version) Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
If fw-sbp2 was too late with requesting the reconnect, the target would reject this. In this case, log out before attempting the reconnect. Else several firmwares will deny the re-login because they somehow didn't invalidate the old login. Also, don't retry reconnects in this situation. The retries won't succeed either. These changes improve chances for successful re-login and shorten the period during which the logical unit is inaccessible. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NJarod Wilson <jwilson@redhat.com>
-
由 Stefan Richter 提交于
When a reconnect failed but re-login succeeded, __scsi_add_device was called again. In those cases, __scsi_add_device succeeded and returned the pointer to the existing scsi_device. fw-sbp2 then continued orderly, except that it missed to call sbp2_cancel_orbs. SCSI core would call fw-sbp2's eh_abort_handler eventually if there had been an outstanding command. This patch avoids the needless lookups and temporary allocations in SCSI core and I/O stall and timeout until eh_abort_handler hits. Also, __scsi_add_device tolerating calls for devices which already exist is undocumented behavior on which we shouldn't rely. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NJarod Wilson <jwilson@redhat.com>
-
由 Stefan Richter 提交于
for easier readable logs if more than one SBP-2 device is present. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NJarod Wilson <jwilson@redhat.com>
-
由 Stefan Richter 提交于
Like the old sbp2 driver, wait for the write transaction to the AGENT_RESET to complete before proceeding (after login, after reconnect, or in SCSI error handling). There is one occasion where AGENT_RESET is written to from atomic context when getting DEAD status for a command ORB. There we still continue without waiting for the transaction to complete because this is more difficult to fix... Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
Several different SBP-2 bridges accept a login early while the IDE device is still powering up. They are therefore unable to respond to SCSI INQUIRY immediately, and the SCSI core has to retry the INQUIRY. One of these retries is typically successful, and all is well. But in case of Momobay FX-3A, the INQUIRY retries tend to fail entirely. This can usually be avoided by waiting a little while after login before letting the SCSI core send the INQUIRY. The old sbp2 driver handles this more gracefully for as yet unknown reasons (perhaps because it waits for fetch agent resets to complete, unlike fw-sbp2 which quickly proceeds after requesting the agent reset). Therefore the workaround is not as much necessary for sbp2. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NJarod Wilson <jwilson@redhat.com>
-
由 Stefan Richter 提交于
If a device is being unplugged while fw-sbp2 had a login or reconnect on schedule, it would take about half a minute to shut the fw_unit down: Jan 27 18:34:54 stein firewire_sbp2: logged in to fw2.0 LUN 0000 (0 retries) <unplug> Jan 27 18:34:59 stein firewire_sbp2: sbp2_scsi_abort Jan 27 18:34:59 stein scsi 25:0:0:0: Device offlined - not ready after error recovery Jan 27 18:35:01 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:06 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:12 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:17 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:22 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:27 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:32 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:32 stein firewire_sbp2: failed to login to fw2.0 LUN 0000 Jan 27 18:35:32 stein firewire_sbp2: released fw2.0 After this patch, typically only a few seconds spent in __scsi_add_device remain: Jan 27 19:05:50 stein firewire_sbp2: logged in to fw2.0 LUN 0000 (0 retries) <unplug> Jan 27 19:05:56 stein firewire_sbp2: sbp2_scsi_abort Jan 27 19:05:56 stein scsi 33:0:0:0: Device offlined - not ready after error recovery Jan 27 19:05:56 stein firewire_sbp2: released fw2.0 The benefit of this is less noise in the syslog. It furthermore avoids a few wasted CPU cycles and needlessly prolonged lifetime of a few driver objects. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NJarod Wilson <jwilson@redhat.com>
-
由 Stefan Richter 提交于
This fixes a "can't recognize device" kind of bug. If the SCSI INQUIRY failed and hence __scsi_add_device failed due to a bus reset, we tried a logout and then waited for the already scheduled login work to happen. So far so good, but the generation used for the logout was outdated, hence the logout never reached the target. The target might therefore deny the subsequent relogin attempt, which would also leave the target inaccessible. Therefore fetch a fresh device->generation for the logout. Use memory barriers to prevent our plan being foiled by compiler or hardware optimizations. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
Standardize on "unsigned int" style. Sort some struct members thematically. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
- 31 1月, 2008 9 次提交
-
-
由 Jarod Wilson 提交于
To be more compliant with section 7.4.8 of the SBP-2 specification, use the mgt_ORB_timeout specified in the SBP-2 device's config rom for login ORB attempts (though with some sanity checks). A happy side-effect is that certain device and controller combinations that sometimes take more than 20 seconds to get synced up (like my laptop with just about any SBP-2 device) now function more reliably. Signed-off-by: NJarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (silenced sparse)
-
由 Jarod Wilson 提交于
Increase (and rename) the login orb reply timeout value to 20s to match that of the old firewire stack. 2s simply didn't give many devices enough time to spin up and reply. Fixes inability to recognize some devices. Failure mode was "orb reply timed out"/"failed to login". Signed-off-by: NJarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (style, comments, changelog)
-
由 Stefan Richter 提交于
fw_device.node_id and fw_device.generation are accessed without mutexes. We have to ensure that all readers will get to see node_id updates before generation updates. Fixes an inability to recognize devices after "giving up on config rom", https://bugzilla.redhat.com/show_bug.cgi?id=429950Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Reviewed by Nick Piggin <nickpiggin@yahoo.com.au>. Verified to fix 'giving up on config rom' issues on multiple system and drive combinations that were previously affected. Signed-off-by: NJarod Wilson <jwilson@redhat.com> Signed-off-by: NKristian Høgsberg <krh@redhat.com>
-
由 Stefan Richter 提交于
There was a small window where a login or reconnect job could use an already updated card generation with an outdated node ID. We have to use the fw_device.generation here, not the fw_card.generation, because the generation must never be newer than the node ID when we emit a transaction. This cannot be guaranteed with fw_card.generation. Furthermore, the target's and initiator's node IDs can be obtained from fw_device and fw_card. Dereferencing their underlying topology objects is not necessary. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Verified in concert with subsequent memory barriers patch to fix 'giving up on config rom' issues on multiple system and drive combinations that were previously affected. Signed-off-by: NJarod Wilson <jwilson@redhat.com>
-
由 Stefan Richter 提交于
Ask the target to grant 4 seconds instead of the standard and minimum of 1 second window after bus reset for reconnection. This accelerates reconnection if there are more than one targets on the bus: If a login and inquiry to one target blocks the fw-sbp2 workqueue for more than 1s after bus reset, we now still can reconnect to the other target. Before that, fw-sbp2's reconnect attempts would be rejected with "error status: 0:9" (function rejected), and fw-sbp2 would finally re-login. All those futile reconnect attemps cost extra time until the target which needs re-login is ready for I/O again. The reconnect timeout field in the login ORB doesn't have to be honored by the target though. I found that we could get up to - allegedly 32768s from an old OXFW911 firmware - 256s from LSI bridges - 4s from OXUF922 and OXFW912 bridges, - 2s from TI bridges, - only the standard 1s from Initio and Prolific bridges and from Apple OpenFirmware in target mode. We just try to get 4 seconds which already covers the case of a few HDDs on the same bus quite nicely. A minor drawback occurs in the following (rare and impractical) border case: - two initiators are there, initiator 1 holds an exclusive login to a target, - initiator 1 goes off the bus, - target refuses login attempts from initiator 2 until reconnect_hold seconds after bus reset. An alternative approach to the issue at hand would be to parallelize fw-sbp2's reconnect and login work. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: NJarod Wilson <jwilson@redhat.com>
-
由 Stefan Richter 提交于
Don't attempt to send a logout ORB if the target was already unplugged or had its link switched off. If two targets are attached, this enhances the chance to quickly reconnect to the remaining target when one target is plugged out. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: NJarod Wilson <jwilson@redhat.com>
-
由 Stefan Richter 提交于
SBP2_MAX_SECTORS is nowhere used in fw-sbp2. It merely got copied over from sbp2 where it played a role in the past. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
This somewhat reduces the size of firewire-sbp2.ko. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
- 12 1月, 2008 1 次提交
-
-
由 James Bottomley 提交于
This patch relaxes the default SCSI DMA alignment from 512 bytes to 4 bytes. I remember from previous discussions that usb and firewire have sector size alignment requirements, so I upped their alignments in the respective slave allocs. The reason for doing this is so that we don't get such a huge amount of copy overhead in bio_copy_user() for udev. (basically all inquiries it issues can now be directly mapped). Acked-by: NAlan Stern <stern@rowland.harvard.edu> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 07 11月, 2007 1 次提交
-
-
由 Stefan Richter 提交于
Since patch "fw-sbp2: use an own workqueue (fix system responsiveness)" increased parallelism between fw-sbp2 and fw-core, it was possible that fw-sbp2 didn't release the SCSI device when the FireWire device was disconnected. This happened if sbp2_update() ran during sbp2_login(), because a bus reset occurred during sbp2_login(). The sbp2_login() work would [try to] reschedule itself because it failed due to the bus reset, and it would _not_ drop its reference on the target. However, sbp2_update() would schedule sbp2_login() too before sbp2_login() rescheduled itself and hence sbp2_update() would take an additional reference. And then we would have one reference too many. The fix is to _always_ drop the reference when leaving the sbp2_login() work. If the sbp2_login() work reschedules itself, it takes a reference, but only if it wasn't already rescheduled by sbp2_update(). Ditto in the sbp2_reconnect() work. The resulting code is actually simpler than before: We _always_ take a reference when successfully scheduling work. And we _always_ drop a reference when leaving a workqueue job. No exceptions. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
- 17 10月, 2007 4 次提交
-
-
由 Stefan Richter 提交于
Firewire-sbp2 did very uncooperative things in the kernel's shared workqueue: Sleeping until reception of management status from the target for up to 2 seconds, and performing SCSI inquiry and all of the setup of SCSI command set drivers via scsi_add_device. If there were transient or permanent error conditions, this caused long blockage of the kernel's events process, noticeable e.g. by blocked keyboard input. We now allocate a workqueue process exclusive to fw-sbp2. As a side effect, this also increases parallelism of fw-sbp2's login and reconnect work versus fw-core's device discovery and device update work which is performed in the shared workqueue. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NKristian Høgsberg <krh@redhat.com>
-
由 Stefan Richter 提交于
On rare occasions, the ability to set one of the workaround flags at runtime may save the day. People who experience I/O errors with firewire-sbp2 while the old sbp2 driver worked for them should try workarounds=1 and report to the devel mailinglist whether that improves things. Firewire-sbp2 defaults to the SCSI stack's maximum transfer size per command, while sbp2 limits them to 128 kBytes. Flag 1 accomplishes just that. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
Fixes "New firewire stack only recognizing half of a chain of drives", https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242254Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
由 Stefan Richter 提交于
On IOMMU-less noncoherent architectures, orb->callback will memcpy the whole SCSI command buffer for READ-like SCSI commands. It is therefore friendlier to enable IRQs before the call, like before patch "Add ref-counting for sbp2 orbs". Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: NKristian Høgsberg <krh@redhat.com>
-
- 26 8月, 2007 1 次提交
-
-
由 Kristian Høgsberg 提交于
This handles the case where we get the status write before getting the complete_transaction callback ("status write for unknown orb"). In this case, we just assume that the initial orb pointer transaction succeeded and finish the orb. To prevent the transaction callback from touching freed memory, we ref-count the orb structures. Signed-off-by: NKristian Høgsberg <krh@redhat.com> Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
- 03 8月, 2007 1 次提交
-
-
由 Stefan Richter 提交于
As far as I know, all CardBus FireWire 400 adapters have a maximum payload of 1024 bytes which is less than the speed-dependent limit of 2048 bytes. Fw-sbp2 has to take the host adapter's limit into account. This apparently fixes Juju's incompatibility with my CardBus cards, a NEC based card and a VIA based card. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: NKristian Høgsberg <krh@redhat.com>
-
- 20 7月, 2007 1 次提交
-
-
由 Kristian Høgsberg 提交于
Signed-off-by: NKristian Høgsberg <krh@redhat.com> collapsed with fw-sbp2 patch "Drop cast to non-const char * in host template initialization." from Kristian Høgsberg Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
-
- 19 7月, 2007 1 次提交
-
-
由 Stefan Richter 提交于
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
-
- 10 7月, 2007 6 次提交
-
-
由 Stefan Richter 提交于
The CPU must not touch the buffer after it was DMA-mapped. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NKristian Høgsberg <krh@redhat.com>
-
由 Stefan Richter 提交于
The CPU must not touch the buffer after it was DMA-mapped. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NKristian Høgsberg <krh@redhat.com>
-
由 Stefan Richter 提交于
- The CPU must not touch the buffer after it was DMA-mapped. - The size argument of dma_unmap_single(...page_table...) was bogus. - Move a comment closer to the code to which it refers to. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: NKristian Høgsberg <krh@redhat.com>
-
由 Stefan Richter 提交于
Add rudimentary check for the case that the page table overflows due to merging of s/g elements by the IOMMU. This would have lead to overwriting of arbitrary memory. After this change I expect that an offending command will be unsuccessfully retried until the scsi_device is taken offline by SCSI core. It's a border case and not worth to implement a recovery strategy. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: NKristian Høgsberg <krh@redhat.com>
-
由 Stefan Richter 提交于
This is required per SBP-2 clause 5.2. Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: NKristian Høgsberg <krh@redhat.com>
-
由 Stefan Richter 提交于
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: NKristian Høgsberg <krh@redhat.com>
-