1. 02 3月, 2008 2 次提交
    • S
      firewire: fix crash in automatic module unloading · 855c603d
      Stefan Richter 提交于
      "modprobe firewire-ohci; sleep .1; modprobe -r firewire-ohci" used to
      result in crashes like this:
      
          BUG: unable to handle kernel paging request at ffffffff8807b455
          IP: [<ffffffff8807b455>]
          PGD 203067 PUD 207063 PMD 7c170067 PTE 0
          Oops: 0010 [1] PREEMPT SMP
          CPU 0
          Modules linked in: i915 drm cpufreq_ondemand acpi_cpufreq freq_table applesmc input_polldev led_class coretemp hwmon eeprom snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss button thermal processor sg snd_hda_intel snd_pcm snd_timer snd snd_page_alloc sky2 i2c_i801 rtc [last unloaded: crc_itu_t]
          Pid: 9, comm: events/0 Not tainted 2.6.25-rc2 #3
          RIP: 0010:[<ffffffff8807b455>]  [<ffffffff8807b455>]
          RSP: 0018:ffff81007dcdde88  EFLAGS: 00010246
          RAX: ffff81007dc95040 RBX: ffff81007dee5390 RCX: 0000000000005e13
          RDX: 0000000000008c8b RSI: 0000000000000001 RDI: ffff81007dee5388
          RBP: ffff81007dc5eb40 R08: 0000000000000002 R09: ffffffff8022d05c
          R10: ffffffff8023b34c R11: ffffffff8041a353 R12: ffff81007dee5388
          R13: ffffffff8807b455 R14: ffffffff80593bc0 R15: 0000000000000000
          FS:  0000000000000000(0000) GS:ffffffff8055a000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
          CR2: ffffffff8807b455 CR3: 0000000000201000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
          Process events/0 (pid: 9, threadinfo ffff81007dcdc000, task ffff81007dc95040)
          Stack:  ffffffff8023b396 ffffffff88082524 0000000000000000 ffffffff8807d9ae
          ffff81007dc5eb40 ffff81007dc9dce0 ffff81007dc5eb40 ffff81007dc5eb80
          ffff81007dc9dce0 ffffffffffffffff ffffffff8023be87 0000000000000000
          Call Trace:
          [<ffffffff8023b396>] ? run_workqueue+0xdf/0x1df
          [<ffffffff8023be87>] ? worker_thread+0xd8/0xe3
          [<ffffffff8023e917>] ? autoremove_wake_function+0x0/0x2e
          [<ffffffff8023bdaf>] ? worker_thread+0x0/0xe3
          [<ffffffff8023e813>] ? kthread+0x47/0x74
          [<ffffffff804198e0>] ? trace_hardirqs_on_thunk+0x35/0x3a
          [<ffffffff8020c008>] ? child_rip+0xa/0x12
          [<ffffffff8020b6e3>] ? restore_args+0x0/0x3d
          [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171
          [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171
          [<ffffffff8023e7cc>] ? kthread+0x0/0x74
          [<ffffffff8020bffe>] ? child_rip+0x0/0x12
      
          Code:  Bad RIP value.
          RIP  [<ffffffff8807b455>]
          RSP <ffff81007dcdde88>
          CR2: ffffffff8807b455
          ---[ end trace c7366c6657fe5bed ]---
      
      Note that this crash happened _after_ firewire-core was unloaded.  The
      shared workqueue tried to run firewire-core's device initialization jobs
      or similar jobs.
      
      The fix makes sure that firewire-ohci and hence firewire-core is not
      unloaded before all device shutdown jobs have been completed.  This is
      determined by the count of device initializations minus device releases.
      
      Also skip useless retries in the node initialization job if the node is
      to be shut down.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NJarod Wilson <jwilson@redhat.com>
      855c603d
    • S
      firewire: fw-sbp2: better fix for NULL pointer dereference in scsi_remove_device · f8436158
      Stefan Richter 提交于
      Patch "firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device"
      had the unintended effect that firewire-sbp2 could not be unloaded
      anymore until all SBP-2 devices were unplugged.
      
      We now fix the NULL pointer bug by reacquiring a reference to the sdev
      instead of holding a reference to the sdev (and to the module) all the
      time.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Tested-by: NJarod Wilson <jwilson@redhat.com>
      f8436158
  2. 20 2月, 2008 3 次提交
    • S
      firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device · 33f1c6c3
      Stefan Richter 提交于
      Fix a kernel bug when unplugging an SBP-2 device after having its
      scsi_device already removed via the "delete" sysfs attribute.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      33f1c6c3
    • S
      firewire: fw-sbp2: fix NULL pointer deref. in slave_alloc · 5513c5f6
      Stefan Richter 提交于
      Fix a kernel bug when running rescan-scsi-bus while a FireWire disk is
      connected:  http://bugzilla.kernel.org/show_bug.cgi?id=10008Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      5513c5f6
    • S
      firewire: fw-sbp2: (try to) avoid I/O errors during reconnect · 2e2705bd
      Stefan Richter 提交于
      While fw-sbp2 takes the necessary time to reconnect to a logical unit
      after bus reset, the SCSI core keeps sending new commands.  They are all
      immediately completed with host busy status, and application clients or
      filesystems will break quickly.  The SCSI device might even be taken
      offline:  http://bugzilla.kernel.org/show_bug.cgi?id=9734
      
      The only remedy seems to be to block the SCSI device until reconnect.
      Alas the SCSI core has no useful API to block only one logical unit i.e.
      the scsi_device, therefore we block the entire Scsi_Host.  This
      currently corresponds to an SBP-2 target.  In case of targets with
      multiple logical units, we need to satisfy the dependencies between
      logical units by carefully tracking the blocking state of the target and
      its units.  We block all logical units of a target as soon as one of
      them needs to be blocked, and keep them blocked until all of them are
      ready to be unblocked.
      
      Furthermore, as the history of the old sbp2 driver has shown, the
      scsi_block_requests() API is a minefield with high potential of
      deadlocks.  We therefore take extra measures to keep logical units
      unblocked during __scsi_add_device() and during shutdown.
      
      This avoids I/O errors during reconnect in many but alas not in all
      cases.  There may still be errors after a re-login had to be performed.
      Also, some bridges have been seen to cease fetching management ORBs if
      I/O went on up until a bus reset.  In these cases, all management ORBs
      time out after mgt_orb_timeout.  The old sbp2 driver is less vulnerable
      or maybe not vulnerable to this, for as yet unknown reasons.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      2e2705bd
  3. 16 2月, 2008 10 次提交
    • S
      firewire: fw-sbp2: enforce a retry of __scsi_add_device if bus generation changed · e80de370
      Stefan Richter 提交于
      fw-sbp2 is unable to reconnect while performing __scsi_add_device
      because there is only a single workqueue thread context available for
      both at the moment.  This should be fixed eventually.
      
      An actual failure of __scsi_add_device is easy to handle, but an
      incomplete execution of __scsi_add_device with an sdev returned would
      remain undetected and leave the SBP-2 target unusable.
      
      Therefore we use a workaround:  If there was a bus reset during
      __scsi_add_device (i.e. during the SCSI probe), we remove the new sdev
      immediately, log out, and attempt login and SCSI probe again.
      
      Tested-by: Jarod Wilson <jwilson@redhat.com> (earlier version)
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      e80de370
    • S
      firewire: fw-sbp2: sort includes · 7bb6bf7c
      Stefan Richter 提交于
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      7bb6bf7c
    • S
      firewire: fw-sbp2: logout and login after failed reconnect · ce896d95
      Stefan Richter 提交于
      If fw-sbp2 was too late with requesting the reconnect, the target would
      reject this.  In this case, log out before attempting the reconnect.
      Else several firmwares will deny the re-login because they somehow
      didn't invalidate the old login.
      
      Also, don't retry reconnects in this situation.  The retries won't
      succeed either.
      
      These changes improve chances for successful re-login and shorten the
      period during which the logical unit is inaccessible.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NJarod Wilson <jwilson@redhat.com>
      ce896d95
    • S
      firewire: fw-sbp2: don't add scsi_device twice · 0fa6dfdb
      Stefan Richter 提交于
      When a reconnect failed but re-login succeeded, __scsi_add_device was
      called again.
      
      In those cases, __scsi_add_device succeeded and returned the pointer to
      the existing scsi_device.  fw-sbp2 then continued orderly, except that
      it missed to call sbp2_cancel_orbs.  SCSI core would call fw-sbp2's
      eh_abort_handler eventually if there had been an outstanding command.
      
      This patch avoids the needless lookups and temporary allocations in SCSI
      core and I/O stall and timeout until eh_abort_handler hits.
      
      Also, __scsi_add_device tolerating calls for devices which already exist
      is undocumented behavior on which we shouldn't rely.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NJarod Wilson <jwilson@redhat.com>
      0fa6dfdb
    • S
      firewire: fw-sbp2: log bus_id at management request failures · 48f18c76
      Stefan Richter 提交于
      for easier readable logs if more than one SBP-2 device is present.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NJarod Wilson <jwilson@redhat.com>
      48f18c76
    • S
      firewire: fw-sbp2: wait for completion of fetch agent reset · e0e60215
      Stefan Richter 提交于
      Like the old sbp2 driver, wait for the write transaction to the
      AGENT_RESET to complete before proceeding (after login, after reconnect,
      or in SCSI error handling).
      
      There is one occasion where AGENT_RESET is written to from atomic
      context when getting DEAD status for a command ORB.  There we still
      continue without waiting for the transaction to complete because this
      is more difficult to fix...
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      e0e60215
    • S
      firewire: fw-sbp2: add INQUIRY delay workaround · 9220f194
      Stefan Richter 提交于
      Several different SBP-2 bridges accept a login early while the IDE
      device is still powering up.  They are therefore unable to respond to
      SCSI INQUIRY immediately, and the SCSI core has to retry the INQUIRY.
      One of these retries is typically successful, and all is well.
      
      But in case of Momobay FX-3A, the INQUIRY retries tend to fail entirely.
      This can usually be avoided by waiting a little while after login before
      letting the SCSI core send the INQUIRY.  The old sbp2 driver handles
      this more gracefully for as yet unknown reasons (perhaps because it
      waits for fetch agent resets to complete, unlike fw-sbp2 which quickly
      proceeds after requesting the agent reset).  Therefore the workaround is
      not as much necessary for sbp2.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NJarod Wilson <jwilson@redhat.com>
      9220f194
    • S
      firewire: fw-sbp2: don't retry login or reconnect after unplug · be6f48b0
      Stefan Richter 提交于
      If a device is being unplugged while fw-sbp2 had a login or reconnect on
      schedule, it would take about half a minute to shut the fw_unit down:
      
          Jan 27 18:34:54 stein firewire_sbp2: logged in to fw2.0 LUN 0000 (0 retries)
          <unplug>
          Jan 27 18:34:59 stein firewire_sbp2: sbp2_scsi_abort
          Jan 27 18:34:59 stein scsi 25:0:0:0: Device offlined - not ready after error recovery
          Jan 27 18:35:01 stein firewire_sbp2: orb reply timed out, rcode=0x11
          Jan 27 18:35:06 stein firewire_sbp2: orb reply timed out, rcode=0x11
          Jan 27 18:35:12 stein firewire_sbp2: orb reply timed out, rcode=0x11
          Jan 27 18:35:17 stein firewire_sbp2: orb reply timed out, rcode=0x11
          Jan 27 18:35:22 stein firewire_sbp2: orb reply timed out, rcode=0x11
          Jan 27 18:35:27 stein firewire_sbp2: orb reply timed out, rcode=0x11
          Jan 27 18:35:32 stein firewire_sbp2: orb reply timed out, rcode=0x11
          Jan 27 18:35:32 stein firewire_sbp2: failed to login to fw2.0 LUN 0000
          Jan 27 18:35:32 stein firewire_sbp2: released fw2.0
      
      After this patch, typically only a few seconds spent in __scsi_add_device
      remain:
      
          Jan 27 19:05:50 stein firewire_sbp2: logged in to fw2.0 LUN 0000 (0 retries)
          <unplug>
          Jan 27 19:05:56 stein firewire_sbp2: sbp2_scsi_abort
          Jan 27 19:05:56 stein scsi 33:0:0:0: Device offlined - not ready after error recovery
          Jan 27 19:05:56 stein firewire_sbp2: released fw2.0
      
      The benefit of this is less noise in the syslog.  It furthermore avoids
      a few wasted CPU cycles and needlessly prolonged lifetime of a few
      driver objects.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NJarod Wilson <jwilson@redhat.com>
      be6f48b0
    • S
      firewire: fw-sbp2: fix logout before login retry · 1b9c12ba
      Stefan Richter 提交于
      This fixes a "can't recognize device" kind of bug.
      
      If the SCSI INQUIRY failed and hence __scsi_add_device failed due to a
      bus reset, we tried a logout and then waited for the already scheduled
      login work to happen.  So far so good, but the generation used for the
      logout was outdated, hence the logout never reached the target.  The
      target might therefore deny the subsequent relogin attempt, which would
      also leave the target inaccessible.
      
      Therefore fetch a fresh device->generation for the logout.  Use memory
      barriers to prevent our plan being foiled by compiler or hardware
      optimizations.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      1b9c12ba
    • S
      firewire: fw-sbp2: unsigned int vs. unsigned · 05cca738
      Stefan Richter 提交于
      Standardize on "unsigned int" style.
      Sort some struct members thematically.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      05cca738
  4. 31 1月, 2008 9 次提交
  5. 12 1月, 2008 1 次提交
  6. 07 11月, 2007 1 次提交
    • S
      firewire: fw-sbp2: fix refcounting · 7c45d191
      Stefan Richter 提交于
      Since patch "fw-sbp2: use an own workqueue (fix system responsiveness)"
      increased parallelism between fw-sbp2 and fw-core, it was possible that
      fw-sbp2 didn't release the SCSI device when the FireWire device was
      disconnected.
      
      This happened if sbp2_update() ran during sbp2_login(), because a bus
      reset occurred during sbp2_login().  The sbp2_login() work would [try
      to] reschedule itself because it failed due to the bus reset, and it
      would _not_ drop its reference on the target.  However, sbp2_update()
      would schedule sbp2_login() too before sbp2_login() rescheduled itself
      and hence sbp2_update() would take an additional reference.  And then
      we would have one reference too many.
      
      The fix is to _always_ drop the reference when leaving the sbp2_login()
      work.  If the sbp2_login() work reschedules itself, it takes a
      reference, but only if it wasn't already rescheduled by sbp2_update().
      
      Ditto in the sbp2_reconnect() work.
      
      The resulting code is actually simpler than before:  We _always_ take
      a reference when successfully scheduling work.  And we _always_ drop
      a reference when leaving a workqueue job.  No exceptions.
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      7c45d191
  7. 17 10月, 2007 4 次提交
  8. 26 8月, 2007 1 次提交
  9. 03 8月, 2007 1 次提交
  10. 20 7月, 2007 1 次提交
  11. 19 7月, 2007 1 次提交
  12. 10 7月, 2007 6 次提交