1. 10 11月, 2011 1 次提交
    • J
      [SCSI] fix WARNING: at drivers/scsi/scsi_lib.c:1704 · 4e6c82b3
      James Bottomley 提交于
      On Mon, 2011-11-07 at 17:24 +1100, Stephen Rothwell wrote:
      > Hi all,
      >
      > Starting some time last week I am getting the following during boot on
      > our PPC970 blade:
      >
      > calling  .ipr_init+0x0/0x68 @ 1
      > ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011)
      > ipr 0000:01:01.0: Found IOA with IRQ: 26
      > ipr 0000:01:01.0: Starting IOA initialization sequence.
      > ipr 0000:01:01.0: Adapter firmware version: 06160039
      > ipr 0000:01:01.0: IOA initialized.
      > scsi0 : IBM 572E Storage Adapter
      > ------------[ cut here ]------------
      > WARNING: at drivers/scsi/scsi_lib.c:1704
      > Modules linked in:
      > NIP: c00000000053b3d4 LR: c00000000053e5b0 CTR: c000000000541d70
      > REGS: c0000000783c2f60 TRAP: 0700   Not tainted  (3.1.0-autokern1)
      > MSR: 8000000000029032 <EE,ME,CE,IR,DR>  CR: 24002024  XER: 20000002
      > TASK = c0000000783b8000[1] 'swapper' THREAD: c0000000783c0000 CPU: 0
      > GPR00: 0000000000000001 c0000000783c31e0 c000000000cf38b0 c00000000239a9d0
      > GPR04: c000000000cbe8f8 0000000000000000 c0000000783c3040 0000000000000000
      > GPR08: c000000075daf488 c000000078a3b7ff c000000000bcacc8 0000000000000000
      > GPR12: 0000000044002028 c000000007ffb000 0000000002e40000 000000000099b800
      > GPR16: 0000000000000000 c000000000bba5fc c000000000a61db8 0000000000000000
      > GPR20: 0000000001b77200 0000000000000000 c000000078990000 0000000000000001
      > GPR24: c000000002396828 0000000000000000 0000000000000000 c000000078a3b938
      > GPR28: fffffffffffffffa c0000000008ad2c0 c000000000c7faa8 c00000000239a9d0
      > NIP [c00000000053b3d4] .scsi_free_queue+0x24/0x90
      > LR [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0
      > Call Trace:
      > [c0000000783c31e0] [c000000000c7faa8] wireless_seq_fops+0x278d0/0x2eb88 (unreliable)
      > [c0000000783c3270] [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0
      > [c0000000783c3330] [c00000000053eba0] .scsi_probe_and_add_lun+0x390/0xb40
      > [c0000000783c34a0] [c00000000053f7ec] .__scsi_scan_target+0x16c/0x650
      > [c0000000783c35f0] [c00000000053fd90] .scsi_scan_channel+0xc0/0x100
      > [c0000000783c36a0] [c00000000053fefc] .scsi_scan_host_selected+0x12c/0x1c0
      > [c0000000783c3750] [c00000000083dcb4] .ipr_probe+0x2c0/0x390
      > [c0000000783c3830] [c0000000003f50b4] .local_pci_probe+0x34/0x50
      > [c0000000783c38a0] [c0000000003f5f78] .pci_device_probe+0x148/0x150
      > [c0000000783c3950] [c0000000004e1e8c] .driver_probe_device+0xdc/0x210
      > [c0000000783c39f0] [c0000000004e20cc] .__driver_attach+0x10c/0x110
      > [c0000000783c3a80] [c0000000004e1228] .bus_for_each_dev+0x98/0xf0
      > [c0000000783c3b30] [c0000000004e1bf8] .driver_attach+0x28/0x40
      > [c0000000783c3bb0] [c0000000004e07d8] .bus_add_driver+0x218/0x340
      > [c0000000783c3c60] [c0000000004e2a2c] .driver_register+0x9c/0x1b0
      > [c0000000783c3d00] [c0000000003f62d4] .__pci_register_driver+0x64/0x140
      > [c0000000783c3da0] [c000000000b99f88] .ipr_init+0x4c/0x68
      > [c0000000783c3e20] [c00000000000ad24] .do_one_initcall+0x1a4/0x1e0
      > [c0000000783c3ee0] [c000000000b512d0] .kernel_init+0x14c/0x1fc
      > [c0000000783c3f90] [c000000000022468] .kernel_thread+0x54/0x70
      > Instruction dump:
      > ebe1fff8 7c0803a6 4e800020 7c0802a6 fba1ffe8 fbe1fff8 7c7f1b78 f8010010
      > f821ff71 e8030398 3120ffff 7c090110 <0b000000> e86303b0 482de065 60000000
      > ---[ end trace 759bed76a85e8dec ]---
      > scsi 0:0:1:0: Direct-Access     IBM-ESXS MAY2036RC        T106 PQ: 0 ANSI: 5
      > ------------[ cut here ]------------
      >
      > I get lots more of these.  The obvious commit to point the finger at
      > is 3308511c ("[SCSI] Make scsi_free_queue() kill pending SCSI
      > commands") but the root cause may be something different.
      
      Caused by
      
      commit f7c9c6bb
      Author: Anton Blanchard <anton@samba.org>
      Date:   Thu Nov 3 08:56:22 2011 +1100
      
          [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev
      
      Doesn't completely do the teardown.  The true fix is to do a proper
      teardown instead of hand rolling it
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Tested-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: stable@kernel.org	#2.6.38+
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      4e6c82b3
  2. 03 11月, 2011 1 次提交
    • A
      [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev · f7c9c6bb
      Anton Blanchard 提交于
      When looking at memory consumption issues I noticed quite a
      lot of memory in the kmalloc-2048 bucket:
      
        OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
        6561   6471  98%    2.30K    243       27     15552K kmalloc-2048
      
      Over 15MB. slub debug shows that cfq is responsible for almost
      all of it:
      
      # sort -nr /sys/kernel/slab/kmalloc-2048/alloc_calls
      6402 .cfq_init_queue+0xec/0x460 age=43423/43564/43655 pid=1 cpus=4,11,13
      
      In scsi_alloc_sdev we do scsi_alloc_queue but if slave_alloc
      fails we don't free it with scsi_free_queue.
      
      The patch below fixes the issue:
      
        OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
         135     72  53%    2.30K      5       27       320K kmalloc-2048
      
      # cat /sys/kernel/slab/kmalloc-2048/alloc_calls
      3 .cfq_init_queue+0xec/0x460 age=3811/3876/3925 pid=1 cpus=4,11,13
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: <stable@kernel.org>		#2.6.38+
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      f7c9c6bb
  3. 02 6月, 2011 1 次提交
    • J
      [SCSI] Fix oops caused by queue refcounting failure · e73e079b
      James Bottomley 提交于
      In certain circumstances, we can get an oops from a torn down device.
      Most notably this is from CD roms trying to call scsi_ioctl.  The root
      cause of the problem is the fact that after scsi_remove_device() has
      been called, the queue is fully torn down.  This is actually wrong
      since the queue can be used until the sdev release function is called.
      Therefore, we add an extra reference to the queue which is released in
      sdev->release, so the queue always exists.
      Reported-by: NParag Warudkar <parag.lkml@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <jbottomley@parallels.com>
      e73e079b
  4. 17 5月, 2011 1 次提交
    • J
      scsi: remove performance regression due to async queue run · 9937a5e2
      Jens Axboe 提交于
      Commit c21e6beb removed our queue request_fn re-enter
      protection, and defaulted to always running the queues from
      kblockd to be safe. This was a known potential slow down,
      but should be safe.
      
      Unfortunately this is causing big performance regressions for
      some, so we need to improve this logic. Looking into the details
      of the re-enter, the real issue is on requeue of requests.
      
      Requeue of requests upon seeing a BUSY condition from the device
      ends up re-running the queue, causing traces like this:
      
      scsi_request_fn()
              scsi_dispatch_cmd()
                      scsi_queue_insert()
                              __scsi_queue_insert()
                                      scsi_run_queue()
      					scsi_request_fn()
      						...
      
      potentially causing the issue we want to avoid. So special
      case the requeue re-run of the queue, but improve it to offload
      the entire run of local queue and starved queue from a single
      workqueue callback. This is a lot better than potentially
      kicking off a workqueue run for each device seen.
      
      This also fixes the issue of the local device going into recursion,
      since the above mentioned commit never moved that queue run out
      of line.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      9937a5e2
  5. 23 10月, 2010 1 次提交
    • K
      driver core: remove CONFIG_SYSFS_DEPRECATED_V2 but keep it for block devices · 39aba963
      Kay Sievers 提交于
      This patch removes the old CONFIG_SYSFS_DEPRECATED_V2 config option,
      but it keeps the logic around to handle block devices in the old manner
      as some people like to run new kernel versions on old (pre 2007/2008)
      distros.
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: "James E.J. Bottomley" <James.Bottomley@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jaroslav Kysela <perex@perex.cz>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      
      39aba963
  6. 28 7月, 2010 1 次提交
    • A
      [SCSI] implement runtime Power Management · bc4f2401
      Alan Stern 提交于
      This patch (as1398b) adds runtime PM support to the SCSI layer.  Only
      the machanism is provided; use of it is up to the various high-level
      drivers, and the patch doesn't change any of them.  Except for sg --
      the patch expicitly prevents a device from being runtime-suspended
      while its sg device file is open.
      
      The implementation is simplistic.  In general, hosts and targets are
      automatically suspended when all their children are asleep, but for
      them the runtime-suspend code doesn't actually do anything.  (A host's
      runtime PM status is propagated up the device tree, though, so a
      runtime-PM-aware lower-level driver could power down the host adapter
      hardware at the appropriate times.)  There are comments indicating
      where a transport class might be notified or some other hooks added.
      
      LUNs are runtime-suspended by calling the drivers' existing suspend
      handlers (and likewise for runtime-resume).  Somewhat arbitrarily, the
      implementation delays for 100 ms before suspending an eligible LUN.
      This is because there typically are occasions during bootup when the
      same device file is opened and closed several times in quick
      succession.
      
      The way this all works is that the SCSI core increments a device's
      PM-usage count when it is registered.  If a high-level driver does
      nothing then the device will not be eligible for runtime-suspend
      because of the elevated usage count.  If a high-level driver wants to
      use runtime PM then it can call scsi_autopm_put_device() in its probe
      routine to decrement the usage count and scsi_autopm_get_device() in
      its remove routine to restore the original count.
      
      Hosts, targets, and LUNs are not suspended while they are being probed
      or removed, or while the error handler is running.  In fact, a fairly
      large part of the patch consists of code to make sure that things
      aren't suspended at such times.
      
      [jejb: fix up compile issues in PM config variations]
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      bc4f2401
  7. 26 5月, 2010 1 次提交
  8. 24 5月, 2010 1 次提交
  9. 11 4月, 2010 1 次提交
  10. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  11. 26 2月, 2010 1 次提交
  12. 19 2月, 2010 2 次提交
  13. 05 12月, 2009 1 次提交
    • V
      [SCSI] add queue_depth ramp up code · 4a84067d
      Vasu Dev 提交于
      Current FC HBA queue_depth ramp up code depends on last queue
      full time. The sdev already  has last_queue_full_time field to
      track last queue full time but stored value is truncated by
      last four bits.
      
      So this patch updates last_queue_full_time without truncating
      last 4 bits to store full value and then updates its only
      current usages in scsi_track_queue_full to ignore last four bits
      to keep current usages same while also use this field
      in added ramp up code.
      
      Adds scsi_handle_queue_ramp_up to ramp up queue_depth on
      successful completion of IO. The scsi_handle_queue_ramp_up will
      do ramp up on all luns of a target, just same as ramp down done
      on all luns on a target.
      
      The ramp up is skipped in case the change_queue_depth is not
      supported by LLD or already reached to added max_queue_depth.
      
      Updates added max_queue_depth on every new update to default
      queue_depth value.
      
      The ramp up is also skipped if lapsed time since either last
      queue ramp up or down is less than LLD specified
      queue_ramp_up_period.
      
      Adds queue_ramp_up_period to sysfs but only if change_queue_depth
      is supported since ramp up and queue_ramp_up_period is needed only
      in case change_queue_depth is supported first.
      
      Initializes queue_ramp_up_period to 120HZ jiffies as initial
      default value, it is same as used in existing lpfc and qla2xxx.
      
      -v2
       Combined all ramp code into this single patch.
      
      -v3
       Moves max_queue_depth initialization after slave_configure is
      called from after slave_alloc calling done. Also adjusted
      max_queue_depth check to skip ramp up if current queue_depth
      is >= max_queue_depth.
      
      -v4
       Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f
      to store or show its value.
      Signed-off-by: NVasu Dev <vasu.dev@intel.com>
      Tested-by: NChristof Schmitt <christof.schmitt@de.ibm.com>
      Tested-by: NGiridhar Malavali <giridhar.malavali@qlogic.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      4a84067d
  14. 26 11月, 2009 1 次提交
    • J
      [SCSI] fix async scan add/remove race resulting in an oops · 860dc736
      James Bottomley 提交于
      Async scanning introduced a very wide window where the SCSI device is
      up and running but has not yet been added to sysfs.  We delay the
      adding until all scans have completed to retain the same ordering as
      sync scanning.
      
      This delay in visibility causes an oops if a device is removed before
      we make it visible because the SCSI removal routines have an inbuilt
      assumption that if a device is in SDEV_RUNNING state, it must be
      visible (which is not necessarily true in the async scanning case).
      
      Fix this by introducing an additional is_visible flag which we can use
      to condition the tear down so we do the right thing for running but
      not yet made visible.
      Reported-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      860dc736
  15. 14 10月, 2009 1 次提交
    • J
      [SCSI] fix memory leak in initialization · 37e6ba00
      James Bottomley 提交于
      The root cause of the problem is the fact that dev_set_name() now
      allocates storage instead of using the original array within the kobj.
      That means that the SCSI assumption that if you haven't made the
      containing object or any sub objects visible, you can just destroy it
      (and its component devices) lock stock and barrel becomes false.
      
      Fix this by doing the get of sdev_dev at parent time and thus do an
      extra put of it in scsi_destroy_sdev() (and all other destruction
      without add paths).
      Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      37e6ba00
  16. 21 5月, 2009 1 次提交
  17. 15 5月, 2009 1 次提交
  18. 22 4月, 2009 1 次提交
    • A
      driver synchronization: make scsi_wait_scan more advanced · d4d5291c
      Arjan van de Ven 提交于
      There is currently only one way for userspace to say "wait for my storage
      device to get ready for the modules I just loaded": to load the
      scsi_wait_scan module. Expectations of userspace are that once this
      module is loaded, all the (storage) devices for which the drivers
      were loaded before the module load are present.
      
      Now, there are some issues with the implementation, and the async
      stuff got caught in the middle of this: The existing code only
      waits for the scsy async probing to finish, but it did not take
      into account at all that probing might not have begun yet.
      (Russell ran into this problem on his computer and the fix works for him)
      
      This patch fixes this more thoroughly than the previous "fix", which
      had some bad side effects (namely, for kernel code that wanted to wait for
      the scsi scan it would also do an async sync, which would deadlock if you did
      it from async context already.. there's a report about that on lkml):
      The patch makes the module first wait for all device driver probes, and then it
      will wait for the scsi parallel scan to finish.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Tested-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d4d5291c
  19. 13 3月, 2009 1 次提交
  20. 11 2月, 2009 1 次提交
  21. 08 1月, 2009 1 次提交
  22. 03 1月, 2009 1 次提交
  23. 30 12月, 2008 2 次提交
  24. 13 10月, 2008 1 次提交
    • M
      [SCSI] Add helper code so transport classes/driver can control queueing (v3) · f0c0a376
      Mike Christie 提交于
      SCSI-ml manages the queueing limits for the device and host, but
      does not do so at the target level. However something something similar
      can come in userful when a driver is transitioning a transport object to
      the the blocked state, becuase at that time we do not want to queue
      io and we do not want the queuecommand to be called again.
      
      The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers.
      You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit
      a transport level queueing issue like the hw cannot allocate some
      resource at the iscsi session/connection level, or the target has temporarily
      closed or shrunk the queueing window, or if we are transitioning
      to the blocked state.
      
      bnx2i, when they rework their firmware according to netdev
      developers requests, will also need to be able to limit queueing at this
      level. bnx2i will hook into libiscsi, but will allocate a scsi host per
      netdevice/hba, so unlike pure software iscsi/iser which is allocating
      a host per session, it cannot set the scsi_host->can_queue and return
      SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport.
      
      The iscsi class/driver can also set a scsi_target->can_queue value which
      reflects the max commands the driver/class can support. For iscsi this
      reflects the number of commands we can support for each session due to
      session/connection hw limits, driver limits, and to also reflect the
      session/targets's queueing window.
      
      Changes:
      v1 - initial patch.
      v2 - Fix scsi_run_queue handling of multiple blocked targets.
      Previously we would break from the main loop if a device was added back on
      the starved list. We now run over the list and check if any target is
      blocked.
      v3 - Rediff for scsi-misc.
      Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      f0c0a376
  25. 04 10月, 2008 2 次提交
  26. 29 8月, 2008 1 次提交
  27. 27 7月, 2008 2 次提交
  28. 12 7月, 2008 1 次提交
  29. 29 4月, 2008 1 次提交
  30. 23 4月, 2008 2 次提交
  31. 04 3月, 2008 2 次提交
  32. 24 1月, 2008 1 次提交
  33. 12 1月, 2008 2 次提交