1. 02 10月, 2006 1 次提交
  2. 20 8月, 2006 1 次提交
    • A
      [SCSI] limit recursion when flushing shost->starved_list · 04846f25
      Andreas Herrmann 提交于
      Attached is a patch that should limit a possible recursion that can
      lead to a stack overflow like follows:
      
      Kernel stack overflow.
      CPU:    3    Not tainted
      Process zfcperp0.0.d819
      (pid: 13897, task: 000000003e0d8cc8, ksp: 000000003499dbb8)
      Krnl PSW : 0404000180000000 000000000030f8b2 (get_device+0x12/0x48)
      Krnl GPRS: 00000000135a1980 000000000030f758 000000003ed6c1e8 0000000000000005
                 0000000000000000 000000000044a780 000000003dbf7000 0000000034e15800
                 000000003621c048 070000003499c108 000000003499c1a0 000000003ed6c000
                 0000000040895000 00000000408ab630 000000003499c0a0 000000003499c0a0
      Krnl Code: a7 fb ff e8 a7 19 00 00 b9 02 00 22 e3 e0 f0 98 00 24 a7 84
      Call Trace:
      ([<000000004089edc2>] scsi_request_fn+0x13e/0x650 [scsi_mod])
       [<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
       [<000000004089ff8c>] scsi_queue_insert+0x22c/0x2a4 [scsi_mod]
       [<000000004089779a>] scsi_dispatch_cmd+0x8a/0x3d0 [scsi_mod]
       [<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
      ...
       [<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
       [<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
       [<000000004089ff8c>] scsi_queue_insert+0x22c/0x2a4 [scsi_mod]
       [<000000004089779a>] scsi_dispatch_cmd+0x8a/0x3d0 [scsi_mod]
       [<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
       [<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
       [<000000004089fa9e>] scsi_run_host_queues+0x196/0x230 [scsi_mod]
       [<00000000409eba28>] zfcp_erp_thread+0x2638/0x3080 [zfcp]
       [<0000000000107462>] kernel_thread_starter+0x6/0xc
       [<000000000010745c>] kernel_thread_starter+0x0/0xc
      <0>Kernel panic - not syncing: Corrupt kernel stack, can't continue.
      
      This stack overflow occurred during tests on s390 using zfcp.
      Recursion depth for this panic was 19.
      
      Usually recursion between blk_run_queue and a request_fn is avoided
      using QUEUE_FLAG_REENTER. But this does not help if the scsi stack
      tries to flush the starved_list of a scsi_host.
      
      Limit recursion depth when flushing the starved_list
      of a scsi_host.
      Signed-off-by: NAndreas Herrmann <aherrman@de.ibm.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      04846f25
  3. 10 7月, 2006 1 次提交
    • C
      [SCSI] hide EH backup data outside the scsi_cmnd · 631c228c
      Christoph Hellwig 提交于
      Currently struct scsi_cmnd has various fields that are used to backup
      original data after the corresponding fields have been overridden for
      EH commands.  This means drivers can easily get at it and misuse it.
      Due to the old_ naming this doesn't happen for most of them, but two
      that have different names have been used wrong a lot (see previous
      patch).  Another downside is that they unessecarily bloat the scsi_cmnd
      size.
      
      This patch moves them onstack in scsi_send_eh_cmnd to fix those two
      issues aswell as allowing future EH fixes like moving the EH command
      submissions to use SG lists like everything else.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      631c228c
  4. 03 7月, 2006 1 次提交
  5. 29 6月, 2006 1 次提交
    • B
      [SCSI] scsi: Device scanning oops for offlined devices (resend) · 309bd271
      Brian King 提交于
      If a device gets offlined as a result of the Inquiry sent
      during scanning, the following oops can occur. After the
      disk gets put into the SDEV_OFFLINE state, the error handler
      sends back the failed inquiry, which wakes the thread doing
      the scan. This starts a race between the scanning thread
      freeing the scsi device and the error handler calling
      scsi_run_host_queues to restart the host. Since the disk
      is in the SDEV_OFFLINE state, scsi_device_get will still
      work, which results in __scsi_iterate_devices getting
      a reference to the scsi disk when it shouldn't.
      
      The following execution thread causes the oops:
      
      CPU 0 (scan)				CPU 1 (eh)
      
      ---------------------------------------------------------
      scsi_probe_and_add_lun
                              ....
                                              scsi_eh_offline_sdevs
                                              scsi_eh_flush_done_q
      scsi_destroy_sdev
      scsi_device_dev_release
                                              scsi_restart_operations
                                               scsi_run_host_queues
                                                __scsi_iterate_devices
                                                 get_device
      scsi_device_dev_release_usercontext
                                                scsi_run_queue
                                                  <---OOPS--->
      
      The patch fixes this by changing the state of the sdev to SDEV_DEL
      before doing the final put_device, which should prevent the race
      from occurring.
      
      Original oops follows:
      
      Badness in kref_get at lib/kref.c:32
      Call Trace:
      [C00000002F4476D0] [C00000000000EE20] .show_stack+0x68/0x1b0 (unreliable)
      [C00000002F447770] [C00000000037515C] .program_check_exception+0x1cc/0x5a8
      [C00000002F447840] [C00000000000446C] program_check_common+0xec/0x100
       Exception: 700 at .kref_get+0x10/0x28
          LR = .kobject_get+0x20/0x3c
      [C00000002F447B30] [C00000002F447BC0] 0xc00000002f447bc0 (unreliable)
      [C00000002F447BB0] [C000000000254BDC] .get_device+0x20/0x3c
      [C00000002F447C30] [D000000000063188] .scsi_device_get+0x34/0xdc [scsi_mod]
      [C00000002F447CC0] [D0000000000633EC] .__scsi_iterate_devices+0x50/0xbc [scsi_mod]
      [C00000002F447D60] [D00000000006A910] .scsi_run_host_queues+0x34/0x5c [scsi_mod]
      [C00000002F447DF0] [D000000000069054] .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
      [C00000002F447EE0] [C00000000007B4E0] .kthread+0x128/0x178
      [C00000002F447F90] [C000000000025E84] .kernel_thread+0x4c/0x68
      Unable to handle kernel paging request for <7>PCI: Enabling device: (0002:41:01.1), cmd 143
      data at address 0x000001b8
      Faulting instruction address: 0xd0000000000698e4
      sym1: <1010-66> rev 0x1 at pci 0002:41:01.1 irq 216
      sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking
      sym1: SCSI BUS has been reset.
      scsi2 : sym-2.2.2
      cpu 0x0: Vector: 300 (Data Access) at [c00000002f447a30]
          pc: d0000000000698e4: .scsi_run_queue+0x2c/0x218 [scsi_mod]
          lr: d00000000006a904: .scsi_run_host_queues+0x28/0x5c [scsi_mod]
          sp: c00000002f447cb0
         msr: 9000000000009032
         dar: 1b8
       dsisr: 40000000
        current = 0xc0000000045fecd0
        paca    = 0xc00000000048ee80
          pid   = 1123, comm = scsi_eh_1
      enter ? for help
      [c00000002f447d60] d00000000006a904 .scsi_run_host_queues+0x28/0x5c [scsi_mod]
      [c00000002f447df0] d000000000069054 .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
      [c00000002f447ee0] c00000000007b4e0 .kthread+0x128/0x178
      [c00000002f447f90] c000000000025e84 .kernel_thread+0x4c/0x68
      Signed-off-by: NBrian King <brking@us.ibm.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      309bd271
  6. 26 6月, 2006 2 次提交
  7. 11 6月, 2006 1 次提交
  8. 10 6月, 2006 1 次提交
  9. 01 6月, 2006 1 次提交
  10. 15 5月, 2006 1 次提交
    • T
      [PATCH] SCSI: implement shost->host_eh_scheduled · ee7863bc
      Tejun Heo 提交于
      libata needs to invoke EH without scmd.  This patch adds
      shost->host_eh_scheduled to implement such behavior.
      
      Currently the only user of this feature is libata and no general
      interface is defined.  This patch simply adds handling for
      host_eh_scheduled where needed and exports scsi_eh_wakeup() to
      modules.  The rest is upto libata.  This is the result of the
      following discussion.
      
      http://thread.gmane.org/gmane.linux.scsi/23853/focus=9760
      
      In short, SCSI host is not supposed to know about exceptions unrelated
      to specific device or command.  Such exceptions should be handled by
      transport layer proper.  However, the distinction is not essential to
      ATA and libata is planning to depart from SCSI, so, for the time
      being, libata will be using SCSI EH to handle such exceptions.
      Signed-off-by: NTejun Heo <htejun@gmail.com>
      ee7863bc
  11. 28 4月, 2006 1 次提交
    • J
      [SCSI] Fix DVD burning issues. · f3e93f73
      James Bottomley 提交于
      Some pioneer DVDs are apparently returning odd "not ready" status
      codes that the mid-layer doesn't recognise and so passes back to the
      user as errors.
      
      This patch overhauls our not-ready handling and adds transparent retries for:
      
      format in progress
      rebuild in progress
      recalculation in progress
      operation in progress
      Long write in progress
      self test in progress
      
      The Pioneer was actually returning "long write in progress"
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      f3e93f73
  12. 20 4月, 2006 1 次提交
  13. 15 4月, 2006 1 次提交
    • G
      [SCSI] dc395x: dynamically map scatter-gather for PIO · cdb8c2a6
      Guennadi Liakhovetski 提交于
      The current dc395x driver uses PIO to transfer up to 4 bytes which do not
      get transferred by DMA (under unclear circumstances). For this the driver
      uses page_address() which is broken on highmem. Apart from this the
      actual calculation of the virtual address is wrong (even without highmem).
      So, e.g., for reading it reads bytes from the driver to a wrong address
      and returns wrong data, I guess, for writing it would just output random
      data to the device.
      
      The proper fix, as suggested by many, is to dynamically map data using
      kmap_atomic(page, KM_BIO_SRC_IRQ) / kunmap_atomic(virt). The reason why it
      has not been done until now, although I've done some preliminary patches
      more than a year ago was that nobody interested in fixing this problem was
      able to reliably reproduce it. Now it changed - with the help from
      Sebastian Frei (CC'ed) I was able to trigger the PIO path. Thus, I was
      also able to test and debug it.
      
      There are 4 cases when PIO is used in dc395x - data-in / -out with and
      without scatter-gather. I was able to reproduce and test only data-in with
      and without SG. So, the data-out path is still untested, but it is also
      somewhat simpler than the data-in. Fredrik Roubert (also CC'ed) also had
      PIO triggering on his system, and in his case it was data-out without SG.
      It would be great if he could test the attached patch on his system, but
      even if he cannot, I would still request to apply the patch and just wait
      if anybody cries...
      
      Implementation: I put 2 new functions in scsi_lib.c and their declarations
      in scsi_cmnd.h. I exported them without _GPL, although, I don't feel
      strongly about that - not many drivers are likely to use them. But there
      is at least one more - I want to use them in tmscsim.c. Whether these are
      the right files for the functions and their declarations - not sure
      either. Actually, they are not scsi-specific, so, might go somewhere
      around other scattergather magic? They are not platform specific either,
      and most SG functions are defined under arch/*/... As these issues were
      discussed previously there were some more routines suggested to manipulate
      scattergather buffers, I think, some of them were needed around
      crypto code... So, might be a common place reasonable, like
      lib/scattergather.c? I am open here.
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      cdb8c2a6
  14. 14 4月, 2006 1 次提交
  15. 27 3月, 2006 1 次提交
  16. 20 3月, 2006 1 次提交
  17. 28 2月, 2006 4 次提交
  18. 15 2月, 2006 1 次提交
    • J
      [PATCH] add scsi_execute_in_process_context() API · faead26d
      James Bottomley 提交于
      We have several points in the SCSI stack (primarily for our device
      functions) where we need to guarantee process context, but (given the
      place where the last reference was released) we cannot guarantee this.
      
      This API gets around the issue by executing the function directly if
      the caller has process context, but scheduling a workqueue to execute
      in process context if the caller doesn't have it.  Unfortunately, it
      requires memory allocation in interrupt context, but it's better than
      what we have previously.  The true solution will require a bit of
      re-engineering, so isn't appropriate for 2.6.16.
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      faead26d
  19. 27 1月, 2006 1 次提交
    • B
      [SCSI] Prevent scsi_execute_async from guessing cdb length · bb1d1073
      brking@us.ibm.com 提交于
      When the scsi_execute_async interface was added it ended up reducing
      the flexibility of userspace to send arbitrary scsi commands through
      sg using SG_IO. The SG_IO interface allows userspace to specify the
      CDB length. This is now ignored in scsi_execute_async and it is
      guessed using the COMMAND_SIZE macro, which is not always correct,
      particularly for vendor specific commands. This patch adds a cmd_len
      parameter to the scsi_execute_async interface to allow the caller
      to specify the length of the CDB.
      Signed-off-by: NBrian King <brking@us.ibm.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      bb1d1073
  20. 15 1月, 2006 1 次提交
  21. 09 1月, 2006 1 次提交
  22. 06 1月, 2006 3 次提交
  23. 16 12月, 2005 2 次提交
    • J
      Fix up SCSI mismerge · 7b16318d
      James Bottomley 提交于
      I forgot to do a git-update-cache on the merged files ...
      7b16318d
    • M
      [SCSI] seperate max_sectors from max_hw_sectors · defd94b7
      Mike Christie 提交于
      - export __blk_put_request and blk_execute_rq_nowait
      needed for async REQ_BLOCK_PC requests
      - seperate max_hw_sectors and max_sectors for block/scsi_ioctl.c and
      SG_IO bio.c helpers per Jens's last comments. Since block/scsi_ioctl.c SG_IO was
      already testing against max_sectors and SCSI-ml was setting max_sectors and
      max_hw_sectors to the same value this does not change any scsi SG_IO behavior. It only
      prepares ll_rw_blk.c, scsi_ioctl.c and bio.c for when SCSI-ml begins to set
      a valid max_hw_sectors for all LLDs. Today if a LLD does not set it
      SCSI-ml sets it to a safe default and some LLDs set it to a artificial low
      value to overcome memory and feedback issues.
      
      Note: Since we now cap max_sectors to BLK_DEF_MAX_SECTORS, which is 1024,
      drivers that used to call blk_queue_max_sectors with a large value of
      max_sectors will now see the fs requests capped to BLK_DEF_MAX_SECTORS.
      Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      defd94b7
  24. 15 12月, 2005 4 次提交
  25. 14 12月, 2005 2 次提交
  26. 13 12月, 2005 1 次提交
    • L
      Revert revert of "[SCSI] fix usb storage oops" · 49d7bc64
      Linus Torvalds 提交于
      This reverts commit 1b0997f5, which in
      turn reverted 34ea80ec (which is thus
      re-instated).
      
      Quoth James Bottomley:
      
        "All it's doing is deferring the device_put() from the
         scsi_put_command() to after the scsi_run_queue(), which doesn't fix
         the sleep while atomic problem of the device release method.  In both
         cases we still get the semaphore in atomic context problem which is
         caused by scsi_reap_target() doing a device_del(), which I assumed
         (wrongly) was valid from atomic context."
      
      who also promised to fix scsi_reap_target().
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      49d7bc64
  27. 10 12月, 2005 1 次提交
  28. 03 12月, 2005 1 次提交
  29. 09 11月, 2005 1 次提交
    • G
      [SCSI] fix usb storage oops · 34ea80ec
      goggin, edward 提交于
      The problem is that scsi_run_queue is called from scsi_next_command()
      after doing a scsi_put_command.  If the command was the only thing
      holding the reference on the scsi_device then the resulting device put
      will tear down the block queue.  Fix this by taking a reference to the
      device and holding it around scsi_run_queue()
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      34ea80ec