1. 09 11月, 2013 1 次提交
  2. 08 11月, 2013 1 次提交
    • J
      block: fix race between request completion and timeout handling · 4912aa6c
      Jeff Moyer 提交于
      crocode i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma dca be2net sg ses enclosure ext4 mbcache jbd2 sd_mod crc_t10dif ahci megaraid_sas(U) dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
      
      Pid: 491, comm: scsi_eh_0 Tainted: G        W  ----------------   2.6.32-220.13.1.el6.x86_64 #1 IBM  -[8722PAX]-/00D1461
      RIP: 0010:[<ffffffff8124e424>]  [<ffffffff8124e424>] blk_requeue_request+0x94/0xa0
      RSP: 0018:ffff881057eefd60  EFLAGS: 00010012
      RAX: ffff881d99e3e8a8 RBX: ffff881d99e3e780 RCX: ffff881d99e3e8a8
      RDX: ffff881d99e3e8a8 RSI: ffff881d99e3e780 RDI: ffff881d99e3e780
      RBP: ffff881057eefd80 R08: ffff881057eefe90 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff881057f92338
      R13: 0000000000000000 R14: ffff881057f92338 R15: ffff883058188000
      FS:  0000000000000000(0000) GS:ffff880040200000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 00000000006d3ec0 CR3: 000000302cd7d000 CR4: 00000000000406b0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process scsi_eh_0 (pid: 491, threadinfo ffff881057eee000, task ffff881057e29540)
      Stack:
       0000000000001057 0000000000000286 ffff8810275efdc0 ffff881057f16000
      <0> ffff881057eefdd0 ffffffff81362323 ffff881057eefe20 ffffffff8135f393
      <0> ffff881057e29af8 ffff8810275efdc0 ffff881057eefe78 ffff881057eefe90
      Call Trace:
       [<ffffffff81362323>] __scsi_queue_insert+0xa3/0x150
       [<ffffffff8135f393>] ? scsi_eh_ready_devs+0x5e3/0x850
       [<ffffffff81362a23>] scsi_queue_insert+0x13/0x20
       [<ffffffff8135e4d4>] scsi_eh_flush_done_q+0x104/0x160
       [<ffffffff8135fb6b>] scsi_error_handler+0x35b/0x660
       [<ffffffff8135f810>] ? scsi_error_handler+0x0/0x660
       [<ffffffff810908c6>] kthread+0x96/0xa0
       [<ffffffff8100c14a>] child_rip+0xa/0x20
       [<ffffffff81090830>] ? kthread+0x0/0xa0
       [<ffffffff8100c140>] ? child_rip+0x0/0x20
      Code: 00 00 eb d1 4c 8b 2d 3c 8f 97 00 4d 85 ed 74 bf 49 8b 45 00 49 83 c5 08 48 89 de 4c 89 e7 ff d0 49 8b 45 00 48 85 c0 75 eb eb a4 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00
      RIP  [<ffffffff8124e424>] blk_requeue_request+0x94/0xa0
       RSP <ffff881057eefd60>
      
      The RIP is this line:
              BUG_ON(blk_queued_rq(rq));
      
      After digging through the code, I think there may be a race between the
      request completion and the timer handler running.
      
      A timer is started for each request put on the device's queue (see
      blk_start_request->blk_add_timer).  If the request does not complete
      before the timer expires, the timer handler (blk_rq_timed_out_timer)
      will mark the request complete atomically:
      
      static inline int blk_mark_rq_complete(struct request *rq)
      {
              return test_and_set_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
      }
      
      and then call blk_rq_timed_out.  The latter function will call
      scsi_times_out, which will return one of BLK_EH_HANDLED,
      BLK_EH_RESET_TIMER or BLK_EH_NOT_HANDLED.  If BLK_EH_RESET_TIMER is
      returned, blk_clear_rq_complete is called, and blk_add_timer is again
      called to simply wait longer for the request to complete.
      
      Now, if the request happens to complete while this is going on, what
      happens?  Given that we know the completion handler will bail if it
      finds the REQ_ATOM_COMPLETE bit set, we need to focus on the completion
      handler running after that bit is cleared.  So, from the above
      paragraph, after the call to blk_clear_rq_complete.  If the completion
      sets REQ_ATOM_COMPLETE before the BUG_ON in blk_add_timer, we go boom
      there (I haven't seen this in the cores).  Next, if we get the
      completion before the call to list_add_tail, then the timer will
      eventually fire for an old req, which may either be freed or reallocated
      (there is evidence that this might be the case).  Finally, if the
      completion comes in *after* the addition to the timeout list, I think
      it's harmless.  The request will be removed from the timeout list,
      req_atom_complete will be set, and all will be well.
      
      This will only actually explain the coredumps *IF* the request
      structure was freed, reallocated *and* queued before the error handler
      thread had a chance to process it.  That is possible, but it may make
      sense to keep digging for another race.  I think that if this is what
      was happening, we would see other instances of this problem showing up
      as null pointer or garbage pointer dereferences, for example when the
      request structure was not re-used.  It looks like we actually do run
      into that situation in other reports.
      
      This patch moves the BUG_ON(test_bit(REQ_ATOM_COMPLETE,
      &req->atomic_flags)); from blk_add_timer to the only caller that could
      trip over it (blk_start_request).  It then inverts the calls to
      blk_clear_rq_complete and blk_add_timer in blk_rq_timed_out to address
      the race.  I've boot tested this patch, but nothing more.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Acked-by: NHannes Reinecke <hare@suse.de>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4912aa6c
  3. 01 7月, 2013 1 次提交
  4. 15 6月, 2012 1 次提交
    • A
      block: Drop dead function blk_abort_queue() · 76aaa510
      Asias He 提交于
      This function was only used by btrfs code in btrfs_abort_devices()
      (seems in a wrong way).
      
      It was removed in commit d07eb911,
      So, Let's remove the dead code to avoid any confusion.
      
      Changes in v2: update commit log, btrfs_abort_devices() was removed
      already.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-kernel@vger.kernel.org
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: linux-btrfs@vger.kernel.org
      Cc: David Sterba <dave@jikos.cz>
      Signed-off-by: NAsias He <asias@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      76aaa510
  5. 04 8月, 2011 1 次提交
    • A
      fault-injection: add ability to export fault_attr in arbitrary directory · dd48c085
      Akinobu Mita 提交于
      init_fault_attr_dentries() is used to export fault_attr via debugfs.
      But it can only export it in debugfs root directory.
      
      Per Forlin is working on mmc_fail_request which adds support to inject
      data errors after a completed host transfer in MMC subsystem.
      
      The fault_attr for mmc_fail_request should be defined per mmc host and
      export it in debugfs directory per mmc host like
      /sys/kernel/debug/mmc0/mmc_fail_request.
      
      init_fault_attr_dentries() doesn't help for mmc_fail_request.  So this
      introduces fault_create_debugfs_attr() which is able to create a
      directory in the arbitrary directory and replace
      init_fault_attr_dentries().
      
      [akpm@linux-foundation.org: extraneous semicolon, per Randy]
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Tested-by: NPer Forlin <per.forlin@linaro.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd48c085
  6. 21 4月, 2010 1 次提交
  7. 15 4月, 2010 1 次提交
  8. 28 4月, 2009 1 次提交
  9. 24 4月, 2009 1 次提交
    • H
      block: fix intermittent dm timeout based oops · 17d5c8ca
      Hannes Reinecke 提交于
      Very rarely under stress testing of dm, oopses are occuring as
      something tampers with an old stack frame.  This has been traced back
      to blk_abort_queue() leaving a timeout_list pointing to the stack.
      The reason is that sometimes blk_abort_request() won't delete the
      timer (if the request is marked as complete but before the timer has
      been removed, a small race window).  Fix this by splicing back from
      the ususally empty list to the q->timeout_list.
      Signed-off-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      17d5c8ca
  10. 22 4月, 2009 1 次提交
  11. 18 2月, 2009 1 次提交
  12. 29 12月, 2008 3 次提交
  13. 06 11月, 2008 1 次提交
  14. 09 10月, 2008 4 次提交