提交 · c7d1ba417c7cb7297d14dd47a390ec90ce548d5c · openanolis / cloud-kernel

09 11月, 2013 1 次提交

block: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO · 8616ebb1

由 Duan Jiong 提交于 11月 06, 2013

This patch fixes coccinelle error regarding usage of IS_ERR and
PTR_ERR instead of PTR_ERR_OR_ZERO.
Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8616ebb1

08 11月, 2013 1 次提交

block: fix race between request completion and timeout handling · 4912aa6c

由 Jeff Moyer 提交于 10月 08, 2013

crocode i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma dca be2net sg ses enclosure ext4 mbcache jbd2 sd_mod crc_t10dif ahci megaraid_sas(U) dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

Pid: 491, comm: scsi_eh_0 Tainted: G        W  ----------------   2.6.32-220.13.1.el6.x86_64 #1 IBM  -[8722PAX]-/00D1461
RIP: 0010:[<ffffffff8124e424>]  [<ffffffff8124e424>] blk_requeue_request+0x94/0xa0
RSP: 0018:ffff881057eefd60  EFLAGS: 00010012
RAX: ffff881d99e3e8a8 RBX: ffff881d99e3e780 RCX: ffff881d99e3e8a8
RDX: ffff881d99e3e8a8 RSI: ffff881d99e3e780 RDI: ffff881d99e3e780
RBP: ffff881057eefd80 R08: ffff881057eefe90 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff881057f92338
R13: 0000000000000000 R14: ffff881057f92338 R15: ffff883058188000
FS:  0000000000000000(0000) GS:ffff880040200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000006d3ec0 CR3: 000000302cd7d000 CR4: 00000000000406b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process scsi_eh_0 (pid: 491, threadinfo ffff881057eee000, task ffff881057e29540)
Stack:
 0000000000001057 0000000000000286 ffff8810275efdc0 ffff881057f16000
<0> ffff881057eefdd0 ffffffff81362323 ffff881057eefe20 ffffffff8135f393
<0> ffff881057e29af8 ffff8810275efdc0 ffff881057eefe78 ffff881057eefe90
Call Trace:
 [<ffffffff81362323>] __scsi_queue_insert+0xa3/0x150
 [<ffffffff8135f393>] ? scsi_eh_ready_devs+0x5e3/0x850
 [<ffffffff81362a23>] scsi_queue_insert+0x13/0x20
 [<ffffffff8135e4d4>] scsi_eh_flush_done_q+0x104/0x160
 [<ffffffff8135fb6b>] scsi_error_handler+0x35b/0x660
 [<ffffffff8135f810>] ? scsi_error_handler+0x0/0x660
 [<ffffffff810908c6>] kthread+0x96/0xa0
 [<ffffffff8100c14a>] child_rip+0xa/0x20
 [<ffffffff81090830>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20
Code: 00 00 eb d1 4c 8b 2d 3c 8f 97 00 4d 85 ed 74 bf 49 8b 45 00 49 83 c5 08 48 89 de 4c 89 e7 ff d0 49 8b 45 00 48 85 c0 75 eb eb a4 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00
RIP  [<ffffffff8124e424>] blk_requeue_request+0x94/0xa0
 RSP <ffff881057eefd60>

The RIP is this line:
        BUG_ON(blk_queued_rq(rq));

After digging through the code, I think there may be a race between the
request completion and the timer handler running.

A timer is started for each request put on the device's queue (see
blk_start_request->blk_add_timer).  If the request does not complete
before the timer expires, the timer handler (blk_rq_timed_out_timer)
will mark the request complete atomically:

static inline int blk_mark_rq_complete(struct request *rq)
{
        return test_and_set_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
}

and then call blk_rq_timed_out.  The latter function will call
scsi_times_out, which will return one of BLK_EH_HANDLED,
BLK_EH_RESET_TIMER or BLK_EH_NOT_HANDLED.  If BLK_EH_RESET_TIMER is
returned, blk_clear_rq_complete is called, and blk_add_timer is again
called to simply wait longer for the request to complete.

Now, if the request happens to complete while this is going on, what
happens?  Given that we know the completion handler will bail if it
finds the REQ_ATOM_COMPLETE bit set, we need to focus on the completion
handler running after that bit is cleared.  So, from the above
paragraph, after the call to blk_clear_rq_complete.  If the completion
sets REQ_ATOM_COMPLETE before the BUG_ON in blk_add_timer, we go boom
there (I haven't seen this in the cores).  Next, if we get the
completion before the call to list_add_tail, then the timer will
eventually fire for an old req, which may either be freed or reallocated
(there is evidence that this might be the case).  Finally, if the
completion comes in *after* the addition to the timeout list, I think
it's harmless.  The request will be removed from the timeout list,
req_atom_complete will be set, and all will be well.

This will only actually explain the coredumps *IF* the request
structure was freed, reallocated *and* queued before the error handler
thread had a chance to process it.  That is possible, but it may make
sense to keep digging for another race.  I think that if this is what
was happening, we would see other instances of this problem showing up
as null pointer or garbage pointer dereferences, for example when the
request structure was not re-used.  It looks like we actually do run
into that situation in other reports.

This patch moves the BUG_ON(test_bit(REQ_ATOM_COMPLETE,
&req->atomic_flags)); from blk_add_timer to the only caller that could
trip over it (blk_start_request).  It then inverts the calls to
blk_clear_rq_complete and blk_add_timer in blk_rq_timed_out to address
the race.  I've boot tested this patch, but nothing more.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Acked-by: NHannes Reinecke <hare@suse.de>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4912aa6c

01 7月, 2013 1 次提交

block: check for timeout function in blk_rq_timed_out() · 80bd7181

由 Hannes Reinecke 提交于 1月 30, 2013

rq_timed_out_fn might have been unset while the request
was in flight, so we need to check for it in blk_rq_timed_out().
Acked-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NStefan Weinhuber <wein@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

80bd7181

15 6月, 2012 1 次提交

block: Drop dead function blk_abort_queue() · 76aaa510

由 Asias He 提交于 6月 14, 2012

This function was only used by btrfs code in btrfs_abort_devices()
(seems in a wrong way).

It was removed in commit d07eb911,
So, Let's remove the dead code to avoid any confusion.

Changes in v2: update commit log, btrfs_abort_devices() was removed
already.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-kernel@vger.kernel.org
Cc: Chris Mason <chris.mason@oracle.com>
Cc: linux-btrfs@vger.kernel.org
Cc: David Sterba <dave@jikos.cz>
Signed-off-by: NAsias He <asias@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

76aaa510

04 8月, 2011 1 次提交

fault-injection: add ability to export fault_attr in arbitrary directory · dd48c085

由 Akinobu Mita 提交于 8月 03, 2011

init_fault_attr_dentries() is used to export fault_attr via debugfs.
But it can only export it in debugfs root directory.

Per Forlin is working on mmc_fail_request which adds support to inject
data errors after a completed host transfer in MMC subsystem.

The fault_attr for mmc_fail_request should be defined per mmc host and
export it in debugfs directory per mmc host like
/sys/kernel/debug/mmc0/mmc_fail_request.

init_fault_attr_dentries() doesn't help for mmc_fail_request.  So this
introduces fault_create_debugfs_attr() which is able to create a
directory in the arbitrary directory and replace
init_fault_attr_dentries().

[akpm@linux-foundation.org: extraneous semicolon, per Randy]
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Tested-by: NPer Forlin <per.forlin@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dd48c085

21 4月, 2010 1 次提交

block: ensure jiffies wrap is handled correctly in blk_rq_timed_out_timer · a534dbe9

由 Richard Kennedy 提交于 4月 14, 2010

blk_rq_timed_out_timer() relied on blk_add_timer() never returning a
timer value of zero, but commit 7838c15b
removed the code that bumped this value when it was zero.
Therefore when jiffies is near wrap we could get unlucky & not set the
timeout value correctly.

This patch uses a flag to indicate that the timeout value was set and so
handles jiffies wrap correctly, and it keeps all the logic in one
function so should be easier to maintain in the future.
Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a534dbe9

15 4月, 2010 1 次提交

block: ensure jiffies wrap is handled correctly in blk_rq_timed_out_timer · c0d97e9c

由 Richard Kennedy 提交于 4月 14, 2010

blk_rq_timed_out_timer() relied on blk_add_timer() never returning a
timer value of zero, but commit 7838c15b
removed the code that bumped this value when it was zero.
Therefore when jiffies is near wrap we could get unlucky & not set the
timeout value correctly.

This patch uses a flag to indicate that the timeout value was set and so
handles jiffies wrap correctly, and it keeps all the logic in one
function so should be easier to maintain in the future.
Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c0d97e9c

28 4月, 2009 1 次提交

block: clean up misc stuff after block layer timeout conversion · 2eef33e4

由 Tejun Heo 提交于 4月 23, 2009

* In blk_rq_timed_out_timer(), else { if } to else if

* In blk_add_timer(), simplify if/else block

[ Impact: cleanup ]
Signed-off-by: NTejun Heo <tj@kernel.org>

2eef33e4

24 4月, 2009 1 次提交

block: fix intermittent dm timeout based oops · 17d5c8ca

由 Hannes Reinecke 提交于 4月 23, 2009

Very rarely under stress testing of dm, oopses are occuring as
something tampers with an old stack frame.  This has been traced back
to blk_abort_queue() leaving a timeout_list pointing to the stack.
The reason is that sometimes blk_abort_request() won't delete the
timer (if the request is marked as complete but before the timer has
been removed, a small race window).  Fix this by splicing back from
the ususally empty list to the q->timeout_list.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

17d5c8ca

22 4月, 2009 1 次提交

block: make blk_abort_queue() ignore non-request based devices · b7591134

由 Jens Axboe 提交于 4月 17, 2009

There's nothing to do for those devices, since the timeout handling is
based on requests.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b7591134

18 2月, 2009 1 次提交

block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list · be987fdb

由 Hannes Reinecke 提交于 2月 18, 2009

blk_abort_queue() iterates the timeout list and aborts each request on the
list, but if the driver error handling readds a request to the timeout list
during this processing, we could be looping forever. Fix this by splicing
current entries to a local list and run over that list instead.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

be987fdb

29 12月, 2008 3 次提交

block: leave the request timeout timer running even on an empty list · 70ed28b9

由 Jens Axboe 提交于 11月 19, 2008

For sync IO, we'll often do them serialized. This means we'll be touching
the queue timer for every IO, as opposed to only occasionally like we
do for queued IO. Instead of deleting the timer when the last request
is removed, just let continue running. If a new request comes up soon
we then don't have to readd the timer again. If no new requests arrive,
the timer will expire without side effect later.

This improves high iops sync IO by ~1%.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

70ed28b9

J
block: add comment in blk_rq_timed_out() about why next can not be 0 · 65d3618c
由 Jens Axboe 提交于 10月 30, 2008
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
65d3618c

block: optimizations in blk_rq_timed_out_timer() · 565e411d

由 malahal@us.ibm.com 提交于 10月 30, 2008

Now the rq->deadline can't be zero if the request is in the
timeout_list, so there is no need to have next_set. There is no need to
access a request's deadline field if blk_rq_timed_out is called on it.
Signed-off-by: NMalahal Naineni <malahal@us.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

565e411d

06 11月, 2008 1 次提交

Block: use round_jiffies_up() · 7838c15b

由 Alan Stern 提交于 11月 06, 2008

This patch (as1159b) changes the timeout routines in the block core to
use round_jiffies_up().  There's no point in rounding the timer
deadline down, since if it expires too early we will have to restart
it.

The patch also removes some unnecessary tests when a request is
removed from the queue's timer list.
Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7838c15b

09 10月, 2008 4 次提交

block: use rq complete marking in blk_abort_request() · 7ba1fbaa

由 Jens Axboe 提交于 9月 16, 2008

We cannot abort a request if we raced with the timeout handler already,
or with the IO completion. So make blk_abort_request() mark the request
as complete, and only continue if we succeeded.

Found and suggested by Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7ba1fbaa

block: add fault injection mechanism for faking request timeouts · 581d4e28

由 Jens Axboe 提交于 9月 14, 2008

Only works for the generic request timer handling. Allows one to
sporadically ignore request completions, thus exercising the timeout
handling.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

581d4e28

block: Add interface to abort queued requests · 11914a53

由 Mike Anderson 提交于 9月 13, 2008

Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

11914a53

block: unify request timeout handling · 242f9dcb

由 Jens Axboe 提交于 9月 14, 2008

Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.
Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

242f9dcb

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功