- 22 7月, 2011 1 次提交
-
-
由 James Bottomley 提交于
USB surprise removal of sr is triggering an oops in scsi_dispatch_command(). What seems to be happening is that USB is hanging on to a queue reference until the last close of the upper device, so the crash is caused by surprise remove of a mounted CD followed by attempted unmount. The problem is that USB doesn't issue its final commands as part of the SCSI teardown path, but on last close when the block queue is long gone. The long term fix is probably to make sr do the teardown in the same way as sd (so remove all the lower bits on ejection, but keep the upper disk alive until last close of user space). However, the current oops can be simply fixed by not allowing any commands to be sent to a dead queue. Cc: stable@kernel.org Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
- 27 6月, 2011 2 次提交
-
-
由 Shaohua Li 提交于
ioc->ioc_data is rcu protectd, so uses correct API to access it. This doesn't change any behavior, but just make code consistent. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Cc: stable@kernel.org # after ab4bd22dSigned-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Shaohua Li 提交于
I got a rcu warnning at boot. the ioc->ioc_data is rcu_deferenced, but doesn't hold rcu_read_lock. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Cc: stable@kernel.org # after ab4bd22dSigned-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 13 6月, 2011 1 次提交
-
-
由 Joe Perches 提交于
Use the compiler to verify format strings and arguments. Fix fallout. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 10 6月, 2011 3 次提交
-
-
由 Tejun Heo 提交于
disk_block_events() should guarantee that the event work is not in flight on return and once blocked it shouldn't issue further cancellations. Because there was no synchronization between the first blocker doing cancel_delayed_work_sync() and the following blockers, the following blockers could finish before cancellation was complete, which broke both guarantees - event work could be in flight and cancellation could happen after return. This bug triggered WARN_ON_ONCE() in disk_clear_events() reported in bug#34662. https://bugzilla.kernel.org/show_bug.cgi?id=34662 Fix it by adding an outer mutex which protects both block count manipulation and work cancellation. -v2: Use outer mutex instead of bit waitqueue per Linus. Signed-off-by: NTejun Heo <tj@kernel.org> Tested-by: NSitsofe Wheeler <sitsofe@yahoo.com> Reported-by: NSitsofe Wheeler <sitsofe@yahoo.com> Reported-by: NBorislav Petkov <bp@alien8.de> Reported-by: NMeelis Roos <mroos@linux.ee> Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Tejun Heo 提交于
After the previous update to disk_check_events(), nobody is using non-syncing __disk_block_events(). Remove @sync and, as this makes __disk_block_events() virtually identical to disk_block_events(), remove the underscore prefixed version. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Tejun Heo 提交于
This patch is part of fix for triggering of WARN_ON_ONCE() in disk_clear_events() reported in bug#34662. https://bugzilla.kernel.org/show_bug.cgi?id=34662 disk_clear_events() blocks events, schedules and flushes the event work. It expects the work to have started execution on schedule and finished on return from flush. WARN_ON_ONCE() triggers if the event work hasn't executed as expected. This problem happens because __disk_block_events() fails to guarantee that the event work item is not in flight on return from the function in race-free manner. The problem is two-fold and this patch addresses one of them. When __disk_block_events() is called with @sync == %false, it bumps event block count, calls cancel_delayed_work() and return. This makes it impossible to guarantee that event polling is not in flight on return from syncing __disk_block_events() - if the first blocker was non-syncing, polling could still be in progress and later syncing ones would assume that the first blocker already canceled it. Making __disk_block_events() cancel_sync regardless of block count isn't feasible either as it may race with forced event checking in disk_clear_events(). As disk_check_events() is the only user of non-syncing __disk_block_events(), updating it to directly cancel and schedule event work is the easiest way to solve the issue. Note that there's another bug in __disk_block_events() and this patch doesn't fix the issue completely. Later patch will fix the other bug. Signed-off-by: NTejun Heo <tj@kernel.org> Tested-by: NSitsofe Wheeler <sitsofe@yahoo.com> Reported-by: NSitsofe Wheeler <sitsofe@yahoo.com> Reported-by: NBorislav Petkov <bp@alien8.de> Reported-by: NMeelis Roos <mroos@linux.ee> Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 06 6月, 2011 1 次提交
-
-
由 Jens Axboe 提交于
Since we are modifying this RCU pointer, we need to hold the lock protecting it around it. This fixes a potential reuse and double free of a cfq io_context structure. The bug has been in CFQ for a long time, it hit very few people but those it did hit seemed to see it a lot. Tracked in RH bugzilla here: https://bugzilla.redhat.com/show_bug.cgi?id=577968 Credit goes to Paul Bolle for figuring out that the issue was around the one-hit ioc->ioc_data cache. Thanks to his hard work the issue is now fixed. Cc: stable@kernel.org Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 02 6月, 2011 2 次提交
-
-
由 Paul Bolle 提交于
list_entry() and hlist_entry() are both simply aliases for container_of(), but since io_context.cic_list.first is an hlist_node one should at least use the correct alias. Signed-off-by: NPaul Bolle <pebolle@tiscali.nl> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Paul Bolle 提交于
queue_fail can only be reached if cic is NULL, so its check for cic must be bogus. Signed-off-by: NPaul Bolle <pebolle@tiscali.nl> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 01 6月, 2011 1 次提交
-
-
由 Kyungmin Park 提交于
Fix comment typo and remove unnecessary semicolon at macro Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 27 5月, 2011 4 次提交
-
-
由 Jens Axboe 提交于
We need them in SCSI to fix a bug, but currently they are not exported to modules. Export them. Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Ben Blum 提交于
Add cgroup subsystem callbacks for per-thread attachment in atomic contexts Add can_attach_task(), pre_attach(), and attach_task() as new callbacks for cgroups's subsystem interface. Unlike can_attach and attach, these are for per-thread operations, to be called potentially many times when attaching an entire threadgroup. Also, the old "bool threadgroup" interface is removed, as replaced by this. All subsystems are modified for the new interface - of note is cpuset, which requires from/to nodemasks for attach to be globally scoped (though per-cpuset would work too) to persist from its pre_attach to attach_task and attach. This is a pre-patch for cgroup-procs-writable.patch. Signed-off-by: NBen Blum <bblum@andrew.cmu.edu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Luca Tettamanti 提交于
sector is never read inside the function. Signed-off-by: NLuca Tettamanti <kronos.it@gmail.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Tejun Heo 提交于
9fd097b1 (block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers) removed DISK_EVENT_MEDIA_CHANGE from legacy/fringe block drivers which have inadequate ->check_events(). Combined with earlier change 7c88a168 (block: don't propagate unlisted DISK_EVENTs to userland), this enables using ->check_events() for internal processing while avoiding enabling in-kernel block event polling which can lead to infinite event loop. Unfortunately, this made many drivers including floppy without any bit set in disk->events and ->async_events in which case disk_add_events() simply skipped allocation of disk->ev, which disables whole event handling. As ->check_events() is still used during open processing for revalidation, this can lead to open failure. This patch always allocates disk->ev if ->check_events is implemented. In the long term, it would make sense to simply include the event structure inline into genhd as it's now used by virtually all block devices. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NOndrej Zary <linux@rainbow-software.org> Reported-by: NAlex Villacis Lasso <avillaci@ceibo.fiec.espol.edu.ec> Cc: stable@kernel.org Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 24 5月, 2011 5 次提交
-
-
由 Namhyung Kim 提交于
When struct cfq_data allocation fails, cic_index need to be freed. Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Namhyung Kim 提交于
The 'group_changed' variable is initialized to 0 and never changed, so checking the variable is meaningless. It is a leftover from 0bbfeb83 ("cfq-iosched: Always provide group iosolation."). Let's get rid of it. Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Cc: Justin TerAvest <teravest@google.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Namhyung Kim 提交于
Reduce the number of bit operations in cfq_choose_req() on average (and worst) cases. Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Namhyung Kim 提交于
Simplify the calculation in cfq_prio_to_maxrq(), plus replace CFQ_PRIO_LISTS to IOPRIO_BE_NR since they are the same and IOPRIO_BE_NR looks more reasonable in this context IMHO. Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
If we don't explicitly initialize it to zero, CFQ might think that cgroup of ioc has changed and it generates lots of unnecessary calls to call_for_each_cic(changed_cgroup). Fix it. cfq_get_io_context() cfq_ioc_set_cgroup() call_for_each_cic(ioc, changed_cgroup) Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 23 5月, 2011 3 次提交
-
-
由 Vivek Goyal 提交于
Commit 73c10101 ("block: initial patch for on-stack per-task plugging") removed calls to elv_bio_merged() when @bio merged with @req. Re-add them. This in turn will update merged stats in associated group. That should be safe as long as request has got reference to the blkio_group. Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Cc: Divyesh Shah <dpshah@google.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Make BLKIO_STAT_MERGED per cpu hence gettring rid of need of taking blkg->stats_lock. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
We allocated per cpu stats struct for root group but did not free it. Fix it. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 21 5月, 2011 14 次提交
-
-
由 Jens Axboe 提交于
We don't need them anymore, so kill: - REQ_ON_PLUG checks in various places - !rq_mergeable() check in plug merging Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Currently we take a queue lock on each bio to check if there are any throttling rules associated with the group and also update the stats. Now access the group under rcu and update the stats without taking the queue lock. Queue lock is taken only if there are throttling rules associated with the group. So the common case of root group when there are no rules, save unnecessary pounding of request queue lock. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Now dispatch stats update is lock free. But reset of these stats still takes blkg->stats_lock and is dependent on that. As stats are per cpu, we should be able to just reset the stats on each cpu without any locks. (Atleast for 64bit arch). On 32bit arch there is a small race where 64bit updates are not atomic. The result of this race can be that in the presence of other writers, one might not get 0 value after reset of a stat and might see something intermediate One can write more complicated code to cover this race like sending IPI to other cpus to reset stats and for offline cpus, reset these directly. Right not I am not taking that path because reset_update is more of a debug feature and it can happen only on 32bit arch and possibility of it happening is small. Will fix it if it becomes a real problem. For the time being going for code simplicity. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Some of the stats are 64bit and updation will be non atomic on 32bit architecture. Use sequence counters on 32bit arch to make reading of stats safe. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Currently we take blkg_stat lock for even updating the stats. So even if a group has no throttling rules (common case for root group), we end up taking blkg_lock, for updating the stats. Make dispatch stats per cpu so that these can be updated without taking blkg lock. If cpu goes offline, these stats simply disappear. No protection has been provided for that yet. Do we really need anything for that? Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Soon we will allow accessing a throtl_grp under rcu_read_lock(). Hence start freeing up throtl_grp after one rcu grace period. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Use same helper function for root group as we use with dynamically allocated groups to add it to various lists. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
A helper function for the code which is used at 2-3 places. Makes reading code little easier. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Currently, we allocate root throtl_grp statically. But as we will be introducing per cpu stat pointers and that will be allocated dynamically even for root group, we might as well make whole root throtl_grp allocation dynamic and treat it in same manner as other groups. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Currently, all the cfq_group or throtl_group allocations happen while we are holding ->queue_lock and sleeping is not allowed. Soon, we will move to per cpu stats and also need to allocate the per group stats. As one can not call alloc_percpu() from atomic context as it can sleep, we need to drop ->queue_lock, allocate the group, retake the lock and continue processing. In throttling code, I check the queue DEAD flag again to make sure that driver did not call blk_cleanup_queue() in the mean time. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
blkg->key = cfqd is an rcu protected pointer and hence we used to do call_rcu(cfqd->rcu_head) to free up cfqd after one rcu grace period. The problem here is that even though cfqd is around, there are no gurantees that associated request queue (td->queue) or q->queue_lock is still around. A driver might have called blk_cleanup_queue() and release the lock. It might happen that after freeing up the lock we call blkg->key->queue->queue_ock and crash. This is possible in following path. blkiocg_destroy() blkio_unlink_group_fn() cfq_unlink_blkio_group() Hence, wait for an rcu peirod if there are groups which have not been unlinked from blkcg->blkg_list. That way, if there are any groups which are taking cfq_unlink_blkio_group() path, can safely take queue lock. This is how we have taken care of race in throttling logic also. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Nobody seems to be using cfq_find_alloc_cfqg() function parameter "create". Get rid of that. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
cgroup unaccounted_time file is created only if CONFIG_DEBUG_BLK_CGROUP=y. there are some fields which are out side this config option. Fix that. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Vivek Goyal 提交于
Group initialization code seems to be at two places. root group initialization in blk_throtl_init() and dynamically allocated group in throtl_find_alloc_tg(). Create a common function and use at both the places. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 19 5月, 2011 1 次提交
-
-
由 James Bottomley 提交于
blk_cleanup_queue() calls elevator_exit() and after this, we can't touch the elevator without oopsing. __elv_next_request() must check for this state because in the refcounted queue model, we can still call it after blk_cleanup_queue() has been called. This was reported as causing an oops attributable to scsi. Signed-off-by: NJames Bottomley <James.Bottomley@suse.de> Cc: stable@kernel.org Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 18 5月, 2011 2 次提交
-
-
由 Shaohua Li 提交于
Let's check a scenario: 1. blk_delay_queue(q, SCSI_QUEUE_DELAY); 2. blk_run_queue_async(); the second one will became a noop, because q->delay_work already has WORK_STRUCT_PENDING_BIT set, so the delayed work will still run after SCSI_QUEUE_DELAY. But blk_run_queue_async actually hopes the delayed work runs immediately. Fix this by doing a cancel on potentially pending delayed work before queuing an immediate run of the workqueue. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Martin K. Petersen 提交于
In some cases we would end up stacking discard_zeroes_data incorrectly. Fix this by enabling the feature by default for stacking drivers and clearing it for low-level drivers. Incorporating a device that does not support dzd will then cause the feature to be disabled in the stacking driver. Also ensure that the maximum discard value does not overflow when exported in sysfs and return 0 in the alignment and dzd fields for devices that don't support discard. Reported-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NMike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-