- 10 2月, 2015 15 次提交
-
-
由 Johannes Thumshirn 提交于
Currently the cleanup of all error cases are open-coded. Introduce a common exit path and labels. Signed-off-by: NJohannes Thumshirn <morbidrsa@gmail.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Rickard Strandqvist 提交于
The thin-pool target doesn't display the data block size as part of its table status, unlike the dm-cache target, so there is no need for dm_pool_get_data_block_size(). This was found using cppcheck. Signed-off-by: NRickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Junxiao Bi 提交于
dm_table_put() was replaced by dm_put_live_table(). Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Loic Pefferkorn 提交于
Update the obsolete url in the CONFIG_DM_CRYPT help text. Signed-off-by: NLoic Pefferkorn <loic@loicp.eu> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Asaf Vertz 提交于
To be future-proof and for better readability the time comparison is modified to use time_after_eq() instead of plain, error-prone math. Signed-off-by: NAsaf Vertz <asaf.vertz@tandemg.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Manuel Schölling 提交于
To be future-proof and for better readability the time comparisons are modified to use time_in_range() and time_after() instead of plain, error-prone math. Signed-off-by: NManuel Schölling <manuel.schoelling@gmx.de> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Dan Carpenter 提交于
My static checker complains that if "num_raid_params" is UINT_MAX then the "if (num_raid_params + 1 > argc) {" check doesn't work as intended. The other change is that I moved the "if (argc != (num_raid_devs * 2))" condition forward a few lines so it was before the call to context_alloc(). If we had an integer overflow inside that function then it would lead to an immediate crash. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Otherwise replacing the multipath target with the error target fails: device-mapper: ioctl: can't change device type after initial table load. The error target was mistakenly considered to be target type DM_TYPE_REQUEST_BASED rather than DM_TYPE_MQ_REQUEST_BASED even if the target it was to replace was of type DM_TYPE_MQ_REQUEST_BASED. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
For blk-mq request-based DM the responsibility of allocating a cloned request is transfered from DM core to the target type. Doing so enables the cloned request to be allocated from the appropriate blk-mq request_queue's pool (only the DM target, e.g. multipath, can know which block device to send a given cloned request to). Care was taken to preserve compatibility with old-style block request completion that requires request-based DM _not_ acquire the clone request's queue lock in the completion path. As such, there are now 2 different request-based DM target_type interfaces: 1) the original .map_rq() interface will continue to be used for non-blk-mq devices -- the preallocated clone request is passed in from DM core. 2) a new .clone_and_map_rq() and .release_clone_rq() will be used for blk-mq devices -- blk_get_request() and blk_put_request() are used respectively from these hooks. dm_table_set_type() was updated to detect if the request-based target is being stacked on blk-mq devices, if so DM_TYPE_MQ_REQUEST_BASED is set. DM core disallows switching the DM table's type after it is set. This means that there is no mixing of non-blk-mq and blk-mq devices within the same request-based DM table. [This patch was started by Keith and later heavily modified by Mike] Tested-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Keith Busch 提交于
For blk-mq request-based DM the responsibility of allocating a cloned request will be transfered from DM core to the target type. To prepare for conditionally using this new model the original request's 'special' now points to the dm_rq_target_io because the clone is allocated later in the block layer rather than in DM core. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Keith Busch 提交于
Switch to having request-based DM enqueue all prep'ed requests into work processed by another thread. This allows request-based DM to invoke block APIs that assume interrupt enabled context (e.g. blk_get_request) and is a prerequisite for adding blk-mq support to request-based DM. The new kernel thread is only initialized for request-based DM devices. multipath_map() is now always in irq enabled context so change multipath spinlock (m->lock) locking to always disable interrupts. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Request-based DM support for blk-mq devices requires that dm_rq_target_io structures not be allocated with an embedded request structure. The request-based DM target (e.g. dm-multipath) must allocate the request from the blk-mq devices' request_queue using blk_get_request(). The unfortunate side-effect of this change is old-style request-based DM support will no longer use contiguous memory for the dm_rq_target_io and request structures for each clone. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Remove exports for dm_dispatch_request, dm_requeue_unmapped_request, and dm_kill_unmapped_request. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Commit febf7158 ("block: require blk_rq_prep_clone() be given an initialized clone request") introduced a regression by calling blk_rq_init() on the original request rather than the clone request that is passed to setup_clone(). Signed-off-by: NMike Snitzer <snitzer@redhat.com> Fixes: febf7158 ("block: require blk_rq_prep_clone() be given an initialized clone request") Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Konstantin Khlebnikov 提交于
Cfq_lookup_create_cfqg() allocates struct blkcg_gq using GFP_ATOMIC. In cfq_find_alloc_queue() possible allocation failure is not handled. As a result kernel oopses on NULL pointer dereference when cfq_link_cfqq_cfqg() calls cfqg_get() for NULL pointer. Bug was introduced in v3.5 in commit cd1604fa ("blkcg: factor out blkio_group creation"). Prior to that commit cfq group lookup had returned pointer to root group as fallback. This patch handles this error using existing fallback oom_cfqq. Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NVivek Goyal <vgoyal@redhat.com> Fixes: cd1604fa ("blkcg: factor out blkio_group creation") Cc: stable@kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 06 2月, 2015 8 次提交
-
-
由 Martin K. Petersen 提交于
blkdev_issue_zeroout() printed a warning if a device failed a discard or write same request despite advertising support for these. That's fine for SCSI since we'll disable these commands if we get an error back from the disk saying that they are not supported. And consequently the warning only gets printed once. There are other types of block devices that support discard, however, and these may return -EOPNOTSUPP for each command but leave discard enabled in the queue limits. This will cause a warning message for every blkdev_issue_zeroout() invocation. Remove the offending warning messages. Reported-by: NSedat Dilek <sedat.dilek@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Tested-by: NSedat Dilek <sedat.dilek@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Dongsu Park 提交于
Rewrite __bio_copy_iov using the copy_page_{from,to}_iter helpers, and split it into two simpler functions. This commit should contain only literal replacements, without functional changes. Cc: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDongsu Park <dongsu.park@profitbricks.com> [hch: removed the __bio_copy_iov wrapper] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
And also remove the unused bdev argument. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
This saves a little code, and allow to simplify the error handling. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Kent Overstreet 提交于
Make use of a new interface provided by iov_iter, backed by scatter-gather list of iovec, instead of the old interface based on sg_iovec. Also use iov_iter_advance() instead of manual iteration. This commit should contain only literal replacements, without functional changes. Cc: Christoph Hellwig <hch@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Signed-off-by: NKent Overstreet <kmo@daterainc.com> [dpark: add more description in commit message] Signed-off-by: NDongsu Park <dongsu.park@profitbricks.com> [hch: fixed to do a deep clone of the iov_iter, and to properly use the iov_iter direction] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
The code sniplet to walk all bio_vecs and free their pages is opencoded in way to many places, so factor it into a helper. Also convert the slightly more complex cases in bio_kern_endio and __bio_copy_iov where we break the freeing from an existing loop into a separate one. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Just open code the trivial mapping from a kernel virtual address to a bio instead of going through the complex user address mapping machinery. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 29 1月, 2015 4 次提交
-
-
由 Mike Snitzer 提交于
Commit 4ee5eaf4 ("block: add a queue flag for request stacking support") introduced the concept of "STACKABLE" and blk-mq devices fit the definition in that they establish q->request_fn. So establish QUEUE_FLAG_STACKABLE in QUEUE_FLAG_MQ_DEFAULT. While not strictly needed (DM _could_ just check for q->mq_ops to assume the device is request-based), request-based DM support for blk-mq devices benefits from the ability to consistently check for QUEUE_FLAG_STACKABLE before allowing a device to be stacked into a request-based DM table. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
blk_mq_alloc_request() may establish REQ_MQ_INFLIGHT in addition to incrementing the hctx->nr_active count. Any cmd_flags that are established in the newly allocated clone request must be preserved in addition to the cmd_flags that are later copied over from the original request as part of blk_rq_prep_clone(). Otherwise, if REQ_MQ_INFLIGHT isn't set in the clone request the hctx->nr_active count won't get decremented via blk_mq_free_request(). The only consumer of blk_rq_prep_clone() is request-based DM, which uses blk_rq_init() prior to calling blk_rq_prep_clone() for the non-blk-mq case. Given the cloned request's cmd_flags will be 0 it is safe to OR them with the original request's cmd_flags for both the non-blk-mq and blk-mq cases. Reported-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
If the request passed to blk_insert_cloned_request() was allocated by a blk-mq device it must be submitted using blk_mq_insert_request(). Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
Prepare to allow blk_rq_prep_clone() to accept clone requests that were allocated from blk-mq request queues. As such the blk_rq_prep_clone() caller must first initialize the clone request. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 24 1月, 2015 2 次提交
-
-
由 Shaohua Li 提交于
This is the blk-mq part to support tag allocation policy. The default allocation policy isn't changed (though it's not a strict FIFO). The new policy is round-robin for libata. But it's a try-best implementation. If multiple tasks are competing, the tags returned will be mixed (which is unavoidable even with !mq, as requests from different tasks can be mixed in queue) Cc: Jens Axboe <axboe@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: NShaohua Li <shli@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Shaohua Li 提交于
The libata tag allocation is using a round-robin policy. Next patch will make libata use block generic tag allocation, so let's add a policy to tag allocation. Currently two policies: FIFO (default) and round-robin. Cc: Jens Axboe <axboe@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: NShaohua Li <shli@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 22 1月, 2015 3 次提交
-
-
由 Boaz Harrosh 提交于
As Christoph put it: Can we just get rid of the warnings? It's fairly annoying as devices without partitions are perfectly fine and very useful. Me too I see this message every VM boot for ages on all my devices. Would love to just remove it. For me a partition-table is only needed for a booting BIOS, grub, and stuff. CC: Christoph Hellwig <hch@infradead.org> Signed-off-by: NBoaz Harrosh <boaz@plexistor.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Martin K. Petersen 提交于
blkdev_issue_discard() will zero a given block range. This is done by way of explicit writing, thus provisioning or allocating the blocks on disk. There are use cases where the desired behavior is to zero the blocks but unprovision them if possible. The blocks must deterministically contain zeroes when they are subsequently read back. This patch adds a flag to blkdev_issue_zeroout() that provides this variant. If the discard flag is set and a block device guarantees discard_zeroes_data we will use REQ_DISCARD to clear the block range. If the device does not support discard_zeroes_data or if the discard request fails we will fall back to first REQ_WRITE_SAME and then a regular REQ_WRITE. Also update the callers of blkdev_issue_zero() to reflect the new flag and make sb_issue_zeroout() prefer the discard approach. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jeff Moyer 提交于
Hi, If you can manage to submit an async write as the first async I/O from the context of a process with realtime scheduling priority, then a cfq_queue is allocated, but filed into the wrong async_cfqq bucket. It ends up in the best effort array, but actually has realtime I/O scheduling priority set in cfqq->ioprio. The reason is that cfq_get_queue assumes the default scheduling class and priority when there is no information present (i.e. when the async cfqq is created): static struct cfq_queue * cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, struct bio *bio, gfp_t gfp_mask) { const int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio); const int ioprio = IOPRIO_PRIO_DATA(cic->ioprio); cic->ioprio starts out as 0, which is "invalid". So, class of 0 (IOPRIO_CLASS_NONE) is passed to cfq_async_queue_prio like so: async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio); static struct cfq_queue ** cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio) { switch (ioprio_class) { case IOPRIO_CLASS_RT: return &cfqd->async_cfqq[0][ioprio]; case IOPRIO_CLASS_NONE: ioprio = IOPRIO_NORM; /* fall through */ case IOPRIO_CLASS_BE: return &cfqd->async_cfqq[1][ioprio]; case IOPRIO_CLASS_IDLE: return &cfqd->async_idle_cfqq; default: BUG(); } } Here, instead of returning a class mapped from the process' scheduling priority, we get back the bucket associated with IOPRIO_CLASS_BE. Now, there is no queue allocated there yet, so we create it: cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask); That function ends up doing this: cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync); cfq_init_prio_data(cfqq, cic); cfq_init_cfqq marks the priority as having changed. Then, cfq_init_prio data does this: ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio); switch (ioprio_class) { default: printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class); case IOPRIO_CLASS_NONE: /* * no prio set, inherit CPU scheduling settings */ cfqq->ioprio = task_nice_ioprio(tsk); cfqq->ioprio_class = task_nice_ioclass(tsk); break; So we basically have two code paths that treat IOPRIO_CLASS_NONE differently, which results in an RT async cfqq filed into a best effort bucket. Attached is a patch which fixes the problem. I'm not sure how to make it cleaner. Suggestions would be welcome. Signed-off-by: NJeff Moyer <jmoyer@redhat.com> Tested-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: stable@kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 14 1月, 2015 2 次提交
-
-
由 Jens Axboe 提交于
The blk-mq tagging tries to maintain some locality between CPUs and the tags issued. The tags are split into groups of words, and the words may not be fully populated. When searching for a new free tag, blk-mq may look at partial words, hence it passes in an offset/size to find_next_zero_bit(). However, it does that wrong, the size must always be the full length of the number of tags in that word, otherwise we'll potentially miss some near the end. Another issue is when __bt_get() goes from one word set to the next. It bumps the index, but not the last_tag associated with the previous index. Bump that to be in the range of the new word. Finally, clean up __bt_get() and __bt_get_word() a bit and get rid of the goto in there, and the unnecessary 'wrap' variable. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Matthew Wilcox 提交于
In order to support accesses to larger chunks of memory, pass in a 'size' parameter (counted in bytes), and return the amount available at that address. Add a new helper function, bdev_direct_access(), to handle common functionality including partition handling, checking the length requested is positive, checking for the sector being page-aligned, and checking the length of the request does not pass the end of the partition. Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NBoaz Harrosh <boaz@plexistor.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 03 1月, 2015 3 次提交
-
-
由 Jens Axboe 提交于
Commit b4c6a028 exported the start and unfreeze, but we need the regular blk_mq_freeze_queue() for the loop conversion. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
Linux 3.19-rc2
-
由 Ming Lei 提交于
Check IS_ERR_OR_NULL(return value) instead of just return value. Signed-off-by: NMing Lei <ming.lei@canonical.com> Reduced to IS_ERR() by me, we never return NULL. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 01 1月, 2015 1 次提交
-
-
由 Jens Axboe 提交于
If it's dying, we can't expect new request to complete and come in an wake up other tasks waiting for requests. So after we have marked it as dying, wake up everybody currently waiting for a request. Once they wake, they will retry their allocation and fail appropriately due to the state of the queue. Tested-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 29 12月, 2014 2 次提交
-
-
由 Linus Torvalds 提交于
-
git://git.kernel.org/pub/scm/virt/kvm/kvm由 Linus Torvalds 提交于
Pull KVM fixes from Paolo Bonzini: "The important fixes are for two bugs introduced by the merge window. On top of this, add a couple of WARN_ONs and stop spamming dmesg on pretty much every boot of a virtual machine" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: warn on more invariant breakage kvm: fix sorting of memslots with base_gfn == 0 kvm: x86: drop severity of "generation wraparound" message kvm: x86: vmx: reorder some msr writing
-