- 11 6月, 2016 3 次提交
-
-
由 Mike Snitzer 提交于
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Add "multipath-bio" target that offers a bio-based multipath target as an alternative to the request-based "multipath" target -- but in a following commit "multipath-bio" will immediately be replaced by a new "queue_mode" feature for the "multipath" target which will allow bio-based mode to be selected. When DM multipath was originally converted from bio-based to request-based the motivation for the change was better dynamic load balancing (by leveraging block core's request-based IO schedulers, for merging and sorting, _before_ DM multipath would make the decision on where to steer the IO -- based on path load and/or availability). More background is available in this "Request-based Device-mapper multipath and Dynamic load balancing" paper: https://www.kernel.org/doc/ols/2007/ols2007v2-pages-235-244.pdf But we've now come full circle where significantly faster storage devices no longer need IOs to be made larger to drive optimal IO performance. And even if they do there have been changes to the block and filesystem layers that help ensure upper layers are constructing larger IOs. In addition, SCSI's differentiated IO errors will propagate through to bio-based IO completion hooks -- so that eliminates another historic justiciation for request-based DM multipath. Lastly, the block layer's immutable biovec changes have made bio cloning cheaper than it has ever been; whereas request cloning is still relatively expensive (both on a CPU usage and memory footprint level). As such, bio-based DM multipath offers the promise of a more efficient IO path for high IOPs devices that are, or will be, emerging. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Add some seperation between bio-based and request-based DM core code. 'struct mapped_device' and other DM core only structures and functions have been moved to dm-core.h and all relevant DM core .c files have been updated to include dm-core.h rather than dm.h DM targets should _never_ include dm-core.h! [block core merge conflict resolution from Stephen Rothwell] Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
-
- 08 6月, 2016 13 次提交
-
-
由 Mike Christie 提交于
To avoid confusion between REQ_OP_FLUSH, which is handled by request_fn drivers, and upper layers requesting the block layer perform a flush sequence along with possibly a WRITE, this patch renames REQ_FLUSH to REQ_PREFLUSH. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This adds a REQ_OP_FLUSH operation that is sent to request_fn based drivers by the block layer's flush code, instead of sending requests with the request->cmd_flags REQ_FLUSH bit set. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
We don't need bi_rw to be so large on 64 bit archs, so reduce it to unsigned int. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
The req operation REQ_OP is separated from the rq_flag_bits definition. This converts the block layer drivers to use req_op to get the op from the request struct. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
Separate the op from the rq_flag_bits and have md set/get the bio using bio_set_op_attrs/bio_op. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
Separate the op from the rq_flag_bits and have bcache set/get the bio using bio_set_op_attrs/bio_op. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
Separate the op from the rq_flag_bits and have dm set/get the bio using bio_set_op_attrs/bio_op. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
It looks like dm stats cares about the data direction (READ vs WRITE) and does not need the bio/request flags. Commands like REQ_FLUSH, REQ_DISCARD and REQ_WRITE_SAME are currently always set with REQ_WRITE, so the extra check for REQ_DISCARD in dm_stats_account_io is not needed. This patch has it use the bio and request data_dir helpers instead of accessing the bi_rw/cmd_flags directly. This makes the next patches that remove the operation from the cmd_flags and bi_rw easier, because we will no longer have the REQ_WRITE bit set for operations like discards. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This converts the block issue discard helper and users to use the bio_set_op_attrs accessor and only pass in the operation flags like REQ_SEQURE. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
We currently set REQ_WRITE/WRITE for all non READ IOs like discard, flush, writesame, etc. In the next patches where we no longer set up the op as a bitmap, we will not be able to detect a operation direction like writesame by testing if REQ_WRITE is set. This has bcache use the op_is_write helper which will do the right thing. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
We currently set REQ_WRITE/WRITE for all non READ IOs like discard, flush, writesame, etc. In the next patches where we no longer set up the op as a bitmap, we will not be able to detect a operation direction like writesame by testing if REQ_WRITE is set. This has dm use the op_is_write helper which will do the right thing. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This has submit_bh users pass in the operation and flags separately, so submit_bh_wbc can setup the bio op and bi_rw flags on the bio that is submitted. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This has callers of submit_bio/submit_bio_wait set the bio->bi_rw instead of passing it in. This makes that use the same as generic_make_request and how we set the other bio fields. Signed-off-by: NMike Christie <mchristi@redhat.com> Fixed up fs/ext4/crypto.c Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 24 5月, 2016 3 次提交
-
-
由 Jiri Kosina 提交于
bch_gc_thread() doesn't mark itself freezable, so calling try_to_freeze() in its context is just an expensive no-op. Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jiri Kosina 提交于
bch_allocator_thread() is calling try_to_freeze(), but that's just an expensive no-op given the fact that the thread is not marked freezable. Bucket allocator has to be up and running to the very last stages of the suspend, as the bcache I/O that's in flight (think of writing an hibernation image to a swap device served by bcache). Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jiri Kosina 提交于
bch_writeback_thread() is calling try_to_freeze(), but that's just an expensive no-op given the fact that the thread is not marked freezable. I/O helper kthreads, exactly such as the bcache writeback thread, actually shouldn't be freezable, because they are potentially necessary for finalizing the image write-out. Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 13 5月, 2016 4 次提交
-
-
由 Joe Thornber 提交于
There is little benefit to doing this but it does structure DM thinp's code to more cleanly use the __blkdev_issue_discard() interface -- particularly in passdown_double_checking_shared_status(). Signed-off-by: NJoe Thornber <ejt@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
With commit 38f25255 ("block: add __blkdev_issue_discard") DM thinp no longer needs to carry its own async discard method. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NJoe Thornber <ejt@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Mike Snitzer 提交于
DM thinp's use of bio_inc_remaining() is critical to ensure the original parent discard bio isn't completed before sub-discards have. DM thinp needs this due to the extra quiescing that occurs, via multiple DM thinp mappings, while processing large discards. As such DM thinp must build the async discard bio chain after some delay -- so bio_inc_remaining() is used to enable DM thinp to take a reference on the original parent discard bio for each mapping. This allows the immediate use of bio_endio() on that discard bio; but with the understanding that the actual completion won't occur until each of the sub-discards' per-mapping references are dropped. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NJoe Thornber <ejt@redhat.com>
-
由 Heinz Mauelshagen 提交于
Given we don't yet support any feature flags in the dm-raid ondisk metadata (see: 'features' member of 'struct dm_raid_superblock'), add a check to ensure no flags are actually set, if any features are set reject the activation of the RAID mapping. This is to prevent possible data corruption in case of a kernel downgrade when there'll potentially be feature flags set by a future dm-raid target. Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 10 5月, 2016 6 次提交
-
-
由 Guoqing Jiang 提交于
We don't need to run the full path of recv_daemon if process_recvd_msg doesn't return 0. Reviewed-by: NNeilBrown <neilb@suse.com> Signed-off-by: NGuoqing Jiang <gqjiang@suse.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
由 Guoqing Jiang 提交于
The in-memory bitmap is not ready when node joins cluster, so it doesn't make sense to make gather_all_resync_info() called so earlier, we need to call it after the node's bitmap is setup. Also, recv_thread could be wake up after node joins cluster, but it could cause problem if node receives RESYNCING message without persionality since mddev->pers->quiesce is called in process_suspend_info. This commit introduces a new cluster interface load_bitmaps to fix above problems, load_bitmaps is called in bitmap_load where bitmap and persionality are ready, and load_bitmaps does the following tasks: 1. call gather_all_resync_info to load all the node's bitmap info. 2. set MD_CLUSTER_ALREADY_IN_CLUSTER bit to recv_thread could be wake up, and wake up recv_thread if there is pending recv event. Then ack_bast only wakes up recv_thread after IN_CLUSTER bit is ready otherwise MD_CLUSTER_PENDING_RESYNC_EVENT is set. Reviewed-by: NNeilBrown <neilb@suse.com> Signed-off-by: NGuoqing Jiang <gqjiang@suse.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
由 Guoqing Jiang 提交于
Some code waits for a metadata update by: 1. flagging that it is needed (MD_CHANGE_DEVS or MD_CHANGE_CLEAN) 2. setting MD_CHANGE_PENDING and waking the management thread 3. waiting for MD_CHANGE_PENDING to be cleared If the first two are done without locking, the code in md_update_sb() which checks if it needs to repeat might test if an update is needed before step 1, then clear MD_CHANGE_PENDING after step 2, resulting in the wait returning early. So make sure all places that set MD_CHANGE_PENDING are atomicial, and bit_clear_unless (suggested by Neil) is introduced for the purpose. Cc: Martin Kepplinger <martink@posteo.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: <linux-kernel@vger.kernel.org> Reviewed-by: NNeilBrown <neilb@suse.com> Signed-off-by: NGuoqing Jiang <gqjiang@suse.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
由 Heinz Mauelshagen 提交于
In case md runs underneath the dm-raid target, the mddev does not have a request queue or gendisk, thus avoid accesses. This patch adds a missing conditional to the raid5 personality. Signed-of-by: NHeinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
由 Heinz Mauelshagen 提交于
In case md runs underneath the dm-raid target, the mddev does not have a request queue or gendisk, thus avoid accesses to it. This patch adds two missing conditionals to the raid10 personality. Signed-of-by: NHeinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
由 Heinz Mauelshagen 提交于
Introduced by upstream commit 70d9798b The raid0 personality does not create mddev->thread as oposed to other personalities leading to its unconditional access in mddev_suspend() causing an oops. Patch checks for mddev->thread in order to keep the intention of aforementioned commit. Fixes: 70d9798b ("MD: warn for potential deadlock") Cc: stable@vger.kernel.org (4.5+) Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
- 06 5月, 2016 7 次提交
-
-
由 Michal Hocko 提交于
copy_params()'s use of __GFP_REPEAT for the __vmalloc() call doesn't make much sense because vmalloc doesn't rely on costly high order allocations. Signed-off-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
The primary motivation of this commit is to improve the scalability of DM multipath on large NUMA systems where m->lock spinlock contention has been proven to be a serious bottleneck on really fast storage. The ability to atomically read a pointer, using lockless_dereference(), is leveraged in this commit. But all pointer writes are still protected by the m->lock spinlock (which is fine since these all now occur in the slow-path). The following functions no longer require the m->lock spinlock in their fast-path: multipath_busy(), __multipath_map(), and do_end_io() And choose_pgpath() is modified to _not_ update m->current_pgpath unless it also switches the path-group. This is done to avoid needing to take the m->lock everytime __multipath_map() calls choose_pgpath(). But m->current_pgpath will be reset if it is failed via fail_path(). Suggested-by: NJeff Moyer <jmoyer@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Tested-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Allows the 'work_mutex' member to no longer cross a cacheline. Reviewed-by: NHannes Reinecke <hare@suse.com> Tested-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
The use of atomic_t for nr_valid_paths, pg_init_in_progress and pg_init_count will allow relaxing the use of the m->lock spinlock. Suggested-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Tested-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Mechanical change that doesn't make any real effort to reduce the use of m->lock; that will come later (once atomics are used for counters, etc). Suggested-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Tested-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Amitoj Kaur Chawla 提交于
Return statement at the end of a void function is useless. The Coccinelle semantic patch used to make this change is as follows: //<smpl> @@ identifier f; expression e; @@ void f(...) { <... - return e; ...> } //</smpl> Signed-off-by: NAmitoj Kaur Chawla <amitoj1606@gmail.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 05 5月, 2016 4 次提交
-
-
由 kbuild test robot 提交于
drivers/md/bitmap.c:2049:6-11: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values. NULL check before some freeing functions is not needed. Based on checkpatch warning "kfree(NULL) is safe this check is probably not required" and kfreeaddr.cocci by Julia Lawall. Generated by: scripts/coccinelle/free/ifnullfree.cocci Acked-by: NGuoqing Jiang <gqjiang@suse.com> Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
由 Guoqing Jiang 提交于
This patch is doing two distinct but related things. 1. It adds bitmap_unplug() for the main bitmap (mddev->bitmap). As bit have been set, BITMAP_PAGE_DIRTY is set so bitmap_deamon_work() will not write those pages out in its regular scans, only bitmap_unplug() will. If there are no writes to the array, bitmap_unplug() won't be called, so we need to call it explicitly here. 2. bitmap_write_all() is a bit of a confusing interface as it doesn't actually write anything. The current code for writing "bitmap" works but this change makes it a bit clearer. Reviewed-by: NNeilBrown <neilb@suse.com> Signed-off-by: NGuoqing Jiang <gqjiang@suse.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
由 Guoqing Jiang 提交于
The pnum passed to set_page_attr and test_page_attr should from 0 to storage.file_pages - 1, but bitmap_file_set_bit and bitmap_file_clear_bit call set_page_attr and test_page_attr with page->index parameter while page->index has already added node_offset before. So we need to minus node_offset in both bitmap_file_clear_bit and bitmap_file_set_bit. Reviewed-by: NNeilBrown <neilb@suse.com> Signed-off-by: NGuoqing Jiang <gqjiang@suse.com> Signed-off-by: NShaohua Li <shli@fb.com>
-
由 Guoqing Jiang 提交于
The offset is wrong in bitmap_storage_alloc, we should set it like below in bitmap_init_from_disk(). node_offset = bitmap->cluster_slot * (DIV_ROUND_UP(store->bytes, PAGE_SIZE)); Because 'offset' is only assigned to 'page->index' and that is usually over-written by read_sb_page. So it does not cause problem in general, but it still need to be fixed. Reviewed-by: NNeilBrown <neilb@suse.com> Signed-off-by: NGuoqing Jiang <gqjiang@suse.com> Signed-off-by: NShaohua Li <shli@fb.com>
-