- 10 12月, 2020 1 次提交
-
-
由 Bart Van Assche 提交于
Introduce the BLK_MQ_REQ_PM flag. This flag makes the request allocation functions set RQF_PM. This is the first step towards removing BLK_MQ_REQ_PREEMPT. Link: https://lore.kernel.org/r/20201209052951.16136-3-bvanassche@acm.org Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Can Guo <cang@codeaurora.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NJens Axboe <axboe@kernel.dk> Reviewed-by: NCan Guo <cang@codeaurora.org> Signed-off-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 14 10月, 2020 1 次提交
-
-
由 Keith Busch 提交于
A zoned device with limited resources to open or activate zones may return an error when the host exceeds those limits. The same command may be successful if retried later, but the host needs to wait for specific zone states before it should expect a retry to succeed. Have the block layer provide an appropriate status for these conditions so applications can distinuguish this error for special handling. Cc: linux-api@vger.kernel.org Cc: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 10月, 2020 1 次提交
-
-
由 Tetsuo Handa 提交于
syzbot is reporting unkillable task [1], for the caller is failing to handle a corrupted filesystem image which attempts to access beyond the end of the device. While we need to fix the caller, flooding the console with handle_bad_sector() message is unlikely useful. [1] https://syzkaller.appspot.com/bug?id=f1f49fb971d7a3e01bd8ab8cff2ff4572ccf3092Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 06 10月, 2020 1 次提交
-
-
由 Eric Biggers 提交于
blk_crypto_rq_bio_prep() assumes its gfp_mask argument always includes __GFP_DIRECT_RECLAIM, so that the mempool_alloc() will always succeed. However, blk_crypto_rq_bio_prep() might be called with GFP_ATOMIC via setup_clone() in drivers/md/dm-rq.c. This case isn't currently reachable with a bio that actually has an encryption context. However, it's fragile to rely on this. Just make blk_crypto_rq_bio_prep() able to fail. Suggested-by: NSatya Tangirala <satyat@google.com> Signed-off-by: NEric Biggers <ebiggers@google.com> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Reviewed-by: NSatya Tangirala <satyat@google.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 9月, 2020 3 次提交
-
-
由 Mike Snitzer 提交于
Add QUEUE_FLAG_NOWAIT to allow a block device to advertise support for REQ_NOWAIT. Bio-based devices may set QUEUE_FLAG_NOWAIT where applicable. Update QUEUE_FLAG_MQ_DEFAULT to include QUEUE_FLAG_NOWAIT. Also update submit_bio_checks() to verify it is set for REQ_NOWAIT bios. Reported-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Suggested-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Just checking SB_I_CGROUPWB for cgroup writeback support is enough. Either the file system allocates its own bdi (e.g. btrfs), in which case it is known to support cgroup writeback, or the bdi comes from the block layer, which always supports cgroup writeback. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Set up a readahead size by default, as very few users have a good reason to change it. This means code, ecryptfs, and orangefs now set up the values while they were previously missing it, while ubifs, mtd and vboxsf manually set it to 0 to avoid readahead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: Richard Weinberger <richard@nod.at> [ubifs, mtd] Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 9月, 2020 1 次提交
-
-
由 Song Liu 提交于
These functions can be used to enable iostat for partitions on devices like md, bcache. Signed-off-by: NSong Liu <songliubraving@fb.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 04 9月, 2020 1 次提交
-
-
由 John Garry 提交于
The per-hctx nr_active value can no longer be used to fairly assign a share of tag depth per request queue for when using a shared sbitmap, as it does not consider that the tags are shared tags over all hctx's. For this case, record the nr_active_requests per request_queue, and make the judgement based on that value. Co-developed-with: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used Tested-by: NDouglas Gilbert <dgilbert@interlog.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 02 9月, 2020 4 次提交
-
-
由 Ritika Srivastava 提交于
If WRITE_ZERO/WRITE_SAME operation is not supported by the storage, blk_cloned_rq_check_limits() will return IO error which will cause device-mapper to fail the paths. Instead, if the queue limit is set to 0, return BLK_STS_NOTSUPP. BLK_STS_NOTSUPP will be ignored by device-mapper and will not fail the paths. Suggested-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NRitika Srivastava <ritika.srivastava@oracle.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ritika Srivastava 提交于
Replace returning legacy errno codes with blk_status_t in blk_cloned_rq_check_limits(). Signed-off-by: NRitika Srivastava <ritika.srivastava@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Xianting Tian 提交于
Replace various magic -1 constants for tags with BLK_MQ_NO_TAG. Signed-off-by: NXianting Tian <tian.xianting@h3c.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Baolin Wang 提交于
It's better to move bio merge related functions into blk-merge.c, which contains all merge related functions. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 01 9月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
If a driver leaves the limit settings as the defaults, then we don't initialize bdi->io_pages. This means that file systems may need to work around bdi->io_pages == 0, which is somewhat messy. Initialize the default value just like we do for ->ra_pages. Cc: stable@vger.kernel.org Fixes: 9491ae4a ("mm: don't cap request size based on read-ahead setting") Reported-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 7月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
If blk_mq_submit_bio flushes the plug list, bios for other disks can show up on current->bio_list. As that doesn't involve any stacking of block device it is entirely harmless and we should not warn about this case. Fixes: ff93ea0c ("block: shortcut __submit_bio_noacct for blk-mq drivers") Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 03 7月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
bio_alloc_bioset references current->bio_list[1], so we need to initialize it for the blk-mq submission path as well. Fixes: ff93ea0c ("block: shortcut __submit_bio_noacct for blk-mq drivers") Reported-by: NQian Cai <cai@lca.pw> Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 01 7月, 2020 8 次提交
-
-
由 Christoph Hellwig 提交于
Now that submit_bio_noacct has a decent blk-mq fast path there is no more need for this bypass. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
For blk-mq drivers bios can only be inserted for the same queue. So bypass the complicated sorting logic in __submit_bio_noacct with a blk-mq simpler submission helper. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Split out a __submit_bio_noacct helper for the actual de-recursion algorithm, and simplify the loop by using a continue when we can't enter the queue for a bio. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
generic_make_request has always been very confusingly misnamed, so rename it to submit_bio_noacct to make it clear that it is submit_bio minus accounting and a few checks. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The make_request_fn is a little weird in that it sits directly in struct request_queue instead of an operation vector. Replace it with a block_device_operations method called submit_bio (which describes much better what it does). Also remove the request_queue argument to it, as the queue can be derived pretty trivially from the bio. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The variable is only used once, so just open code the bio_sector() there. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
All registers disks must have a valid queue pointer, so don't bother to log a warning for that case. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The "generic_make_request: " prefix has no value, and will soon become stale. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 29 6月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
blkcg_bio_issue_check is a giant inline function that does three entirely different things. Factor out the blk-cgroup related bio initalization into a new helper, and the open code the sequence in the only caller, relying on the fact that all the actual functionality is stubbed out for non-cgroup builds. Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 24 6月, 2020 4 次提交
-
-
由 Luis Chamberlain 提交于
We were only creating the request_queue debugfs_dir only for make_request block drivers (multiqueue), but never for request-based block drivers. We did this as we were only creating non-blktrace additional debugfs files on that directory for make_request drivers. However, since blktrace *always* creates that directory anyway, we special-case the use of that directory on blktrace. Other than this being an eye-sore, this exposes request-based block drivers to the same debugfs fragile race that used to exist with make_request block drivers where if we start adding files onto that directory we can later run a race with a double removal of dentries on the directory if we don't deal with this carefully on blktrace. Instead, just simplify things by always creating the request_queue debugfs_dir on request_queue registration. Rename the mutex also to reflect the fact that this is used outside of the blktrace context. Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Luis Chamberlain 提交于
Commit dc9edc44 ("block: Fix a blk_exit_rl() regression") merged on v4.12 moved the work behind blk_release_queue() into a workqueue after a splat floated around which indicated some work on blk_release_queue() could sleep in blk_exit_rl(). This splat would be possible when a driver called blk_put_queue() or blk_cleanup_queue() (which calls blk_put_queue() as its final call) from an atomic context. blk_put_queue() decrements the refcount for the request_queue kobject, and upon reaching 0 blk_release_queue() is called. Although blk_exit_rl() is now removed through commit db6d9952 ("block: remove request_list code") on v5.0, we reserve the right to be able to sleep within blk_release_queue() context. The last reference for the request_queue must not be called from atomic context. *When* the last reference to the request_queue reaches 0 varies, and so let's take the opportunity to document when that is expected to happen and also document the context of the related calls as best as possible so we can avoid future issues, and with the hopes that the synchronous request_queue removal sticks. We revert back to synchronous request_queue removal because asynchronous removal creates a regression with expected userspace interaction with several drivers. An example is when removing the loopback driver, one uses ioctls from userspace to do so, but upon return and if successful, one expects the device to be removed. Likewise if one races to add another device the new one may not be added as it is still being removed. This was expected behavior before and it now fails as the device is still present and busy still. Moving to asynchronous request_queue removal could have broken many scripts which relied on the removal to have been completed if there was no error. Document this expectation as well so that this doesn't regress userspace again. Using asynchronous request_queue removal however has helped us find other bugs. In the future we can test what could break with this arrangement by enabling CONFIG_DEBUG_KOBJECT_RELEASE. While at it, update the docs with the context expectations for the request_queue / gendisk refcount decrement, and make these expectations explicit by using might_sleep(). Fixes: dc9edc44 ("block: Fix a blk_exit_rl() regression") Suggested-by: NNicolai Stange <nstange@suse.de> Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Omar Sandoval <osandov@fb.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Nicolai Stange <nstange@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: yu kuai <yukuai3@huawei.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Luis Chamberlain 提交于
Let us clarify the context under which the helpers to increment the refcount for the gendisk and request_queue can be called under. We make this explicit on the places where we may sleep with might_sleep(). We don't address the decrement context yet, as that needs some extra work and fixes, but will be addressed in the next patch. Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Luis Chamberlain 提交于
This adds documentation for the gendisk / request_queue refcount helpers. Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 22 6月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
Provide a way for the caller to specify that IO should be marked with REQ_NOWAIT to avoid blocking on allocation. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 03 6月, 2020 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
Patch series "Change readahead API", v11. This series adds a readahead address_space operation to replace the readpages operation. The key difference is that pages are added to the page cache as they are allocated (and then looked up by the filesystem) instead of passing them on a list to the readpages operation and having the filesystem add them to the page cache. It's a net reduction in code for each implementation, more efficient than walking a list, and solves the direct-write vs buffered-read problem reported by yu kuai at http://lkml.kernel.org/r/20200116063601.39201-1-yukuai3@huawei.com The only unconverted filesystems are those which use fscache. Their conversion is pending Dave Howells' rewrite which will make the conversion substantially easier. This should be completed by the end of the year. I want to thank the reviewers/testers; Dave Chinner, John Hubbard, Eric Biggers, Johannes Thumshirn, Dave Sterba, Zi Yan, Christoph Hellwig and Miklos Szeredi have done a marvellous job of providing constructive criticism. These patches pass an xfstests run on ext4, xfs & btrfs with no regressions that I can tell (some of the tests seem a little flaky before and remain flaky afterwards). This patch (of 25): The readahead code is part of the page cache so should be found in the pagemap.h file. force_page_cache_readahead is only used within mm, so move it to mm/internal.h instead. Remove the parameter names where they add no value, and rename the ones which were actively misleading. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-1-willy@infradead.org Link: http://lkml.kernel.org/r/20200414150233.24495-2-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 5月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
This reverts commit c58c1f83. io_uring does do the right thing for this case, and we're still returning -EAGAIN to userspace for the cases we don't support. Revert this change to avoid doing endless spins of resubmits. Cc: stable@vger.kernel.org # v5.6 Reported-by: NBijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 5月, 2020 4 次提交
-
-
由 Christoph Hellwig 提交于
We only need the stats lock (aka preempt_disable()) for updating the states, not for looking up or dropping the hd_struct reference. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Konstantin Khlebnikov 提交于
Move the non-"new_io" branch of blk_account_io_start() into separate function. Fix merge accounting for discards (they were counted as write merges). The new blk_account_io_merge_bio() doesn't call update_io_ticks() unlike blk_account_io_start(), as there is no reason for that. [hch: rebased] Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
All callers are in blk-core.c, so move update_io_ticks over. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add two new helpers to simplify I/O accounting for bio based drivers. Currently these drivers use the generic_start_io_acct and generic_end_io_acct helpers which have very cumbersome calling conventions, don't actually return the time they started accounting, and try to deal with accounting for partitions, which can't happen for bio based drivers. The new helpers will be used to subsequently replace uses of the old helpers. The main API is the bio based wrappes in blkdev.h, but for zram which wants to account rw_page based I/O lower level routines are provided as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 19 5月, 2020 3 次提交
-
-
由 Christoph Hellwig 提交于
part_inc_in_flight and part_dec_in_flight are no-ops for blk-mq queues, so remove the calls in purely blk-mq callers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
blk_mq_make_request currently needs to grab an q_usage_counter reference when allocating a request. This is because the block layer grabs one before calling blk_mq_make_request, but also releases it as soon as blk_mq_make_request returns. Remove the blk_queue_exit call after blk_mq_make_request returns, and instead let it consume the reference. This works perfectly fine for the block layer caller, just device mapper needs an extra reference as the old problem still persists there. Open code blk_queue_enter_live in device mapper, as there should be no other callers and this allows better documenting why we do a non-try get. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 14 5月, 2020 1 次提交
-
-
由 Satya Tangirala 提交于
We must have some way of letting a storage device driver know what encryption context it should use for en/decrypting a request. However, it's the upper layers (like the filesystem/fscrypt) that know about and manages encryption contexts. As such, when the upper layer submits a bio to the block layer, and this bio eventually reaches a device driver with support for inline encryption, the device driver will need to have been told the encryption context for that bio. We want to communicate the encryption context from the upper layer to the storage device along with the bio, when the bio is submitted to the block layer. To do this, we add a struct bio_crypt_ctx to struct bio, which can represent an encryption context (note that we can't use the bi_private field in struct bio to do this because that field does not function to pass information across layers in the storage stack). We also introduce various functions to manipulate the bio_crypt_ctx and make the bio/request merging logic aware of the bio_crypt_ctx. We also make changes to blk-mq to make it handle bios with encryption contexts. blk-mq can merge many bios into the same request. These bios need to have contiguous data unit numbers (the necessary changes to blk-merge are also made to ensure this) - as such, it suffices to keep the data unit number of just the first bio, since that's all a storage driver needs to infer the data unit number to use for each data block in each bio in a request. blk-mq keeps track of the encryption context to be used for all the bios in a request with the request's rq_crypt_ctx. When the first bio is added to an empty request, blk-mq will program the encryption context of that bio into the request_queue's keyslot manager, and store the returned keyslot in the request's rq_crypt_ctx. All the functions to operate on encryption contexts are in blk-crypto.c. Upper layers only need to call bio_crypt_set_ctx with the encryption key, algorithm and data_unit_num; they don't have to worry about getting a keyslot for each encryption context, as blk-mq/blk-crypto handles that. Blk-crypto also makes it possible for request-based layered devices like dm-rq to make use of inline encryption hardware by cloning the rq_crypt_ctx and programming a keyslot in the new request_queue when necessary. Note that any user of the block layer can submit bios with an encryption context, such as filesystems, device-mapper targets, etc. Signed-off-by: NSatya Tangirala <satyat@google.com> Reviewed-by: NEric Biggers <ebiggers@google.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-