- 20 1月, 2022 1 次提交
-
-
由 OGAWA Hirofumi 提交于
bio_truncate() clears the buffer outside of last block of bdev, however current bio_truncate() is using the wrong offset of page. So it can return the uninitialized data. This happened when both of truncated/corrupted FS and userspace (via bdev) are trying to read the last of bdev. Reported-by: syzbot+ac94ae5f68b84197f41c@syzkaller.appspotmail.com Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/875yqt1c9g.fsf@mail.parknet.co.jpSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 17 12月, 2021 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
This is a thin wrapper around bio_add_page(). The main advantage here is the documentation that folios larger than 2GiB are not supported. It's not currently possible to allocate folios that large, but if it ever becomes possible, this function will fail gracefully instead of doing I/O to the wrong bytes. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NJens Axboe <axboe@kernel.dk> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
- 16 12月, 2021 1 次提交
-
-
由 Jens Axboe 提交于
Pointless to maintain a head/tail for the list, as we never need to access the tail. Entries are always LIFO for cache hotness reasons. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 10月, 2021 1 次提交
-
-
由 Pavel Begunkov 提交于
Nobody cares about iov iterators state if we return -EIOCBQUEUED, so as the we now have __blkdev_direct_IO_async(), which gets pages only once, we can skip expensive iov_iter_advance(). It's around 1-2% of all CPU spent. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a6158edfbfa2ae3bc24aed29a72f035df18fad2f.1635337135.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 10月, 2021 1 次提交
-
-
由 Pavel Begunkov 提交于
Combine bio_iov_bvec_set() and bio_iov_bvec_set_append() and let the caller to do iov_iter_advance(). Also get rid of __bio_iov_bvec_set(), which was duplicated in the final binary, and replace a weird iov_iter_truncate() of a temporal iter copy with min() better reflecting the intention. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/bcf1ac36fce769a514e19475f3623cd86a1d8b72.1635006010.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 10月, 2021 1 次提交
-
-
由 Pavel Begunkov 提交于
Inline BIO_NO_PAGE_REF check of bio_release_pages() to avoid function call. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 18 10月, 2021 8 次提交
-
-
由 Jens Axboe 提交于
If we're completing nbytes and nbytes is the size of the bio, don't bother with calling into the iterator increment helpers. Just clear the bio size and we're done. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/85c36ea784d285a5075baa10049e6b59e15fb484.1634219547.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Replace the blk_poll interface that requires the caller to keep a queue and cookie from the submissions with polling based on the bio. Polling for the bio itself leads to a few advantages: - the cookie construction can made entirely private in blk-mq.c - the caller does not need to remember the request_queue and cookie separately and thus sidesteps their lifetime issues - keeping the device and the cookie inside the bio allows to trivially support polling BIOs remapping by stacking drivers - a lot of code to propagate the cookie back up the submission path can be removed entirely. Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NMark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-15-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
This flags ensures that the pages will not be reused for non-bio allocations before the end of an RCU grace period. With that we can safely use a RCU lookup for bio polling as long as we are fine with occasionally polling the wrong device. Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NMark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-13-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
bio_truncate is only used in bio.c, so mark it static. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012161804.991559-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Mark __bio_try_merge_page static and move it up a bit to avoid the need for a forward declaration. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012161804.991559-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
bio_full is only used in bio.c, so move it there. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012161804.991559-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
BIO_DEBUG is always defined, so just switch the two instances to use BUG_ON directly. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012161804.991559-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 9月, 2021 1 次提交
-
-
由 Ming Lei 提交于
rq_qos framework is only applied on request based driver, so: 1) rq_qos_done_bio() needn't to be called for bio based driver 2) rq_qos_done_bio() needn't to be called for bio which isn't tracked, such as bios ended from error handling code. Especially in bio_endio(): 1) request queue is referred via bio->bi_bdev->bd_disk->queue, which may be gone since request queue refcount may not be held in above two cases 2) q->rq_qos may be freed in blk_cleanup_queue() when calling into __rq_qos_done_bio() Fix the potential kernel panic by not calling rq_qos_ops->done_bio if the bio isn't tracked. This way is safe because both ioc_rqos_done_bio() and blkcg_iolatency_done_bio() are nop if the bio isn't tracked. Reported-by: NYu Kuai <yukuai3@huawei.com> Cc: tj@kernel.org Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Acked-by: NTejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20210924110704.1541818-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 03 9月, 2021 1 次提交
-
-
由 Jens Axboe 提交于
Apparently the last fixup got butter fingered a bit, the correct variable name is 'nr_vecs', not 'nr_iovecs'. Link: https://lore.kernel.org/lkml/20210903164939.02f6e8c5@canb.auug.org.au/Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 24 8月, 2021 4 次提交
-
-
由 Pavel Begunkov 提交于
__bio_iov_append_get_pages() doesn't put not appended pages on bio_add_hw_page() failure, so potentially leaking them, fix it. Also, do the same for __bio_iov_iter_get_pages(), even though it looks like it can't be triggered by userspace in this case. Fixes: 0512a75b ("block: Introduce REQ_OP_ZONE_APPEND") Cc: stable@vger.kernel.org # 5.8+ Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1edfa6a2ffd66d55e6345a477df5387d2c1415d0.1626653825.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We're missing a description for the 'nr_vecs' parameter. While in there, clarify that freeing a bio allocated through this function must be done from process context. Fixes: 1cbbd31c4ada ("bio: add allocation cache abstraction") Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Add a per-cpu bio_set cache for bio allocations, enabling us to quickly recycle them instead of going through the slab allocator. This cache isn't IRQ safe, and hence is only really suitable for polled IO. Very simple - keeps a count of bio's in the cache, and maintains a max of 512 with a slack of 64. If we get above max + slack, we drop slack number of bio's. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
The memset() used is measurably slower in targeted benchmarks, wasting about 1% of the total runtime, or 50% of the (later) hot path cached bio alloc. Get rid of it and fill in the bio manually. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 23 8月, 2021 1 次提交
-
-
由 Chaitanya Kulkarni 提交于
The function bio_trim has offset and size arguments that are declared as int. The callers of this function use sector_t type when passing the offset and size, e.g. drivers/md/raid1.c:narrow_write_error() and drivers/md/raid1.c:narrow_write_error(). Change offset and size arguments to sector_t type for bio_trim(). Also, add WARN_ON_ONCE() to catch their overflow. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NNaohiro Aota <naohiro.aota@wdc.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 03 8月, 2021 2 次提交
-
-
由 Christoph Hellwig 提交于
Use the proper helpers instead of open coding the copy. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210727055646.118787-11-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use memzero_bvec to zero each segment in the bio instead of manually mapping and zeroing the data. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20210727055646.118787-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 24 6月, 2021 1 次提交
-
-
由 Edward Hsieh 提交于
For chained bio, trace_block_bio_complete in bio_endio is currently called only by the parent bio once upon all chained bio completed. However, the sector and size for the parent bio are modified in bio_split. Therefore, the size and sector of the complete events might not match the queue events in blktrace. The original fix of bio completion trace <fbbaf700> ("block: trace completion of all bios.") wants multiple complete events to correspond to one queue event but missed this. The issue can be reproduced by md/raid5 read with bio cross chunks. To fix, move trace completion into the loop for every chained bio to call. Fixes: fbbaf700 ("block: trace completion of all bios.") Reviewed-by: NWade Liang <wadel@synology.com> Reviewed-by: NBingJing Chang <bingjingc@synology.com> Signed-off-by: NEdward Hsieh <edwardh@synology.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210624123030.27014-1-edwardh@synology.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 5月, 2021 1 次提交
-
-
由 Jens Axboe 提交于
This reverts commit cd2c7545. Alex reports that the commit causes corruption with LUKS on ext4. Revert it for now so that this can be investigated properly. Link: https://lore.kernel.org/linux-block/1620493841.bxdq8r5haw.none@localhost/Reported-by: NAlex Xu (Hello71) <alex_y_xu@yahoo.ca> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 04 5月, 2021 1 次提交
-
-
由 Changheun Lee 提交于
bio size can grow up to 4GB when muli-page bvec is enabled. but sometimes it would lead to inefficient behaviors. in case of large chunk direct I/O, - 32MB chunk read in user space - all pages for 32MB would be merged to a bio structure if the pages physical addresses are contiguous. it makes some delay to submit until merge complete. bio max size should be limited to a proper size. When 32MB chunk read with direct I/O option is coming from userspace, kernel behavior is below now in do_direct_IO() loop. it's timeline. | bio merge for 32MB. total 8,192 pages are merged. | total elapsed time is over 2ms. |------------------ ... ----------------------->| | 8,192 pages merged a bio. | at this time, first bio submit is done. | 1 bio is split to 32 read request and issue. |---------------> |---------------> |---------------> ...... |---------------> |--------------->| total 19ms elapsed to complete 32MB read done from device. | If bio max size is limited with 1MB, behavior is changed below. | bio merge for 1MB. 256 pages are merged for each bio. | total 32 bio will be made. | total elapsed time is over 2ms. it's same. | but, first bio submit timing is fast. about 100us. |--->|--->|--->|---> ... -->|--->|--->|--->|--->| | 256 pages merged a bio. | at this time, first bio submit is done. | and 1 read request is issued for 1 bio. |---------------> |---------------> |---------------> ...... |---------------> |--------------->| total 17ms elapsed to complete 32MB read done from device. | As a result, read request issue timing is faster if bio max size is limited. Current kernel behavior with multipage bvec, super large bio can be created. And it lead to delay first I/O request issue. Signed-off-by: NChangheun Lee <nanich.lee@samsung.com> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20210503095203.29076-1-nanich.lee@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 4月, 2021 2 次提交
-
-
由 Christoph Hellwig 提交于
bio_list_copy_data is only used by pktcdvd, so move it there. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210412134658.2623190-2-hch@lst.deReviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
zero_fill_bio_iter is only used to implement zero_fill_bio, so remove the indirection. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210412134658.2623190-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 01 4月, 2021 1 次提交
-
-
由 Yufen Yu 提交于
For multiple split bios, if one of the bio is fail, the whole should return error to application. But we found there is a race between bio_integrity_verify_fn and bio complete, which return io success to application after one of the bio fail. The race as following: split bio(READ) kworker nvme_complete_rq blk_update_request //split error=0 bio_endio bio_integrity_endio queue_work(kintegrityd_wq, &bip->bip_work); bio_integrity_verify_fn bio_endio //split bio __bio_chain_endio if (!parent->bi_status) <interrupt entry> nvme_irq blk_update_request //parent error=7 req_bio_endio bio->bi_status = 7 //parent bio <interrupt exit> parent->bi_status = 0 parent->bi_end_io() // return bi_status=0 The bio has been split as two: split and parent. When split bio completed, it depends on kworker to do endio, while bio_integrity_verify_fn have been interrupted by parent bio complete irq handler. Then, parent bio->bi_status which have been set in irq handler will overwrite by kworker. In fact, even without the above race, we also need to conside the concurrency beteen mulitple split bio complete and update the same parent bi_status. Normally, multiple split bios will be issued to the same hctx and complete from the same irq vector. But if we have updated queue map between multiple split bios, these bios may complete on different hw queue and different irq vector. Then the concurrency update parent bi_status may cause the final status error. Suggested-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NYufen Yu <yuyufen@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210331115359.1125679-1-yuyufen@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 3月, 2021 1 次提交
-
-
由 Johannes Thumshirn 提交于
Christoph reported that we'll likely trigger the WARN_ON_ONCE() checking that we're not submitting a bvec with REQ_OP_ZONE_APPEND in bio_iov_iter_get_pages() some time ago using zoned btrfs, but I couldn't reproduce it back then. Now Naohiro was able to trigger the bug as well with xfstests generic/095 on a zoned btrfs. There is nothing that prevents bvec submissions via REQ_OP_ZONE_APPEND if the hardware's zone append limit is met. Reported-by: NNaohiro Aota <naohiro.aota@wdc.com> Reported-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/10bd414d9326c90cd69029077db63b363854eee5.1616600835.git.johannes.thumshirn@wdc.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 11 3月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been horribly confusingly misnamed. Rename it to BIO_MAX_VECS to stop confusing users of the bio API. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 2月, 2021 1 次提交
-
-
由 Johannes Thumshirn 提交于
Add bio_add_zone_append_page(), a wrapper around bio_add_hw_page() which is intended to be used by file systems that directly add pages to a bio instead of using bio_iov_iter_get_pages(). Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 08 2月, 2021 8 次提交
-
-
由 Christoph Hellwig 提交于
Instead of encoding of the bvec pool using magic bio flags, just use a helper to find the pool based on the max_vecs value. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
bio_iov_bvec_set clones the bio_vecs from the iter, and thus should be treated like a cloned bio in every respect. That also includes not touching bi_max_vecs as that is a property of the bio allocation and not its current payload. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
bio_iov_bvec_set assigns the foreign bvec, so setting the NO_PAGE_REF directly there seems like the best fit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Remove a pointless layer of indentation after a return statement. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The bi_max_vecs and bi_vcnt fields are defined as unsigned short, so don't allow passing larger values in. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
All bios with up to 4 bvecs use the inline bvecs in the bio itself, so don't bother to define bvec_slabs entries for them. Also decruftify the bvec_slabs definition and initialization while we're at it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Avoid the pointless goto by trying the slab allocation first and falling through to the mempool. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Clean up bvec_alloc a little by factoring out a helper for the gfp_t manipulations. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-