1. 03 8月, 2021 1 次提交
  2. 17 6月, 2021 1 次提交
    • C
      block: export blk_next_bio() · c28a6147
      Chaitanya Kulkarni 提交于
      The block layer provides emulation of zone management operations
      targeting all zones of a zoned block device only for the zone reset
      operation (REQ_OP_ZONE_RESET). In order to correctly implement
      exporting of zoned block devices with NVMeOF, emulating zone management
      operations targeting all zones of a device is also necessary for the
      open, close and finish zone operations (REQ_OP_ZONE_OPEN,
      REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH).
      
      Instead of duplicating the code, export the existing helper from block
      layer so we can use a bio chaining pattern that is present in the block
      layer for REQ_OP_ZONE RESET all emulation in the NVMeOF zoned block
      device backend.
      Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      c28a6147
  3. 09 6月, 2021 1 次提交
    • L
      block: return the correct bvec when checking for gaps · c9c9762d
      Long Li 提交于
      After commit 07173c3e ("block: enable multipage bvecs"), a bvec can
      have multiple pages. But bio_will_gap() still assumes one page bvec while
      checking for merging. If the pages in the bvec go across the
      seg_boundary_mask, this check for merging can potentially succeed if only
      the 1st page is tested, and can fail if all the pages are tested.
      
      Later, when SCSI builds the SG list the same check for merging is done in
      __blk_segment_map_sg_merge() with all the pages in the bvec tested. This
      time the check may fail if the pages in bvec go across the
      seg_boundary_mask (but tested okay in bio_will_gap() earlier, so those
      BIOs were merged). If this check fails, we end up with a broken SG list
      for drivers assuming the SG list not having offsets in intermediate pages.
      This results in incorrect pages written to the disk.
      
      Fix this by returning the multi-page bvec when testing gaps for merging.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
      Cc: Pavel Begunkov <asml.silence@gmail.com>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org
      Fixes: 07173c3e ("block: enable multipage bvecs")
      Signed-off-by: NLong Li <longli@microsoft.com>
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/1623094445-22332-1-git-send-email-longli@linuxonhyperv.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      c9c9762d
  4. 09 5月, 2021 1 次提交
  5. 04 5月, 2021 1 次提交
    • C
      bio: limit bio max size · cd2c7545
      Changheun Lee 提交于
      bio size can grow up to 4GB when muli-page bvec is enabled.
      but sometimes it would lead to inefficient behaviors.
      in case of large chunk direct I/O, - 32MB chunk read in user space -
      all pages for 32MB would be merged to a bio structure if the pages
      physical addresses are contiguous. it makes some delay to submit
      until merge complete. bio max size should be limited to a proper size.
      
      When 32MB chunk read with direct I/O option is coming from userspace,
      kernel behavior is below now in do_direct_IO() loop. it's timeline.
      
       | bio merge for 32MB. total 8,192 pages are merged.
       | total elapsed time is over 2ms.
       |------------------ ... ----------------------->|
                                                       | 8,192 pages merged a bio.
                                                       | at this time, first bio submit is done.
                                                       | 1 bio is split to 32 read request and issue.
                                                       |--------------->
                                                        |--------------->
                                                         |--------------->
                                                                    ......
                                                                         |--------------->
                                                                          |--------------->|
                                total 19ms elapsed to complete 32MB read done from device. |
      
      If bio max size is limited with 1MB, behavior is changed below.
      
       | bio merge for 1MB. 256 pages are merged for each bio.
       | total 32 bio will be made.
       | total elapsed time is over 2ms. it's same.
       | but, first bio submit timing is fast. about 100us.
       |--->|--->|--->|---> ... -->|--->|--->|--->|--->|
            | 256 pages merged a bio.
            | at this time, first bio submit is done.
            | and 1 read request is issued for 1 bio.
            |--------------->
                 |--------------->
                      |--------------->
                                            ......
                                                       |--------------->
                                                        |--------------->|
              total 17ms elapsed to complete 32MB read done from device. |
      
      As a result, read request issue timing is faster if bio max size is limited.
      Current kernel behavior with multipage bvec, super large bio can be created.
      And it lead to delay first I/O request issue.
      Signed-off-by: NChangheun Lee <nanich.lee@samsung.com>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Link: https://lore.kernel.org/r/20210503095203.29076-1-nanich.lee@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      cd2c7545
  6. 12 4月, 2021 2 次提交
  7. 11 3月, 2021 1 次提交
  8. 27 2月, 2021 1 次提交
  9. 09 2月, 2021 1 次提交
  10. 08 2月, 2021 3 次提交
  11. 28 1月, 2021 1 次提交
  12. 25 1月, 2021 6 次提交
  13. 03 12月, 2020 1 次提交
  14. 26 11月, 2020 1 次提交
  15. 29 6月, 2020 2 次提交
  16. 24 6月, 2020 1 次提交
  17. 02 6月, 2020 1 次提交
    • J
      block: mark bio_wouldblock_error() bio with BIO_QUIET · abb30460
      Jens Axboe 提交于
      We really don't care about triggering buffer errors for this condition.
      This avoids a spew of:
      
      Buffer I/O error on dev sdc, logical block 785929, async page read
      Buffer I/O error on dev sdc, logical block 759095, async page read
      Buffer I/O error on dev sdc, logical block 766922, async page read
      Buffer I/O error on dev sdc, logical block 17659, async page read
      Buffer I/O error on dev sdc, logical block 637571, async page read
      Buffer I/O error on dev sdc, logical block 39241, async page read
      Buffer I/O error on dev sdc, logical block 397241, async page read
      Buffer I/O error on dev sdc, logical block 763992, async page read
      
      from -EAGAIN conditions on request allocation for async reads.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      abb30460
  18. 27 5月, 2020 1 次提交
  19. 25 5月, 2020 1 次提交
  20. 19 5月, 2020 1 次提交
  21. 19 4月, 2020 1 次提交
    • G
      bio: Replace zero-length array with flexible-array member · 0a368bf0
      Gustavo A. R. Silva 提交于
      The current codebase makes use of the zero-length array language
      extension to the C90 standard, but the preferred mechanism to declare
      variable-length types such as these ones is a flexible array member[1][2],
      introduced in C99:
      
      struct foo {
              int stuff;
              struct boo array[];
      };
      
      By making use of the mechanism above, we will get a compiler warning
      in case the flexible array does not occur last in the structure, which
      will help us prevent some kind of undefined behavior bugs from being
      inadvertently introduced[3] to the codebase from now on.
      
      Also, notice that, dynamic memory allocations won't be affected by
      this change:
      
      "Flexible array members have incomplete type, and so the sizeof operator
      may not be applied. As a quirk of the original implementation of
      zero-length arrays, sizeof evaluates to zero."[1]
      
      This issue was found with the help of Coccinelle.
      
      [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
      [2] https://github.com/KSPP/linux/issues/21
      [3] commit 76497732 ("cxgb3/l2t: Fix undefined behaviour")
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      0a368bf0
  22. 28 3月, 2020 1 次提交
  23. 25 3月, 2020 1 次提交
  24. 29 12月, 2019 1 次提交
    • M
      block: add bio_truncate to fix guard_bio_eod · 85a8ce62
      Ming Lei 提交于
      Some filesystem, such as vfat, may send bio which crosses device boundary,
      and the worse thing is that the IO request starting within device boundaries
      can contain more than one segment past EOD.
      
      Commit dce30ca9 ("fs: fix guard_bio_eod to check for real EOD errors")
      tries to fix this issue by returning -EIO for this situation. However,
      this way lets fs user code lose chance to handle -EIO, then sync_inodes_sb()
      may hang for ever.
      
      Also the current truncating on last segment is dangerous by updating the
      last bvec, given bvec table becomes not immutable any more, and fs bio
      users may not retrieve the truncated pages via bio_for_each_segment_all() in
      its .end_io callback.
      
      Fixes this issue by supporting multi-segment truncating. And the
      approach is simpler:
      
      - just update bio size since block layer can make correct bvec with
      the updated bio size. Then bvec table becomes really immutable.
      
      - zero all truncated segments for read bio
      
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: linux-fsdevel@vger.kernel.org
      Fixed-by: dce30ca9 ("fs: fix guard_bio_eod to check for real EOD errors")
      Reported-by: syzbot+2b9e54155c8c25d8d165@syzkaller.appspotmail.com
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      85a8ce62
  25. 01 7月, 2019 1 次提交
    • M
      block: fix .bi_size overflow · 79d08f89
      Ming Lei 提交于
      'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1
      bytes.
      
      Before 07173c3e ("block: enable multipage bvecs"), one bio can
      include very limited pages, and usually at most 256, so the fs bio
      size won't be bigger than 1M bytes most of times.
      
      Since we support multi-page bvec, in theory one fs bio really can
      be added > 1M pages, especially in case of hugepage, or big writeback
      with too many dirty pages. Then there is chance in which .bi_size
      is overflowed.
      
      Fixes this issue by using bio_full() to check if the added segment may
      overflow .bi_size.
      
      Cc: Liu Yiding <liuyd.fnst@cn.fujitsu.com>
      Cc: kernel test robot <rong.a.chen@intel.com>
      Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
      Cc: linux-xfs@vger.kernel.org
      Cc: linux-fsdevel@vger.kernel.org
      Cc: stable@vger.kernel.org
      Fixes: 07173c3e ("block: enable multipage bvecs")
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      79d08f89
  26. 29 6月, 2019 2 次提交
  27. 27 6月, 2019 1 次提交
  28. 21 6月, 2019 1 次提交
    • C
      block: remove the bi_phys_segments field in struct bio · 14ccb66b
      Christoph Hellwig 提交于
      We only need the number of segments in the blk-mq submission path.
      Remove the field from struct bio, and return it from a variant of
      blk_queue_split instead of that it can passed as an argument to
      those functions that need the value.
      
      This also means we stop recounting segments except for cloning
      and partial segments.
      
      To keep the number of arguments in this how path down remove
      pointless struct request_queue arguments from any of the functions
      that had it and grew a nr_segs argument.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      14ccb66b
  29. 17 6月, 2019 1 次提交
  30. 24 5月, 2019 1 次提交