1. 03 10月, 2020 13 次提交
  2. 29 9月, 2020 1 次提交
    • N
      null_blk: add support for max open/active zone limit for zoned devices · dc4d137e
      Niklas Cassel 提交于
      Add support for user space to set a max open zone and a max active zone
      limit via configfs. By default, the default values are 0 == no limit.
      
      Call the block layer API functions used for exposing the configured
      limits to sysfs.
      
      Add accounting in null_blk_zoned so that these new limits are respected.
      Performing an operation that would exceed these limits results in a
      standard I/O error.
      
      A max open zone limit exists in the ZBC standard.
      While null_blk_zoned is used to test the Zoned Block Device model in
      Linux, when it comes to differences between ZBC and ZNS, null_blk_zoned
      mostly follows ZBC.
      
      Therefore, implement the manage open zone resources function from ZBC,
      but additionally add support for max active zones.
      This enables user space not only to test against a device with an open
      zone limit, but also to test against a device with an active zone limit.
      Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
      Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      dc4d137e
  3. 28 9月, 2020 1 次提交
    • J
      Merge tag 'nvme-5.10-2020-09-27' of git://git.infradead.org/nvme into for-5.10/drivers · 1ed4211d
      Jens Axboe 提交于
      Pull NVMe updates from Christoph:
      
      "nvme updates for 5.10
      
       - fix keep alive timer modification (Amit Engel)
       - order the PCI ID list more sensibly (Andy Shevchenko)
       - cleanup the open by controller helper (Chaitanya Kulkarni)
       - use an xarray for th CSE log lookup (Chaitanya Kulkarni)
       - support ZNS in nvmet passthrough mode (Chaitanya Kulkarni)
       - fix nvme_ns_report_zones (me)
       - add a sanity check to nvmet-fc (James Smart)
       - fix interrupt allocation when too many polled queues are specified
         (Jeffle Xu)
       - small nvmet-tcp optimization (Mark Wunderlich)"
      
      * tag 'nvme-5.10-2020-09-27' of git://git.infradead.org/nvme:
        nvme-pci: allocate separate interrupt for the reserved non-polled I/O queue
        nvme: fix error handling in nvme_ns_report_zones
        nvmet-fc: fix missing check for no hostport struct
        nvmet: add passthru ZNS support
        nvmet: handle keep-alive timer when kato is modified by a set features cmd
        nvmet-tcp: have queue io_work context run on sock incoming cpu
        nvme-pci: Move enumeration by class to be last in the table
        nvme: use an xarray to lookup the Commands Supported and Effects log
        nvme: lift the file open code from nvme_ctrl_get_by_path
      1ed4211d
  4. 27 9月, 2020 9 次提交
  5. 25 9月, 2020 16 次提交
    • J
      Merge branch 'md-next' of... · 163090c1
      Jens Axboe 提交于
      Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.10/drivers
      
      Pull MD updates from Song.
      
      * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md/raid10: improve discard request for far layout
        md/raid10: improve raid10 discard request
        md/raid10: pull codes that wait for blocked dev into one function
        md/raid10: extend r10bio devs to raid disks
        md: add md_submit_discard_bio() for submitting discard bio
        md: Simplify code with existing definition RESYNC_SECTORS in raid10.c
        md/raid5: reallocate page array after setting new stripe_size
        md/raid5: resize stripe_head when reshape array
        md/raid5: let multiple devices of stripe_head share page
        md/raid6: let async recovery function support different page offset
        md/raid6: let syndrome computor support different page offset
        md/raid5: convert to new xor compution interface
        md/raid5: add new xor function to support different page offset
        md/raid5: make async_copy_data() to support different page offset
        md/raid5: add a new member of offset into r5dev
        md: only calculate blocksize once and use i_blocksize()
      163090c1
    • X
      md/raid10: improve discard request for far layout · d3ee2d84
      Xiao Ni 提交于
      For far layout, the discard region is not continuous on disks. So it needs
      far copies r10bio to cover all regions. It needs a way to know all r10bios
      have finish or not. Similar with raid10_sync_request, only the first r10bio
      master_bio records the discard bio. Other r10bios master_bio record the
      first r10bio. The first r10bio can finish after other r10bios finish and
      then return the discard bio.
      Signed-off-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      d3ee2d84
    • X
      md/raid10: improve raid10 discard request · bcc90d28
      Xiao Ni 提交于
      Now the discard request is split by chunk size. So it takes a long time
      to finish mkfs on disks which support discard function. This patch improve
      handling raid10 discard request. It uses the similar way with patch
      29efc390 (md/md0: optimize raid0 discard handling).
      
      But it's a little complex than raid0. Because raid10 has different layout.
      If raid10 is offset layout and the discard request is smaller than stripe
      size. There are some holes when we submit discard bio to underlayer disks.
      
      For example: five disks (disk1 - disk5)
      D01 D02 D03 D04 D05
      D05 D01 D02 D03 D04
      D06 D07 D08 D09 D10
      D10 D06 D07 D08 D09
      The discard bio just wants to discard from D03 to D10. For disk3, there is
      a hole between D03 and D08. For disk4, there is a hole between D04 and D09.
      D03 is a chunk, raid10_write_request can handle one chunk perfectly. So
      the part that is not aligned with stripe size is still handled by
      raid10_write_request.
      
      If reshape is running when discard bio comes and the discard bio spans the
      reshape position, raid10_write_request is responsible to handle this
      discard bio.
      
      I did a test with this patch set.
      Without patch:
      time mkfs.xfs /dev/md0
      real4m39.775s
      user0m0.000s
      sys0m0.298s
      
      With patch:
      time mkfs.xfs /dev/md0
      real0m0.105s
      user0m0.000s
      sys0m0.007s
      
      nvme3n1           259:1    0   477G  0 disk
      └─nvme3n1p1       259:10   0    50G  0 part
      nvme4n1           259:2    0   477G  0 disk
      └─nvme4n1p1       259:11   0    50G  0 part
      nvme5n1           259:6    0   477G  0 disk
      └─nvme5n1p1       259:12   0    50G  0 part
      nvme2n1           259:9    0   477G  0 disk
      └─nvme2n1p1       259:15   0    50G  0 part
      nvme0n1           259:13   0   477G  0 disk
      └─nvme0n1p1       259:14   0    50G  0 part
      Reviewed-by: NColy Li <colyli@suse.de>
      Reviewed-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      bcc90d28
    • X
      md/raid10: pull codes that wait for blocked dev into one function · f046f5d0
      Xiao Ni 提交于
      The following patch will reuse these logics, so pull the same codes into
      one function.
      Signed-off-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      f046f5d0
    • X
      md/raid10: extend r10bio devs to raid disks · 8650a889
      Xiao Ni 提交于
      Now it allocs r10bio->devs[conf->copies]. Discard bio needs to submit
      to all member disks and it needs to use r10bio. So extend to
      r10bio->devs[geo.raid_disks].
      Reviewed-by: NColy Li <colyli@suse.de>
      Signed-off-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      8650a889
    • X
      md: add md_submit_discard_bio() for submitting discard bio · 2628089b
      Xiao Ni 提交于
      Move these logic from raid0.c to md.c, so that we can also use it in
      raid10.c.
      Reviewed-by: NColy Li <colyli@suse.de>
      Reviewed-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      2628089b
    • Z
      md: Simplify code with existing definition RESYNC_SECTORS in raid10.c · e287308b
      Zhen Lei 提交于
      #define RESYNC_SECTORS (RESYNC_BLOCK_SIZE >> 9)
      
      "RESYNC_BLOCK_SIZE/512" is equal to "RESYNC_BLOCK_SIZE >> 9", replace it
      with RESYNC_SECTORS.
      Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      e287308b
    • Y
      md/raid5: reallocate page array after setting new stripe_size · 38912584
      Yufen Yu 提交于
      When try to resize stripe_size, we also need to free old
      shared page array and allocate new.
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      38912584
    • Y
      md/raid5: resize stripe_head when reshape array · f16acaf3
      Yufen Yu 提交于
      When reshape array, we try to reuse shared pages of old stripe_head,
      and allocate more for the new one if needed.
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      f16acaf3
    • Y
      md/raid5: let multiple devices of stripe_head share page · 046169f0
      Yufen Yu 提交于
      In current implementation, grow_buffers() uses alloc_page() to
      allocate the buffers for each stripe_head, i.e. allocate a page
      for each dev[i] in stripe_head.
      
      After setting stripe_size as a configurable value by writing
      sysfs entry, it means that we always allocate 64K buffers, but
      just use 4K of them when stripe_size is 4K in 64KB arm64.
      
      To avoid wasting memory, we try to let multiple sh->dev share
      one real page. That means, multiple sh->dev[i].page will point
      to the only page with different offset. Example of 64K PAGE_SIZE
      and 4K stripe_size as following:
      
                          64K PAGE_SIZE
                +---+---+---+---+------------------------------+
                |   |   |   |   |
                |   |   |   |   |
                +-+-+-+-+-+-+-+-+------------------------------+
                  ^   ^   ^   ^
                  |   |   |   +----------------------------+
                  |   |   |                                |
                  |   |   +-------------------+            |
                  |   |                       |            |
                  |   +----------+            |            |
                  |              |            |            |
                  +-+            |            |            |
                    |            |            |            |
              +-----+-----+------+-----+------+-----+------+------+
      sh      | offset(0) | offset(4K) | offset(8K) | offset(12K) |
       +      +-----------+------------+------------+-------------+
       +----> dev[0].page  dev[1].page  dev[2].page  dev[3].page
      
      A new 'pages' array will be added into stripe_head to record shared
      page used by this stripe_head. Allocate them when grow_buffers()
      and free them when shrink_buffers().
      
      After trying to share page, the users of sh->dev[i].page need to take
      care of the related page offset: page of issued bio and page passed
      to xor compution functions. But thanks for previous different page offset
      supported. Here, we just need to set correct dev[i].offset.
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      046169f0
    • Y
      md/raid6: let async recovery function support different page offset · 4f86ff55
      Yufen Yu 提交于
      For now, asynchronous raid6 recovery calculate functions are require
      common offset for pages. But, we expect them to support different page
      offset after introducing stripe shared page. Do that by simplily adding
      page offset where each page address are referred. Then, replace the
      old interface with the new ones in raid6 and raid6test.
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      4f86ff55
    • Y
      md/raid6: let syndrome computor support different page offset · d69454bc
      Yufen Yu 提交于
      For now, syndrome compute functions require common offset in the pages
      array. However, we expect them to support different offset when try to
      use shared page in the following. Simplily covert them by adding page
      offset where each page address are referred.
      
      Since the only caller of async_gen_syndrome() and async_syndrome_val()
      are in raid6, we don't want to reserve the old interface but modify the
      interface directly. After that, replacing old interfaces with new ones
      for raid6 and raid6test.
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      d69454bc
    • Y
      md/raid5: convert to new xor compution interface · a7c224a8
      Yufen Yu 提交于
      We try to replace async_xor() and async_xor_val() with the new
      introduced interface async_xor_offs() and async_xor_val_offs()
      for raid456.
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      a7c224a8
    • Y
      md/raid5: add new xor function to support different page offset · 29bcff78
      Yufen Yu 提交于
      raid5 will call async_xor() and async_xor_val() to compute xor.
      For now, both of them require the common src/dst page offset. But,
      we want them to support different src/dst page offset for following
      shared page.
      
      Here, adding two new function async_xor_offs() and async_xor_val_offs()
      respectively for async_xor() and async_xor_val().
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      29bcff78
    • Y
      md/raid5: make async_copy_data() to support different page offset · 248728dd
      Yufen Yu 提交于
      ops_run_biofill() and ops_run_biodrain() will call async_copy_data()
      to copy sh->dev[i].page from or to bio page. For now, it implies the
      offset of dev[i].page is 0. But we want to support different page offset
      in the following.
      
      Thus, pass page offset to these functions and replace 'page_offset'
      with 'page_offset + poff'.
      
      No functional change.
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      248728dd
    • Y
      md/raid5: add a new member of offset into r5dev · 7aba13b7
      Yufen Yu 提交于
      Add a new member of offset into struct r5dev. It indicates the
      offset of related dev[i].page. For now, since each device have a
      privated page, the value is always 0. Thus, we set offset as 0
      when allcate page in grow_buffers() and resize_stripes().
      
      To support following different page offset, we try to use the page
      offset rather than '0' directly for async_memcpy() and ops_run_io().
      
      We try to support different page offset for xor compution functions
      in the following. To avoid repeatly allocate a new array each time,
      we add a memory region into scribble buffer to record offset.
      
      No functional change.
      Signed-off-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      7aba13b7