1. 13 11月, 2019 6 次提交
    • D
      block: Remove partition support for zoned block devices · 5eac3eb3
      Damien Le Moal 提交于
      No known partitioning tool supports zoned block devices, especially the
      host managed flavor with strong sequential write constraints.
      Furthermore, there are also no known user nor use cases for partitioned
      zoned block devices.
      
      This patch removes partition device creation for zoned block devices,
      which allows simplifying the processing of zone commands for zoned
      block devices. A warning is added if a partition table is found on the
      device.
      
      For report zones operations no zone sector information remapping is
      necessary anymore, simplifying the code. Of note is that remapping of
      zone reports for DM targets is still necessary as done by
      dm_remap_zone_report().
      
      Similarly, remaping of a zone reset bio is not necessary anymore.
      Testing for the applicability of the zone reset all request also becomes
      simpler and only needs to check that the number of sectors of the
      requested zone range is equal to the disk capacity.
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5eac3eb3
    • D
      block: Simplify report zones execution · ceeb373a
      Damien Le Moal 提交于
      All kernel users of blkdev_report_zones() as well as applications use
      through ioctl(BLKZONEREPORT) expect to potentially get less zone
      descriptors than requested. As such, the use of the internal report
      zones command execution loop implemented by blk_report_zones() is
      not necessary and can even be harmful to performance by causing the
      execution of inefficient small zones report command to service the
      reminder of a requested zone array.
      
      This patch removes blk_report_zones(), simplifying the code. Also
      remove a now incorrect comment in dm_blk_report_zones().
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJavier Gonzalez <javier@javigon.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ceeb373a
    • C
      block: cleanup the !zoned case in blk_revalidate_disk_zones · c98c3d09
      Christoph Hellwig 提交于
      blk_revalidate_disk_zones is never called for non-zoned devices.  Just
      return early and warn instead of trying to handle this case.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c98c3d09
    • D
      block: Enhance blk_revalidate_disk_zones() · d9dd7308
      Damien Le Moal 提交于
      For ZBC and ZAC zoned devices, the scsi driver revalidation processing
      implemented by sd_revalidate_disk() includes a call to
      sd_zbc_read_zones() which executes a full disk zone report used to
      check that all zones of the disk are the same size. This processing is
      followed by a call to blk_revalidate_disk_zones(), used to initialize
      the device request queue zone bitmaps (zone type and zone write lock
      bitmaps). To do so, blk_revalidate_disk_zones() also executes a full
      device zone report to obtain zone types. As a result, the entire
      zoned block device revalidation process includes two full device zone
      report.
      
      By moving the zone size checks into blk_revalidate_disk_zones(), this
      process can be optimized to a single full device zone report, leading to
      shorter device scan and revalidation times. This patch implements this
      optimization, reducing the original full device zone report implemented
      in sd_zbc_check_zones() to a single, small, report zones command
      execution to obtain the size of the first zone of the device. Checks
      whether all zones of the device are the same size as the first zone
      size are moved to the generic blk_check_zone() function called from
      blk_revalidate_disk_zones().
      
      This optimization also has the following benefits:
      1) fewer memory allocations in the scsi layer during disk revalidation
         as the potentailly large buffer for zone report execution is not
         needed.
      2) Implement zone checks in a generic manner, reducing the burden on
         device driver which only need to obtain the zone size and check that
         this size is a power of 2 number of LBAs. Any new type of zoned
         block device will benefit from this.
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d9dd7308
    • J
      Merge branch 'for-5.5/drivers-post' into for-5.5/zoned · 0788c4ed
      Jens Axboe 提交于
      * for-5.5/drivers-post:
        scsi: sd_zbc: add zone open, close, and finish support
        scsi: core: Handle drivers which set sg_tablesize to zero
        scsi: qla2xxx: fix NPIV tear down process
        scsi: sd_zbc: Fix sd_zbc_complete()
        scsi: qla2xxx: stop timer in shutdown path
        scsi: sd: define variable dif as unsigned int instead of bool
        scsi: target: cxgbit: Fix cxgbit_fw4_ack()
        scsi: qla2xxx: Fix partial flash write of MBI
        scsi: qla2xxx: Initialized mailbox to prevent driver load failure
        scsi: lpfc: Honor module parameter lpfc_use_adisc
        scsi: ufs-bsg: Wake the device before sending raw upiu commands
        scsi: lpfc: Check queue pointer before use
        scsi: qla2xxx: fixup incorrect usage of host_byte
      0788c4ed
    • J
      Merge branch 'for-5.5/drivers' into for-5.5/zoned · d29510d3
      Jens Axboe 提交于
      * for-5.5/drivers: (38 commits)
        null_blk: add zone open, close, and finish support
        dm: add zone open, close and finish support
        nvme: Fix parsing of ANA log page
        nvmet: stop using bio_set_op_attrs
        nvmet: add plugging for read/write when ns is bdev
        nvmet: clean up command parsing a bit
        nvme-pci: Spelling s/resdicovered/rediscovered/
        nvmet: fill discovery controller sn, fr and mn correctly
        nvmet: Open code nvmet_req_execute()
        nvmet: Remove the data_len field from the nvmet_req struct
        nvmet: Introduce nvmet_dsm_len() helper
        nvmet: Cleanup discovery execute handlers
        nvmet: Introduce common execute function for get_log_page and identify
        nvmet-tcp: Don't set the request's data_len
        nvmet-tcp: Don't check data_len in nvmet_tcp_map_data()
        nvme: Introduce nvme_lba_to_sect()
        nvme: Cleanup and rename nvme_block_nr()
        nvme: resync include/linux/nvme.h with nvmecli
        nvme: move common call to nvme_cleanup_cmd to core layer
        nvme: introduce "Command Aborted By host" status code
        ...
      d29510d3
  2. 08 11月, 2019 9 次提交
    • M
      block: split bio if the only bvec's length is > SZ_4K · 6952a7f8
      Ming Lei 提交于
      64K PAGE_SIZE is popular on ARM64 or other ARCHs, and 64K has been big
      enough to break some devices probably, so change the logic to split bio
      if the only bvec's length is > SZ_4K instead of PAGE_SIZE.
      
      Fixes: fa532287 (block: avoid blk_bio_segment_split for small I/O operations)
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6952a7f8
    • M
      block: still try to split bio if the bvec crosses pages · 59db8ba2
      Ming Lei 提交于
      Some device may set segment boundary as PAGE_SIZE - 1. If the bvec
      crosses pages, and meantime its length is <= PAGE_SIZE, we still need
      to split the bvec into 2 segments.
      
      Fixes this issue by still splitting bio if the single bvec crosses
      pages.
      Reported-by: Nkernel test robot <lkp@intel.com>
      Fixes: fa532287 (block: avoid blk_bio_segment_split for small I/O operations)
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      59db8ba2
    • T
      blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT · 1d156646
      Tejun Heo 提交于
      blkg_rwstat is now only used by bfq-iosched and blk-throtl when on
      cgroup1.  Let's move it into its own files and gate it behind a config
      option.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1d156646
    • T
      blk-cgroup: reimplement basic IO stats using cgroup rstat · f7331648
      Tejun Heo 提交于
      blk-cgroup has been using blkg_rwstat to track basic IO stats.
      Unfortunately, reading recursive stats scales badly as itinvolves
      walking all descendants.  On systems with a huge number of cgroups
      (dead or alive), this can lead to substantial CPU cost when reading IO
      stats.
      
      This patch reimplements basic IO stats using cgroup rstat which uses
      more memory but makes recursive stat reading O(# descendants which
      have been active since last reading) instead of O(# descendants).
      
      * blk-cgroup core no longer uses sync/async stats.  Introduce new stat
        enums - BLKG_IOSTAT_{READ|WRITE|DISCARD}.
      
      * Add blkg_iostat[_set] which encapsulates byte and io stats, last
        values for propagation delta calculation and u64_stats_sync for
        correctness on 32bit archs.
      
      * Update the new percpu stat counters directly and implement
        blkcg_rstat_flush() to implement propagation.
      
      * blkg_print_stat() can now bring the stats up to date by calling
        cgroup_rstat_flush() and print them instead of directly summing up
        all descendants.
      
      * It now allocates 96 bytes per cpu.  It used to be 40 bytes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Dan Schatzberg <dschatzberg@fb.com>
      Cc: Daniel Xu <dlxu@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f7331648
    • T
      blk-cgroup: remove now unused blkg_print_stat_{bytes|ios}_recursive() · 8a80d5d6
      Tejun Heo 提交于
      These don't have users anymore.  Remove them.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8a80d5d6
    • T
      blk-throtl: stop using blkg->stat_bytes and ->stat_ios · 7ca46438
      Tejun Heo 提交于
      When used on cgroup1, blk-throtl uses the blkg->stat_bytes and
      ->stat_ios from blk-cgroup core to populate four stat knobs.
      blk-cgroup core is moving away from blkg_rwstat to improve scalability
      and won't be able to support this usage.
      
      It isn't like the sharing gains all that much.  Let's break them out
      to dedicated rwstat counters which are updated when on cgroup1.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7ca46438
    • T
      bfq-iosched: stop using blkg->stat_bytes and ->stat_ios · fd41e603
      Tejun Heo 提交于
      When used on cgroup1, bfq uses the blkg->stat_bytes and ->stat_ios
      from blk-cgroup core to populate six stat knobs.  blk-cgroup core is
      moving away from blkg_rwstat to improve scalability and won't be able
      to support this usage.
      
      It isn't like the sharing gains all that much.  Let's break it out to
      dedicated rwstat counters which are updated when on cgroup1.  This
      makes use of bfqg_*rwstat*() helpers outside of
      CONFIG_BFQ_CGROUP_DEBUG.  Move them out.
      
      v2: Compile fix when !CONFIG_BFQ_CGROUP_DEBUG.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Paolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      fd41e603
    • T
      bfq-iosched: relocate bfqg_*rwstat*() helpers · a557f1c7
      Tejun Heo 提交于
      Collect them right under #ifdef CONFIG_BFQ_CGROUP_DEBUG.  The next
      patch will use them from !DEBUG path and this makes it easy to move
      them out of the ifdef block.
      
      This is pure code reorganization.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a557f1c7
    • J
      Merge branch 'for-linus' into for-5.5/block · 912c0a85
      Jens Axboe 提交于
      Pull on for-linus to resolve what otherwise would have been a conflict
      with the cgroups rstat patchset from Tejun.
      
      * for-linus: (942 commits)
        blkcg: make blkcg_print_stat() print stats only for online blkgs
        nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
        nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
        nvme-rdma: fix a segmentation fault during module unload
        iocost: don't nest spin_lock_irq in ioc_weight_write()
        io_uring: ensure we clear io_kiocb->result before each issue
        um-ubd: Entrust re-queue to the upper layers
        nvme-multipath: remove unused groups_only mode in ana log
        nvme-multipath: fix possible io hang after ctrl reconnect
        io_uring: don't touch ctx in setup after ring fd install
        io_uring: Fix leaked shadow_req
        Linux 5.4-rc5
        riscv: cleanup do_trap_break
        nbd: verify socket is supported during setup
        ata: libahci_platform: Fix regulator_get_optional() misuse
        nbd: handle racing with error'ed out commands
        nbd: protect cmd->status with cmd->lock
        io_uring: fix bad inflight accounting for SETUP_IOPOLL|SETUP_SQTHREAD
        io_uring: used cached copies of sq->dropped and cq->overflow
        ARM: dts: stm32: relax qspi pins slew-rate for stm32mp157
        ...
      912c0a85
  3. 07 11月, 2019 11 次提交
  4. 06 11月, 2019 6 次提交
  5. 05 11月, 2019 8 次提交
新手
引导
客服 返回
顶部