1. 03 8月, 2020 4 次提交
  2. 30 7月, 2020 1 次提交
    • J
      Merge branch 'nvme-5.9' of git://git.infradead.org/nvme into for-5.9/drivers · a9e8e18a
      Jens Axboe 提交于
      Pull NVMe updates from Christoph.
      
      * 'nvme-5.9' of git://git.infradead.org/nvme: (30 commits)
        nvme-loop: remove extra variable in create ctrl
        nvme-loop: set ctrl state connecting after init
        nvme-multipath: do not fall back to __nvme_find_path() for non-optimized paths
        nvme-multipath: fix logic for non-optimized paths
        nvme-rdma: fix controller reset hang during traffic
        nvme-tcp: fix controller reset hang during traffic
        nvmet: introduce the passthru Kconfig option
        nvmet: introduce the passthru configfs interface
        nvmet: Add passthru enable/disable helpers
        nvmet: add passthru code to process commands
        nvme: export nvme_find_get_ns() and nvme_put_ns()
        nvme: introduce nvme_ctrl_get_by_path()
        nvme: introduce nvme_execute_passthru_rq to call nvme_passthru_[start|end]()
        nvme: create helper function to obtain command effects
        nvme: clear any SGL flags in passthru commands
        nvmet-fc: remove redundant del_work_active flag
        nvmet-fc: check successful reference in nvmet_fc_find_target_assoc
        nvme-fc: set max_segments to lldd max value
        nvme-fc: drop a duplicated word in a comment
        nvme-hwmon: log the controller device name
        ...
      a9e8e18a
  3. 29 7月, 2020 30 次提交
  4. 28 7月, 2020 1 次提交
  5. 25 7月, 2020 4 次提交
    • C
      bcache: fix bio_{start,end}_io_acct with proper device · a2f32ee8
      Coly Li 提交于
      Commit 85750aeb ("bcache: use bio_{start,end}_io_acct") moves the
      io account code to the location after bio_set_dev(bio, dc->bdev) in
      cached_dev_make_request(). Then the account is performed incorrectly on
      backing device, indeed the I/O should be counted to bcache device like
      /dev/bcache0.
      
      With the mistaken I/O account, iostat does not display I/O counts for
      bcache device and all the numbers go to backing device. In writeback
      mode, the hard drive may have 340K+ IOPS which is impossible and wrong
      for spinning disk.
      
      This patch introduces bch_bio_start_io_acct() and bch_bio_end_io_acct(),
      which switches bio->bi_disk to bcache device before calling
      bio_start_io_acct() or bio_end_io_acct(). Now the I/Os are counted to
      bcache device, and bcache device, cache device and backing device have
      their correct I/O count information back.
      
      Fixes: 85750aeb ("bcache: use bio_{start,end}_io_acct")
      Signed-off-by: NColy Li <colyli@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a2f32ee8
    • C
      bcache: avoid extra memory consumption in struct bbio for large bucket size · 4e4d4e09
      Coly Li 提交于
      Bcache uses struct bbio to do I/Os for meta data pages like uuids,
      disk_buckets, prio_buckets, and btree nodes.
      
      Example writing a btree node onto cache device, the process is,
      - Allocate a struct bbio from mempool c->bio_meta.
      - Inside struct bbio embedded a struct bio, initialize bi_inline_vecs
        for this embedded bio.
      - Call bch_bio_map() to map each meta data page to each bv from the
        inlined  bi_io_vec table.
      - Call bch_submit_bbio() to submit the bio into underlying block layer.
      - When the I/O completed, only release the struct bbio, don't touch the
        reference counter of the meta data pages.
      
      The struct bbio is defined as,
      738 struct bbio {
      739     unsigned int            submit_time_us;
      	[snipped]
      748     struct bio              bio;
      749 };
      
      Because struct bio is embedded at the end of struct bbio, therefore the
      actual size of struct bbio is sizeof(struct bio) + size of the embedded
      bio->bi_inline_vecs.
      
      Now all the meta data bucket size are limited to meta_bucket_pages(), if
      the bucket size is large than meta_bucket_pages()*PAGE_SECTORS, rested
      space in the bucket is unused. Therefore the most used space in meta
      bucket is (1<<MAX_ORDER) pages, or (1<<CONFIG_FORCE_MAX_ZONEORDER) if it
      is configured.
      
      Therefore for large bucket size, it is unnecessary to calculate the
      allocation size of mempool c->bio_meta as,
      	mempool_init_kmalloc_pool(&c->bio_meta, 2,
      			sizeof(struct bbio) +
      			sizeof(struct bio_vec) * bucket_pages(c))
      It is too large, neither the Linux buddy allocator cannot allocate so
      much continuous pages, nor the extra allocated pages are wasted.
      
      This patch replace bucket_pages() to meta_bucket_pages() in two places,
      - In bch_cache_set_alloc(), when initialize mempool c->bio_meta, uses
        sizeof(struct bbio) + sizeof(struct bio_vec) * bucket_pages(c) to set
        the allocating object size.
      - In bch_bbio_alloc(), when calling bio_init() to set inline bvec talbe
        bi_inline_bvecs, uses meta_bucket_pages() to indicate number of the
        inline bio vencs number.
      
      Now the maximum size of embedded bio inside struct bbio exactly matches
      the limit of meta_bucket_pages(), no extra page wasted.
      Signed-off-by: NColy Li <colyli@suse.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4e4d4e09
    • C
      bcache: avoid extra memory allocation from mempool c->fill_iter · 6907dc49
      Coly Li 提交于
      Mempool c->fill_iter is used to allocate memory for struct btree_iter in
      bch_btree_node_read_done() to iterate all keys of a read-in btree node.
      
      The allocation size is defined in bch_cache_set_alloc() by,
        mempool_init_kmalloc_pool(&c->fill_iter, 1, iter_size))
      where iter_size is defined by a calculation,
        (sb->bucket_size / sb->block_size + 1) * sizeof(struct btree_iter_set)
      
      For 16bit width bucket_size the calculation is OK, but now the bucket
      size is extended to 32bit, the bucket size can be 2GB. By the above
      calculation, iter_size can be 2048 pages (order 11 is still accepted by
      buddy allocator).
      
      But the actual size holds the bkeys in meta data bucket is limited to
      meta_bucket_pages() already, which is 16MB. By the above calculation,
      if replace sb->bucket_size by meta_bucket_pages() * PAGE_SECTORS, the
      result is 16 pages. This is the size large enough for the mempool
      allocation to struct btree_iter.
      
      Therefore in worst case every time mempool c->fill_iter allocates, at
      most 4080 pages are wasted and won't be used. Therefore this patch uses
      meta_bucket_pages() * PAGE_SECTORS to calculate the iter size in
      bch_cache_set_alloc(), to avoid extra memory allocation from mempool
      c->fill_iter.
      Signed-off-by: NColy Li <colyli@suse.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6907dc49
    • C
      bcache: add sysfs file to display feature sets information of cache set · 092bd54d
      Coly Li 提交于
      The following three sysfs files are created to display according feature
      set information of bcache:
      	/sys/fs/bcache/<cache set UUID>/internal/feature_compat
      	/sys/fs/bcache/<cache set UUID>/internal/feature_ro_compat
      	/sys/fs/bcache/<cache set UUID>/internal/feature_incompat
      is added by this patch, to display feature sets information of the cache
      set.
      
      Now only an incompat feature 'large_bucket' added in bcache, the sysfs
      file content is:
              [large_bucket]
      string large_bucket means the running bcache drive supports incompat
      feature 'large_bucket', the wrapping [] means the 'large_bucket' feature
      is currently enabled on this cache set.
      
      This patch is ready to display compat and ro_compat features, in future
      once bcache code implements such feature sets, the according feature
      strings will be displayed in their sysfs files too.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      092bd54d