1. 13 11月, 2017 4 次提交
    • I
      rbd: default to single-major device number scheme · 3cfa3b16
      Ilya Dryomov 提交于
      It's been 3.5 years, let's turn it on by default.  Support in rbd(8)
      utility goes back to pre-firefly, "rbd map" has been loading the module
      with single_major=Y ever since.  However, if the module is already
      loaded (whether by hand or at boot time), we end up with single_major=N.
      Also, some people don't install rbd(8) and use the sysfs interface
      directly.
      
      (With single-major=N, a major number is consumed for every mapping,
      imposing a limit of ~240 rbd images per host.  single-major=Y allows
      mapping thousands of rbd images on a single machine.)
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NJason Dillaman <dillaman@redhat.com>
      3cfa3b16
    • D
      rbd: set discard_alignment to zero · 7c084289
      David Disseldorp 提交于
      RBD devices are currently incorrectly initialised with the block queue
      discard_alignment set to the underlying RADOS object size.
      
      As per Documentation/ABI/testing/sysfs-block:
        The discard_alignment parameter indicates how many bytes the beginning
        of the device is offset from the internal allocation unit's natural
        alignment.
      
      Correcting the discard_alignment parameter from the RADOS object size to
      zero (the blk_set_default_limits() default) has no effect on how discard
      requests are propagated through the block layer - @alignment in
      __blkdev_issue_discard() remains zero. However, it does fix the UNMAP
      granularity alignment value advertised to SCSI initiators via the Block
      Limits VPD.
      Signed-off-by: NDavid Disseldorp <ddiss@suse.de>
      Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      7c084289
    • I
      rbd: get rid of rbd_mapping::read_only · 9568c93e
      Ilya Dryomov 提交于
      It is redundant -- rw/ro state is stored in hd_struct and managed by
      the block layer.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      9568c93e
    • I
      rbd: fix and simplify rbd_ioctl_set_ro() · 1de797bb
      Ilya Dryomov 提交于
      ->open_count/-EBUSY check is bogus and wrong: when an open device is
      set read-only, blkdev_write_iter() refuses further writes with -EPERM.
      This is standard behaviour and all other block devices allow this.
      
      set_disk_ro() call is also problematic: we affect the entire device
      when called on a single partition.
      
      All rbd_ioctl_set_ro() needs to do is refuse ro -> rw transition for
      mapped snapshots.  Everything else can be handled by generic code.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      1de797bb
  2. 09 11月, 2017 1 次提交
  3. 07 9月, 2017 1 次提交
  4. 19 6月, 2017 1 次提交
    • N
      rbd: use bio_clone_fast() instead of bio_clone() · f856dc36
      NeilBrown 提交于
      bio_clone() makes a copy of the bi_io_vec, but rbd never changes that,
      so there is no need for a copy.
      bio_clone_fast() can be used instead, which avoids making the copy.
      
      This requires that we provide a bio_set.  bio_clone() uses fs_bio_set,
      but it isn't, in general, safe to use the same bio_set at different
      levels of the stack, as that can lead to deadlocks.  As filesystems
      use fs_bio_set, block devices shouldn't.
      
      As rbd never stacks, it is safe to have a single global bio_set for
      all rbd devices to use.  So allocate that when the module is
      initialised, and use it with bio_clone_fast().
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f856dc36
  5. 09 6月, 2017 2 次提交
    • C
      blk-mq: switch ->queue_rq return value to blk_status_t · fc17b653
      Christoph Hellwig 提交于
      Use the same values for use for request completion errors as the return
      value from ->queue_rq.  BLK_STS_RESOURCE is special cased to cause
      a requeue, and all the others are completed as-is.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      fc17b653
    • C
      block: introduce new block status code type · 2a842aca
      Christoph Hellwig 提交于
      Currently we use nornal Linux errno values in the block layer, and while
      we accept any error a few have overloaded magic meanings.  This patch
      instead introduces a new  blk_status_t value that holds block layer specific
      status codes and explicitly explains their meaning.  Helpers to convert from
      and to the previous special meanings are provided for now, but I suspect
      we want to get rid of them in the long run - those drivers that have a
      errno input (e.g. networking) usually get errnos that don't know about
      the special block layer overloads, and similarly returning them to userspace
      will usually return somethings that strictly speaking isn't correct
      for file system operations, but that's left as an exercise for later.
      
      For now the set of errors is a very limited set that closely corresponds
      to the previous overloaded errno values, but there is some low hanging
      fruite to improve it.
      
      blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
      typechecking, so that we can easily catch places passing the wrong values.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2a842aca
  6. 29 5月, 2017 1 次提交
    • I
      rbd: implement REQ_OP_WRITE_ZEROES · 6ac56951
      Ilya Dryomov 提交于
      Commit 93c1defe ("rbd: remove the discard_zeroes_data flag")
      explicitly didn't implement REQ_OP_WRITE_ZEROES for rbd, while the
      following commit 48920ff2 ("block: remove the discard_zeroes_data
      flag") dropped ->discard_zeroes_data in favor of REQ_OP_WRITE_ZEROES.
      
      rbd does support efficient zeroing via CEPH_OSD_OP_ZERO opcode and will
      release either some or all blocks depending on whether the zeroing
      request is rbd_obj_bytes() aligned.  This is how we currently implement
      discards, so REQ_OP_WRITE_ZEROES can be identical to REQ_OP_DISCARD for
      now.  Caveats:
      
      - REQ_NOUNMAP is ignored, but AFAICT that's true of at least two other
        current implementations - nvme and loop
      
      - there is no ->write_zeroes_alignment and blk_bio_write_zeroes_split()
        is hence less helpful than blk_bio_discard_split(), but this can (and
        should) be fixed on the rbd side
      
      In the future we will split these into two code paths to respect
      REQ_NOUNMAP on zeroout and save on zeroing blocks that couldn't be
      released on discard.
      
      Fixes: 93c1defe ("rbd: remove the discard_zeroes_data flag")
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NJason Dillaman <dillaman@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      6ac56951
  7. 09 5月, 2017 1 次提交
  8. 04 5月, 2017 10 次提交
  9. 02 5月, 2017 1 次提交
  10. 09 4月, 2017 1 次提交
  11. 31 3月, 2017 1 次提交
  12. 07 3月, 2017 1 次提交
  13. 25 2月, 2017 1 次提交
  14. 20 2月, 2017 14 次提交