1. 02 6月, 2015 1 次提交
    • T
      writeback: move backing_dev_info->state into bdi_writeback · 4452226e
      Tejun Heo 提交于
      Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback)
      and the role of the separation is unclear.  For cgroup support for
      writeback IOs, a bdi will be updated to host multiple wb's where each
      wb serves writeback IOs of a different cgroup on the bdi.  To achieve
      that, a wb should carry all states necessary for servicing writeback
      IOs for a cgroup independently.
      
      This patch moves bdi->state into wb.
      
      * enum bdi_state is renamed to wb_state and the prefix of all enums is
        changed from BDI_ to WB_.
      
      * Explicit zeroing of bdi->state is removed without adding zeoring of
        wb->state as the whole data structure is zeroed on init anyway.
      
      * As there's still only one bdi_writeback per backing_dev_info, all
        uses of bdi->state are mechanically replaced with bdi->wb.state
        introducing no behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: drbd-dev@lists.linbit.com
      Cc: Neil Brown <neilb@suse.de>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4452226e
  2. 06 5月, 2015 2 次提交
  3. 02 5月, 2015 1 次提交
    • I
      rbd: end I/O the entire obj_request on error · 082a75da
      Ilya Dryomov 提交于
      When we end I/O struct request with error, we need to pass
      obj_request->length as @nr_bytes so that the entire obj_request worth
      of bytes is completed.  Otherwise block layer ends up confused and we
      trip on
      
          rbd_assert(more ^ (which == img_request->obj_request_count));
      
      in rbd_img_obj_callback() due to more being true no matter what.  We
      already do it in most cases but we are missing some, in particular
      those where we don't even get a chance to submit any obj_requests, due
      to an early -ENOMEM for example.
      
      A number of obj_request->xferred assignments seem to be redundant but
      I haven't touched any of obj_request->xferred stuff to keep this small
      and isolated.
      
      Cc: Alex Elder <elder@linaro.org>
      Cc: stable@vger.kernel.org # 3.10+
      Reported-by: NShawn Edwards <lesser.evil@gmail.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      082a75da
  4. 22 4月, 2015 1 次提交
  5. 20 4月, 2015 2 次提交
  6. 16 4月, 2015 11 次提交
  7. 15 4月, 2015 1 次提交
  8. 12 4月, 2015 1 次提交
  9. 11 4月, 2015 1 次提交
    • J
      sd, mmc, virtio_blk, string_helpers: fix block size units · b9f28d86
      James Bottomley 提交于
      The current string_get_size() overflows when the device size goes over
      2^64 bytes because the string helper routine computes the suffix from
      the size in bytes.  However, the entirety of Linux thinks in terms of
      blocks, not bytes, so this will artificially induce an overflow on very
      large devices.  Fix this by making the function string_get_size() take
      blocks and the block size instead of bytes.  This should allow us to
      keep working until the current SCSI standard overflows.
      
      Also fix virtio_blk and mmc (both of which were also artificially
      multiplying by the block size to pass a byte side to string_get_size()).
      
      The mathematics of this is pretty simple:  we're taking a product of
      size in blocks (S) and block size (B) and trying to re-express this in
      exponential form: S*B = R*N^E (where N, the exponent is either 1000 or
      1024) and R < N.  Mathematically, S = RS*N^ES and B=RB*N^EB, so if RS*RB
      < N it's easy to see that S*B = RS*RB*N^(ES+EB).  However, if RS*BS > N,
      we can see that this can be re-expressed as RS*BS = R*N (where R =
      RS*BS/N < N) so the whole exponent becomes R*N^(ES+EB+1)
      
      [jejb: fix incorrect 32 bit do_div spotted by kbuild test robot <fengguang.wu@intel.com>]
      Acked-by: NUlf Hansson <ulf.hansson@linaro.org>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      b9f28d86
  10. 08 4月, 2015 4 次提交
  11. 07 4月, 2015 2 次提交
  12. 03 4月, 2015 7 次提交
  13. 01 4月, 2015 6 次提交
    • I
      drivers/block/pmem: Fix 32-bit build warning in pmem_alloc() · 4c1eaa23
      Ingo Molnar 提交于
      Fix:
      
        drivers/block/pmem.c: In function ‘pmem_alloc’:
        drivers/block/pmem.c:138:7: warning: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘phys_addr_t’ [-Wformat=]
      
      By using the proper %pa format specifier we use for 'phys_addr_t' arguments.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-nvdimm@ml01.01.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4c1eaa23
    • R
      drivers/block/pmem: Add a driver for persistent memory · 9e853f23
      Ross Zwisler 提交于
      PMEM is a new driver that presents a reserved range of memory as
      a block device.  This is useful for developing with NV-DIMMs,
      and can be used with volatile memory as a development platform.
      
      This patch contains the initial driver from Ross Zwisler, with
      various changes: converted it to use a platform_device for
      discovery, fixed partition support and merged various patches
      from Boaz Harrosh.
      Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-nvdimm@ml01.01.org
      Link: http://lkml.kernel.org/r/1427872339-6688-3-git-send-email-hch@lst.de
      [ Minor cleanups. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9e853f23
    • J
      NVMe: increase depth of admin queue · d31af0a3
      Jens Axboe 提交于
      Usually the admin queue depth of 64 is plenty, but for some use cases we
      really need it larger. Examples are use cases like MAT, where you have
      to touch all of NAND for init/format like purposes. In those cases, we
      see a good 2x increase with an increased queue depth.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      Acked-by: NKeith Busch <keith.busch@intel.com>
      d31af0a3
    • M
      nvme: Fix PRP list calculation for non-4k system page size · f137e0f1
      Murali Iyer 提交于
      PRP list calculation is supposed to be based on device's page size.
      Systems with page size larger than device's page size cause corruption
      to the name space as well as system memory with out this fix.
      Systems like x86 might not experience this issue because it uses
      PAGE_SIZE of 4K where as powerpc uses PAGE_SIZE of 64k while NVMe device's
      page size varies depending upon the vendor.
      Signed-off-by: NMurali Iyer <mniyer@us.ibm.com>
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Acked-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      f137e0f1
    • K
      NVMe: Fix blk-mq hot cpu notification · 1efccc9d
      Keith Busch 提交于
      The driver may issue commands to a device that may never return, so its
      request_queue could always have active requests while the controller is
      running. Waiting for the queue to freeze could block forever, which is
      what blk-mq's hot cpu notification handler was doing when nvme drives
      were in use.
      
      This has the nvme driver make the asynchronous event command's tag
      reserved and does not keep the request active. We can't have more than
      one since the request is released back to the request_queue before the
      command is completed. Having only one avoids potential tag collisions,
      and reserving the tag for this purpose prevents other admin tasks from
      reusing the tag.
      
      I also couldn't think of a scenario where issuing AEN requests single
      depth is worse than issuing them in batches, so I don't think we lose
      anything with this change.
      
      As an added bonus, doing it this way removes "Cancelling I/O" warnings
      observed when unbinding the nvme driver from a device.
      Reported-by: NYigal Korman <yigal@plexistor.com>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      1efccc9d
    • C
      NVMe: embedded iod mask cleanup · fda631ff
      Chong Yuan 提交于
      Remove unused mask in nvme_alloc_iod
      Signed-off-by: NChong Yuan <chong.yuan@memblaze.com>
      Reviewed-by: NWenbo Wang  <wenbo.wang@memblaze.com>
      Acked-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      fda631ff