1. 30 8月, 2017 1 次提交
    • D
      libnvdimm, label: fix index block size calculation · 02881768
      Dan Williams 提交于
      The old calculation assumed that the label space was 128k and the label
      size is 128. With v1.2 labels where the label size is 256 this
      calculation will return zero. We are saved by the fact that the
      nsindex_size is always pre-initialized from a previous 128 byte
      assumption and we are lucky that the index sizes turn out the same.
      
      Fix this going forward in case we start encountering different
      geometries of label areas besides 128k.
      
      Since the label size can change from one call to the next, drop the
      caching of nsindex_size.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      02881768
  2. 24 8月, 2017 1 次提交
    • C
      block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992
      Christoph Hellwig 提交于
      This way we don't need a block_device structure to submit I/O.  The
      block_device has different life time rules from the gendisk and
      request_queue and is usually only available when the block device node
      is open.  Other callers need to explicitly create one (e.g. the lightnvm
      passthrough code, or the new nvme multipathing code).
      
      For the actual I/O path all that we need is the gendisk, which exists
      once per block device.  But given that the block layer also does
      partition remapping we additionally need a partition index, which is
      used for said remapping in generic_make_request.
      
      Note that all the block drivers generally want request_queue or
      sometimes the gendisk, so this removes a layer of indirection all
      over the stack.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      74d46992
  3. 12 8月, 2017 1 次提交
  4. 10 8月, 2017 1 次提交
  5. 05 8月, 2017 1 次提交
  6. 26 7月, 2017 1 次提交
    • O
      libnvdimm: Stop using HPAGE_SIZE · 0dd69643
      Oliver O'Halloran 提交于
      Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and
      PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when
      hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more
      in common with THP than hugetlbfs we should proably be using
      HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just
      give it a new name.
      
      The other usage of HPAGE_SIZE in libnvdimm is when determining how large
      the altmap should be. For the reasons mentioned above it doesn't really
      make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to
      use in generic code and it happens to match the vmemmap allocation block
      on x86 and Power. It's still a hack, but it's a slightly nicer hack.
      Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0dd69643
  7. 30 6月, 2017 1 次提交
  8. 16 6月, 2017 5 次提交
  9. 11 5月, 2017 1 次提交
    • V
      libnvdimm: add an atomic vs process context flag to rw_bytes · 3ae3d67b
      Vishal Verma 提交于
      nsio_rw_bytes can clear media errors, but this cannot be done while we
      are in an atomic context due to locking within ACPI. From the BTT,
      ->rw_bytes may be called either from atomic or process context depending
      on whether the calls happen during initialization or during IO.
      
      During init, we want to ensure error clearing happens, and the flag
      marking process context allows nsio_rw_bytes to do that. When called
      during IO, we're in atomic context, and error clearing can be skipped.
      
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      3ae3d67b
  10. 05 5月, 2017 1 次提交
  11. 13 4月, 2017 1 次提交
  12. 01 3月, 2017 1 次提交
    • D
      nfit, libnvdimm: fix interleave set cookie calculation · 86ef58a4
      Dan Williams 提交于
      The interleave-set cookie is a sum that sanity checks the composition of
      an interleave set has not changed from when the namespace was initially
      created.  The checksum is calculated by sorting the DIMMs by their
      location in the interleave-set. The comparison for the sort must be
      64-bit wide, not byte-by-byte as performed by memcmp() in the broken
      case.
      
      Fix the implementation to accept correct cookie values in addition to
      the Linux "memcmp" order cookies, but only allow correct cookies to be
      generated going forward. It does mean that namespaces created by
      third-party-tooling, or created by newer kernels with this fix, will not
      validate on older kernels. However, there are a couple mitigating
      conditions:
      
          1/ platforms with namespace-label capable NVDIMMs are not widely
             available.
      
          2/ interleave-sets with a single-dimm are by definition not affected
             (nothing to sort). This covers the QEMU-KVM NVDIMM emulation case.
      
      The cookie stored in the namespace label will be fixed by any write the
      namespace label, the most straightforward way to achieve this is to
      write to the "alt_name" attribute of a namespace in sysfs.
      
      Cc: <stable@vger.kernel.org>
      Fixes: eaf96153 ("libnvdimm, nfit: add interleave-set state-tracking infrastructure")
      Reported-by: NNicholas Moulin <nicholas.w.moulin@linux.intel.com>
      Tested-by: NNicholas Moulin <nicholas.w.moulin@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      86ef58a4
  13. 19 10月, 2016 2 次提交
    • D
      libnvdimm: allow a platform to force enable label support · 42237e39
      Dan Williams 提交于
      Platforms like QEMU-KVM implement an NFIT table and label DSMs.
      However, since that environment does not define an aliased
      configuration, the labels are currently ignored and the kernel registers
      a single full-sized pmem-namespace per region. Now that the kernel
      supports sub-divisions of pmem regions the labels have a purpose.
      Arrange for the labels to be honored when we find an existing / valid
      namespace index block.
      
      Cc: <qemu-devel@nongnu.org>
      Cc: Haozhong Zhang <haozhong.zhang@intel.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      42237e39
    • T
      libnvdimm: use generic iostat interfaces · 8d7c22ac
      Toshi Kani 提交于
      nd_iostat_start() and nd_iostat_end() implement the same functionality
      that generic_start_io_acct() and generic_end_io_acct() already provide.
      
      Change nd_iostat_start() and nd_iostat_end() to call the generic iostat
      interfaces.  There is no change in the nd interfaces.
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      8d7c22ac
  14. 01 10月, 2016 2 次提交
  15. 25 9月, 2016 1 次提交
  16. 02 9月, 2016 1 次提交
  17. 09 8月, 2016 1 次提交
  18. 12 7月, 2016 3 次提交
    • D
      libnvdimm: cycle flush hints · 0c27af60
      Dan Williams 提交于
      When the NFIT provides multiple flush hint addresses per-dimm it is
      expressing that the platform is capable of processing multiple flush
      requests in parallel.  There is some fixed cost per flush request, let
      the cost be shared in parallel on multiple cpus.
      
      Since there may not be enough flush hint addresses for each cpu to have
      one, keep a per-cpu index of the last used hint, hash it with current
      pid, and assume that access pattern and scheduler randomness will keep
      the flush-hint usage somewhat staggered across cpus.
      
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0c27af60
    • D
      libnvdimm, nfit: move flush hint mapping to region-device driver-data · e5ae3b25
      Dan Williams 提交于
      In preparation for triggering flushes of a DIMM's writes-posted-queue
      (WPQ) via the pmem driver move mapping of flush hint addresses to the
      region driver.  Since this uses devm_nvdimm_memremap() the flush
      addresses will remain mapped while any region to which the dimm belongs
      is active.
      
      We need to communicate more information to the nvdimm core to facilitate
      this mapping, namely each dimm object now carries an array of flush hint
      address resources.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      e5ae3b25
    • D
      libnvdimm, nfit: remove nfit_spa_map() infrastructure · a8a6d2e0
      Dan Williams 提交于
      Now that all shared mappings are handled by devm_nvdimm_memremap() we no
      longer need nfit_spa_map() nor do we need to trigger a callback to the
      bus provider at region disable time.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      a8a6d2e0
  19. 21 5月, 2016 1 次提交
  20. 10 5月, 2016 1 次提交
    • D
      libnvdimm, dax: introduce device-dax infrastructure · cd03412a
      Dan Williams 提交于
      Device DAX is the device-centric analogue of Filesystem DAX
      (CONFIG_FS_DAX).  It allows persistent memory ranges to be allocated and
      mapped without need of an intervening file system.  This initial
      infrastructure arranges for a libnvdimm pfn-device to be represented as
      a different device-type so that it can be attached to a driver other
      than the pmem driver.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      cd03412a
  21. 23 4月, 2016 5 次提交
  22. 08 4月, 2016 1 次提交
    • D
      libnvdimm, pfn: fix nvdimm_namespace_add_poison() vs section alignment · a3901802
      Dan Williams 提交于
      When section alignment padding is in effect we need to shift / truncate
      the range that is queried for poison by the 'start_pad' or 'end_trunc'
      reservations.
      
      It's easiest if we just pass in an adjusted resource range rather than
      deriving it from the passed in namespace.  With the resource range
      resolution pushed out to the caller we can also push the
      namespace-to-region lookup to the caller and drop the implicit pmem-type
      assumption about the passed in namespace object.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      a3901802
  23. 10 3月, 2016 1 次提交
  24. 06 3月, 2016 1 次提交
  25. 10 1月, 2016 3 次提交
  26. 13 12月, 2015 1 次提交