1. 24 10月, 2016 1 次提交
  2. 08 10月, 2016 1 次提交
    • R
      /dev/dax: fix Kconfig dependency build breakage · 4e65e938
      Ross Zwisler 提交于
      The function dax_pmem_probe() in drivers/dax/pmem.c is compiled under the
      CONFIG_DEV_DAX_PMEM tri-state config option.  This config option currently
      only depends on CONFIG_NVDIMM_DAX, a bool, which means that the following
      configuration is possible:
      
      CONFIG_LIBNVDIMM=m
      ...
      CONFIG_NVDIMM_DAX=y
      CONFIG_DEV_DAX=y
      CONFIG_DEV_DAX_PMEM=y
      
      With this config LIBNVDIMM is compiled as a module with NVDIMM_DAX=y just
      meaning that we will compile drivers/nvdimm/dax_devs.c into that module.
      However, dax_pmem_probe() depends on several symbols defined in
      drivers/nvdimm/dax_devs.c, which results in the following build errors:
      
      drivers/built-in.o: In function `dax_pmem_probe':
      linux/drivers/dax/pmem.c:70: undefined reference to `to_nd_dax'
      linux/drivers/dax/pmem.c:74: undefined reference to
      `nvdimm_namespace_common_probe'
      linux/drivers/dax/pmem.c:80: undefined reference to `devm_nsio_enable'
      linux/drivers/dax/pmem.c:81: undefined reference to `nvdimm_setup_pfn'
      linux/drivers/dax/pmem.c:84: undefined reference to `devm_nsio_disable'
      linux/drivers/dax/pmem.c:122: undefined reference to `to_nd_region'
      drivers/built-in.o: In function `dax_pmem_init':
      linux/drivers/dax/pmem.c:147: undefined reference to `__nd_driver_register'
      
      Fix this by making NVDIMM_DAX a tristate.  DEV_DAX_PMEM depends on
      NVDIMM_DAX which depends on LIBNVDIMM.  Since they are all now tristates,
      if LIBNVDIMM is built as a kernel module DEV_DAX_PMEM will be as well.
      This prevents dax_devs.c from being built as a built-in while its
      dependencies are in the libnvdimm.ko module.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      4e65e938
  3. 08 7月, 2016 1 次提交
    • D
      libnvdimm: introduce devm_nvdimm_memremap(), convert nfit_spa_map() users · 29b9aa0a
      Dan Williams 提交于
      In preparation for generically mapping flush hint addresses for both the
      BLK and PMEM use case, provide a generic / reference counted mapping
      api.  Given the fact that a dimm may belong to multiple regions (PMEM
      and BLK), the flush hint addresses need to be held valid as long as any
      region associated with the dimm is active.  This is similar to the
      existing BLK-region case where multiple BLK-regions may share an
      aperture mapping.  Up-level this shared / reference-counted mapping
      capability from the nfit driver to a core nvdimm capability.
      
      This eliminates the need for the nd_blk_region.disable() callback.  Note
      that the removal of nfit_spa_map() and related infrastructure is
      deferred to a later patch.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      29b9aa0a
  4. 10 5月, 2016 1 次提交
    • D
      libnvdimm, dax: introduce device-dax infrastructure · cd03412a
      Dan Williams 提交于
      Device DAX is the device-centric analogue of Filesystem DAX
      (CONFIG_FS_DAX).  It allows persistent memory ranges to be allocated and
      mapped without need of an intervening file system.  This initial
      infrastructure arranges for a libnvdimm pfn-device to be represented as
      a different device-type so that it can be attached to a driver other
      than the pmem driver.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      cd03412a
  5. 29 8月, 2015 2 次提交
    • D
      libnvdimm, pmem: 'struct page' for pmem · 32ab0a3f
      Dan Williams 提交于
      Enable the pmem driver to handle PFN device instances.  Attaching a pmem
      namespace to a pfn device triggers the driver to allocate and initialize
      struct page entries for pmem.  Memory capacity for this allocation comes
      exclusively from RAM for now which is suitable for low PMEM to RAM
      ratios.  This mechanism will be expanded later for setting an "allocate
      from PMEM" policy.
      
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      32ab0a3f
    • D
      libnvdimm, pfn: 'struct page' provider infrastructure · e1455744
      Dan Williams 提交于
      Implement the base infrastructure for libnvdimm PFN devices. Similar to
      BTT devices they take a namespace as a backing device and layer
      functionality on top. In this case the functionality is reserving space
      for an array of 'struct page' entries to be handed out through
      pfn_to_page(). For now this is just the basic libnvdimm-device-model for
      configuring the base PFN device.
      
      As the namespace claiming mechanism for PFN devices is mostly identical
      to BTT devices drivers/nvdimm/claim.c is created to house the common
      bits.
      
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      e1455744
  6. 26 6月, 2015 2 次提交
    • R
      libnvdimm, nfit, nd_blk: driver for BLK-mode access persistent memory · 047fc8a1
      Ross Zwisler 提交于
      The libnvdimm implementation handles allocating dimm address space (DPA)
      between PMEM and BLK mode interfaces.  After DPA has been allocated from
      a BLK-region to a BLK-namespace the nd_blk driver attaches to handle I/O
      as a struct bio based block device. Unlike PMEM, BLK is required to
      handle platform specific details like mmio register formats and memory
      controller interleave.  For this reason the libnvdimm generic nd_blk
      driver calls back into the bus provider to carry out the I/O.
      
      This initial implementation handles the BLK interface defined by the
      ACPI 6 NFIT [1] and the NVDIMM DSM Interface Example [2] composed from
      DCR (dimm control region), BDW (block data window), IDT (interleave
      descriptor) NFIT structures and the hardware register format.
      [1]: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
      [2]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      047fc8a1
    • V
      nd_btt: atomic sector updates · 5212e11f
      Vishal Verma 提交于
      BTT stands for Block Translation Table, and is a way to provide power
      fail sector atomicity semantics for block devices that have the ability
      to perform byte granularity IO. It relies on the capability of libnvdimm
      namespace devices to do byte aligned IO.
      
      The BTT works as a stacked blocked device, and reserves a chunk of space
      from the backing device for its accounting metadata. It is a bio-based
      driver because all IO is done synchronously, and there is no queuing or
      asynchronous completions at either the device or the driver level.
      
      The BTT uses 'lanes' to index into various 'on-disk' data structures,
      and lanes also act as a synchronization mechanism in case there are more
      CPUs than available lanes. We did a comparison between two lane lock
      strategies - first where we kept an atomic counter around that tracked
      which was the last lane that was used, and 'our' lane was determined by
      atomically incrementing that. That way, for the nr_cpus > nr_lanes case,
      theoretically, no CPU would be blocked waiting for a lane. The other
      strategy was to use the cpu number we're scheduled on to and hash it to
      a lane number. Theoretically, this could block an IO that could've
      otherwise run using a different, free lane. But some fio workloads
      showed that the direct cpu -> lane hash performed faster than tracking
      'last lane' - my reasoning is the cache thrash caused by moving the
      atomic variable made that approach slower than simply waiting out the
      in-progress IO. This supports the conclusion that the driver can be a
      very simple bio-based one that does synchronous IOs instead of queuing.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      [jmoyer: fix nmi watchdog timeout in btt_map_init]
      [jmoyer: move btt initialization to module load path]
      [jmoyer: fix memory leak in the btt initialization path]
      [jmoyer: Don't overwrite corrupted arenas]
      Signed-off-by: NVishal Verma <vishal.l.verma@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5212e11f
  7. 25 6月, 2015 3 次提交