1. 31 7月, 2018 1 次提交
  2. 18 7月, 2018 1 次提交
    • T
      block: make bdev_ops->rw_page() take a REQ_OP instead of bool · 3f289dcb
      Tejun Heo 提交于
      c11f0c0b ("block/mm: make bdev_ops->rw_page() take a bool for
      read/write") replaced @OP with boolean @is_write, which limited the
      amount of information going into ->rw_page() and more importantly
      page_endio(), which removed the need to expose block internals to mm.
      
      Unfortunately, we want to track discards separately and @is_write
      isn't enough information.  This patch updates bdev_ops->rw_page() to
      take REQ_OP instead but leaves page_endio() to take bool @is_write.
      This allows the block part of operations to have enough information
      while not leaking it to mm.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3f289dcb
  3. 29 6月, 2018 1 次提交
  4. 07 6月, 2018 2 次提交
    • R
      libnvdimm, pmem: Unconditionally deep flush on *sync · ce7f11a2
      Ross Zwisler 提交于
      Prior to this commit we would only do a "deep flush" (have nvdimm_flush()
      write to each of the flush hints for a region) in response to an
      msync/fsync/sync call if the nvdimm_has_cache() returned true at the time
      we were setting up the request queue.  This happens due to the write cache
      value passed in to blk_queue_write_cache(), which then causes the block
      layer to send down BIOs with REQ_FUA and REQ_PREFLUSH set.  We do have a
      "write_cache" sysfs entry for namespaces, i.e.:
      
        /sys/bus/nd/devices/pfn0.1/block/pmem0/dax/write_cache
      
      which can be used to control whether or not the kernel thinks a given
      namespace has a write cache, but this didn't modify the deep flush behavior
      that we set up when the driver was initialized.  Instead, it only modified
      whether or not DAX would flush CPU caches via dax_flush() in response to
      *sync calls.
      
      Simplify this by making the *sync deep flush always happen, regardless of
      the write cache setting of a namespace.  The DAX CPU cache flushing will
      still be controlled the write_cache setting of the namespace.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 5fdf8e5b ("libnvdimm: re-enable deep flush for pmem devices via fsync()")
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      ce7f11a2
    • R
      libnvdimm, pmem: Complete REQ_FLUSH => REQ_PREFLUSH · d2d6364d
      Ross Zwisler 提交于
      Complete the move from REQ_FLUSH to REQ_PREFLUSH that apparently started
      way back in v4.8.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      d2d6364d
  5. 23 5月, 2018 2 次提交
  6. 22 5月, 2018 1 次提交
    • D
      mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS · e7638488
      Dan Williams 提交于
      In preparation for fixing dax-dma-vs-unmap issues, filesystems need to
      be able to rely on the fact that they will get wakeups on dev_pagemap
      page-idle events. Introduce MEMORY_DEVICE_FS_DAX and
      generic_dax_page_free() as common indicator / infrastructure for dax
      filesytems to require. With this change there are no users of the
      MEMORY_DEVICE_HOST designation, so remove it.
      
      The HMM sub-system extended dev_pagemap to arrange a callback when a
      dev_pagemap managed page is freed. Since a dev_pagemap page is free /
      idle when its reference count is 1 it requires an additional branch to
      check the page-type at put_page() time. Given put_page() is a hot-path
      we do not want to incur that check if HMM is not in use, so a static
      branch is used to avoid that overhead when not necessary.
      
      Now, the FS_DAX implementation wants to reuse this mechanism for
      receiving dev_pagemap ->page_free() callbacks. Rework the HMM-specific
      static-key into a generic mechanism that either HMM or FS_DAX code paths
      can enable.
      
      For ARCH=um builds, and any other arch that lacks ZONE_DEVICE support,
      care must be taken to compile out the DEV_PAGEMAP_OPS infrastructure.
      However, we still need to support FS_DAX in the FS_DAX_LIMITED case
      implemented by the s390/dcssblk driver.
      
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Reported-by: NThomas Meyer <thomas@m3y3r.de>
      Reported-by: NDave Jiang <dave.jiang@intel.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      e7638488
  7. 15 5月, 2018 1 次提交
    • D
      x86/asm/memcpy_mcsafe: Return bytes remaining · 60622d68
      Dan Williams 提交于
      Machine check safe memory copies are currently deployed in the pmem
      driver whenever reading from persistent memory media, so that -EIO is
      returned rather than triggering a kernel panic. While this protects most
      pmem accesses, it is not complete in the filesystem-dax case. When
      filesystem-dax is enabled reads may bypass the block layer and the
      driver via dax_iomap_actor() and its usage of copy_to_iter().
      
      In preparation for creating a copy_to_iter() variant that can handle
      machine checks, teach memcpy_mcsafe() to return the number of bytes
      remaining rather than -EFAULT when an exception occurs.
      Co-developed-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: hch@lst.de
      Cc: linux-fsdevel@vger.kernel.org
      Cc: linux-nvdimm@lists.01.org
      Link: http://lkml.kernel.org/r/152539238119.31796.14318473522414462886.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      60622d68
  8. 15 3月, 2018 1 次提交
  9. 09 3月, 2018 1 次提交
  10. 07 3月, 2018 1 次提交
  11. 03 3月, 2018 1 次提交
    • D
      libnvdimm: re-enable deep flush for pmem devices via fsync() · 5fdf8e5b
      Dave Jiang 提交于
      Re-enable deep flush so that users always have a way to be sure that a
      write makes it all the way out to media. Writes from the PMEM driver
      always arrive at the NVDIMM since movnt is used to bypass the cache, and
      the driver relies on the ADR (Asynchronous DRAM Refresh) mechanism to
      flush write buffers on power failure. The Deep Flush mechanism is there
      to explicitly write buffers to protect against (rare) ADR failure.  This
      change prevents a regression in deep flush behavior so that applications
      can continue to depend on fsync() as a mechanism to trigger deep flush
      in the filesystem-DAX case.
      
      Fixes: 06e8ccda ("acpi: nfit: Add support for detect platform CPU cache...")
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NDave Jiang <dave.jiang@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5fdf8e5b
  12. 01 3月, 2018 1 次提交
  13. 02 2月, 2018 1 次提交
  14. 09 1月, 2018 1 次提交
  15. 16 11月, 2017 1 次提交
  16. 11 9月, 2017 1 次提交
    • M
      dax: remove the pmem_dax_ops->flush abstraction · c3ca015f
      Mikulas Patocka 提交于
      Commit abebfbe2 ("dm: add ->flush() dax operation support") is
      buggy. A DM device may be composed of multiple underlying devices and
      all of them need to be flushed. That commit just routes the flush
      request to the first device and ignores the other devices.
      
      It could be fixed by adding more complex logic to the device mapper. But
      there is only one implementation of the method pmem_dax_ops->flush - that
      is pmem_dax_flush() - and it calls arch_wb_cache_pmem(). Consequently, we
      don't need the pmem_dax_ops->flush abstraction at all, we can call
      arch_wb_cache_pmem() directly from dax_flush() because dax_dev->ops->flush
      can't ever reach anything different from arch_wb_cache_pmem().
      
      It should be also pointed out that for some uses of persistent memory it
      is needed to flush only a very small amount of data (such as 1 cacheline),
      and it would be overkill if we go through that device mapper machinery for
      a single flushed cache line.
      
      Fix this by removing the pmem_dax_ops->flush abstraction and call
      arch_wb_cache_pmem() directly from dax_flush(). Also, remove the device
      mapper code that forwards the flushes.
      
      Fixes: abebfbe2 ("dm: add ->flush() dax operation support")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Reviewed-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      c3ca015f
  17. 07 9月, 2017 1 次提交
  18. 01 7月, 2017 1 次提交
    • D
      libnvdimm, region, pmem: fix 'badblocks' sysfs_get_dirent() reference lifetime · 6aa734a2
      Dan Williams 提交于
      We need to hold a reference on the 'dirent' until we are sure there are
      no more notifications that will be sent. As noted in the new comments we
      take advantage of the fact that the references are taken and dropped
      under device_lock() and that nd_device_notify() holds device_lock() over
      new badblocks notifications. The notifications that happen when
      badblocks are cleared only occur while the device is active.
      
      Also take the opportunity to fix up the error messages to report the
      user visible effect of a sysfs_get_dirent() failure.
      
      Fixes: 975750a9 ("libnvdimm, pmem: Add sysfs notifications to badblocks")
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      6aa734a2
  19. 30 6月, 2017 2 次提交
    • D
      libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile region · 0b277961
      Dan Williams 提交于
      The pmem driver attaches to both persistent and volatile memory ranges
      advertised by the ACPI NFIT. When the region is volatile it is redundant
      to spend cycles flushing caches at fsync(). Check if the hosting region
      is volatile and do not set dax_write_cache() if it is.
      
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0b277961
    • D
      libnvdimm, pmem, dax: export a cache control attribute · 6e0c90d6
      Dan Williams 提交于
      The dax_flush() operation can be turned into a nop on platforms where
      firmware arranges for cpu caches to be flushed on a power-fail event.
      The ACPI 6.2 specification defines a mechanism for the platform to
      indicate this capability so the kernel can select the proper default.
      However, for other platforms, the administrator must toggle this setting
      manually.
      
      Given this flush setting is a dax-specific mechanism we advertise it
      through a 'dax' attribute group hanging off a host device. For example,
      a 'pmem0' block-device gets a 'dax' sysfs-subdirectory with a
      'write_cache' attribute to control response to dax cache flush requests.
      This is similar to the 'queue/write_cache' attribute that appears under
      block devices.
      
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Suggested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      6e0c90d6
  20. 28 6月, 2017 3 次提交
  21. 16 6月, 2017 4 次提交
    • D
      x86, dax, libnvdimm: remove wb_cache_pmem() indirection · 4e4f00a9
      Dan Williams 提交于
      With all handling of the CONFIG_ARCH_HAS_PMEM_API case being moved to
      libnvdimm and the pmem driver directly we do not need to provide global
      wrappers and fallbacks in the CONFIG_ARCH_HAS_PMEM_API=n case. The pmem
      driver will simply not link to arch_wb_cache_pmem() in that case.  Same
      as before, pmem flushing is only defined for x86_64, via
      clean_cache_range(), but it is straightforward to add other archs in the
      future.
      
      arch_wb_cache_pmem() is an exported function since the pmem module needs
      to find it, but it is privately declared in drivers/nvdimm/pmem.h because
      there are no consumers outside of the pmem driver.
      
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Oliver O'Halloran <oohall@gmail.com>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Suggested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      4e4f00a9
    • D
      dax, pmem: introduce an optional 'flush' dax_operation · 3c1cebff
      Dan Williams 提交于
      Filesystem-DAX flushes caches whenever it writes to the address returned
      through dax_direct_access() and when writing back dirty radix entries.
      That flushing is only required in the pmem case, so add a dax operation
      to allow pmem to take this extra action, but skip it for other dax
      capable devices that do not provide a flush routine.
      
      An example for this differentiation might be a volatile ram disk where
      there is no expectation of persistence. In fact the pmem driver itself might
      front such an address range specified by the NFIT. So, this "no flush"
      property might be something passed down by the bus / libnvdimm.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      3c1cebff
    • T
      libnvdimm, pmem: Add sysfs notifications to badblocks · 975750a9
      Toshi Kani 提交于
      Sysfs "badblocks" information may be updated during run-time that:
       - MCE, SCI, and sysfs "scrub" may add new bad blocks
       - Writes and ioctl() may clear bad blocks
      
      Add support to send sysfs notifications to sysfs "badblocks" file
      under region and pmem directories when their badblocks information
      is re-evaluated (but is not necessarily changed) during run-time.
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: Linda Knippers <linda.knippers@hpe.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      975750a9
    • D
      libnvdimm, label: honor the lba size specified in v1.2 labels · f979b13c
      Dan Williams 提交于
      Previously we only honored the lba size for blk-aperture mode
      namespaces. For pmem namespaces the lba size was just assumed to be 512.
      With the new v1.2 label definition and compatibility with other
      operating environments, the ->lbasize property is now respected for pmem
      namespaces.
      
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      f979b13c
  22. 10 6月, 2017 1 次提交
    • D
      x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations · 0aed55af
      Dan Williams 提交于
      The pmem driver has a need to transfer data with a persistent memory
      destination and be able to rely on the fact that the destination writes are not
      cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
      (non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
      to ensure data-writes have reached a power-fail-safe zone in the platform. The
      fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
      around and fence previous writes with an "sfence".
      
      Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
      memcpy_flushcache, that guarantee that the destination buffer is not dirty in
      the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
      will be used to replace the "pmem api" (include/linux/pmem.h +
      arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
      and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
      config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
      otherwise.
      
      This is meant to satisfy the concern from Linus that if a driver wants to do
      something beyond the normal nocache semantics it should be something private to
      that driver [1], and Al's concern that anything uaccess related belongs with
      the rest of the uaccess code [2].
      
      The first consumer of this interface is a new 'copy_from_iter' dax operation so
      that pmem can inject cache maintenance operations without imposing this
      overhead on other dax-capable drivers.
      
      [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
      [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
      
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0aed55af
  23. 09 6月, 2017 1 次提交
  24. 01 5月, 2017 1 次提交
    • D
      mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash · 71389703
      Dan Williams 提交于
      The x86 conversion to the generic GUP code included a small change which causes
      crashes and data corruption in the pmem code - not good.
      
      The root cause is that the /dev/pmem driver code implicitly relies on the x86
      get_user_pages() implementation doing a get_page() on the page refcount, because
      get_page() does a get_zone_device_page() which properly refcounts pmem's separate
      page struct arrays that are not present in the regular page struct structures.
      (The pmem driver does this because it can cover huge memory areas.)
      
      But the x86 conversion to the generic GUP code changed the get_page() to
      page_cache_get_speculative() which is faster but doesn't do the
      get_zone_device_page() call the pmem code relies on.
      
      One way to solve the regression would be to change the generic GUP code to use
      get_page(), but that would slow things down a bit and punish other generic-GUP
      using architectures for an x86-ism they did not care about. (Arguably the pmem
      driver was probably not working reliably for them: but nvdimm is an Intel
      feature, so non-x86 exposure is probably still limited.)
      
      So restructure the pmem code's interface with the MM instead: get rid of the
      get/put_zone_device_page() distinction, integrate put_zone_device_page() into
      __put_page() and and restructure the pmem completion-wait and teardown machinery:
      
      Kirill points out that the calls to {get,put}_dev_pagemap() can be
      removed from the mm fast path if we take a single get_dev_pagemap()
      reference to signify that the page is alive and use the final put of the
      page to drop that reference.
      
      This does require some care to make sure that any waits for the
      percpu_ref to drop to zero occur *after* devm_memremap_page_release(),
      since it now maintains its own elevated reference.
      
      This speeds up things while also making the pmem refcounting more robust going
      forward.
      Suggested-by: NKirill Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NKirill Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/149339998297.24933.1129582806028305912.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      71389703
  25. 29 4月, 2017 1 次提交
    • T
      libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify · b2518c78
      Toshi Kani 提交于
      The following BUG was observed when nd_pmem_notify() was called
      for a BTT device.  The use of a pmem_device pointer is not valid
      with BTT.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
       IP: nd_pmem_notify+0x30/0xf0 [nd_pmem]
       Call Trace:
        nd_device_notify+0x40/0x50
        child_notify+0x10/0x20
        device_for_each_child+0x50/0x90
        nd_region_notify+0x20/0x30
        nd_device_notify+0x40/0x50
        nvdimm_region_notify+0x27/0x30
        acpi_nfit_scrub+0x341/0x590 [nfit]
        process_one_work+0x197/0x450
        worker_thread+0x4e/0x4a0
        kthread+0x109/0x140
      
      Fix nd_pmem_notify() by setting nd_region and badblocks pointers
      properly for BTT.
      
      Cc: <stable@vger.kernel.org>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Fixes: 71999466 ("libnvdimm: async notification support")
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      b2518c78
  26. 26 4月, 2017 2 次提交
    • D
      x86, dax, pmem: remove indirection around memcpy_from_pmem() · 6abccd1b
      Dan Williams 提交于
      memcpy_from_pmem() maps directly to memcpy_mcsafe(). The wrapper
      serves no real benefit aside from affording a more generic function name
      than the x86-specific 'mcsafe'. However this would not be the first time
      that x86 terminology leaked into the global namespace. For lack of
      better name, just use memcpy_mcsafe() directly.
      
      This conversion also catches a place where we should have been using
      plain memcpy, acpi_nfit_blk_single_io().
      
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      6abccd1b
    • D
      block: remove block_device_operations ->direct_access() · d4b29fd7
      Dan Williams 提交于
      Now that all the producers and consumers of dax interfaces have been
      converted to using dax_operations on a dax_device, remove the block
      device direct_access enabling.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      d4b29fd7
  27. 20 4月, 2017 1 次提交
  28. 13 1月, 2017 1 次提交
  29. 17 12月, 2016 1 次提交
    • D
      libnvdimm: fix mishandled nvdimm_clear_poison() return value · 868f036f
      Dan Williams 提交于
      Colin, via static analysis, reports that the length could be negative
      from nvdimm_clear_poison() in the error case. There was a similar
      problem with commit 0a3f27b9 "libnvdimm, namespace: avoid multiple
      sector calculations" that I noticed when merging the for-4.10/libnvdimm
      topic branch into libnvdimm-for-next, but I missed this one. Fix both of
      them to the following procedure:
      
      * if we clear a block's worth of media, clear that many blocks in
        badblocks
      
      * if we clear less than the requested size of the transfer return an
        error
      
      * always invalidate cache after any non-error / non-zero
        nvdimm_clear_poison result
      
      Fixes: 82bf1037 ("libnvdimm: check and clear poison before writing to pmem")
      Fixes: 0a3f27b9 ("libnvdimm, namespace: avoid multiple sector calculations")
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Reported-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      868f036f
  30. 05 12月, 2016 1 次提交
  31. 29 11月, 2016 1 次提交
    • D
      libnvdimm: use consistent naming for request_mem_region() · 450c6633
      Dan Williams 提交于
      Here is an example /proc/iomem listing for a system with 2 namespaces,
      one in "sector" mode and one in "memory" mode:
      
        1fc000000-2fbffffff : Persistent Memory (legacy)
          1fc000000-2fbffffff : namespace1.0
        340000000-34fffffff : Persistent Memory
          340000000-34fffffff : btt0.1
      
      Here is the corresponding ndctl listing:
      
        # ndctl list
        [
          {
            "dev":"namespace1.0",
            "mode":"memory",
            "size":4294967296,
            "blockdev":"pmem1"
          },
          {
            "dev":"namespace0.0",
            "mode":"sector",
            "size":267091968,
            "uuid":"f7594f86-badb-4592-875f-ded577da2eaf",
            "sector_size":4096,
            "blockdev":"pmem0s"
          }
        ]
      
      Notice that the ndctl listing is purely in terms of namespace devices,
      while the iomem listing leaks the internal "btt0.1" implementation
      detail. Given that ndctl requires the namespace device name to change
      the mode, for example:
      
        # ndctl create-namespace --reconfig=namespace0.0 --mode=raw --force
      
      ...use the namespace name in the iomem listing to keep the claiming
      device name consistent across different mode settings.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      450c6633