1. 24 3月, 2019 1 次提交
  2. 13 12月, 2018 1 次提交
  3. 22 5月, 2018 1 次提交
    • D
      mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS · e7638488
      Dan Williams 提交于
      In preparation for fixing dax-dma-vs-unmap issues, filesystems need to
      be able to rely on the fact that they will get wakeups on dev_pagemap
      page-idle events. Introduce MEMORY_DEVICE_FS_DAX and
      generic_dax_page_free() as common indicator / infrastructure for dax
      filesytems to require. With this change there are no users of the
      MEMORY_DEVICE_HOST designation, so remove it.
      
      The HMM sub-system extended dev_pagemap to arrange a callback when a
      dev_pagemap managed page is freed. Since a dev_pagemap page is free /
      idle when its reference count is 1 it requires an additional branch to
      check the page-type at put_page() time. Given put_page() is a hot-path
      we do not want to incur that check if HMM is not in use, so a static
      branch is used to avoid that overhead when not necessary.
      
      Now, the FS_DAX implementation wants to reuse this mechanism for
      receiving dev_pagemap ->page_free() callbacks. Rework the HMM-specific
      static-key into a generic mechanism that either HMM or FS_DAX code paths
      can enable.
      
      For ARCH=um builds, and any other arch that lacks ZONE_DEVICE support,
      care must be taken to compile out the DEV_PAGEMAP_OPS infrastructure.
      However, we still need to support FS_DAX in the FS_DAX_LIMITED case
      implemented by the s390/dcssblk driver.
      
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Reported-by: NThomas Meyer <thomas@m3y3r.de>
      Reported-by: NDave Jiang <dave.jiang@intel.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      e7638488
  4. 14 3月, 2018 1 次提交
  5. 07 3月, 2018 1 次提交
  6. 09 1月, 2018 1 次提交
  7. 20 12月, 2017 2 次提交
    • D
      libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment · 41fce90f
      Dan Williams 提交于
      The following namespace configuration attempt:
      
          # ndctl create-namespace -e namespace0.0 -m devdax -a 1G -f
          libndctl: ndctl_dax_enable: dax0.1: failed to enable
            Error: namespace0.0: failed to enable
      
          failed to reconfigure namespace: No such device or address
      
      ...fails when the backing memory range is not physically aligned to 1G:
      
          # cat /proc/iomem | grep Persistent
          210000000-30fffffff : Persistent Memory (legacy)
      
      In the above example the 4G persistent memory range starts and ends on a
      256MB boundary.
      
      We handle this case correctly when needing to handle cases that violate
      section alignment (128MB) collisions against "System RAM", and we simply
      need to extend that padding/truncation for the 1GB alignment use case.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 315c5625 ("libnvdimm, pfn: add 'align' attribute...")
      Reported-and-tested-by: NJane Chu <jane.chu@oracle.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      41fce90f
    • D
      libnvdimm, pfn: fix start_pad handling for aligned namespaces · 19deaa21
      Dan Williams 提交于
      The alignment checks at pfn driver startup fail to properly account for
      the 'start_pad' in the case where the namespace is misaligned relative
      to its internal alignment. This is typically triggered in 1G aligned
      namespace, but could theoretically trigger with small namespace
      alignments. When this triggers the kernel reports messages of the form:
      
          dax2.1: bad offset: 0x3c000000 dax disabled align: 0x40000000
      
      Cc: <stable@vger.kernel.org>
      Fixes: 1ee6667c ("libnvdimm, pfn, dax: fix initialization vs autodetect...")
      Reported-by: NJane Chu <jane.chu@oracle.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      19deaa21
  8. 29 9月, 2017 1 次提交
  9. 16 8月, 2017 1 次提交
  10. 12 8月, 2017 1 次提交
  11. 26 7月, 2017 1 次提交
    • O
      libnvdimm: Stop using HPAGE_SIZE · 0dd69643
      Oliver O'Halloran 提交于
      Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and
      PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when
      hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more
      in common with THP than hugetlbfs we should proably be using
      HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just
      give it a new name.
      
      The other usage of HPAGE_SIZE in libnvdimm is when determining how large
      the altmap should be. For the reasons mentioned above it doesn't really
      make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to
      use in generic code and it happens to match the vmemmap allocation block
      on x86 and Power. It's still a hack, but it's a slightly nicer hack.
      Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0dd69643
  12. 28 6月, 2017 1 次提交
    • D
      libnvdimm, nfit: enable support for volatile ranges · c9e582aa
      Dan Williams 提交于
      Allow volatile nfit ranges to participate in all the same infrastructure
      provided for persistent memory regions. A resulting resulting namespace
      device will still be called "pmem", but the parent region type will be
      "nd_volatile". This is in preparation for disabling the dax ->flush()
      operation in the pmem driver when it is hosted on a volatile range.
      
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      c9e582aa
  13. 16 6月, 2017 1 次提交
    • D
      libnvdimm, label: add address abstraction identifiers · b3fde74e
      Dan Williams 提交于
      Starting with v1.2 labels, 'address abstractions' can be hinted via an
      address abstraction id that implies an info-block format. The standard
      address abstraction in the specification is the v2 format of the
      Block-Translation-Table (BTT). Support for that is saved for a later
      patch, for now we add support for the Linux supported address
      abstractions BTT (v1), PFN, and DAX.
      
      The new 'holder_class' attribute for namespace devices is added for
      tooling to specify the 'abstraction_guid' to store in the namespace label.
      For v1.1 labels this field is undefined and any setting of
      'holder_class' away from the default 'none' value will only have effect
      until the driver is unloaded. Setting 'holder_class' requires that
      whatever device tries to claim the namespace must be of the specified
      class.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      b3fde74e
  14. 11 5月, 2017 1 次提交
    • V
      libnvdimm: add an atomic vs process context flag to rw_bytes · 3ae3d67b
      Vishal Verma 提交于
      nsio_rw_bytes can clear media errors, but this cannot be done while we
      are in an atomic context due to locking within ACPI. From the BTT,
      ->rw_bytes may be called either from atomic or process context depending
      on whether the calls happen during initialization or during IO.
      
      During init, we want to ensure error clearing happens, and the flag
      marking process context allows nsio_rw_bytes to do that. When called
      during IO, we're in atomic context, and error clearing can be skipped.
      
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      3ae3d67b
  15. 05 5月, 2017 1 次提交
    • D
      libnvdimm, pfn: fix 'npfns' vs section alignment · d5483fed
      Dan Williams 提交于
      Fix failures to create namespaces due to the vmem_altmap not advertising
      enough free space to store the memmap.
      
       WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0
       [..]
       Call Trace:
        dump_stack+0x63/0x83
        __warn+0xcb/0xf0
        warn_slowpath_null+0x1d/0x20
        arch_add_memory+0xde/0xf0
        devm_memremap_pages+0x244/0x440
        pmem_attach_disk+0x37e/0x490 [nd_pmem]
        nd_pmem_probe+0x7e/0xa0 [nd_pmem]
        nvdimm_bus_probe+0x71/0x120 [libnvdimm]
        driver_probe_device+0x2bb/0x460
        bind_store+0x114/0x160
        drv_attr_store+0x25/0x30
      
      In commit 658922e5 "libnvdimm, pfn: fix memmap reservation sizing"
      we arranged for the capacity to be allocated, but failed to also update
      the 'npfns' parameter. This leads to cases where there is enough
      capacity reserved to hold all the allocated sections, but
      vmemmap_populate_hugepages() still encounters -ENOMEM from
      altmap_alloc_block_buf().
      
      This fix is a stop-gap until we can teach the core memory hotplug
      implementation to permit sub-section hotplug.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 658922e5 ("libnvdimm, pfn: fix memmap reservation sizing")
      Reported-by: NAnisha Allada <anisha.allada@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      d5483fed
  16. 01 5月, 2017 1 次提交
    • D
      libnvdimm: fix nvdimm_bus_lock() vs device_lock() ordering · 452bae0a
      Dan Williams 提交于
      A debug patch to turn the standard device_lock() into something that
      lockdep can analyze yielded the following:
      
       ======================================================
       [ INFO: possible circular locking dependency detected ]
       4.11.0-rc4+ #106 Tainted: G           O
       -------------------------------------------------------
       lt-libndctl/1898 is trying to acquire lock:
        (&dev->nvdimm_mutex/3){+.+.+.}, at: [<ffffffffc023c948>] nd_attach_ndns+0x178/0x1b0 [libnvdimm]
      
       but task is already holding lock:
        (&nvdimm_bus->reconfig_mutex){+.+.+.}, at: [<ffffffffc022e0b1>] nvdimm_bus_lock+0x21/0x30 [libnvdimm]
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&nvdimm_bus->reconfig_mutex){+.+.+.}:
              lock_acquire+0xf6/0x1f0
              __mutex_lock+0x88/0x980
              mutex_lock_nested+0x1b/0x20
              nvdimm_bus_lock+0x21/0x30 [libnvdimm]
              nvdimm_namespace_capacity+0x1b/0x40 [libnvdimm]
              nvdimm_namespace_common_probe+0x230/0x510 [libnvdimm]
              nd_pmem_probe+0x14/0x180 [nd_pmem]
              nvdimm_bus_probe+0xa9/0x260 [libnvdimm]
      
       -> #0 (&dev->nvdimm_mutex/3){+.+.+.}:
              __lock_acquire+0x1107/0x1280
              lock_acquire+0xf6/0x1f0
              __mutex_lock+0x88/0x980
              mutex_lock_nested+0x1b/0x20
              nd_attach_ndns+0x178/0x1b0 [libnvdimm]
              nd_namespace_store+0x308/0x3c0 [libnvdimm]
              namespace_store+0x87/0x220 [libnvdimm]
      
      In this case '&dev->nvdimm_mutex/3' mirrors '&dev->mutex'.
      
      Fix this by replacing the use of device_lock() with nvdimm_bus_lock() to protect
      nd_{attach,detach}_ndns() operations.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 8c2f7e86 ("libnvdimm: infrastructure for btt devices")
      Reported-by: NYi Zhang <yizhan@redhat.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      452bae0a
  17. 05 2月, 2017 1 次提交
    • D
      libnvdimm, pfn: fix memmap reservation size versus 4K alignment · bfb34527
      Dan Williams 提交于
      When vmemmap_populate() allocates space for the memmap it does so in 2MB
      sized chunks. The libnvdimm-pfn driver incorrectly accounts for this
      when the alignment of the device is set to 4K. When this happens we
      trigger memory allocation failures in altmap_alloc_block_buf() and
      trigger warnings of the form:
      
       WARNING: CPU: 0 PID: 3376 at arch/x86/mm/init_64.c:656 arch_add_memory+0xe4/0xf0
       [..]
       Call Trace:
        dump_stack+0x86/0xc3
        __warn+0xcb/0xf0
        warn_slowpath_null+0x1d/0x20
        arch_add_memory+0xe4/0xf0
        devm_memremap_pages+0x29b/0x4e0
      
      Fixes: 315c5625 ("libnvdimm, pfn: add 'align' attribute, default to HPAGE_SIZE")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      bfb34527
  18. 11 12月, 2016 1 次提交
  19. 24 6月, 2016 1 次提交
    • D
      libnvdimm, pfn, dax: fix initialization vs autodetect for mode + alignment · 1ee6667c
      Dan Williams 提交于
      The updated ndctl unit tests discovered that if a pfn configuration with
      a 4K alignment is read from the namespace, that alignment will be
      ignored in favor of the default 2M alignment.  The result is that the
      configuration will fail initialization with a message like:
      
          dax6.1: bad offset: 0x22000 dax disabled align: 0x200000
      
      Fix this by allowing the alignment read from the info block to override
      the default which is 2M not 0 in the autodetect path.  This also fixes a
      similar problem with the mode and alignment settings silently being
      overwritten by the kernel when userspace has changed it.  We now will
      either overwrite the info block if userspace changes the uuid or fail
      and warn if a live setting disagrees with the info block.
      
      Cc: <stable@vger.kernel.org>
      Cc: Micah Parrish <micah.parrish@hpe.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      1ee6667c
  20. 22 5月, 2016 2 次提交
    • D
      libnvdimm, dax: fix deletion · 03dca343
      Dan Williams 提交于
      The ndctl unit tests discovered that the dax enabling omitted updates to
      nd_detach_and_reset().  This routine clears device the configuration
      when the namespace is detached.  Without this clearing userspace may
      assume that the device is in the process of being configured by another
      agent in the system.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      03dca343
    • D
      libnvdimm, dax: fix alignment validation · 5e24c9fd
      Dan Williams 提交于
      Testing the dax-device autodetect support revealed a probe failure with
      the following result:
      
          dax0.1: bad offset: 0x8200000 dax disabled
      
      The original pfn-device implementation inferred the alignment from
      ilog2(offset), now that the alignment is explicit the is_power_of_2()
      needs replacing with a real sanity check against the recorded alignment.
      Otherwise the alignment check is useless in the implicit case and only
      the minimum size of the offset matters.
      
      This self-consistency check is further validated by the probe path that
      will re-check that the offset is large enough to contain all the
      metadata required to enable the device.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5e24c9fd
  21. 21 5月, 2016 1 次提交
  22. 10 5月, 2016 3 次提交
  23. 23 4月, 2016 3 次提交
  24. 08 4月, 2016 1 次提交
    • D
      libnvdimm, pfn: fix uuid validation · e5670563
      Dan Williams 提交于
      If we detect a namespace has a stale info block in the init path, we
      should overwrite with the latest configuration.  In fact, we already
      return -ENODEV when the parent uuid is invalid, the same should be done
      for the 'self' uuid.  Otherwise we can get into a condition where
      userspace is unable to reconfigure the pfn-device without directly /
      manually invalidating the info block.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      e5670563
  25. 06 3月, 2016 2 次提交
  26. 30 1月, 2016 1 次提交
  27. 16 1月, 2016 1 次提交
  28. 14 12月, 2015 1 次提交
  29. 13 12月, 2015 1 次提交
  30. 11 12月, 2015 2 次提交
  31. 17 9月, 2015 1 次提交
  32. 29 8月, 2015 1 次提交
    • D
      libnvdimm, pmem: 'struct page' for pmem · 32ab0a3f
      Dan Williams 提交于
      Enable the pmem driver to handle PFN device instances.  Attaching a pmem
      namespace to a pfn device triggers the driver to allocate and initialize
      struct page entries for pmem.  Memory capacity for this allocation comes
      exclusively from RAM for now which is suitable for low PMEM to RAM
      ratios.  This mechanism will be expanded later for setting an "allocate
      from PMEM" policy.
      
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      32ab0a3f