1. 01 9月, 2017 2 次提交
    • R
      libnvdimm, nd_blk: remove mmio_flush_range() · 5deb67f7
      Robin Murphy 提交于
      mmio_flush_range() suffers from a lack of clearly-defined semantics,
      and is somewhat ambiguous to port to other architectures where the
      scope of the writeback implied by "flush" and ordering might matter,
      but MMIO would tend to imply non-cacheable anyway. Per the rationale
      in 67a3e8fe ("nd_blk: change aperture mapping from WC to WB"), the
      only existing use is actually to invalidate clean cache lines for
      ARCH_MEMREMAP_PMEM type mappings *without* writeback. Since the recent
      cleanup of the pmem API, that also now happens to be the exact purpose
      of arch_invalidate_pmem(), which would be a far more well-defined tool
      for the job.
      
      Rather than risk potentially inconsistent implementations of
      mmio_flush_range() for the sake of one callsite, streamline things by
      removing it entirely and instead move the ARCH_MEMREMAP_PMEM related
      definitions up to the libnvdimm level, so they can be shared by NFIT
      as well. This allows NFIT to be enabled for arm64.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5deb67f7
    • D
      libnvdimm, nfit: export an 'ecc_unit_size' sysfs attribute · a15797f4
      Dan Williams 提交于
      When the nfit driver initializes it runs an ARS (Address Range Scrub)
      operation across every pmem range. Part of that process involves
      determining the ARS capabilities of a given address range. One of the
      capabilities that is reported is the 'Clear Uncorrectable Error Range
      Length Unit Size' (see: ACPI 6.2 section 9.20.7.4 Function Index 1 -
      Query ARS Capabilities). This property is of interest to userspace
      software as it indicates the boundary at which the NVDIMM may need to
      perform read-modify-write cycles to maintain ECC blocks.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      a15797f4
  2. 08 8月, 2017 1 次提交
  3. 05 8月, 2017 1 次提交
  4. 18 7月, 2017 1 次提交
  5. 03 7月, 2017 1 次提交
  6. 01 7月, 2017 3 次提交
  7. 30 6月, 2017 1 次提交
  8. 28 6月, 2017 2 次提交
    • D
      libnvdimm, nfit: enable support for volatile ranges · c9e582aa
      Dan Williams 提交于
      Allow volatile nfit ranges to participate in all the same infrastructure
      provided for persistent memory regions. A resulting resulting namespace
      device will still be called "pmem", but the parent region type will be
      "nd_volatile". This is in preparation for disabling the dax ->flush()
      operation in the pmem driver when it is hosted on a volatile range.
      
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      c9e582aa
    • D
      x86, libnvdimm, pmem: remove global pmem api · ca6a4657
      Dan Williams 提交于
      Now that all callers of the pmem api have been converted to dax helpers that
      call back to the pmem driver, we can remove include/linux/pmem.h and
      asm/pmem.h.
      
      Cc: <x86@kernel.org>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Oliver O'Halloran <oohall@gmail.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      ca6a4657
  9. 16 6月, 2017 4 次提交
  10. 10 6月, 2017 1 次提交
    • D
      x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations · 0aed55af
      Dan Williams 提交于
      The pmem driver has a need to transfer data with a persistent memory
      destination and be able to rely on the fact that the destination writes are not
      cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
      (non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
      to ensure data-writes have reached a power-fail-safe zone in the platform. The
      fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
      around and fence previous writes with an "sfence".
      
      Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
      memcpy_flushcache, that guarantee that the destination buffer is not dirty in
      the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
      will be used to replace the "pmem api" (include/linux/pmem.h +
      arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
      and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
      config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
      otherwise.
      
      This is meant to satisfy the concern from Linus that if a driver wants to do
      something beyond the normal nocache semantics it should be something private to
      that driver [1], and Al's concern that anything uaccess related belongs with
      the rest of the uaccess code [2].
      
      The first consumer of this interface is a new 'copy_from_iter' dax operation so
      that pmem can inject cache maintenance operations without imposing this
      overhead on other dax-capable drivers.
      
      [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
      [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
      
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0aed55af
  11. 07 6月, 2017 1 次提交
  12. 06 6月, 2017 1 次提交
  13. 05 5月, 2017 2 次提交
  14. 29 4月, 2017 1 次提交
    • D
      acpi, nfit: kill ACPI_NFIT_DEBUG · 7699a6a3
      Dan Williams 提交于
      Inevitably when one actually needs to debug a DSM issue it's on a
      distribution kernel that has CONFIG_ACPI_NFIT_DEBUG=n. The config symbol
      was only there to avoid the compile error due to the missing fallback for
      print_hex_dump_debug in the CONFIG_DYNAMIC_DEBUG=n case. That was fixed
      with commit cdf17449 "hexdump: do not print debug dumps for
      !CONFIG_DEBUG", so the config symbol can just be dropped.
      
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      7699a6a3
  15. 26 4月, 2017 1 次提交
    • D
      x86, dax, pmem: remove indirection around memcpy_from_pmem() · 6abccd1b
      Dan Williams 提交于
      memcpy_from_pmem() maps directly to memcpy_mcsafe(). The wrapper
      serves no real benefit aside from affording a more generic function name
      than the x86-specific 'mcsafe'. However this would not be the first time
      that x86 terminology leaked into the global namespace. For lack of
      better name, just use memcpy_mcsafe() directly.
      
      This conversion also catches a place where we should have been using
      plain memcpy, acpi_nfit_blk_single_io().
      
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      6abccd1b
  16. 19 4月, 2017 1 次提交
  17. 18 4月, 2017 3 次提交
  18. 15 4月, 2017 1 次提交
  19. 13 4月, 2017 4 次提交
  20. 28 3月, 2017 1 次提交
  21. 01 3月, 2017 1 次提交
    • D
      nfit, libnvdimm: fix interleave set cookie calculation · 86ef58a4
      Dan Williams 提交于
      The interleave-set cookie is a sum that sanity checks the composition of
      an interleave set has not changed from when the namespace was initially
      created.  The checksum is calculated by sorting the DIMMs by their
      location in the interleave-set. The comparison for the sort must be
      64-bit wide, not byte-by-byte as performed by memcmp() in the broken
      case.
      
      Fix the implementation to accept correct cookie values in addition to
      the Linux "memcmp" order cookies, but only allow correct cookies to be
      generated going forward. It does mean that namespaces created by
      third-party-tooling, or created by newer kernels with this fix, will not
      validate on older kernels. However, there are a couple mitigating
      conditions:
      
          1/ platforms with namespace-label capable NVDIMMs are not widely
             available.
      
          2/ interleave-sets with a single-dimm are by definition not affected
             (nothing to sort). This covers the QEMU-KVM NVDIMM emulation case.
      
      The cookie stored in the namespace label will be fixed by any write the
      namespace label, the most straightforward way to achieve this is to
      write to the "alt_name" attribute of a namespace in sysfs.
      
      Cc: <stable@vger.kernel.org>
      Fixes: eaf96153 ("libnvdimm, nfit: add interleave-set state-tracking infrastructure")
      Reported-by: NNicholas Moulin <nicholas.w.moulin@linux.intel.com>
      Tested-by: NNicholas Moulin <nicholas.w.moulin@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      86ef58a4
  22. 04 2月, 2017 1 次提交
    • D
      acpi, nfit: fix acpi_nfit_flush_probe() crash · e471486c
      Dan Williams 提交于
      We queue an on-stack work item to 'nfit_wq' and wait for it to complete
      as part of a 'flush_probe' request. However, if the user cancels the
      wait we need to make sure the item is flushed from the queue otherwise
      we are leaving an out-of-scope stack address on the work list.
      
       BUG: unable to handle kernel paging request at ffffbcb3c72f7cd0
       IP: [<ffffffffa9413a7b>] __list_add+0x1b/0xb0
       [..]
       RIP: 0010:[<ffffffffa9413a7b>]  [<ffffffffa9413a7b>] __list_add+0x1b/0xb0
       RSP: 0018:ffffbcb3c7ba7c00  EFLAGS: 00010046
       [..]
       Call Trace:
        [<ffffffffa90bb11a>] insert_work+0x3a/0xc0
        [<ffffffffa927fdda>] ? seq_open+0x5a/0xa0
        [<ffffffffa90bb30a>] __queue_work+0x16a/0x460
        [<ffffffffa90bbb08>] queue_work_on+0x38/0x40
        [<ffffffffc0cf2685>] acpi_nfit_flush_probe+0x95/0xc0 [nfit]
        [<ffffffffc0cf25d0>] ? nfit_visible+0x40/0x40 [nfit]
        [<ffffffffa9571495>] wait_probe_show+0x25/0x60
        [<ffffffffa9546b30>] dev_attr_show+0x20/0x50
      
      Fixes: 7ae0fa43 ("nfit, libnvdimm: async region scrub workqueue")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      e471486c
  23. 21 12月, 2016 1 次提交
    • L
      ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users · 6b11d1d6
      Lv Zheng 提交于
      This patch removes the users of the deprectated APIs:
       acpi_get_table_with_size()
       early_acpi_os_unmap_memory()
      The following APIs should be used instead of:
       acpi_get_table()
       acpi_put_table()
      
      The deprecated APIs are invented to be a replacement of acpi_get_table()
      during the early stage so that the early mapped pointer will not be stored
      in ACPICA core and thus the late stage acpi_get_table() won't return a
      wrong pointer. The mapping size is returned just because it is required by
      early_acpi_os_unmap_memory() to unmap the pointer during early stage.
      
      But as the mapping size equals to the acpi_table_header.length
      (see acpi_tb_init_table_descriptor() and acpi_tb_validate_table()), when
      such a convenient result is returned, driver code will start to use it
      instead of accessing acpi_table_header to obtain the length.
      
      Thus this patch cleans up the drivers by replacing returned table size with
      acpi_table_header.length, and should be a no-op.
      Reported-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6b11d1d6
  24. 07 12月, 2016 4 次提交
    • D
      tools/testing/nvdimm: unit test acpi_nfit_ctl() · a7de92da
      Dan Williams 提交于
      A recent flurry of bug discoveries in the nfit driver's DSM marshalling
      routine has highlighted the fact that we do not have unit test coverage
      for this routine. Add a self-test of acpi_nfit_ctl() routine before
      probing the "nfit_test.0" device. This mocks stimulus to acpi_nfit_ctl()
      and if any of the tests fail "nfit_test.0" will be unavailable causing
      the rest of the tests to not run / fail.
      
      This unit test will also be a place to land reproductions of quirky BIOS
      behavior discovered in the field and ensure the kernel does not regress
      against implementations it has seen in practice.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      a7de92da
    • D
      acpi, nfit: fix bus vs dimm confusion in xlat_status · d6eb270c
      Dan Williams 提交于
      Given dimms and bus commands share the same command number space we need
      to be careful that we are translating status in the correct context.
      Otherwise we can, for example, fail an ND_CMD_GET_CONFIG_SIZE command
      because max_xfer is zero. It fails because that condition erroneously
      correlates with the 'cleared == 0' failure of ND_CMD_CLEAR_ERROR.
      
      Cc: <stable@vger.kernel.org>
      Fixes: aef25338 ("libnvdimm, nfit: centralize command status translation")
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      d6eb270c
    • D
      acpi, nfit: validate ars_status output buffer size · 82aa37cf
      Dan Williams 提交于
      If an ARS Status command returns truncated output, do not process
      partial records or otherwise consume non-status fields.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 0caeef63 ("libnvdimm: Add a poison list and export badblocks")
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      82aa37cf
    • D
      acpi, nfit, libnvdimm: fix / harden ars_status output length handling · efda1b5d
      Dan Williams 提交于
      Given ambiguities in the ACPI 6.1 definition of the "Output (Size)"
      field of the ARS (Address Range Scrub) Status command, a firmware
      implementation may in practice return 0, 4, or 8 to indicate that there
      is no output payload to process.
      
      The specification states "Size of Output Buffer in bytes, including this
      field.". However, 'Output Buffer' is also the name of the entire
      payload, and earlier in the specification it states "Max Query ARS
      Status Output Buffer Size: Maximum size of buffer (including the Status
      and Extended Status fields)".
      
      Without this fix if the BIOS happens to return 0 it causes memory
      corruption as evidenced by this result from the acpi_nfit_ctl() unit
      test.
      
       ars_status00000000: 00020000 00000000                    ........
       BUG: stack guard page was hit at ffffc90001750000 (stack is ffffc9000174c000..ffffc9000174ffff)
       kernel stack overflow (page fault): 0000 [#1] SMP DEBUG_PAGEALLOC
       task: ffff8803332d2ec0 task.stack: ffffc9000174c000
       RIP: 0010:[<ffffffff814cfe72>]  [<ffffffff814cfe72>] __memcpy+0x12/0x20
       RSP: 0018:ffffc9000174f9a8  EFLAGS: 00010246
       RAX: ffffc9000174fab8 RBX: 0000000000000000 RCX: 000000001fffff56
       RDX: 0000000000000000 RSI: ffff8803231f5a08 RDI: ffffc90001750000
       RBP: ffffc9000174fa88 R08: ffffc9000174fab0 R09: ffff8803231f54b8
       R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000000
       R13: 0000000000000000 R14: 0000000000000003 R15: ffff8803231f54a0
       FS:  00007f3a611af640(0000) GS:ffff88033ed00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffffc90001750000 CR3: 0000000325b20000 CR4: 00000000000406e0
       Stack:
        ffffffffa00bc60d 0000000000000008 ffffc90000000001 ffffc9000174faac
        0000000000000292 ffffffffa00c24e4 ffffffffa00c2914 0000000000000000
        0000000000000000 ffffffff00000003 ffff880331ae8ad0 0000000800000246
       Call Trace:
        [<ffffffffa00bc60d>] ? acpi_nfit_ctl+0x49d/0x750 [nfit]
        [<ffffffffa01f4fe0>] nfit_test_probe+0x670/0xb1b [nfit_test]
      
      Cc: <stable@vger.kernel.org>
      Fixes: 747ffe11 ("libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing")
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      efda1b5d