- 30 4月, 2020 5 次提交
-
-
由 Aneesh Kumar K.V 提交于
fix #27138800 commit 4c806b897d6075bfa5067e524fb058c57ab64e7b upstream. Some environments want to use a host tmpfs/ramdisk to back guest pmem. While the data is not persisted relative to the host it *is* persisted relative to guest crashes / reboots. The guest is free to use dax and MAP_SYNC to keep filesystem metadata consistent with dax accesses without requiring guest fsync(). The guest can also observe that the region is volatile and skip cache flushing as global visibility is enough to "persist" data relative to the host staying alive over guest reset events. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: NPankaj Gupta <pagupta@redhat.com> Link: https://lore.kernel.org/r/20190924114327.14700-1-aneesh.kumar@linux.ibm.com [djbw: reword the changelog] Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com> Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>
-
由 Pankaj Gupta 提交于
fix #27138800 commit 8c2e408e73f735d2e6e8b43f9b038c9abb082939 upstream. This patch fixes below sparse warning related to __virtio type in virtio pmem driver. This is reported by Intel test bot on linux-next tree. nd_virtio.c:56:28: warning: incorrect type in assignment (different base types) nd_virtio.c:56:28: expected unsigned int [unsigned] [usertype] type nd_virtio.c:56:28: got restricted __virtio32 nd_virtio.c:93:59: warning: incorrect type in argument 2 (different base types) nd_virtio.c:93:59: expected restricted __virtio32 [usertype] val nd_virtio.c:93:59: got unsigned int [unsigned] [usertype] ret Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NPankaj Gupta <pagupta@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com> Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>
-
由 Pankaj Gupta 提交于
fix #27138800 commit fefc1d97fa4b5e016bbe15447dc3edcd9e1bcb9f upstream. This patch adds 'DAXDEV_SYNC' flag which is set for nd_region doing synchronous flush. This later is used to disable MAP_SYNC functionality for ext4 & xfs filesystem for devices don't support synchronous flush. Signed-off-by: NPankaj Gupta <pagupta@redhat.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com> Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>
-
由 Pankaj Gupta 提交于
fix #27138800 commit 6e84200c0a2994b991259d19450eee561029bf70 upstream. This patch adds virtio-pmem driver for KVM guest. Guest reads the persistent memory range information from Qemu over VIRTIO and registers it on nvdimm_bus. It also creates a nd_region object with the persistent memory range information so that existing 'nvdimm/pmem' driver can reserve this into system memory map. This way 'virtio-pmem' driver uses existing functionality of pmem driver to register persistent memory compatible for DAX capable filesystems. This also provides function to perform guest flush over VIRTIO from 'pmem' driver when userspace performs flush on DAX memory range. Signed-off-by: NPankaj Gupta <pagupta@redhat.com> Reviewed-by: NYuval Shaia <yuval.shaia@oracle.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJakub Staron <jstaron@google.com> Tested-by: NJakub Staron <jstaron@google.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com> Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>
-
由 Pankaj Gupta 提交于
fix #27138800 commit c5d4355d10d414a96ca870b731756b89d068d57a upstream. This patch adds functionality to perform flush from guest to host over VIRTIO. We are registering a callback based on 'nd_region' type. virtio_pmem driver requires this special flush function. For rest of the region types we are registering existing flush function. Report error returned by host fsync failure to userspace. Signed-off-by: NPankaj Gupta <pagupta@redhat.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com> Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>
-
- 15 1月, 2020 1 次提交
-
-
由 Dan Williams 提交于
commit 8fc5c73554db0ac18c0c6ac5b2099ab917f83bdf upstream Persistent memory, as described by the ACPI NFIT (NVDIMM Firmware Interface Table), is the first known instance of a memory range described by a unique "target" proximity domain. Where "initiator" and "target" proximity domains is an approach that the ACPI HMAT (Heterogeneous Memory Attributes Table) uses to described the unique performance properties of a memory range relative to a given initiator (e.g. CPU or DMA device). Currently the numa-node for a /dev/pmemX block-device or /dev/daxX.Y char-device follows the traditional notion of 'numa-node' where the attribute conveys the closest online numa-node. That numa-node attribute is useful for cpu-binding and memory-binding processes *near* the device. However, when the memory range backing a 'pmem', or 'dax' device is onlined (memory hot-add) the memory-only-numa-node representing that address needs to be differentiated from the set of online nodes. In other words, the numa-node association of the device depends on whether you can bind processes *near* the cpu-numa-node in the offline device-case, or bind process *on* the memory-range directly after the backing address range is onlined. Allow for the case that platform firmware describes persistent memory with a unique proximity domain, i.e. when it is distinct from the proximity of DRAM and CPUs that are on the same socket. Plumb the Linux numa-node translation of that proximity through the libnvdimm region device to namespaces that are in device-dax mode. With this in place the proposed kmem driver [1] can optionally discover a unique numa-node number for the address range as it transitions the memory from an offline state managed by a device-driver to an online memory range managed by the core-mm. [1]: https://lore.kernel.org/lkml/20181022201317.8558C1D8@viggo.jf.intel.comReported-by: NFan Du <fan.du@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> [yshi: Removed PowerPC stuff which is not applicable 4.19] Signed-off-by: NYang Shi <yang.shi@linux.alibaba.com> Reviewed-by: NGavin Shan <shan.gavin@linux.alibaba.com>
-
- 12 10月, 2019 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
[ Upstream commit c42adf87e4e7ed77f6ffe288dc90f980d07d68df ] We do check for a bad block during namespace init and that use region bad block list. We need to initialize the bad block for volatile regions for this to work. We also observe a lockdep warning as below because the lock is not initialized correctly since we skip bad block init for volatile regions. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc1-15699-g3dee241c937e #149 Call Trace: [c0000000f95cb250] [c00000000147dd84] dump_stack+0xe8/0x164 (unreliable) [c0000000f95cb2a0] [c00000000022ccd8] register_lock_class+0x308/0xa60 [c0000000f95cb3a0] [c000000000229cc0] __lock_acquire+0x170/0x1ff0 [c0000000f95cb4c0] [c00000000022c740] lock_acquire+0x220/0x270 [c0000000f95cb580] [c000000000a93230] badblocks_check+0xc0/0x290 [c0000000f95cb5f0] [c000000000d97540] nd_pfn_validate+0x5c0/0x7f0 [c0000000f95cb6d0] [c000000000d98300] nd_dax_probe+0xd0/0x1f0 [c0000000f95cb760] [c000000000d9b66c] nd_pmem_probe+0x10c/0x160 [c0000000f95cb790] [c000000000d7f5ec] nvdimm_bus_probe+0x10c/0x240 [c0000000f95cb820] [c000000000d0f844] really_probe+0x254/0x4e0 [c0000000f95cb8b0] [c000000000d0fdfc] driver_probe_device+0x16c/0x1e0 [c0000000f95cb930] [c000000000d10238] device_driver_attach+0x68/0xa0 [c0000000f95cb970] [c000000000d1040c] __driver_attach+0x19c/0x1c0 [c0000000f95cb9f0] [c000000000d0c4c4] bus_for_each_dev+0x94/0x130 [c0000000f95cba50] [c000000000d0f014] driver_attach+0x34/0x50 [c0000000f95cba70] [c000000000d0e208] bus_add_driver+0x178/0x2f0 [c0000000f95cbb00] [c000000000d117c8] driver_register+0x108/0x170 [c0000000f95cbb70] [c000000000d7edb0] __nd_driver_register+0xe0/0x100 [c0000000f95cbbd0] [c000000001a6baa4] nd_pmem_driver_init+0x34/0x48 [c0000000f95cbbf0] [c0000000000106f4] do_one_initcall+0x1d4/0x4b0 [c0000000f95cbcd0] [c0000000019f499c] kernel_init_freeable+0x544/0x65c [c0000000f95cbdb0] [c000000000010d6c] kernel_init+0x2c/0x180 [c0000000f95cbe20] [c00000000000b954] ret_from_kernel_thread+0x5c/0x68 Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20190919083355.26340-1-aneesh.kumar@linux.ibm.comSigned-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 09 8月, 2019 4 次提交
-
-
由 Dan Williams 提交于
commit ca6bf264f6d856f959c4239cda1047b587745c67 upstream. A multithreaded namespace creation/destruction stress test currently deadlocks with the following lockup signature: INFO: task ndctl:2924 blocked for more than 122 seconds. Tainted: G OE 5.2.0-rc4+ #3382 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ndctl D 0 2924 1176 0x00000000 Call Trace: ? __schedule+0x27e/0x780 schedule+0x30/0xb0 wait_nvdimm_bus_probe_idle+0x8a/0xd0 [libnvdimm] ? finish_wait+0x80/0x80 uuid_store+0xe6/0x2e0 [libnvdimm] kernfs_fop_write+0xf0/0x1a0 vfs_write+0xb7/0x1b0 ksys_write+0x5c/0xd0 do_syscall_64+0x60/0x240 INFO: task ndctl:2923 blocked for more than 122 seconds. Tainted: G OE 5.2.0-rc4+ #3382 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ndctl D 0 2923 1175 0x00000000 Call Trace: ? __schedule+0x27e/0x780 ? __mutex_lock+0x489/0x910 schedule+0x30/0xb0 schedule_preempt_disabled+0x11/0x20 __mutex_lock+0x48e/0x910 ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm] ? __lock_acquire+0x23f/0x1710 ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm] nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm] __dax_pmem_probe+0x5e/0x210 [dax_pmem_core] ? nvdimm_bus_probe+0x1d0/0x2c0 [libnvdimm] dax_pmem_probe+0xc/0x20 [dax_pmem] nvdimm_bus_probe+0x90/0x2c0 [libnvdimm] really_probe+0xef/0x390 driver_probe_device+0xb4/0x100 In this sequence an 'nd_dax' device is being probed and trying to take the lock on its backing namespace to validate that the 'nd_dax' device indeed has exclusive access to the backing namespace. Meanwhile, another thread is trying to update the uuid property of that same backing namespace. So one thread is in the probe path trying to acquire the lock, and the other thread has acquired the lock and tries to flush the probe path. Fix this deadlock by not holding the namespace device_lock over the wait_nvdimm_bus_probe_idle() synchronization step. In turn this requires the device_lock to be held on entry to wait_nvdimm_bus_probe_idle() and subsequently dropped internally to wait_nvdimm_bus_probe_idle(). Cc: <stable@vger.kernel.org> Fixes: bf9bccc1 ("libnvdimm: pmem label sets and namespace instantiation") Cc: Vishal Verma <vishal.l.verma@intel.com> Tested-by: NJane Chu <jane.chu@oracle.com> Link: https://lore.kernel.org/r/156341210094.292348.2384694131126767789.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Dan Williams 提交于
commit 6de5d06e657acdbcf9637dac37916a4a5309e0f4 upstream. In preparation for not holding a lock over the execution of nd_ioctl(), update the implementation to allow multiple threads to be attempting ioctls at the same time. The bus lock still prevents multiple in-flight ->ndctl() invocations from corrupting each other's state, but static global staging buffers are moved to the heap. Reported-by: NVishal Verma <vishal.l.verma@intel.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Tested-by: NVishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/156341208947.292348.10560140326807607481.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Dan Williams 提交于
commit 700cd033a82d466ad8f9615f9985525e45f8960a upstream. Namespace activation expects to be able to reference region badblocks. The following warning sometimes triggers when asynchronous namespace activation races in front of the completion of namespace probing. Move all possible namespace probing after region badblocks initialization. Otherwise, lockdep sometimes catches the uninitialized state of the badblocks seqlock with stack trace signatures like: INFO: trying to register non-static key. pmem2: detected capacity change from 0 to 136365211648 the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 9 PID: 358 Comm: kworker/u80:5 Tainted: G OE 5.2.0-rc4+ #3382 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 Workqueue: events_unbound async_run_entry_fn Call Trace: dump_stack+0x85/0xc0 pmem1.12: detected capacity change from 0 to 8589934592 register_lock_class+0x56a/0x570 ? check_object+0x140/0x270 __lock_acquire+0x80/0x1710 ? __mutex_lock+0x39d/0x910 lock_acquire+0x9e/0x180 ? nd_pfn_validate+0x28f/0x440 [libnvdimm] badblocks_check+0x93/0x1f0 ? nd_pfn_validate+0x28f/0x440 [libnvdimm] nd_pfn_validate+0x28f/0x440 [libnvdimm] ? lockdep_hardirqs_on+0xf0/0x180 nd_dax_probe+0x9a/0x120 [libnvdimm] nd_pmem_probe+0x6d/0x180 [nd_pmem] nvdimm_bus_probe+0x90/0x2c0 [libnvdimm] Fixes: 48af2f7e52f4 ("libnvdimm, pfn: during init, clear errors...") Cc: <stable@vger.kernel.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/156341208365.292348.1547528796026249120.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Dan Williams 提交于
commit 8aac0e2338916e273ccbd438a2b7a1e8c61749f5 upstream. A multithreaded namespace creation/destruction stress test currently fails with signatures like the following: sysfs group 'power' not found for kobject 'dax1.1' RIP: 0010:sysfs_remove_group+0x76/0x80 Call Trace: device_del+0x73/0x370 device_unregister+0x16/0x50 nd_async_device_unregister+0x1e/0x30 [libnvdimm] async_run_entry_fn+0x39/0x160 process_one_work+0x23c/0x5e0 worker_thread+0x3c/0x390 BUG: kernel NULL pointer dereference, address: 0000000000000020 RIP: 0010:klist_put+0x1b/0x6c Call Trace: klist_del+0xe/0x10 device_del+0x8a/0x2c9 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 device_unregister+0x44/0x4f nd_async_device_unregister+0x22/0x2d [libnvdimm] async_run_entry_fn+0x47/0x15a process_one_work+0x1a2/0x2eb worker_thread+0x1b8/0x26e Use the kill_device() helper to atomically resolve the race of multiple threads issuing kill, device_unregister(), requests. Reported-by: NJane Chu <jane.chu@oracle.com> Reported-by: NErwin Tsaur <erwin.tsaur@oracle.com> Fixes: 4d88a97a ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver...") Cc: <stable@vger.kernel.org> Link: https://github.com/pmem/ndctl/issues/96Tested-by: NTested-by: Jane Chu <jane.chu@oracle.com> Link: https://lore.kernel.org/r/156341207846.292348.10435719262819764054.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 31 7月, 2019 1 次提交
-
-
由 Dan Williams 提交于
commit b70d31d054ee3a6fc1034b9d7fc0ae1e481aa018 upstream. In preparation for fixing a deadlock between wait_for_bus_probe_idle() and the nvdimm_bus_list_mutex arrange for __nd_ioctl() without nvdimm_bus_list_mutex held. This also unifies the 'dimm' and 'bus' level ioctls into a common nd_ioctl() preamble implementation. Marked for -stable as it is a pre-requisite for a follow-on fix. Cc: <stable@vger.kernel.org> Fixes: bf9bccc1 ("libnvdimm: pmem label sets and namespace instantiation") Cc: Vishal Verma <vishal.l.verma@intel.com> Tested-by: NJane Chu <jane.chu@oracle.com> Link: https://lore.kernel.org/r/156341209518.292348.7183897251740665198.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 26 7月, 2019 1 次提交
-
-
由 Dan Williams 提交于
commit 7e3e888dfc138089f4c15a81b418e88f0978f744 upstream. At namespace creation time there is the potential for the "expected to be zero" fields of a 'pfn' info-block to be filled with indeterminate data. While the kernel buffer is zeroed on allocation it is immediately overwritten by nd_pfn_validate() filling it with the current contents of the on-media info-block location. For fields like, 'flags' and the 'padding' it potentially means that future implementations can not rely on those fields being zero. In preparation to stop using the 'start_pad' and 'end_trunc' fields for section alignment, arrange for fields that are not explicitly initialized to be guaranteed zero. Bump the minor version to indicate it is safe to assume the 'padding' and 'flags' are zero. Otherwise, this corruption is expected to benign since all other critical fields are explicitly initialized. Note The cc: stable is about spreading this new policy to as many kernels as possible not fixing an issue in those kernels. It is not until the change titled "libnvdimm/pfn: Stop padding pmem namespaces to section alignment" where this improper initialization becomes a problem. So if someone decides to backport "libnvdimm/pfn: Stop padding pmem namespaces to section alignment" (which is not tagged for stable), make sure this pre-requisite is flagged. Link: http://lkml.kernel.org/r/156092356065.979959.6681003754765958296.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: 32ab0a3f ("libnvdimm, pmem: 'struct page' for pmem") Signed-off-by: NDan Williams <dan.j.williams@intel.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [ppc64] Cc: <stable@vger.kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 19 6月, 2019 1 次提交
-
-
由 Qian Cai 提交于
[ Upstream commit c01dafad77fea8d64c4fdca0a6031c980842ad65 ] Several places (dimm_devs.c, core.c etc) include label.h but only label.c uses NSINDEX_SIGNATURE, so move its definition to label.c instead. In file included from drivers/nvdimm/dimm_devs.c:23: drivers/nvdimm/label.h:41:19: warning: 'NSINDEX_SIGNATURE' defined but not used [-Wunused-const-variable=] Also, some places abuse "/**" which is only reserved for the kernel-doc. drivers/nvdimm/bus.c:648: warning: cannot understand function prototype: 'struct attribute_group nd_device_attribute_group = ' drivers/nvdimm/bus.c:677: warning: cannot understand function prototype: 'struct attribute_group nd_numa_attribute_group = ' Those are just some member assignments for the "struct attribute_group" instances and it can't be expressed in the kernel-doc. Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NQian Cai <cai@lca.pw> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 31 5月, 2019 1 次提交
-
-
由 Dan Williams 提交于
commit 52f476a323f9efc959be1c890d0cdcf12e1582e0 upstream. Jeff discovered that performance improves from ~375K iops to ~519K iops on a simple psync-write fio workload when moving the location of 'struct page' from the default PMEM location to DRAM. This result is surprising because the expectation is that 'struct page' for dax is only needed for third party references to dax mappings. For example, a dax-mapped buffer passed to another system call for direct-I/O requires 'struct page' for sending the request down the driver stack and pinning the page. There is no usage of 'struct page' for first party access to a file via read(2)/write(2) and friends. However, this "no page needed" expectation is violated by CONFIG_HARDENED_USERCOPY and the check_copy_size() performed in copy_from_iter_full_nocache() and copy_to_iter_mcsafe(). The check_heap_object() helper routine assumes the buffer is backed by a slab allocator (DRAM) page and applies some checks. Those checks are invalid, dax pages do not originate from the slab, and redundant, dax_iomap_actor() has already validated that the I/O is within bounds. Specifically that routine validates that the logical file offset is within bounds of the file, then it does a sector-to-pfn translation which validates that the physical mapping is within bounds of the block device. Bypass additional hardened usercopy overhead and call the 'no check' versions of the copy_{to,from}_iter operations directly. Fixes: 0aed55af ("x86, uaccess: introduce copy_from_iter_flushcache...") Cc: <stable@vger.kernel.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <willy@infradead.org> Reported-and-tested-by: NJeff Smits <jeff.smits@intel.com> Acked-by: NKees Cook <keescook@chromium.org> Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 22 5月, 2019 1 次提交
-
-
由 Dan Williams 提交于
commit c4703ce11c23423d4b46e3d59aef7979814fd608 upstream. Users have reported intermittent occurrences of DIMM initialization failures due to duplicate allocations of address capacity detected in the labels, or errors of the form below, both have the same root cause. nd namespace1.4: failed to track label: 0 WARNING: CPU: 17 PID: 1381 at drivers/nvdimm/label.c:863 RIP: 0010:__pmem_label_update+0x56c/0x590 [libnvdimm] Call Trace: ? nd_pmem_namespace_label_update+0xd6/0x160 [libnvdimm] nd_pmem_namespace_label_update+0xd6/0x160 [libnvdimm] uuid_store+0x17e/0x190 [libnvdimm] kernfs_fop_write+0xf0/0x1a0 vfs_write+0xb7/0x1b0 ksys_write+0x57/0xd0 do_syscall_64+0x60/0x210 Unfortunately those reports were typically with a busy parallel namespace creation / destruction loop making it difficult to see the components of the bug. However, Jane provided a simple reproducer using the work-in-progress sub-section implementation. When ndctl is reconfiguring a namespace it may take an existing defunct / disabled namespace and reconfigure it with a new uuid and other parameters. Critically namespace_update_uuid() takes existing address resources and renames them for the new namespace to use / reconfigure as it sees fit. The bug is that this rename only happens in the resource tracking tree. Existing labels with the old uuid are not reaped leading to a scenario where multiple active labels reference the same span of address range. Teach namespace_update_uuid() to flag any references to the old uuid for reaping at the next label update attempt. Cc: <stable@vger.kernel.org> Fixes: bf9bccc1 ("libnvdimm: pmem label sets and namespace instantiation") Link: https://github.com/pmem/ndctl/issues/91Reported-by: NJane Chu <jane.chu@oracle.com> Reported-by: NJeff Moyer <jmoyer@redhat.com> Reported-by: NErwin Tsaur <erwin.tsaur@oracle.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 17 5月, 2019 3 次提交
-
-
由 Li RongQing 提交于
[ Upstream commit 9dc6488e84b0f64df17672271664752488cd6a25 ] If offset is not zero and length is bigger than PAGE_SIZE, this will cause to out of boundary access to a page memory Fixes: 98cc093c ("block, THP: make block_device_operations.rw_page support THP") Co-developed-by: NLiang ZhiCheng <liangzhicheng@baidu.com> Signed-off-by: NLiang ZhiCheng <liangzhicheng@baidu.com> Signed-off-by: NLi RongQing <lirongqing@baidu.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Aditya Pakki 提交于
[ Upstream commit 486fa92df4707b5df58d6508728bdb9321a59766 ] In case kmemdup fails, the fix releases resources and returns to avoid the NULL pointer dereference. Signed-off-by: NAditya Pakki <pakki001@umn.edu> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Kangjie Lu 提交于
[ Upstream commit 55c1fc0af29a6c1b92f217b7eb7581a882e0c07c ] In case kmemdup fails, the fix goes to blk_err to avoid NULL pointer dereference. Signed-off-by: NKangjie Lu <kjlu@umn.edu> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 24 3月, 2019 4 次提交
-
-
由 Oliver O'Halloran 提交于
commit 07464e88365e9236febaca9ed1a2e2006d8bc952 upstream. Libnvdimm reserves the first 8K of pfn and devicedax namespaces to store a superblock describing the namespace. This 8K reservation is contained within the altmap area which the kernel uses for the vmemmap backing for the pages within the namespace. The altmap allows for some pages at the start of the altmap area to be reserved and that mechanism is used to protect the superblock from being re-used as vmemmap backing. The number of PFNs to reserve is calculated using: PHYS_PFN(SZ_8K) Which is implemented as: #define PHYS_PFN(x) ((unsigned long)((x) >> PAGE_SHIFT)) So on systems where PAGE_SIZE is greater than 8K the reservation size is truncated to zero and the superblock area is re-used as vmemmap backing. As a result all the namespace information stored in the superblock (i.e. if it's a PFN or DAX namespace) is lost and the namespace needs to be re-created to get access to the contents. This patch fixes this by using PFN_UP() rather than PHYS_PFN() to ensure that at least one page is reserved. On systems with a 4K pages size this patch should have no effect. Cc: stable@vger.kernel.org Cc: Dan Williams <dan.j.williams@intel.com> Fixes: ac515c08 ("libnvdimm, pmem, pfn: move pfn setup to the core") Signed-off-by: NOliver O'Halloran <oohall@gmail.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Dan Williams 提交于
commit fa7d2e639cd90442d868dfc6ca1d4cc9d8bf206e upstream. For recovery, where non-dax access is needed to a given physical address range, and testing, allow the 'force_raw' attribute to override the default establishment of a dev_pagemap. Otherwise without this capability it is possible to end up with a namespace that can not be activated due to corrupted info-block, and one that can not be repaired due to a section collision. Cc: <stable@vger.kernel.org> Fixes: 004f1afb ("libnvdimm, pmem: direct map legacy pmem by default") Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Wei Yang 提交于
commit f101ada7da6551127d192c2f1742c1e9e0f62799 upstream. When trying to see whether current nd_region intersects with others, trim_pfn_device() has already calculated the *size* to be expanded to SECTION size. Do not double append 'adjust' to 'size' when calculating whether the end of a region collides with the next pmem region. Fixes: ae86cbfef381 "libnvdimm, pfn: Pad pfn namespaces relative to other regions" Cc: <stable@vger.kernel.org> Signed-off-by: NWei Yang <richardw.yang@linux.intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Dan Williams 提交于
commit 966d23a006ca7b44ac8cf4d0c96b19785e0c3da0 upstream. The UEFI 2.7 specification sets expectations that the 'updating' flag is eventually cleared. To date, the libnvdimm core has never adhered to that protocol. The policy of the core matches the policy of other multi-device info-block formats like MD-Software-RAID that expect administrator intervention on inconsistent info-blocks, not automatic invalidation. However, some pre-boot environments may unfortunately attempt to "clean up" the labels and invalidate a set when it fails to find at least one "non-updating" label in the set. Clear the updating flag after set updates to minimize the window of vulnerability to aggressive pre-boot environments. Ideally implementations would not write to the label area outside of creating namespaces. Note that this only minimizes the window, it does not close it as the system can still crash while clearing the flag and the set can be subsequently deleted / invalidated by the pre-boot environment. Fixes: f524bf27 ("libnvdimm: write pmem label set") Cc: <stable@vger.kernel.org> Cc: Kelly Couch <kelly.j.couch@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 13 1月, 2019 1 次提交
-
-
由 Dan Williams 提交于
commit a95c90f1e2c253b280385ecf3d4ebfe476926b28 upstream. The last step before devm_memremap_pages() returns success is to allocate a release action, devm_memremap_pages_release(), to tear the entire setup down. However, the result from devm_add_action() is not checked. Checking the error from devm_add_action() is not enough. The api currently relies on the fact that the percpu_ref it is using is killed by the time the devm_memremap_pages_release() is run. Rather than continue this awkward situation, offload the responsibility of killing the percpu_ref to devm_memremap_pages_release() directly. This allows devm_memremap_pages() to do the right thing relative to init failures and shutdown. Without this change we could fail to register the teardown of devm_memremap_pages(). The likelihood of hitting this failure is tiny as small memory allocations almost always succeed. However, the impact of the failure is large given any future reconfiguration, or disable/enable, of an nvdimm namespace will fail forever as subsequent calls to devm_memremap_pages() will fail to setup the pgmap_radix since there will be stale entries for the physical address range. An argument could be made to require that the ->kill() operation be set in the @pgmap arg rather than passed in separately. However, it helps code readability, tracking the lifetime of a given instance, to be able to grep the kill routine directly at the devm_memremap_pages() call site. Link: http://lkml.kernel.org/r/154275558526.76910.7535251937849268605.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com> Fixes: e8d51348 ("memremap: change devm_memremap_pages interface...") Reviewed-by: N"Jérôme Glisse" <jglisse@redhat.com> Reported-by: NLogan Gunthorpe <logang@deltatee.com> Reviewed-by: NLogan Gunthorpe <logang@deltatee.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 13 12月, 2018 1 次提交
-
-
由 Dan Williams 提交于
commit ae86cbfe upstream. Commit cfe30b87 "libnvdimm, pmem: adjust for section collisions with 'System RAM'" enabled Linux to workaround occasions where platform firmware arranges for "System RAM" and "Persistent Memory" to collide within a single section boundary. Unfortunately, as reported in this issue [1], platform firmware can inflict the same collision between persistent memory regions. The approach of interrogating iomem_resource does not work in this case because platform firmware may merge multiple regions into a single iomem_resource range. Instead provide a method to interrogate regions that share the same parent bus. This is a stop-gap until the core-MM can grow support for hotplug on sub-section boundaries. [1]: https://github.com/pmem/ndctl/issues/76 Fixes: cfe30b87 ("libnvdimm, pmem: adjust for section collisions with...") Cc: <stable@vger.kernel.org> Reported-by: NPatrick Geary <patrickg@supermicro.com> Tested-by: NPatrick Geary <patrickg@supermicro.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 14 11月, 2018 3 次提交
-
-
由 Dan Williams 提交于
commit 91ed7ac444ef749603a95629a5ec483988c4f14b upstream. The driver is only initializing bb_res in the devm_memremap_pages() paths, but the raw namespace case is passing an uninitialized bb_res to nvdimm_badblocks_populate(). Fixes: e8d51348 ("memremap: change devm_memremap_pages interface...") Cc: <stable@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Reported-by: NJacek Zloch <jacek.zloch@intel.com> Reported-by: NKrzysztof Rusocki <krzysztof.rusocki@intel.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Dan Williams 提交于
commit 5d394eee2c102453278d81d9a7cf94c80253486a upstream. While experimenting with region driver loading the following backtrace was triggered: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. [..] Call Trace: dump_stack+0x85/0xcb register_lock_class+0x571/0x580 ? __lock_acquire+0x2ba/0x1310 ? kernfs_seq_start+0x2a/0x80 __lock_acquire+0xd4/0x1310 ? dev_attr_show+0x1c/0x50 ? __lock_acquire+0x2ba/0x1310 ? kernfs_seq_start+0x2a/0x80 ? lock_acquire+0x9e/0x1a0 lock_acquire+0x9e/0x1a0 ? dev_attr_show+0x1c/0x50 badblocks_show+0x70/0x190 ? dev_attr_show+0x1c/0x50 dev_attr_show+0x1c/0x50 This results from a missing successful call to devm_init_badblocks() from nd_region_probe(). Block attempts to show badblocks while the region is not enabled. Fixes: 6a6bef90 ("libnvdimm: add mechanism to publish badblocks...") Cc: <stable@vger.kernel.org> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Alexander Duyck 提交于
commit b6eae0f6 upstream. Unlike asynchronous initialization in the core we have not yet associated the device with the parent, and as such the device doesn't hold a reference to the parent. In order to resolve that we should be holding a reference on the parent until the asynchronous initialization has completed. Cc: <stable@vger.kernel.org> Fixes: 4d88a97a ("libnvdimm: ...base ... infrastructure") Signed-off-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 8月, 2018 1 次提交
-
-
由 Dan Williams 提交于
Use clear_mce_nospec() to restore WB mode for the kernel linear mapping of a pmem page that was marked 'HWPoison'. A page with 'HWPoison' set has also been marked UC in PAT (page attribute table) via set_mce_nospec() to prevent speculative retrievals of poison. The 'HWPoison' flag is only cleared when overwriting an entire page. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com>
-
- 20 8月, 2018 1 次提交
-
-
由 Vishal Verma 提交于
Commit efda1b5d ("acpi, nfit, libnvdimm: fix / harden ars_status output length handling") Introduced additional hardening for ambiguity in the ACPI spec for ars_status output sizing. However, it had a couple of cases mixed up. Where it should have been checking for (and returning) "out_field[1] - 4" it was using "out_field[1] - 8" and vice versa. This caused a four byte discrepancy in the buffer size passed on to the command handler, and in some cases, this caused memory corruption like: ./daxdev-errors.sh: line 76: 24104 Aborted (core dumped) ./daxdev-errors $busdev $region malloc(): memory corruption Program received signal SIGABRT, Aborted. [...] #5 0x00007ffff7865a2e in calloc () from /lib64/libc.so.6 #6 0x00007ffff7bc2970 in ndctl_bus_cmd_new_ars_status (ars_cap=ars_cap@entry=0x6153b0) at ars.c:136 #7 0x0000000000401644 in check_ars_status (check=0x7fffffffdeb0, bus=0x604c20) at daxdev-errors.c:144 #8 test_daxdev_clear_error (region_name=<optimized out>, bus_name=<optimized out>) at daxdev-errors.c:332 Cc: <stable@vger.kernel.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Lukasz Dorau <lukasz.dorau@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Fixes: efda1b5d ("acpi, nfit, libnvdimm: fix / harden ars_status output length handling") Signed-off-by: NVishal Verma <vishal.l.verma@intel.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-of-by: NDave Jiang <dave.jiang@intel.com>
-
- 31 7月, 2018 1 次提交
-
-
由 Huaisheng Ye 提交于
pmem_direct_access() needs to check the validity of pointers kaddr and pfn for NULL assignment. If anyone equals to NULL, it doesn't need to calculate the value. If pointer equals to NULL, that is to say callers may have no need for kaddr or pfn, so this patch is prepared for allowing them to pass in NULL instead of having to pass in a pointer or local variable that they then just throw away. Signed-off-by: NHuaisheng Ye <yehs1@lenovo.com> Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com>
-
- 26 7月, 2018 2 次提交
-
-
由 Keith Busch 提交于
The 'available_size' attribute showing the combined total of all unallocated space isn't always useful to know how large of a namespace a user may be able to allocate if the region is fragmented. This patch will export the largest extent of unallocated space that may be allocated to create a new namespace. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com>
-
由 Keith Busch 提交于
This patch will find the max contiguous area to determine the largest pmem namespace size that can be created. If the requested size exceeds the largest available, ENOSPC error will be returned. This fixes the allocation underrun error and wrong error return code that have otherwise been observed as the following kernel warning: WARNING: CPU: <CPU> PID: <PID> at drivers/nvdimm/namespace_devs.c:913 size_store Fixes: a1f3e4d6 ("libnvdimm, region: update nd_region_available_dpa() for multi-pmem support") Cc: <stable@vger.kernel.org> Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com>
-
- 18 7月, 2018 2 次提交
-
-
由 Michael Callahan 提交于
Add and use a new op_stat_group() function for indexing partition stat fields rather than indexing them by rq_data_dir() or bio_data_dir(). This function works similarly to op_is_sync() in that it takes the request::cmd_flags or bio::bi_opf flags and determines which stats should et updated. In addition, the second parameter to generic_start_io_acct() and generic_end_io_acct() is now a REQ_OP rather than simply a read or write bit and it uses op_stat_group() on the parameter to determine the stat group. Note that the partition in_flight counts are not part of the per-cpu statistics and as such are not indexed via this function. It's now indexed by op_is_write(). tj: Refreshed on top of v4.17. Updated to pass around REQ_OP. Signed-off-by: NMichael Callahan <michaelcallahan@fb.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Matias Bjorling <mb@lightnvm.io> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Alasdair Kergon <agk@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
c11f0c0b ("block/mm: make bdev_ops->rw_page() take a bool for read/write") replaced @OP with boolean @is_write, which limited the amount of information going into ->rw_page() and more importantly page_endio(), which removed the need to expose block internals to mm. Unfortunately, we want to track discards separately and @is_write isn't enough information. This patch updates bdev_ops->rw_page() to take REQ_OP instead but leaves page_endio() to take bool @is_write. This allows the block part of operations to have enough information while not leaking it to mm. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 7月, 2018 1 次提交
-
-
由 Dan Williams 提交于
When a DIMM is locked its namespace label area may not be. Introduce the distinction of locked namespaces to allow namespace enumeration while the capacity is locked. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 29 6月, 2018 2 次提交
-
-
由 Dan Williams 提交于
Commit 60622d68 "x86/asm/memcpy_mcsafe: Return bytes remaining" converted callers of memcpy_mcsafe() to expect a positive 'bytes remaining' value rather than a negative error code. The nsio_rw_bytes() conversion failed to return success. The failure is benign in that nsio_rw_bytes() will end up writing back what it just read. Fixes: 60622d68 ("x86/asm/memcpy_mcsafe: Return bytes remaining") Cc: Dan Williams <dan.j.williams@intel.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Ross Zwisler 提交于
QUEUE_FLAG_DAX is an indication that a given block device supports filesystem DAX and should not be set for PMEM namespaces which are in "raw" mode. These namespaces lack struct page and are prevented from participating in filesystem DAX as of commit 569d0365 ("dax: require 'struct page' by default for filesystem dax"). Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Suggested-by: NMike Snitzer <snitzer@redhat.com> Fixes: 569d0365 ("dax: require 'struct page' by default for filesystem dax") Cc: stable@vger.kernel.org Acked-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NToshi Kani <toshi.kani@hpe.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 07 6月, 2018 2 次提交
-
-
由 Ross Zwisler 提交于
This commit: 5fdf8e5b ("libnvdimm: re-enable deep flush for pmem devices via fsync()") intended to make sure that deep flush was always available even on platforms which support a power-fail protected CPU cache. An unintended side effect of this change was that we also lost the ability to skip flushing CPU caches on those power-fail protected CPU cache. Fix this by skipping the low level cache flushing in dax_flush() if we have CPU caches which are power-fail protected. The user can still override this behavior by manually setting the write_cache state of a namespace. See libndctl's ndctl_namespace_write_cache_is_enabled(), ndctl_namespace_enable_write_cache() and ndctl_namespace_disable_write_cache() functions. Cc: <stable@vger.kernel.org> Fixes: 5fdf8e5b ("libnvdimm: re-enable deep flush for pmem devices via fsync()") Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Ross Zwisler 提交于
Prior to this commit we would only do a "deep flush" (have nvdimm_flush() write to each of the flush hints for a region) in response to an msync/fsync/sync call if the nvdimm_has_cache() returned true at the time we were setting up the request queue. This happens due to the write cache value passed in to blk_queue_write_cache(), which then causes the block layer to send down BIOs with REQ_FUA and REQ_PREFLUSH set. We do have a "write_cache" sysfs entry for namespaces, i.e.: /sys/bus/nd/devices/pfn0.1/block/pmem0/dax/write_cache which can be used to control whether or not the kernel thinks a given namespace has a write cache, but this didn't modify the deep flush behavior that we set up when the driver was initialized. Instead, it only modified whether or not DAX would flush CPU caches via dax_flush() in response to *sync calls. Simplify this by making the *sync deep flush always happen, regardless of the write cache setting of a namespace. The DAX CPU cache flushing will still be controlled the write_cache setting of the namespace. Cc: <stable@vger.kernel.org> Fixes: 5fdf8e5b ("libnvdimm: re-enable deep flush for pmem devices via fsync()") Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-