- 30 11月, 2017 1 次提交
-
-
由 Dan Williams 提交于
Similar to how device-dax enforces that the 'address', 'offset', and 'len' parameters to mmap() be aligned to the device's fundamental alignment, the same constraints apply to munmap(). Implement ->split() to fail munmap calls that violate the alignment constraint. Otherwise, we later fail VM_BUG_ON checks in the unmap_page_range() path with crash signatures of the form: vma ffff8800b60c8a88 start 00007f88c0000000 end 00007f88c0e00000 next (null) prev (null) mm ffff8800b61150c0 prot 8000000000000027 anon_vma (null) vm_ops ffffffffa0091240 pgoff 0 file ffff8800b638ef80 private_data (null) flags: 0x380000fb(read|write|shared|mayread|maywrite|mayexec|mayshare|softdirty|mixedmap|hugepage) ------------[ cut here ]------------ kernel BUG at mm/huge_memory.c:2014! [..] RIP: 0010:__split_huge_pud+0x12a/0x180 [..] Call Trace: unmap_page_range+0x245/0xa40 ? __vma_adjust+0x301/0x990 unmap_vmas+0x4c/0xa0 unmap_region+0xae/0x120 ? __vma_rb_erase+0x11a/0x230 do_munmap+0x276/0x410 vm_munmap+0x6a/0xa0 SyS_munmap+0x1d/0x30 Link: http://lkml.kernel.org/r/151130418681.4029.7118245855057952010.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: dee41079 ("/dev/dax, core: file operations and dax-mmap") Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reported-by: NJeff Moyer <jmoyer@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 10月, 2017 1 次提交
-
-
由 Ross Zwisler 提交于
Fix this build warning: warning: 'phys' may be used uninitialized in this function [-Wuninitialized] As reported here: https://lkml.org/lkml/2017/10/16/152 http://kisskb.ellerman.id.au/kisskb/buildresult/13181373/log/Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 19 7月, 2017 1 次提交
-
-
由 Dan Williams 提交于
Fix warnings of the form... WARNING: CPU: 10 PID: 4983 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80 sysfs: cannot create duplicate filename '/class/dax/dax12.0' Call Trace: dump_stack+0x63/0x86 __warn+0xcb/0xf0 warn_slowpath_fmt+0x5a/0x80 ? kernfs_path_from_node+0x4f/0x60 sysfs_warn_dup+0x62/0x80 sysfs_do_create_link_sd.isra.2+0x97/0xb0 sysfs_create_link+0x25/0x40 device_add+0x266/0x630 devm_create_dax_dev+0x2cf/0x340 [dax] dax_pmem_probe+0x1f5/0x26e [dax_pmem] nvdimm_bus_probe+0x71/0x120 ...by reusing the namespace id for the device-dax instance name. Now that we have decided that there will never by more than one device-dax instance per libnvdimm-namespace parent device [1], we can directly reuse the namepace ids. There are some possible follow-on cleanups, but those are saved for a later patch to simplify the -stable backport. [1]: https://lists.01.org/pipermail/linux-nvdimm/2016-December/008266.html Fixes: 98a29c39 ("libnvdimm, namespace: allow creation of multiple pmem...") Cc: Jeff Moyer <jmoyer@redhat.com> Cc: <stable@vger.kernel.org> Reported-by: NDariusz Dokupil <dariusz.dokupil@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 18 7月, 2017 1 次提交
-
-
由 Dan Williams 提交于
Dan Carpenter reports: The patch 7b6be844: "dax: refactor dax-fs into a generic provider of 'struct dax_device' instances" from Apr 11, 2017, leads to the following static checker warning: drivers/dax/device.c:643 devm_create_dev_dax() warn: passing zero to 'ERR_PTR' Fix the case where we inadvertently leak 0 to ERR_PTR() by setting at every error case, and make it clear that 'count' is never 0. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 06 7月, 2017 1 次提交
-
-
由 Jeff Layton 提交于
Most filesystems currently use mapping_set_error and filemap_check_errors for setting and reporting/clearing writeback errors at the mapping level. filemap_check_errors is indirectly called from most of the filemap_fdatawait_* functions and from filemap_write_and_wait*. These functions are called from all sorts of contexts to wait on writeback to finish -- e.g. mostly in fsync, but also in truncate calls, getattr, etc. The non-fsync callers are problematic. We should be reporting writeback errors during fsync, but many places spread over the tree clear out errors before they can be properly reported, or report errors at nonsensical times. If I get -EIO on a stat() call, there is no reason for me to assume that it is because some previous writeback failed. The fact that it also clears out the error such that a subsequent fsync returns 0 is a bug, and a nasty one since that's potentially silent data corruption. This patch adds a small bit of new infrastructure for setting and reporting errors during address_space writeback. While the above was my original impetus for adding this, I think it's also the case that current fsync semantics are just problematic for userland. Most applications that call fsync do so to ensure that the data they wrote has hit the backing store. In the case where there are multiple writers to the file at the same time, this is really hard to determine. The first one to call fsync will see any stored error, and the rest get back 0. The processes with open fds may not be associated with one another in any way. They could even be in different containers, so ensuring coordination between all fsync callers is not really an option. One way to remedy this would be to track what file descriptor was used to dirty the file, but that's rather cumbersome and would likely be slow. However, there is a simpler way to improve the semantics here without incurring too much overhead. This set adds an errseq_t to struct address_space, and a corresponding one is added to struct file. Writeback errors are recorded in the mapping's errseq_t, and the one in struct file is used as the "since" value. This changes the semantics of the Linux fsync implementation such that applications can now use it to determine whether there were any writeback errors since fsync(fd) was last called (or since the file was opened in the case of fsync having never been called). Note that those writeback errors may have occurred when writing data that was dirtied via an entirely different fd, but that's the case now with the current mapping_set_error/filemap_check_error infrastructure. This will at least prevent you from getting a false report of success. The new behavior is still consistent with the POSIX spec, and is more reliable for application developers. This patch just adds some basic infrastructure for doing this, and ensures that the f_wb_err "cursor" is properly set when a file is opened. Later patches will change the existing code to use this new infrastructure for reporting errors at fsync time. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
- 20 4月, 2017 2 次提交
-
-
由 Dan Williams 提交于
Track a set of dax_operations per dax_device that can be set at alloc_dax() time. These operations will be used to stop the abuse of block_device_operations for communicating dax capabilities to filesystems. It will also be used to replace the "pmem api" and move pmem-specific cache maintenance, and other dax-driver-specific filesystem-dax operations, to dax device methods. In particular this allows us to stop abusing __copy_user_nocache(), via memcpy_to_pmem(), with a driver specific replacement. This is a standalone introduction of the operations. Follow on patches convert each dax-driver and teach fs/dax.c to use ->direct_access() from dax_operations instead of block_device_operations. Suggested-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
For the current block_device based filesystem-dax path, we need a way for it to lookup the dax_device associated with a block_device. Add a 'host' property of a dax_device that can be used for this purpose. It is a free form string, but for a dax_device associated with a block device it is the bdev name. This is a stop-gap until filesystems are able to mount on a dax-inode directly. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 13 4月, 2017 6 次提交
-
-
由 Dan Williams 提交于
We want dax capable drivers to be able to publish a set of dax operations [1]. However, we do not want to further abuse block_devices to advertise these operations. Instead we will attach these operations to a dax device and add a lookup mechanism to go from block device path to a dax device. A dax capable driver like pmem or brd is responsible for registering a dax device, alongside a block device, and then a dax capable filesystem is responsible for retrieving the dax device by path name if it wants to call dax_operations. For now, we refactor the dax pseudo-fs to be a generic facility, rather than an implementation detail, of the device-dax use case. Where a "dax device" is just an inode + dax infrastructure, and "Device DAX" is a mapping service layered on top of that base 'struct dax_device'. "Filesystem DAX" is then a mapping service that layers a filesystem on top of that same base device. Filesystem DAX is associated with a block_device for now, but perhaps directly to a dax device in the future, or for new pmem-only filesystems. [1]: https://lkml.org/lkml/2017/1/19/880Suggested-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
In preparation for introducing a struct dax_device type to the kernel global type namespace, rename dax_dev to dev_dax. A 'dax_device' instance will be a generic device-driver object for any provider of dax functionality. A 'dev_dax' object is a device-dax-driver local / internal instance. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Oliver O'Halloran 提交于
A couple of minor improvements to the debug output in the fault handlers: a) Print the region alignment and fault size when we sent a SIGBUS because the region alignment is greater than the fault size. b) Fix the message in the PFN_{DEV|MAP} check. c) Additionally print the fault size enum value in the huge fault handler. Signed-off-by: NOliver O'Halloran <oohall@gmail.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Pushkar Jambhlekar 提交于
The default case for dax_dev_huge_fault() fault size handling mistakenly returns when it should unlock. This is not a problem in practice since the only three possible fault sizes are handled. Going forward, if the core mm adds a new fault size beyond pte, pmd, or pud device-dax should abort VM_FAULT_SIGBUS requests not VM_FAULT_FALLBACK since device-dax guarantees a configured fault granularity for all faults. Signed-off-by: NPushkar Jambhlekar <pushkar.iit@gmail.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dave Jiang 提交于
Provide a replacement pgoff_to_phys() that translates an nfit_test resource (allocated by vmalloc()) to a pfn. Signed-off-by: NDave Jiang <dave.jiang@intel.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The following warning triggers with a new unit test that stresses the device-dax interface. =============================== [ ERR: suspicious RCU usage. ] 4.11.0-rc4+ #1049 Tainted: G O ------------------------------- ./include/linux/rcupdate.h:521 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 0 2 locks held by fio/9070: #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8d0739d7>] __do_page_fault+0x167/0x4f0 #1: (rcu_read_lock){......}, at: [<ffffffffc03fbd02>] dax_dev_huge_fault+0x32/0x620 [dax] Call Trace: dump_stack+0x86/0xc3 lockdep_rcu_suspicious+0xd7/0x110 ___might_sleep+0xac/0x250 __might_sleep+0x4a/0x80 __alloc_pages_nodemask+0x23a/0x360 alloc_pages_current+0xa1/0x1f0 pte_alloc_one+0x17/0x80 __pte_alloc+0x1e/0x120 __get_locked_pte+0x1bf/0x1d0 insert_pfn.isra.70+0x3a/0x100 ? lookup_memtype+0xa6/0xd0 vm_insert_mixed+0x64/0x90 dax_dev_huge_fault+0x520/0x620 [dax] ? dax_dev_huge_fault+0x32/0x620 [dax] dax_dev_fault+0x10/0x20 [dax] __do_fault+0x1e/0x140 __handle_mm_fault+0x9af/0x10d0 handle_mm_fault+0x16d/0x370 ? handle_mm_fault+0x47/0x370 __do_page_fault+0x28c/0x4f0 trace_do_page_fault+0x58/0x2a0 do_async_page_fault+0x1a/0xa0 async_page_fault+0x28/0x30 Inserting a page table entry may trigger an allocation while we are holding a read lock to keep the device instance alive for the duration of the fault. Use srcu for this keep-alive protection. Fixes: dee41079 ("/dev/dax, core: file operations and dax-mmap") Cc: <stable@vger.kernel.org> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 21 3月, 2017 2 次提交
-
-
由 Logan Gunthorpe 提交于
Replace the open coded registration of the cdev and dev with the new device_add_cdev() helper. The helper replaces a common pattern by taking the proper reference against the parent device and adding both the cdev and the device. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Dan Williams 提交于
If device_add() fails, cleanup the cdev. Otherwise, we leak a kobj_map() with a stale device number. As Jason points out, there is a small possibility that userspace has opened and mapped the device in the time between cdev_add() and the device_add() failure. We need a new kill_dax_dev() helper to invalidate any established mappings. Fixes: ba09c01d ("dax: convert to the cdev api") Cc: <stable@vger.kernel.org> Reported-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 11 3月, 2017 3 次提交
-
-
由 Dave Jiang 提交于
The debug output for return the return data of pgoff_to_phys() in the fault handlers has 'phys' and 'pgoff' incorrectly swapped. Reported-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dave Jiang 提交于
Jeff Moyer reports: With a device dax alignment of 4KB or 2MB, I get sigbus when running the attached fio job file for the current kernel (4.11.0-rc1+). If I specify an alignment of 1GB, it works. I turned on debug output, and saw that it was failing in the huge fault code. dax dax1.0: dax_open dax dax1.0: dax_mmap dax dax1.0: dax_dev_huge_fault: fio: write (0x7f08f0a00000 - dax dax1.0: __dax_dev_pud_fault: phys_to_pgoff(0xffffffffcf60) dax dax1.0: dax_release fio config for reproduce: [global] ioengine=dev-dax direct=0 filename=/dev/dax0.0 bs=2m [write] rw=write [read] stonewall rw=read The driver fails to fallback when taking a fault that is larger than the device alignment, or handling a larger fault when a smaller mapping is already established. While we could support larger mappings for a device with a smaller alignment, that change is too large for the immediate fix. The simplest change is to force fallback until the fault size matches the alignment. Reported-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dave Jiang 提交于
Jeff Moyer reports: With a device dax alignment of 4KB or 2MB, I get sigbus when running the attached fio job file for the current kernel (4.11.0-rc1+). If I specify an alignment of 1GB, it works. I turned on debug output, and saw that it was failing in the huge fault code. dax dax1.0: dax_open dax dax1.0: dax_mmap dax dax1.0: dax_dev_huge_fault: fio: write (0x7f08f0a00000 - dax dax1.0: __dax_dev_pud_fault: phys_to_pgoff(0xffffffffcf60 dax dax1.0: dax_release fio config for reproduce: [global] ioengine=dev-dax direct=0 filename=/dev/dax0.0 bs=2m [write] rw=write [read] stonewall rw=read The driver fails to fallback when taking a fault that is larger than the device alignment, or handling a larger fault when a smaller mapping is already established. While we could support larger mappings for a device with a smaller alignment, that change is too large for the immediate fix. The simplest change is to force fallback until the fault size matches the alignment. Fixes: dee41079 ("/dev/dax, core: file operations and dax-mmap") Cc: <stable@vger.kernel.org> Reported-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 02 3月, 2017 1 次提交
-
-
由 Ingo Molnar 提交于
Update files that depend on the magic.h inclusion. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 2月, 2017 4 次提交
-
-
由 Dave Jiang 提交于
Since the introduction of FAULT_FLAG_SIZE to the vm_fault flag, it has been somewhat painful with getting the flags set and removed at the correct locations. More than one kernel oops was introduced due to difficulties of getting the placement correctly. Remove the flag values and introduce an input parameter to huge_fault that indicates the size of the page entry. This makes the code easier to trace and should avoid the issues we see with the fault flags where removal of the flag was necessary in the fallback paths. Link: http://lkml.kernel.org/r/148615748258.43180.1690152053774975329.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NDave Jiang <dave.jiang@intel.com> Tested-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Jiang 提交于
Add transparent huge PUD pages support for device DAX by adding a pud_fault handler. Link: http://lkml.kernel.org/r/148545060002.17912.6765687780007547551.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NDave Jiang <dave.jiang@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Jiang 提交于
Patch series "1G transparent hugepage support for device dax", v2. The following series implements support for 1G trasparent hugepage on x86 for device dax. The bulk of the code was written by Mathew Wilcox a while back supporting transparent 1G hugepage for fs DAX. I have forward ported the relevant bits to 4.10-rc. The current submission has only the necessary code to support device DAX. Comments from Dan Williams: So the motivation and intended user of this functionality mirrors the motivation and users of 1GB page support in hugetlbfs. Given expected capacities of persistent memory devices an in-memory database may want to reduce tlb pressure beyond what they can already achieve with 2MB mappings of a device-dax file. We have customer feedback to that effect as Willy mentioned in his previous version of these patches [1]. [1]: https://lkml.org/lkml/2016/1/31/52 Comments from Nilesh @ Oracle: There are applications which have a process model; and if you assume 10,000 processes attempting to mmap all the 6TB memory available on a server; we are looking at the following: processes : 10,000 memory : 6TB pte @ 4k page size: 8 bytes / 4K of memory * #processes = 6TB / 4k * 8 * 10000 = 1.5GB * 80000 = 120,000GB pmd @ 2M page size: 120,000 / 512 = ~240GB pud @ 1G page size: 240GB / 512 = ~480MB As you can see with 2M pages, this system will use up an exorbitant amount of DRAM to hold the page tables; but the 1G pages finally brings it down to a reasonable level. Memory sizes will keep increasing; so this number will keep increasing. An argument can be made to convert the applications from process model to thread model, but in the real world that may not be always practical. Hopefully this helps explain the use case where this is valuable. This patch (of 3): In preparation for adding the ability to handle PUD pages, convert vm_operations_struct.pmd_fault to vm_operations_struct.huge_fault. The vm_fault structure is extended to include a union of the different page table pointers that may be needed, and three flag bits are reserved to indicate which type of pointer is in the union. [ross.zwisler@linux.intel.com: remove unused function ext4_dax_huge_fault()] Link: http://lkml.kernel.org/r/1485813172-7284-1-git-send-email-ross.zwisler@linux.intel.com [dave.jiang@intel.com: clear PMD or PUD size flags when in fall through path] Link: http://lkml.kernel.org/r/148589842696.5820.16078080610311444794.stgit@djiang5-desk3.ch.intel.com Link: http://lkml.kernel.org/r/148545058784.17912.6353162518188733642.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Jiang 提交于
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 2月, 2017 2 次提交
-
-
由 Dave Jiang 提交于
pmd_fault() and related functions really only need the vmf parameter since the additional parameters are all included in the vmf struct. Remove the additional parameter and simplify pmd_fault() and friends. Link: http://lkml.kernel.org/r/1484085142-2297-8-git-send-email-ross.zwisler@linux.intel.comSigned-off-by: NDave Jiang <dave.jiang@intel.com> Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Jiang 提交于
Instead of passing in multiple parameters in the pmd_fault() handler, a vmf can be passed in just like a fault() handler. This will simplify code and remove the need for the actual pmd fault handlers to allocate a vmf. Related functions are also modified to do the same. [dave.jiang@intel.com: fix issue with xfs_tests stall when DAX option is off] Link: http://lkml.kernel.org/r/148469861071.195597.3619476895250028518.stgit@djiang5-desk3.ch.intel.com Link: http://lkml.kernel.org/r/1484085142-2297-7-git-send-email-ross.zwisler@linux.intel.comSigned-off-by: NDave Jiang <dave.jiang@intel.com> Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 12月, 2016 1 次提交
-
-
由 Dan Williams 提交于
While this information is available by looking at the nvdimm parent device that may not always be the case when/if we add support for other memory regions. Tooling should not depend on walking a given ancestor topology that is not guaranteed by the device's class. For example, a device-dax instance will always have a dax_region parent, but it may not always have a libnvdimm "dax" device as a grandparent. Reported-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 15 12月, 2016 1 次提交
-
-
由 Jan Kara 提交于
Every single user of vmf->virtual_address typed that entry to unsigned long before doing anything with it so the type of virtual_address does not really provide us any additional safety. Just use masked vmf->address which already has the appropriate type. Link: http://lkml.kernel.org/r/1479460644-25076-3-git-send-email-jack@suse.czSigned-off-by: NJan Kara <jack@suse.cz> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 12月, 2016 1 次提交
-
-
由 Dan Williams 提交于
Hugh notes in response to commit 4cb19355 "device-dax: fail all private mapping attempts": "I think that is more restrictive than you intended: haven't tried, but I believe it rejects a PROT_READ, MAP_SHARED, O_RDONLY fd mmap, leaving no way to mmap /dev/dax without write permission to it." Indeed it does restrict read-only mappings, switch to checking VM_MAYSHARE, not VM_SHARED. Cc: <stable@vger.kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Pawel Lebioda <pawel.lebioda@intel.com> Fixes: 4cb19355 ("device-dax: fail all private mapping attempts") Reported-by: NHugh Dickins <hughd@google.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 17 11月, 2016 1 次提交
-
-
由 Dan Williams 提交于
The device-dax implementation originally tried to be tricky and allow private read-only mappings, but in the process allowed writable MAP_PRIVATE + MAP_NORESERVE mappings. For simplicity and predictability just fail all private mapping attempts since device-dax memory is statically allocated and will never support overcommit. Cc: <stable@vger.kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Fixes: dee41079 ("/dev/dax, core: file operations and dax-mmap") Reported-by: NPawel Lebioda <pawel.lebioda@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 08 10月, 2016 2 次提交
-
-
由 Arnd Bergmann 提交于
The dev_t variable in devm_create_dax_dev() is used before it's first set: drivers/dax/dax.c: In function 'devm_create_dax_dev': drivers/dax/dax.c:205:39: error: 'dev_t' may be used uninitialized in this function [-Werror=maybe-uninitialized] inode = iget5_locked(dax_superblock, hash_32(devt + DAXFS_MAGIC, 31), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/dax/dax.c:688:8: note: 'dev_t' was declared here This reorders the code to how it looks correct to me. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Fixes: 3bc52c45 ("dax: define a unified inode/address_space for device-dax mappings") Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
For sub-division support we need access to the dax_dev created by devm_create_dax_dev(). Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 04 9月, 2016 1 次提交
-
-
由 Dan Williams 提交于
pgoff_to_phys() validates that both the starting address and the length of the mapping against the resource list. We need to check for a mapping size of PMD_SIZE not PAGE_SIZE in the pmd fault path. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 24 8月, 2016 8 次提交
-
-
由 Dan Williams 提交于
All the extents of a dax-device must match the alignment of the region. Otherwise, we are unable to guarantee fault semantics of a given page size. The region must be self-consistent itself as well. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Invalidate all mappings of a device-dax instance when the device is unregistered. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
In support of enabling resize / truncate of device-dax instances, define a pseudo-fs to provide a unified inode/address space for vm operations. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
A goal of the device-DAX interface is to be able to support many exclusive allocations (partitions) of performance / feature differentiated memory. This count may exceed the default minors limit of 256. As a result of switching to an embedded cdev the inode-to-dax_dev conversion is simplified, as well as reference counting which can switch to the cdev kobject lifetime. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The kref in dax_dev can be made redundant if the final put_device() on the device associated with the dax_dev frees the dax_dev. This can be accomplished by embedding a struct device in struct dax_dev, open coding device_create() and specifying a custom release method. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Shorten the prefix of the file operations to distinguish them from operations on the struct device associated with the dax_dev. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
In order to convert devm_create_dax_dev() to use cdev, it will need access to dax_fops. Move dax_fops and related function definitions before devm_create_dax_dev(). Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
drivers/dax/dax.c:75:6: warning: symbol 'dax_region_put' was not declared. drivers/dax/dax.c:95:19: warning: symbol 'alloc_dax_region' was not declared. drivers/dax/dax.c:173:5: warning: symbol 'devm_create_dax_dev' was not declared. drivers/dax/pmem.c:27:17: warning: symbol 'to_dax_pmem' was not declared. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-