- 29 3月, 2016 1 次提交
-
-
由 Dan Williams 提交于
Update the definition of memcpy_from_pmem() to return 0 or a negative error code. Implement x86/arch_memcpy_from_pmem() with memcpy_mcsafe(). Cc: Borislav Petkov <bp@alien8.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 10 3月, 2016 5 次提交
-
-
由 Dan Williams 提交于
If a write is directed at a known bad block perform the following: 1/ write the data 2/ send a clear poison command 3/ invalidate the poison out of the cache hierarchy Cc: <x86@kernel.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
When we enounter a bad block we need to kunmap_atomic() before returning. Cc: <stable@vger.kernel.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 NeilBrown 提交于
alloc_disk(0) does not require or use a ->major number, all devices are allocated with a major of BLOCK_EXT_MAJOR. So don't allocate btt_major. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 NeilBrown 提交于
When alloc_disk(0) is used ->major is completely ignored, all devices are allocated with a "major" of BLOCK_EXT_MAJOR. So don't allocate nd_blk_major Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 NeilBrown 提交于
When alloc_disk(0) or alloc_disk-node(0, XX) is used, the ->major number is completely ignored: all devices are allocated with a major of BLOCK_EXT_MAJOR. So there is no point allocating pmem_major. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 07 3月, 2016 1 次提交
-
-
由 Dan Williams 提交于
drivers/nvdimm/pmem.c: In function 'nvdimm_namespace_attach_pfn': drivers/nvdimm/pmem.c:367:3: error: implicit declaration of function '__phys_to_pfn' [-Werror=implicit-function-declaration] .base_pfn = __phys_to_pfn(nsio->res.start), ia64 does not provide __phys_to_pfn(), just use the PHYS_PFN() alias. Cc: Guenter Roeck <linux@roeck-us.net> Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 06 3月, 2016 11 次提交
-
-
由 Dan Williams 提交于
Add the boiler-plate for a 'clear error' command based on section 9.20.7.6 "Function Index 4 - Clear Uncorrectable Error" from the ACPI 6.1 specification, and add a reference implementation in nfit_test. Reviewed-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Currenty with a raw mode pmem namespace the physical memory address range for the device can be obtained via /sys/block/pmemX/device/{resource|size}. Add similar attributes for pfn instances that takes the struct page memmap and section padding into account. Reported-by: NHaozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
On a platform where 'Persistent Memory' and 'System RAM' are mixed within a given sparsemem section, trim the namespace and notify about the sub-optimal alignment. Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The altmap for a section-misaligned namespace needs to arrange for the base_pfn to be section-aligned. As a result the 'reserve' region (pfns from base that do not have a struct page) must be increased. Otherwise we trip the altmap validation check in __add_pages: if (altmap->base_pfn != phys_start_pfn || vmem_altmap_offset(altmap) > nr_pages) { pr_warn_once("memory add fail, invalid altmap\n"); return -EINVAL; } Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Jerry Hoemann 提交于
Code attempts to prevent certain IOCTL DSM from being called when device is opened read only. This security feature can be trivially overcome by changing the size portion of the ioctl_command which isn't used. Check only the _IOC_NR (i.e. the command). Cc: <stable@vger.kernel.org> Signed-off-by: NJerry Hoemann <jerry.hoemann@hpe.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Jerry Hoemann 提交于
Change nd_ioctl and nvdimm_ioctl access mode check to use O_RDONLY. Signed-off-by: NJerry Hoemann <jerry.hoemann@hpe.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
While the nfit driver is issuing address range scrub commands and reaping the results do not permit an ars_start command issued from userspace. The scrub thread assumes that all ars completions are for scrubs initiated by platform firmware at boot, or by the nfit driver. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Introduce a workqueue that will be used to run address range scrub asynchronously with the rest of nvdimm device probing. Userspace still wants notification when probing operations complete, so introduce a new callback to flush this workqueue when userspace is awaiting probe completion. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
In preparation for asynchronous address range scrub support add an ability for the pmem driver to dynamically consume address range scrub results. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
In preparation for making poison list retrieval asynchronus to region registration, add protection for walking and mutating the bus-level poison list. Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The return value from an 'ndctl_fn' reports the command execution status, i.e. was the command properly formatted and was it successfully submitted to the bus provider. The new 'cmd_rc' parameter allows the bus provider to communicate command specific results, translated into common error codes. Convert the ARS commands to this scheme to: 1/ Consolidate status reporting 2/ Prepare for for expanding ars unit test cases 3/ Make the implementation more generic Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 24 2月, 2016 2 次提交
-
-
由 Arnd Bergmann 提交于
A recent bugfix changed pfn_t to always be 64-bit wide, but did not change the code in pmem.c, which is now broken on 32-bit architectures as reported by gcc: In file included from ../drivers/nvdimm/pmem.c:28:0: drivers/nvdimm/pmem.c: In function 'pmem_alloc': include/linux/pfn_t.h:15:17: error: large integer implicitly truncated to unsigned type [-Werror=overflow] #define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3)) This changes the intermediate pfn_flags in struct pmem_device to be 64 bit wide as well, so they can store the flags correctly. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Fixes: db78c222 ("mm: fix pfn_t vs highmem") Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The original format of these commands from the "NVDIMM DSM Interface Example" [1] are superseded by the ACPI 6.1 definition of the "NVDIMM Root Device _DSMs" [2]. [1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf [2]: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf "9.20.7 NVDIMM Root Device _DSMs" Changes include: 1/ New 'restart' fields in ars_status, unfortunately these are implemented in the middle of the existing definition so this change is not backwards compatible. The expectation is that shipping platforms will only ever support the ACPI 6.1 definition. 2/ New status values for ars_start ('busy') and ars_status ('overflow'). Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Linda Knippers <linda.knippers@hpe.com> Cc: <stable@vger.kernel.org> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 20 2月, 2016 1 次提交
-
-
由 Dan Williams 提交于
Use the output length specified in the command to size the receive buffer rather than the arbitrary 4K limit. This bug was hiding the fact that the ndctl implementation of ndctl_bus_cmd_new_ars_status() was not specifying an output buffer size. Cc: <stable@vger.kernel.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 17 2月, 2016 1 次提交
-
-
由 Dan Williams 提交于
This patch ensures that existing bus match callbacks don't return negative values (which might be interpreted as potential errors in the future) in case of positive match. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 30 1月, 2016 2 次提交
-
-
由 Toshi Kani 提交于
Change the callers of walk_iomem_res() scanning for the following resources by name to use walk_iomem_res_desc() instead. "ACPI Tables" "ACPI Non-volatile Storage" "Persistent Memory (legacy)" "Crash kernel" Note, the caller of walk_iomem_res() with "GART" will be removed in a later patch. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NDave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chun-Yi <joeyli.kernel@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Don Zickus <dzickus@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Lee, Chun-Yi <joeyli.kernel@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Minfei Huang <mnfhuang@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Takao Indoh <indou.takao@jp.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: kexec@lists.infradead.org Cc: linux-arch@vger.kernel.org Cc: linux-mm <linux-mm@kvack.org> Cc: linux-nvdimm@lists.01.org Link: http://lkml.kernel.org/r/1453841853-11383-15-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dan Williams 提交于
This path was missed when turning on the memmap in pmem support. Permit 'pmem' as a valid location for the map. Reported-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 27 1月, 2016 1 次提交
-
-
由 Dan Williams 提交于
Correctly display "safe" mode when a btt is established on a e820/memmap defined pmem namespace. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 16 1月, 2016 6 次提交
-
-
由 Dan Williams 提交于
get_dev_page() enables paths like get_user_pages() to pin a dynamically mapped pfn-range (devm_memremap_pages()) while the resulting struct page objects are in use. Unlike get_page() it may fail if the device is, or is in the process of being, disabled. While the initial lookup of the range may be an expensive list walk, the result is cached to speed up subsequent lookups which are likely to be in the same mapped range. devm_memremap_pages() now requires a reference counter to be specified at init time. For pmem this means moving request_queue allocation into pmem_alloc() so the existing queue usage counter can track "device pages". ZONE_DEVICE pages always have an elevated count and will never be on an lru reclaim list. That space in 'struct page' can be redirected for other uses, but for safety introduce a poison value that will always trip __list_add() to assert. This allows half of the struct list_head storage to be reclaimed with some assurance to back up the assumption that the page count never goes to zero and a list_add() is never attempted. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Tested-by: NLogan Gunthorpe <logang@deltatee.com> Cc: Dave Hansen <dave@sr71.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
Before the dynamically allocated struct pages from devm_memremap_pages() can be put to use outside the driver, we need a mechanism to track whether they are still in use at teardown. Towards that goal reorder the initialization sequence to allow the 'q_usage_counter' from the request_queue to be used by the devm_memremap_pages() implementation (in subsequent patches). Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
Use the new vmem_altmap capability to enable the pmem driver to arrange for a struct page memmap to be established in persistent memory. [linux@roeck-us.net: mn10300: declare __pfn_to_phys() to fix build error] Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
In support of providing struct page for large persistent memory capacities, use struct vmem_altmap to change the default policy for allocating memory for the memmap array. The default vmemmap_populate() allocates page table storage area from the page allocator. Given persistent memory capacities relative to DRAM it may not be feasible to store the memmap in 'System Memory'. Instead vmem_altmap represents pre-allocated "device pages" to satisfy vmemmap_alloc_block_buf() requests. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reported-by: Nkbuild test robot <lkp@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
There are several scenarios where we need to retrieve and update metadata associated with a given devm_memremap_pages() mapping, and the only lookup key available is a pfn in the range: 1/ We want to augment vmemmap_populate() (called via arch_add_memory()) to allocate memmap storage from pre-allocated pages reserved by the device driver. At vmemmap_alloc_block_buf() time it grabs device pages rather than page allocator pages. This is in support of devm_memremap_pages() mappings where the memmap is too large to fit in main memory (i.e. large persistent memory devices). 2/ Taking a reference against the mapping when inserting device pages into the address_space radix of a given inode. This facilitates unmap_mapping_range() and truncate_inode_pages() operations when the driver is tearing down the mapping. 3/ get_user_pages() operations on ZONE_DEVICE memory require taking a reference against the mapping so that the driver teardown path can revoke and drain usage of device pages. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Tested-by: NLogan Gunthorpe <logang@deltatee.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
For the purpose of communicating the optional presence of a 'struct page' for the pfn returned from ->direct_access(), introduce a type that encapsulates a page-frame-number plus flags. These flags contain the historical "page_link" encoding for a scatterlist entry, but can also denote "device memory". Where "device memory" is a set of pfns that are not part of the kernel's linear mapping by default, but are accessed via the same memory controller as ram. The motivation for this new type is large capacity persistent memory that needs struct page entries in the 'memmap' to support 3rd party DMA (i.e. O_DIRECT I/O with a persistent memory source/target). However, we also need it in support of maintaining a list of mapped inodes which need to be unmapped at driver teardown or freeze_bdev() time. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave@sr71.net> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 1月, 2016 7 次提交
-
-
由 Dan Williams 提交于
Support badblock checking in all the pmem read paths that do not go through the block layer. This protects info block reads (btt or pfn) as well as data reads to a pmem namespace via a btt instance. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Longer term teach dax to punch "error" holes in mapping requests and deliver SIGBUS to applications that consume a bad pmem page. For now, simply disable the dax performance optimization in the presence of known errors. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Check the sectors specified in a read bio to see if they hit a known bad block, and return an error code pmem_do_bvec(). Note that the ->rw_page() is not in a position to return errors. For now, copy the same layering violation present in zram_rw_page() to avoid crashes of the form: kernel BUG at mm/filemap.c:822! [..] Call Trace: [<ffffffff811c540e>] page_endio+0x1e/0x60 [<ffffffff81290d29>] mpage_end_io+0x39/0x60 [<ffffffff8141c4ef>] bio_endio+0x3f/0x60 [<ffffffffa005c491>] pmem_make_request+0x111/0x230 [nd_pmem] ...i.e. unlock a page that was already unlocked via pmem_rw_page() => page_endio(). Reported-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
If a device will ever have badblocks it should always have a badblocks instance available. So, similar to md, embed a badblocks instance in pmem_device. This reduces pointer chasing in the i/o fast path, and simplifies the init path. Reported-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
If the badblocks list runs out of space it simply means that software is unable to intercept all errors. This is no different than the latent discovery of new badblocks case and should not be an initialization failure condition. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
nd-core.h is private to the libnvdimm core internals and should not be used by drivers. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Vishal Verma 提交于
During region creation, perform Address Range Scrubs (ARS) for the SPA (System Physical Address) ranges to retrieve known poison locations from firmware. Add a new data structure 'nd_poison' which is used as a list in nvdimm_bus to store these poison locations. When creating a pmem namespace, if there is any known poison associated with its physical address space, convert the poison ranges to bad sectors that are exposed using the badblocks interface. Signed-off-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 06 1月, 2016 1 次提交
-
-
由 Dan Williams 提交于
When btt devices were re-worked to be child devices of regions this routine was overlooked. It mistakenly attempts to_nd_namespace_pmem() or to_nd_namespace_blk() conversions on btt and pfn devices. By luck to date we have happened to be hitting valid memory leading to a uuid miscompare, but a recent change to struct nd_namespace_common causes: BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 IP: [<ffffffff814610dc>] memcmp+0xc/0x40 [..] Call Trace: [<ffffffffa0028631>] is_uuid_busy+0xc1/0x2a0 [libnvdimm] [<ffffffffa0028570>] ? to_nd_blk_region+0x50/0x50 [libnvdimm] [<ffffffff8158c9c0>] device_for_each_child+0x50/0x90 Cc: <stable@vger.kernel.org> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 25 12月, 2015 1 次提交
-
-
由 Dan Williams 提交于
'Memory mode' is defined as the capability of a DAX mapping to be the source/target of DMA and other "direct I/O" scenarios. While it currently requires allocating 'struct page' for each page frame of persistent memory in the namespace it will not always be the case. Work continues on reducing the kernel's dependency on 'struct page'. Let's not maintain a suffix that is expected to lose meaning over time. In other words a future 'raw mode' pmem namespace may be as capable as today's 'memory mode' namespace. Undo the encoding of the mode in the device name and leave it to other tooling to determine the mode of the namespace from its attributes. Reported-by: NMatthew Wilcox <willy@linux.intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-