- 31 1月, 2016 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
In file included from mm/vmscan.c:54:0: include/linux/swapops.h: In function ‘pte_to_swp_entry’: include/linux/swapops.h:69:2: error: implicit declaration of function ‘pte_swp_soft_dirty’ [-Werror=implicit-function-declaration] if (pte_swp_soft_dirty(pte)) ^ include/linux/swapops.h:70:3: error: implicit declaration of function ‘pte_swp_clear_soft_dirty’ [-Werror=implicit-function-declaration] pte = pte_swp_clear_soft_dirty(pte); We support soft dirty tracking only with book3s 64 for now. So change the Kconfig dependency accordingly. Also CHECKPOINT_RESTORE feature is not really dependent on SOFT_DIRTY. We track the dependency between MEM_SOFT_DIRTY and ARCH_SOFT_DIRTY through headers Fixes: 7207f436 ("powerpc/mm: Add page soft dirty tracking") Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 28 1月, 2016 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
This was wrongly updated by commit 7aa9a23c ("powerpc, thp: remove infrastructure for handling splitting PMDs") during the last merge window. Fix it up. This could lead to incorrect behaviour in THP and/or mprotect(), at a minimum. Fixes: 7aa9a23c ("powerpc, thp: remove infrastructure for handling splitting PMDs") Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Madhavan Srinivasan 提交于
Commit 7a786832 ("powerpc/perf: Add an explict flag indicating presence of SLOT field") introduced the PPMU_HAS_SSLOT flag to remove the assumption that MMCRA[SLOT] was present when PPMU_ALT_SIPR was not set. That commit's changelog also mentions that Power8 does not support MMCRA[SLOT]. However when the Power8 PMU support was merged, it errnoeously included the PPMU_HAS_SSLOT flag. So remove PPMU_HAS_SSLOT from the Power8 flags. mpe: On systems where MMCRA[SLOT] exists, the field occupies bits 37:39 (IBM numbering). On Power8 bit 37 is reserved, and 38:39 overlap with the high bits of the Threshold Event Counter Mantissa. I am not aware of any published events which use the threshold counting mechanism, which would cause the mantissa bits to be set. So in practice this bug is unlikely to trigger. Fixes: e05b9b9e ("powerpc/perf: Power8 PMU support") Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 27 1月, 2016 1 次提交
-
-
由 Gavin Shan 提交于
In eeh_pe_loc_get(), the PE location code is retrieved from the "ibm,loc-code" property of the device node for the bridge of the PE's primary bus. It's not correct because the property indicates the parent PE's location code. This reads the correct PE location code from "ibm,io-base-loc-code" or "ibm,slot-location-code" property of PE parent bus's device node. Cc: stable@vger.kernel.org # v3.16+ Fixes: 357b2f3d ("powerpc/eeh: Dump PE location code") Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: NRussell Currey <ruscur@russell.cc> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 25 1月, 2016 1 次提交
-
-
由 Vasant Hegde 提交于
With commit 90a545e9 (restrict /dev/mem to idle io memory ranges) mapping rtas_rmo_buf from user space is failing. Hence we are not able to make RTAS syscall. This patch calls page_is_rtas_user_buf before calling iomem_is_exclusive in devmem_is_allowed(). This will allow user space to map rtas_rmo_buf and we are able to make RTAS syscall. Reported-by: NBharata B Rao <bharata@linux.vnet.ibm.com> CC: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 21 1月, 2016 3 次提交
-
-
由 Stephen Rothwell 提交于
Commit d5d6a443 ("arch/powerpc/include/asm/pgtable-ppc64.h: add pmd_[dirty|mkclean] for THP") added a new identical definition of pmd_dirty(). Remove it again. Cc: Minchan Kim <minchan@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Alan Modra 提交于
PowerPC64 uses the symbol .TOC. much as other targets use _GLOBAL_OFFSET_TABLE_. It identifies the value of the GOT pointer (or in powerpc parlance, the TOC pointer). Global offset tables are generally local to an executable or shared library, or in the kernel, module. Thus it does not make sense for a module to resolve a relocation against .TOC. to the kernel's .TOC. value. A module has its own .TOC., and indeed the powerpc64 module relocation processing ignores the kernel value of .TOC. and instead calculates a module-local value. This patch removes code involved in exporting the kernel .TOC., tweaks modpost to ignore an undefined .TOC., and the module loader to twiddle the section symbol so that .TOC. isn't seen as undefined. Note that if the kernel was compiled with -msingle-pic-base then ELFv2 would not have function global entry code setting up r2. In that case the module call stubs would need to be modified to set up r2 using the kernel .TOC. value, requiring some of this code to be reinstated. mpe: Furthermore a change in binutils master (not yet released) causes the current way we handle the TOC to no longer work when building with MODVERSIONS=y and RELOCATABLE=n. The symptom is that modules can not be loaded due to there being no version found for TOC. Cc: stable@vger.kernel.org # 3.16+ Signed-off-by: NAlan Modra <amodra@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Chandan Rajendra 提交于
Test runs on a ppc64 BE guest succeeded using modified fstests. Also tested on ppc64 LE using a home made test - mpe. Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 20 1月, 2016 2 次提交
-
-
由 Souvik Kumar Chakravarty 提交于
Intel PM Telemetry is a software mechanism via which various SoC PM and performance related parameters like PM counters, firmware trace verbosity, the status of different devices inside the SoC, etc. can be monitored and analyzed. The different samples that may be monitored can be configured at runtime via exported APIs. This patch adds the telemetry core driver that implements basic exported APIs. Signed-off-by: NSouvik Kumar Chakravarty <souvik.k.chakravarty@intel.com> Signed-off-by: NDarren Hart <dvhart@linux.intel.com>
-
由 Qipeng Zha 提交于
This driver provides support for P-Unit mailbox IPC on Intel platforms. The heart of the P-Unit is the Foxton microcontroller and its firmware, which provide mailbox interface for power management usage. Signed-off-by: NQipeng Zha <qipeng.zha@intel.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NDarren Hart <dvhart@linux.intel.com>
-
- 19 1月, 2016 7 次提交
-
-
由 Chris Metcalf 提交于
This isn't used anywhere, so delete it. Looks like the last usage (in x86-specific code) was removed by Tejun in 2011 in commit bd6709a9 ("x86, NUMA: Make 32bit use common NUMA init path"). Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
-
由 Chris Metcalf 提交于
This ensures that we always notify context tracking that we have exited from user space no matter how we enter the kernel. It is similar to how arm64 handles context tracking, for example. This allows the removal of all the exception_enter() calls that were added in commit 49e4e156 ("tile: support CONTEXT_TRACKING and thus NOHZ_FULL"). Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
-
由 Chris Metcalf 提交于
This flag value is saved in ptregs and used to decide whether to disable irqs when returning from the kernel. Commit 1168df528fe4 ("tile: don't assume user privilege is zero") performed a bad merge from some KVM-enabled code that had not yet been upstreamed. The only issue with the old code is that we will read the interrupt mask in more conditions than we need to (e.g., coming from user space when user space has the Interrupt Critical Section bit set, or coming from a guest kernel), which is a slow multi-cycle operation. This change saves those few cycles in the common case. Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
-
由 Chris Metcalf 提交于
Missing parentheses could cause an argument of the form "integer + pointer" to get cast to "(long)integer + pointer" and remain a pointer type, causing compiler warnings. Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
-
由 Chris Metcalf 提交于
The warning occurs in setup.c, where it is known that it can't be a problem, but it's still a good idea to silence the warning. The onstack array is converted from an s32 to a u8, which still is plenty of range for the values being managed there. Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
-
由 Chris Metcalf 提交于
This information is easily available in the backtrace data and can be helpful when trying to figure out the backtrace, particularly if we're early in kernel entry or late in kernel exit. Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
-
由 Chris Metcalf 提交于
This change is a prerequisite change for TASK_ISOLATION but also stands on its own for readability and maintainability. The existing tile do_work_pending() was called in a loop from assembly on the slow path; this change moves the loop into C code as well. For the x86 version see commit c5c46f59 ("x86/entry: Add new, comprehensible entry and exit handlers written in C"). This change exposes a pre-existing bug on the older tilepro platform; the singlestep processing is done last, but on tilepro (unlike tilegx) we enable interrupts while doing that processing, so we could in theory miss a signal or other asynchronous event. A future change could fix this by breaking the singlestep work into a "prepare" step done in the main loop, and a "trigger" step done after exiting the loop. Since this change is intended as purely a restructuring change, we call out the bug explicitly now, but don't yet fix it. Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
-
- 18 1月, 2016 2 次提交
-
-
由 Zi Shen Lim 提交于
Code generation functions in arch/arm64/kernel/insn.c previously BUG_ON invalid parameters. Following change of that behavior, now we need to handle the error case where AARCH64_BREAK_FAULT is returned. Instead of error-handling on every emit() in JIT, we add a new validation pass at the end of JIT compilation. There's no point in running JITed code at run-time only to trap due to AARCH64_BREAK_FAULT. Instead, we drop this failed JIT compilation and allow the system to gracefully fallback on the BPF interpreter. Signed-off-by: NZi Shen Lim <zlim.lnx@gmail.com> Suggested-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Zi Shen Lim 提交于
During code generation, we used to BUG_ON unknown/unsupported encoding or invalid parameters. Instead, now we report these as errors and simply return the instruction AARCH64_BREAK_FAULT. Users of these codegen helpers should check for and handle this failure condition as appropriate. Otherwise, unhandled codegen failure will result in trapping at run-time due to AARCH64_BREAK_FAULT, which is arguably better than a BUG_ON. Signed-off-by: NZi Shen Lim <zlim.lnx@gmail.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 1月, 2016 2 次提交
-
-
由 Will Deacon 提交于
As illustrated by commit a3afe70b ("[S390] latencytop s390 support."), HAVE_LATENCYTOP_SUPPORT is defined by an architecture to advertise an implementation of save_stack_trace_tsk. However, as of 9212ddb5 ("stacktrace: provide save_stack_trace_tsk() weak alias") a dummy implementation is provided if STACKTRACE=y. Given that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether. Signed-off-by: NWill Deacon <will.deacon@arm.com> Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: James Hogan <james.hogan@imgtec.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Helge Deller <deller@gmx.de> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Helge Deller 提交于
PA-RISC doesn't have atomic instructions to modify page table entries, so it takes spinlock in the TLB handler and modifies the page table entry non-atomically. If you modify the page table entry without the spinlock, you may race with TLB handler on another CPU and your modification may be lost. Protect against that with usage of purge_tlb_start() and purge_tlb_end() which handles the TLB spinlock. Signed-off-by: NHelge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v4.4
-
- 16 1月, 2016 19 次提交
-
-
由 Dominik Dingel 提交于
By passing a non-null flag we allow fixup_user_fault to retry, which enables userfaultfd. As during these retries we might drop the mmap_sem we need to check if that happened and redo the complete chain of actions. Signed-off-by: NDominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dominik Dingel 提交于
During Jason's work with postcopy migration support for s390 a problem regarding gmap faults was discovered. The gmap code will call fixup_user_fault which will end up always in handle_mm_fault. Till now we never cared about retries, but as the userfaultfd code kind of relies on it. this needs some fix. This patchset does not take care of the futex code. I will now look closer at this. This patch (of 2): With the introduction of userfaultfd, kvm on s390 needs fixup_user_fault to pass in FAULT_FLAG_ALLOW_RETRY and give feedback if during the faulting we ever unlocked mmap_sem. This patch brings in the logic to handle retries as well as it cleans up the current documentation. fixup_user_fault was not having the same semantics as filemap_fault. It never indicated if a retry happened and so a caller wasn't able to handle that case. So we now changed the behaviour to always retry a locked mmap_sem. Signed-off-by: NDominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
A dax mapping establishes a pte with _PAGE_DEVMAP set when the driver has established a devm_memremap_pages() mapping, i.e. when the pfn_t return from ->direct_access() has PFN_DEV and PFN_MAP set. Later, when encountering _PAGE_DEVMAP during a page table walk we lookup and pin a struct dev_pagemap instance to keep the result of pfn_to_page() valid until put_page(). Signed-off-by: NDan Williams <dan.j.williams@intel.com> Tested-by: NLogan Gunthorpe <logang@deltatee.com> Cc: Dave Hansen <dave@sr71.net> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
A dax-huge-page mapping while it uses some thp helpers is ultimately not a transparent huge page. The distinction is especially important in the get_user_pages() path. pmd_devmap() is used to distinguish dax-pmds from pmd_huge() and pmd_trans_huge() which have slightly different semantics. Explicitly mark the pmd_trans_huge() helpers that dax needs by adding pmd_devmap() checks. [kirill.shutemov@linux.intel.com: fix regression in handling mlocked pages in __split_huge_pmd()] Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
Similar to the conversion of vm_insert_mixed() use pfn_t in the vmf_insert_pfn_pmd() to tag the resulting pte with _PAGE_DEVICE when the pfn is backed by a devm_memremap_pages() mapping. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
Convert the raw unsigned long 'pfn' argument to pfn_t for the purpose of evaluating the PFN_MAP and PFN_DEV flags. When both are set it triggers _PAGE_DEVMAP to be set in the resulting pte. There are no functional changes to the gpu drivers as a result of this conversion. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: David Airlie <airlied@linux.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
_PAGE_DEVMAP is a hardware-unused pte bit that will later be used in the get_user_pages() path to identify pfns backed by the dynamic allocation established by devm_memremap_pages. Upon seeing that bit the gup path will lookup and pin the allocation while the pages are in use. Since the _PAGE_DEVMAP bit is > 32 it must be cast to u64 instead of a pteval_t to allow pmd_flags() usage in the realmode boot code to build. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
Take into account that the pmd_t type is a array inside a struct, so it needs two levels of brackets to initialize. Otherwise, a usage of __pmd generates a warning: include/linux/mm.h:986:2: warning: missing braces around initializer [-Wmissing-braces] Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
Switch avr32/include/asm/page.h to use the common defintions for pfn_to_page(), page_to_pfn(), and ARCH_PFN_OFFSET. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
Use the new vmem_altmap capability to enable the pmem driver to arrange for a struct page memmap to be established in persistent memory. [linux@roeck-us.net: mn10300: declare __pfn_to_phys() to fix build error] Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
In support of providing struct page for large persistent memory capacities, use struct vmem_altmap to change the default policy for allocating memory for the memmap array. The default vmemmap_populate() allocates page table storage area from the page allocator. Given persistent memory capacities relative to DRAM it may not be feasible to store the memmap in 'System Memory'. Instead vmem_altmap represents pre-allocated "device pages" to satisfy vmemmap_alloc_block_buf() requests. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reported-by: Nkbuild test robot <lkp@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
For the purpose of communicating the optional presence of a 'struct page' for the pfn returned from ->direct_access(), introduce a type that encapsulates a page-frame-number plus flags. These flags contain the historical "page_link" encoding for a scatterlist entry, but can also denote "device memory". Where "device memory" is a set of pfns that are not part of the kernel's linear mapping by default, but are accessed via the same memory controller as ram. The motivation for this new type is large capacity persistent memory that needs struct page entries in the 'memmap' to support 3rd party DMA (i.e. O_DIRECT I/O with a persistent memory source/target). However, we also need it in support of maintaining a list of mapped inodes which need to be unmapped at driver teardown or freeze_bdev() time. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave@sr71.net> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
To date, we have implemented two I/O usage models for persistent memory, PMEM (a persistent "ram disk") and DAX (mmap persistent memory into userspace). This series adds a third, DAX-GUP, that allows DAX mappings to be the target of direct-i/o. It allows userspace to coordinate DMA/RDMA from/to persistent memory. The implementation leverages the ZONE_DEVICE mm-zone that went into 4.3-rc1 (also discussed at kernel summit) to flag pages that are owned and dynamically mapped by a device driver. The pmem driver, after mapping a persistent memory range into the system memmap via devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus page-backed pmem-pfns via flags in the new pfn_t type. The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the resulting pte(s) inserted into the process page tables with a new _PAGE_DEVMAP flag. Later, when get_user_pages() is walking ptes it keys off _PAGE_DEVMAP to pin the device hosting the page range active. Finally, get_page() and put_page() are modified to take references against the device driver established page mapping. Finally, this need for "struct page" for persistent memory requires memory capacity to store the memmap array. Given the memmap array for a large pool of persistent may exhaust available DRAM introduce a mechanism to allocate the memmap from persistent memory. The new "struct vmem_altmap *" parameter to devm_memremap_pages() enables arch_add_memory() to use reserved pmem capacity rather than the page allocator. This patch (of 18): The core has developed a need for a "pfn_t" type [1]. Move the existing pfn_t in KVM to kvm_pfn_t [2]. [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html [2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.htmlSigned-off-by: NDan Williams <dan.j.williams@intel.com> Acked-by: NChristoffer Dall <christoffer.dall@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
The core has developed a need for a "pfn_t" type [1]. Convert the usage of pfn_t by usermode-linux to an unsigned long, and update pfn_to_phys() to drop its expectation of a typed pfn. [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.htmlSigned-off-by: NDan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
To date, we have implemented two I/O usage models for persistent memory, PMEM (a persistent "ram disk") and DAX (mmap persistent memory into userspace). This series adds a third, DAX-GUP, that allows DAX mappings to be the target of direct-i/o. It allows userspace to coordinate DMA/RDMA from/to persistent memory. The implementation leverages the ZONE_DEVICE mm-zone that went into 4.3-rc1 (also discussed at kernel summit) to flag pages that are owned and dynamically mapped by a device driver. The pmem driver, after mapping a persistent memory range into the system memmap via devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus page-backed pmem-pfns via flags in the new pfn_t type. The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the resulting pte(s) inserted into the process page tables with a new _PAGE_DEVMAP flag. Later, when get_user_pages() is walking ptes it keys off _PAGE_DEVMAP to pin the device hosting the page range active. Finally, get_page() and put_page() are modified to take references against the device driver established page mapping. Finally, this need for "struct page" for persistent memory requires memory capacity to store the memmap array. Given the memmap array for a large pool of persistent may exhaust available DRAM introduce a mechanism to allocate the memmap from persistent memory. The new "struct vmem_altmap *" parameter to devm_memremap_pages() enables arch_add_memory() to use reserved pmem capacity rather than the page allocator. This patch (of 25): Both __dax_pmd_fault, and clear_pmem() were taking special steps to clear memory a page at a time to take advantage of non-temporal clear_page() implementations. However, x86_64 does not use non-temporal instructions for clear_page(), and arch_clear_pmem() was always incurring the cost of __arch_wb_cache_pmem(). Clean up the assumption that doing clear_pmem() a page at a time is more performant. Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reported-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: David Airlie <airlied@linux.ie> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite of the contents since MADV_FREE syscall is called for THP page. This patch adds pmd_mkclean for THP page MADV_FREE support. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@kernel.org> Cc: <yalin.wang2010@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Hugh Dickins <hughd@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jason Evans <je@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttil <mika.penttila@nextfour.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rik van Riel <riel@redhat.com> Cc: Roland Dreier <roland@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Shaohua Li <shli@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite of the contents since MADV_FREE syscall is called for THP page. This patch adds pmd_mkclean for THP page MADV_FREE support. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@kernel.org> Cc: <yalin.wang2010@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Hugh Dickins <hughd@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jason Evans <je@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttil <mika.penttila@nextfour.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rik van Riel <riel@redhat.com> Cc: Roland Dreier <roland@kernel.org> Cc: Shaohua Li <shli@kernel.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite of the contents since MADV_FREE syscall is called for THP page. This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE support. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@kernel.org> Cc: <yalin.wang2010@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Hugh Dickins <hughd@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jason Evans <je@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttil <mika.penttila@nextfour.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rik van Riel <riel@redhat.com> Cc: Roland Dreier <roland@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Shaohua Li <shli@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Minchan Kim 提交于
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite of the contents since MADV_FREE syscall is called for THP page. This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE support. Signed-off-by: NMinchan Kim <minchan@kernel.org> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@kernel.org> Cc: <yalin.wang2010@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Hugh Dickins <hughd@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jason Evans <je@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttil <mika.penttila@nextfour.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rik van Riel <riel@redhat.com> Cc: Roland Dreier <roland@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Shaohua Li <shli@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-