- 31 1月, 2017 17 次提交
-
-
由 Paul Mackerras 提交于
If the guest is in radix mode, then it doesn't have a hashed page table (HPT), so all of the hypercalls that manipulate the HPT can't work and should return an error. This adds checks to make them return H_FUNCTION ("function not supported"). Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds code to keep track of dirty pages when requested (that is, when memslot->dirty_bitmap is non-NULL) for radix guests. We use the dirty bits in the PTEs in the second-level (partition-scoped) page tables, together with a bitmap of pages that were dirty when their PTE was invalidated (e.g., when the page was paged out). This bitmap is stored in the first half of the memslot->dirty_bitmap area, and kvm_vm_ioctl_get_dirty_log_hv() now uses the second half for the bitmap that gets returned to userspace. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adapts our implementations of the MMU notifier callbacks (unmap_hva, unmap_hva_range, age_hva, test_age_hva, set_spte_hva) to call radix functions when the guest is using radix. These implementations are much simpler than for HPT guests because we have only one PTE to deal with, so we don't need to traverse rmap chains. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds the code to construct the second-level ("partition-scoped" in architecturese) page tables for guests using the radix MMU. Apart from the PGD level, which is allocated when the guest is created, the rest of the tree is all constructed in response to hypervisor page faults. As well as hypervisor page faults for missing pages, we also get faults for reference/change (RC) bits needing to be set, as well as various other error conditions. For now, we only set the R or C bit in the guest page table if the same bit is set in the host PTE for the backing page. This code can take advantage of the guest being backed with either transparent or ordinary 2MB huge pages, and insert 2MB page entries into the guest page tables. There is no support for 1GB huge pages yet. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds code to branch around the parts that radix guests don't need - clearing and loading the SLB with the guest SLB contents, saving the guest SLB contents on exit, and restoring the host SLB contents. Since the host is now using radix, we need to save and restore the host value for the PID register. On hypervisor data/instruction storage interrupts, we don't do the guest HPT lookup on radix, but just save the guest physical address for the fault (from the ASDR register) in the vcpu struct. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds a field in struct kvm_arch and an inline helper to indicate whether a guest is a radix guest or not, plus a new file to contain the radix MMU code, which currently contains just a translate function which knows how to traverse the guest page tables to translate an address. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
POWER9 adds a register called ASDR (Access Segment Descriptor Register), which is set by hypervisor data/instruction storage interrupts to contain the segment descriptor for the address being accessed, assuming the guest is using HPT translation. (For radix guests, it contains the guest real address of the access.) Thus, for HPT guests on POWER9, we can use this register rather than looking up the SLB with the slbfee. instruction. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds the implementation of the KVM_PPC_CONFIGURE_V3_MMU ioctl for HPT guests on POWER9. With this, we can return 1 for the KVM_CAP_PPC_MMU_HASH_V3 capability. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds two capabilities and two ioctls to allow userspace to find out about and configure the POWER9 MMU in a guest. The two capabilities tell userspace whether KVM can support a guest using the radix MMU, or using the hashed page table (HPT) MMU with a process table and segment tables. (Note that the MMUs in the POWER9 processor cores do not use the process and segment tables when in HPT mode, but the nest MMU does). The KVM_PPC_CONFIGURE_V3_MMU ioctl allows userspace to specify whether a guest will use the radix MMU or the HPT MMU, and to specify the size and location (in guest space) of the process table. The KVM_PPC_GET_RMMU_INFO ioctl gives userspace information about the radix MMU. It returns a list of supported radix tree geometries (base page size and number of bits indexed at each level of the radix tree) and the encoding used to specify the various page sizes for the TLB invalidate entry instruction. Initially, both capabilities return 0 and the ioctls return -EINVAL, until the necessary infrastructure for them to operate correctly is added. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
With host and guest both using radix translation, it is feasible for the host to take interrupts that come from the guest with relocation on, and that is in fact what the POWER9 hardware will do when LPCR[AIL] = 3. All such interrupts use HSRR0/1 not SRR0/1 except for system call with LEV=1 (hcall). Therefore this adds the KVM tests to the _HV variants of the relocation-on interrupt handlers, and adds the KVM test to the relocation-on system call entry point. We also instantiate the relocation-on versions of the hypervisor data storage and instruction interrupt handlers, since these can occur with relocation on in radix guests. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
When changing a partition table entry on POWER9, we do a particular form of the tlbie instruction which flushes all TLBs and caches of the partition table for a given logical partition ID (LPID). This instruction has a field in the instruction word, labelled R (radix), which should be 1 if the partition was previously a radix partition and 0 if it was a HPT partition. This implements that logic. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This exports the pgtable_cache array and the pgtable_cache_add function so that HV KVM can use them for allocating radix page tables for guests. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds definitions for bits in the DSISR register which are used by POWER9 for various translation-related exception conditions, and for some more bits in the partition table entry that will be needed by KVM. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
To use radix as a guest, we first need to tell the hypervisor via the ibm,client-architecture call first that we support POWER9 and architecture v3.00, and that we can do either radix or hash and that we would like to choose later using an hcall (the H_REGISTER_PROC_TBL hcall). Then we need to check whether the hypervisor agreed to us using radix. We need to do this very early on in the kernel boot process before any of the MMU initialization is done. If the hypervisor doesn't agree, we can't use radix and therefore clear the radix MMU feature bit. Later, when we have set up our process table, which points to the radix tree for each process, we need to install that using the H_REGISTER_PROC_TBL hcall. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This fixes the byte index values for some of the option bits in the "ibm,architectur-vec-5" property. The "platform facilities options" bits are in byte 17 not byte 14, so the upper 8 bits of their definitions need to be 0x11 not 0x0E. The "sub processor support" option is in byte 21 not byte 15. Note none of these options are actually looked up in "ibm,architecture-vec-5" at this time, so there is no bug. When checking whether option bits are set, we should check that the offset of the byte being checked is less than the vector length that we got from the hypervisor. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
Currently, if the kernel is running on a POWER9 processor under a hypervisor, it will try to use the radix MMU even though it doesn't have the necessary code to use radix under a hypervisor (it doesn't negotiate use of radix, and it doesn't do the H_REGISTER_PROC_TBL hcall). The result is that the guest kernel will crash when it tries to turn on the MMU. This fixes it by looking for the /chosen/ibm,architecture-vec-5 property, and if it exists, clears the radix MMU feature bit, before we decide whether to initialize for radix or HPT. This property is created by the hypervisor as a result of the guest calling the ibm,client-architecture-support method to indicate its capabilities, so it will indicate whether the hypervisor agreed to us using radix. Systems without a hypervisor may have this property also (for example, skiboot creates it), so we check the HV bit in the MSR to see whether we are running as a guest or not. If we are in hypervisor mode, then we can do whatever we like including using the radix MMU. The reason for using this property is that in future, when we have support for using radix under a hypervisor, we will need to check this property to see whether the hypervisor agreed to us using radix. Fixes: 2bfd65e4 ("powerpc/mm/radix: Add radix callbacks for early init routines") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
64-bit Book3S exception handlers must find the dynamic kernel base to add to the target address when branching beyond __end_interrupts, in order to support kernel running at non-0 physical address. Support this in KVM by branching with CTR, similarly to regular interrupt handlers. The guest CTR saved in HSTATE_SCRATCH1 and restored after the branch. Without this, the host kernel hangs and crashes randomly when it is running at a non-0 address and a KVM guest is started. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 27 1月, 2017 2 次提交
-
-
由 Nicholas Piggin 提交于
A subsequent patch to make KVM handlers relocation-safe makes them unusable from within alt section "else" cases (due to the way fixed addresses are taken from within fixed section head code). Stop open-coding the KVM handlers, and add them both as normal. A more optimal fix may be to allow some level of alternate feature patching in the exception macros themselves, but for now this will do. The TRAMP_KVM handlers must be moved to the "virt" fixed section area (name is arbitrary) in order to be closer to .text and avoid the dreaded "relocation truncated to fit" error. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
Change the calling convention to put the trap number together with CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV handler. The 64-bit PR handler entry translates the calling convention back to match the previous call convention (i.e., shared with 32-bit), for simplicity. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 26 12月, 2016 2 次提交
-
-
由 Larry Finger 提交于
I am getting the following warning when I build kernel 4.9-git on my PowerBook G4 with a 32-bit PPC processor: AS arch/powerpc/kernel/misc_32.o arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef] This problem is evident after commit 989cea5c ("kbuild: prevent lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an error that has been in the code since 2005 when this source file was created. That was with commit 9994a338 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S"). The offending line does not make a lot of sense. This error does not seem to cause any errors in the executable, thus I am not recommending that it be applied to any stable versions. Thanks to Nicholas Piggin for suggesting this solution. Fixes: 9994a338 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S") Signed-off-by: NLarry Finger <Larry.Finger@lwfinger.net> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Thomas Gleixner 提交于
ktime_set(S,N) was required for the timespec storage type and is still useful for situations where a Seconds and Nanoseconds part of a time value needs to be converted. For anything where the Seconds argument is 0, this is pointless and can be replaced with a simple assignment. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 25 12月, 2016 3 次提交
-
-
由 Thomas Gleixner 提交于
There is no point in having an extra type for extra confusion. u64 is unambiguous. Conversion was done with the following coccinelle script: @rem@ @@ -typedef u64 cycle_t; @fix@ typedef cycle_t; @@ -cycle_t +u64 Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
When the state names got added a script was used to add the extra argument to the calls. The script basically converted the state constant to a string, but the cleanup to convert these strings into meaningful ones did not happen. Replace all the useless strings with 'subsys/xxx/yyy:state' strings which are used in all the other places already. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Linus Torvalds 提交于
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 12月, 2016 3 次提交
-
-
由 Madalin Bucur 提交于
The fsl/fman drivers will use of_platform_populate() on all supported platforms. Call of_platform_populate() to probe the FMan sub-nodes. Signed-off-by: NIgal Liberman <igal.liberman@freescale.com> Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com> Acked-by: NScott Wood <oss@buserror.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thiago Jung Bauermann 提交于
The IMA kexec buffer allows the currently running kernel to pass the measurement list via a kexec segment to the kernel that will be kexec'd. This is the architecture-specific part of setting up the IMA kexec buffer for the next kernel. It will be used in the next patch. Link: http://lkml.kernel.org/r/1480554346-29071-6-git-send-email-zohar@linux.vnet.ibm.comSigned-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Thiago Jung Bauermann 提交于
Patch series "ima: carry the measurement list across kexec", v8. The TPM PCRs are only reset on a hard reboot. In order to validate a TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list of the running kernel must be saved and then restored on the subsequent boot, possibly of a different architecture. The existing securityfs binary_runtime_measurements file conveniently provides a serialized format of the IMA measurement list. This patch set serializes the measurement list in this format and restores it. Up to now, the binary_runtime_measurements was defined as architecture native format. The assumption being that userspace could and would handle any architecture conversions. With the ability of carrying the measurement list across kexec, possibly from one architecture to a different one, the per boot architecture information is lost and with it the ability of recalculating the template digest hash. To resolve this problem, without breaking the existing ABI, this patch set introduces the boot command line option "ima_canonical_fmt", which is arbitrarily defined as little endian. The need for this boot command line option will be limited to the existing version 1 format of the binary_runtime_measurements. Subsequent formats will be defined as canonical format (eg. TPM 2.0 support for larger digests). A simplified method of Thiago Bauermann's "kexec buffer handover" patch series for carrying the IMA measurement list across kexec is included in this patch set. The simplified method requires all file measurements be taken prior to executing the kexec load, as subsequent measurements will not be carried across the kexec and restored. This patch (of 10): The IMA kexec buffer allows the currently running kernel to pass the measurement list via a kexec segment to the kernel that will be kexec'd. The second kernel can check whether the previous kernel sent the buffer and retrieve it. This is the architecture-specific part which enables IMA to receive the measurement list passed by the previous kernel. It will be used in the next patch. The change in machine_kexec_64.c is to factor out the logic of removing an FDT memory reservation so that it can be used by remove_ima_buffer. Link: http://lkml.kernel.org/r/1480554346-29071-2-git-send-email-zohar@linux.vnet.ibm.comSigned-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 12月, 2016 2 次提交
-
-
由 Jan Kara 提交于
Every single user of vmf->virtual_address typed that entry to unsigned long before doing anything with it so the type of virtual_address does not really provide us any additional safety. Just use masked vmf->address which already has the appropriate type. Link: http://lkml.kernel.org/r/1479460644-25076-3-git-send-email-jack@suse.czSigned-off-by: NJan Kara <jack@suse.cz> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexander Duyck 提交于
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to avoid invoking cache line invalidation if the driver will just handle it via a sync_for_cpu or sync_for_device call. Link: http://lkml.kernel.org/r/20161110113534.76501.86492.stgit@ahduyck-blue-test.jf.intel.comSigned-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 12月, 2016 4 次提交
-
-
由 Aneesh Kumar K.V 提交于
Add arch specific callback in the generic THP page cache code that will deposit and withdarw preallocated page table. Archs like ppc64 use this preallocated table to store the hash pte slot information. Testing: kernel build of the patch series on tmpfs mounted with option huge=always The related thp stat: thp_fault_alloc 72939 thp_fault_fallback 60547 thp_collapse_alloc 603 thp_collapse_alloc_failed 0 thp_file_alloc 253763 thp_file_mapped 4251 thp_split_page 51518 thp_split_page_failed 1 thp_deferred_split_page 73566 thp_split_pmd 665 thp_zero_page_alloc 3 thp_zero_page_alloc_failed 0 [akpm@linux-foundation.org: remove unneeded parentheses, per Kirill] Link: http://lkml.kernel.org/r/20161113150025.17942-2-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Aneesh Kumar K.V 提交于
Independent of whether the vma is for anonymous memory, some arches like ppc64 would like to override pmd_move_must_withdraw(). One option is to encapsulate the vma_is_anonymous() check for general architectures inside pmd_move_must_withdraw() so that is always called and architectures that need unconditional overriding can override this function. ppc64 needs to override the function when the MMU is configured to use hash PTE's. [bsingharora@gmail.com: reworked changelog] Link: http://lkml.kernel.org/r/20161113150025.17942-1-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Reza Arbab 提交于
Patch series "enable movable nodes on non-x86 configs", v7. This patchset allows more configs to make use of movable nodes. When CONFIG_MOVABLE_NODE is selected, there are two ways to introduce such nodes into the system: 1. Discover movable nodes at boot. Currently this is only possible on x86, but we will enable configs supporting fdt to do the same. 2. Hotplug and online all of a node's memory using online_movable. This is already possible on any config supporting memory hotplug, not just x86, but the Kconfig doesn't say so. We will fix that. We'll also remove some cruft on power which would prevent (2). This patch (of 5): Remove the check which prevents us from hotplugging into an empty node. The original commit b226e462 ("[PATCH] powerpc: don't add memory to empty node/zone"), states that this was intended to be a temporary measure. It is a workaround for an oops which no longer occurs. Link: http://lkml.kernel.org/r/1479160961-25840-2-git-send-email-arbab@linux.vnet.ibm.comSigned-off-by: NReza Arbab <arbab@linux.vnet.ibm.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NBalbir Singh <bsingharora@gmail.com> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alistair Popple <apopple@au1.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Aneesh Kumar K.V 提交于
With commit e77b0852 ("mm/mmu_gather: track page size with mmu gather and force flush if page size change") we added the ability to force a tlb flush when the page size change in a mmu_gather loop. We did that by checking for a page size change every time we added a page to mmu_gather for lazy flush/remove. We can improve that by moving the page size change check early and not doing it every time we add a page. This also helps us to do tlb flush when invalidating a range covering dax mapping. Wrt dax mapping we don't have a backing struct page and hence we don't call tlb_remove_page, which earlier forced the tlb flush on page size change. Moving the page size change check earlier means we will do the same even for dax mapping. We also avoid doing this check on architecture other than powerpc. In a later patch we will remove page size check from tlb_remove_page(). Link: http://lkml.kernel.org/r/20161026084839.27299-5-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 12月, 2016 7 次提交
-
-
由 Madalin Bucur 提交于
Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com> Signed-off-by: NScott Wood <oss@buserror.net>
-
由 Madalin Bucur 提交于
The alias is used by the boot loader to perform a device tree fixup. Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com> Signed-off-by: NScott Wood <oss@buserror.net>
-
由 Madalin Bucur 提交于
Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com> Signed-off-by: NScott Wood <oss@buserror.net>
-
由 Madalin Bucur 提交于
Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com> Signed-off-by: NScott Wood <oss@buserror.net>
-
由 Fabian Frederick 提交于
Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NScott Wood <oss@buserror.net>
-
由 Christophe Leroy 提交于
8xx uses a two level page table with two different linux page size support (4k and 16k). 8xx also support two different hugepage sizes 512k and 8M. In order to support them on linux we define two different page table layout. The size of pages is in the PGD entry, using PS field (bits 28-29): 00 : Small pages (4k or 16k) 01 : 512k pages 10 : reserved 11 : 8M pages For 512K hugepage size a pgd entry have the below format [<hugepte address >0101] . The hugepte table allocated will contain 8 entries pointing to 512K huge pte in 4k pages mode and 64 entries in 16k pages mode. For 8M in 16k mode, a pgd entry have the below format [<hugepte address >1101] . The hugepte table allocated will contain 8 entries pointing to 8M huge pte. For 8M in 4k mode, multiple pgd entries point to the same hugepte address and pgd entry will have the below format [<hugepte address>1101]. The hugepte table allocated will only have one entry. For the time being, we do not support CPU15 ERRATA when HUGETLB is selected Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits) Signed-off-by: NScott Wood <oss@buserror.net>
-
由 Christophe Leroy 提交于
Today there are two implementations of hugetlbpages which are managed by exclusive #ifdefs: * FSL_BOOKE: several directory entries points to the same single hugepage * BOOK3S: one upper level directory entry points to a table of hugepages In preparation of implementation of hugepage support on the 8xx, we need a mix of the two above solutions, because the 8xx needs both cases depending on the size of pages: * In 4k page size mode, each PGD entry covers a 4M bytes area. It means that 2 PGD entries will be necessary to cover an 8M hugepage while a single PGD entry will cover 8x 512k hugepages. * In 16 page size mode, each PGD entry covers a 64M bytes area. It means that 8x 8M hugepages will be covered by one PGD entry and 64x 512k hugepages will be covers by one PGD entry. This patch: * removes #ifdefs in favor of if/else based on the range sizes * merges the two huge_pte_alloc() functions as they are pretty similar * merges the two hugetlbpage_init() functions as they are pretty similar Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3) Signed-off-by: NScott Wood <oss@buserror.net>
-