- 25 1月, 2017 3 次提交
-
-
由 Aneesh Kumar K.V 提交于
In commit a4b34954 ("powerpc/mm: Cleanup LPCR defines") we updated LPCR_VRMASD wrongly as below. -#define LPCR_VRMASD (0x1ful << (63-16)) +#define LPCR_VRMASD_SH 47 +#define LPCR_VRMASD (ASM_CONST(1) << LPCR_VRMASD_SH) We initialize the VRMA bits in LPCR to 0x00 in kvm. Hence using a different mask value as above while updating lpcr should not have any impact. This patch updates it to the correct value. Fixes: a4b34954 ("powerpc/mm: Cleanup LPCR defines") Reported-by: NRam Pai <linuxram@us.ibm.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NJia He <hejianet@gmail.com> Acked-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
Currently we have optimized hand-coded assembly checksum routines for big-endian 64-bit systems, but for little-endian we use the generic C routines. This modifies the optimized routines to work for little-endian. With this, we no longer need to enable CONFIG_GENERIC_CSUM. This also fixes a couple of comments in checksum_64.S so they accurately reflect what the associated instruction does. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> [mpe: Use the more common __BIG_ENDIAN__] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
These functions compute an IP checksum by computing a 64-bit sum and folding it to 32 bits (the "nofold" in their names refers to folding down to 16 bits). However, doing (u32) (s + (s >> 32)) is not sufficient to fold a 64-bit sum to 32 bits correctly. The addition can produce a carry out from bit 31, which needs to be added in to the sum to produce the correct result. To fix this, we copy the from64to32() function from lib/checksum.c and use that. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 23 1月, 2017 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
We now support THP with both 64k and 4K page size configuration for radix. (hash only support THP with 64K page size). Hence we will have CONFIG_TRANSPARENT_HUGEPAGE enabled for both PPC_64K and PPC_4K config. Since we only need large pmd page table with hash configuration (to store the slot information in the second half of the table) restrict the large pmd page table to THP and 64K configs. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
We don't do this for other page table entries. So lets keep this simple and always return false for hugepd check on a 64K page size config. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 20 1月, 2017 1 次提交
-
-
由 Anton Blanchard 提交于
IBM bit 31 (for the rest of us - bit 0) is a reserved field in the instruction definition of mtspr and mfspr. Hardware is encouraged to (and does) ignore it. As a result, if userspace executes an mtspr DSCR with the reserved bit set, we get a DSCR facility unavailable exception. The kernel fails to match against the expected value/mask, and we silently return to userspace to try and re-execute the same mtspr DSCR instruction. We loop forever until the process is killed. We should do something here, and it seems mirroring what hardware does is the better option vs killing the process. While here, relax the matching of mfspr PVR too. Cc: stable@vger.kernel.org Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 18 1月, 2017 2 次提交
-
-
由 Madhavan Srinivasan 提交于
SIER and SIAR are not updated correctly for some samples, so force the use of MSR and regs->nip instead for misc_flag updates. This is done by adding a new ppmu flag and updating the use_siar logic in perf_read_regs() to use it, and dropping the PPMU_HAS_SIER flag. Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Rename flag to PPMU_NO_SIAR, and also drop PPMU_HAS_SIER] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
When we switched to big endian page table, we never updated the hugepd format such that it can work for both big endian and little endian config. This patch series update hugepd format such that it is looked at as __be64 value in big endian page table config. This patch also switch hugepd_t.pd from signed long to unsigned long. I did update the FSL hugepd_ok check to check for the top bit instead of checking > 0. Fixes: 5dc1ef85 ("powerpc/mm: Use big endian Linux page tables for book3s 64") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 17 1月, 2017 1 次提交
-
-
由 Reza Arbab 提交于
Memory hotplug is leading to hash page table calls, even on radix: arch_add_memory create_section_mapping htab_bolt_mapping BUG_ON(!ppc_md.hpte_insert); To fix, refactor {create,remove}_section_mapping() into hash__ and radix__ variants. Leave the radix versions stubbed for now. Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NReza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 25 12月, 2016 1 次提交
-
-
由 Linus Torvalds 提交于
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 12月, 2016 2 次提交
-
-
由 Thiago Jung Bauermann 提交于
The IMA kexec buffer allows the currently running kernel to pass the measurement list via a kexec segment to the kernel that will be kexec'd. This is the architecture-specific part of setting up the IMA kexec buffer for the next kernel. It will be used in the next patch. Link: http://lkml.kernel.org/r/1480554346-29071-6-git-send-email-zohar@linux.vnet.ibm.comSigned-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Thiago Jung Bauermann 提交于
Patch series "ima: carry the measurement list across kexec", v8. The TPM PCRs are only reset on a hard reboot. In order to validate a TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list of the running kernel must be saved and then restored on the subsequent boot, possibly of a different architecture. The existing securityfs binary_runtime_measurements file conveniently provides a serialized format of the IMA measurement list. This patch set serializes the measurement list in this format and restores it. Up to now, the binary_runtime_measurements was defined as architecture native format. The assumption being that userspace could and would handle any architecture conversions. With the ability of carrying the measurement list across kexec, possibly from one architecture to a different one, the per boot architecture information is lost and with it the ability of recalculating the template digest hash. To resolve this problem, without breaking the existing ABI, this patch set introduces the boot command line option "ima_canonical_fmt", which is arbitrarily defined as little endian. The need for this boot command line option will be limited to the existing version 1 format of the binary_runtime_measurements. Subsequent formats will be defined as canonical format (eg. TPM 2.0 support for larger digests). A simplified method of Thiago Bauermann's "kexec buffer handover" patch series for carrying the IMA measurement list across kexec is included in this patch set. The simplified method requires all file measurements be taken prior to executing the kexec load, as subsequent measurements will not be carried across the kexec and restored. This patch (of 10): The IMA kexec buffer allows the currently running kernel to pass the measurement list via a kexec segment to the kernel that will be kexec'd. The second kernel can check whether the previous kernel sent the buffer and retrieve it. This is the architecture-specific part which enables IMA to receive the measurement list passed by the previous kernel. It will be used in the next patch. The change in machine_kexec_64.c is to factor out the logic of removing an FDT memory reservation so that it can be used by remove_ima_buffer. Link: http://lkml.kernel.org/r/1480554346-29071-2-git-send-email-zohar@linux.vnet.ibm.comSigned-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Cc: Andreas Steffen <andreas.steffen@strongswan.org> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: Josh Sklar <sklar@linux.vnet.ibm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 12月, 2016 3 次提交
-
-
由 Aneesh Kumar K.V 提交于
Add arch specific callback in the generic THP page cache code that will deposit and withdarw preallocated page table. Archs like ppc64 use this preallocated table to store the hash pte slot information. Testing: kernel build of the patch series on tmpfs mounted with option huge=always The related thp stat: thp_fault_alloc 72939 thp_fault_fallback 60547 thp_collapse_alloc 603 thp_collapse_alloc_failed 0 thp_file_alloc 253763 thp_file_mapped 4251 thp_split_page 51518 thp_split_page_failed 1 thp_deferred_split_page 73566 thp_split_pmd 665 thp_zero_page_alloc 3 thp_zero_page_alloc_failed 0 [akpm@linux-foundation.org: remove unneeded parentheses, per Kirill] Link: http://lkml.kernel.org/r/20161113150025.17942-2-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Aneesh Kumar K.V 提交于
Independent of whether the vma is for anonymous memory, some arches like ppc64 would like to override pmd_move_must_withdraw(). One option is to encapsulate the vma_is_anonymous() check for general architectures inside pmd_move_must_withdraw() so that is always called and architectures that need unconditional overriding can override this function. ppc64 needs to override the function when the MMU is configured to use hash PTE's. [bsingharora@gmail.com: reworked changelog] Link: http://lkml.kernel.org/r/20161113150025.17942-1-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Aneesh Kumar K.V 提交于
With commit e77b0852 ("mm/mmu_gather: track page size with mmu gather and force flush if page size change") we added the ability to force a tlb flush when the page size change in a mmu_gather loop. We did that by checking for a page size change every time we added a page to mmu_gather for lazy flush/remove. We can improve that by moving the page size change check early and not doing it every time we add a page. This also helps us to do tlb flush when invalidating a range covering dax mapping. Wrt dax mapping we don't have a backing struct page and hence we don't call tlb_remove_page, which earlier forced the tlb flush on page size change. Moving the page size change check earlier means we will do the same even for dax mapping. We also avoid doing this check on architecture other than powerpc. In a later patch we will remove page size check from tlb_remove_page(). Link: http://lkml.kernel.org/r/20161026084839.27299-5-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 12月, 2016 2 次提交
-
-
由 Christophe Leroy 提交于
8xx uses a two level page table with two different linux page size support (4k and 16k). 8xx also support two different hugepage sizes 512k and 8M. In order to support them on linux we define two different page table layout. The size of pages is in the PGD entry, using PS field (bits 28-29): 00 : Small pages (4k or 16k) 01 : 512k pages 10 : reserved 11 : 8M pages For 512K hugepage size a pgd entry have the below format [<hugepte address >0101] . The hugepte table allocated will contain 8 entries pointing to 512K huge pte in 4k pages mode and 64 entries in 16k pages mode. For 8M in 16k mode, a pgd entry have the below format [<hugepte address >1101] . The hugepte table allocated will contain 8 entries pointing to 8M huge pte. For 8M in 4k mode, multiple pgd entries point to the same hugepte address and pgd entry will have the below format [<hugepte address>1101]. The hugepte table allocated will only have one entry. For the time being, we do not support CPU15 ERRATA when HUGETLB is selected Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits) Signed-off-by: NScott Wood <oss@buserror.net>
-
由 Christophe Leroy 提交于
Today powerpc64 uses a set of pgtable_caches while powerpc32 uses standard pages when using 4k pages and a single pgtable_cache if using other size pages. In preparation of implementing huge pages on the 8xx, this patch replaces the specific powerpc32 handling by the 64 bits approach. This is done by: * moving 64 bits pgtable_cache_add() and pgtable_cache_init() in a new file called init-common.c * modifying pgtable_cache_init() to also handle the case without PMD * removing the 32 bits version of pgtable_cache_add() and pgtable_cache_init() * copying related header contents from 64 bits into both the book3s/32 and nohash/32 header files On the 8xx, the following cache sizes will be used: * 4k pages mode: - PGT_CACHE(10) for PGD - PGT_CACHE(3) for 512k hugepage tables * 16k pages mode: - PGT_CACHE(6) for PGD - PGT_CACHE(7) for 512k hugepage tables - PGT_CACHE(3) for 8M hugepage tables Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NScott Wood <oss@buserror.net>
-
- 09 12月, 2016 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
Some tracepoints have a registration function that gets enabled when the tracepoint is enabled. There may be cases that the registraction function must fail (for example, can't allocate enough memory). In this case, the tracepoint should also fail to register, otherwise the user would not know why the tracepoint is not working. Cc: David Howells <dhowells@redhat.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Anton Blanchard <anton@samba.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 02 12月, 2016 2 次提交
-
-
由 Alexey Kardashevskiy 提交于
This changes mm_iommu_xxx helpers to take mm_struct as a parameter instead of getting it from @current which in some situations may not have a valid reference to mm. This changes helpers to receive @mm and moves all references to @current to the caller, including checks for !current and !current->mm; checks in mm_iommu_preregistered() are removed as there is no caller yet. This moves the mm_iommu_adjust_locked_vm() call to the caller as it receives mm_iommu_table_group_mem_t but it needs mm. This should cause no behavioral change. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Acked-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Alexey Kardashevskiy 提交于
We are going to get rid of @current references in mmu_context_boos3s64.c and cache mm_struct in the VFIO container. Since mm_context_t does not have reference counting, we will be using mm_struct which does have the reference counter. This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather than mm_context_t (which is embedded into mm). This should not cause any behavioral change. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 01 12月, 2016 1 次提交
-
-
由 Paul Mackerras 提交于
This moves the prototypes for functions that are only called from assembler code out of asm/asm-prototypes.h into asm/kvm_ppc.h. The prototypes were added in commit ebe4535f ("KVM: PPC: Book3S HV: sparse: prototypes for functions called from assembler", 2016-10-10), but given that the functions are KVM functions, having them in a KVM header will be better for long-term maintenance. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 30 11月, 2016 5 次提交
-
-
由 Michael Ellerman 提交于
Now that we've defined structures to describe each of the client architecture vectors, we can use those to construct the value we pass to firmware. This avoids the tricks we previously played with the W() macro, allows us to properly endian annotate fields, and should help to avoid bugs introduced by failing to have the correct number of zero pad bytes between fields. It also means we can avoid hard coding IBM_ARCH_VEC_NRCORES_OFFSET in order to update the max_cpus value and instead just set it. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Nicholas Piggin 提交于
This has not made its way to a PAPR release yet, but we have an hcall number assigned. H_SIGNAL_SYS_RESET = 0x380 Syntax: hcall(uint64 H_SIGNAL_SYS_RESET, int64 target); Generate a system reset NMI on the threads indicated by target. Values for target: -1 = target all online threads including the caller -2 = target all online threads except for the caller All other negative values: reserved Positive values: The thread to be targeted, obtained from the value of the "ibm,ppc-interrupt-server#s" property of the CPU in the OF device tree. Semantics: - Invalid target: return H_Parameter. - Otherwise: Generate a system reset NMI on target thread(s), return H_Success. This will be used by crash/debug code to get stuck CPUs into a known state. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Thiago Jung Bauermann 提交于
Define the Kconfig symbol so that the kexec_file_load() code can be built, and wire up the syscall so that it can be called. Signed-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Thiago Jung Bauermann 提交于
This patch adds the support code needed for implementing kexec_file_load() on powerpc. This consists of functions to load the ELF kernel, either big or little endian, and setup the purgatory enviroment which switches from the first kernel to the second kernel. None of this code is built yet, as it depends on CONFIG_KEXEC_FILE which we have not yet defined. Although we could define CONFIG_KEXEC_FILE in this patch, we'd then have a window in history where the kconfig symbol is present but the syscall is not, which would be awkward. Signed-off-by: NJosh Sklar <sklar@linux.vnet.ibm.com> Signed-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Thiago Jung Bauermann 提交于
Commit 2965faa5 ("kexec: split kexec_load syscall from kexec core code") introduced CONFIG_KEXEC_CORE so that CONFIG_KEXEC means whether the kexec_load system call should be compiled-in and CONFIG_KEXEC_FILE means whether the kexec_file_load system call should be compiled-in. These options can be set independently from each other. Since until now powerpc only supported kexec_load, CONFIG_KEXEC and CONFIG_KEXEC_CORE were synonyms. That is not the case anymore, so we need to make a distinction. Almost all places where CONFIG_KEXEC was being used should be using CONFIG_KEXEC_CORE instead, since kexec_file_load also needs that code compiled in. Signed-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 29 11月, 2016 1 次提交
-
-
The Xilinx interrupt controller driver is now available in drivers/irqchip. Switch to using that driver. Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Acked-by: NMichal Simek <michal.simek@xilinx.com> Signed-off-by: NZubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
- 28 11月, 2016 8 次提交
-
-
由 Aneesh Kumar K.V 提交于
This will improve the task exit case, by batching tlb invalidates. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
When we are updating a pte, we just need to flush the tlb mapping that pte. Right now we do a full mm flush because we don't track page size. Now that we have page size details in pte use that to do the optimized flush Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
When we are updating a pte, we just need to flush the tlb mapping that pte. Right now we do a full mm flush because we don't track the page size. Now that we have page size details in pte use that to do the optimized flush Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Now that we have page size details encoded in pte using software pte bits, use that to find the page size needed for tlb flush. This function should only be used on P9 DD1, so give it a horrible name to make that clear. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
This patch adds a new software defined pte bit. We use the reserved fields of ISA 3.0 pte definition since we will only be using this on DD1 code paths. We can possibly look at removing this code later. The software bit will be used to differentiate between 64K/4K and 2M ptes. This helps in finding the page size mapping by a pte so that we can do efficient tlb flush. We don't support 1G hugetlb pages yet. So we add a DEBUG WARN_ON to catch wrong usage. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
W.r.t hash page table config, we support 16MB and 16GB as the hugepage size. Update the hstate_get_psize to handle 16M and 16G. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
We will start moving some book3s specific hugetlb functions there. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Suraj Jitindar Singh 提交于
KVM_HALT_POLL_NS_DEFAULT is an arch specific constant which sets the default value of the halt_poll_ns kvm module parameter which determines the global maximum halt polling interval. The current value for powerpc is 500000 (500us) which means that any repetitive workload with a period of less than that can drive the cpu usage to 100% where it may have been mostly idle without halt polling. This presents the possibility of a large increase in power usage with a comparatively small performance benefit. Reduce the default to 10000 (10us) and a user can tune this themselves to set their affinity for halt polling based on the trade off between power and performance which they are willing to make. Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 25 11月, 2016 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
With commit e58e87ad ("powerpc/mm: Update _PAGE_KERNEL_RO") we started using the ppp value 0b110 to map kernel readonly. But that facility was only added as part of ISA 2.04. For earlier ISA version only supported ppp bit value for readonly mapping is 0b011. (This implies both user and kernel get mapped using the same ppp bit value for readonly mapping.). Update the code such that for earlier architecture version we use ppp value 0b011 for readonly mapping. We don't differentiate between power5+ and power5 here and apply the new ppp bits only from power6 (ISA 2.05). This keep the changes minimal. This fixes issue with PS3 spu usage reported at https://lkml.kernel.org/r/rep.1421449714.geoff@infradead.org Fixes: e58e87ad ("powerpc/mm: Update _PAGE_KERNEL_RO") Cc: stable@vger.kernel.org # v4.7+ Tested-by: NGeoff Levand <geoff@infradead.org> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
In commit d0563a12 ("powerpc: Implement {cmp}xchg for u8 and u16") we removed the volatile from __cmpxchg(). This is leading to warnings such as: drivers/gpu/drm/drm_lock.c: In function ‘drm_lock_take’: arch/powerpc/include/asm/cmpxchg.h:484:37: warning: passing argument 1 of ‘__cmpxchg’ discards ‘volatile’ qualifier from pointer target (__typeof__(*(ptr))) __cmpxchg((ptr), (unsigned long)_o_, \ There doesn't seem to be consensus across architectures whether the argument is volatile or not, so at least for now put the volatile back. Fixes: d0563a12 ("powerpc: Implement {cmp}xchg for u8 and u16") Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 24 11月, 2016 3 次提交
-
-
由 Paul Mackerras 提交于
The new XIVE interrupt controller on POWER9 can direct external interrupts to the hypervisor or the guest. The interrupts directed to the hypervisor are controlled by an LPCR bit called LPCR_HVICE, and come in as a "hypervisor virtualization interrupt". This sets the LPCR bit so that hypervisor virtualization interrupts can occur while we are in the guest. We then also need to cope with exiting the guest because of a hypervisor virtualization interrupt. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Paul Mackerras 提交于
POWER9 includes a new interrupt controller, called XIVE, which is quite different from the XICS interrupt controller on POWER7 and POWER8 machines. KVM-HV accesses the XICS directly in several places in order to send and clear IPIs and handle interrupts from PCI devices being passed through to the guest. In order to make the transition to XIVE easier, OPAL firmware will include an emulation of XICS on top of XIVE. Access to the emulated XICS is via OPAL calls. The one complication is that the EOI (end-of-interrupt) function can now return a value indicating that another interrupt is pending; in this case, the XIVE will not signal an interrupt in hardware to the CPU, and software is supposed to acknowledge the new interrupt without waiting for another interrupt to be delivered in hardware. This adapts KVM-HV to use the OPAL calls on machines where there is no XICS hardware. When there is no XICS, we look for a device-tree node with "ibm,opal-intc" in its compatible property, which is how OPAL indicates that it provides XICS emulation. In order to handle the EOI return value, kvmppc_read_intr() has become kvmppc_read_one_intr(), with a boolean variable passed by reference which can be set by the EOI functions to indicate that another interrupt is pending. The new kvmppc_read_intr() keeps calling kvmppc_read_one_intr() until there are no more interrupts to process. The return value from kvmppc_read_intr() is the largest non-zero value of the returns from kvmppc_read_one_intr(). Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Paul Mackerras 提交于
POWER9 adds new capabilities to the tlbie (TLB invalidate entry) and tlbiel (local tlbie) instructions. Both instructions get a set of new parameters (RIC, PRS and R) which appear as bits in the instruction word. The tlbiel instruction now has a second register operand, which contains a PID and/or LPID value if needed, and should otherwise contain 0. This adapts KVM-HV's usage of tlbie and tlbiel to work on POWER9 as well as older processors. Since we only handle HPT guests so far, we need RIC=0 PRS=0 R=0, which ends up with the same instruction word as on previous processors, so we don't need to conditionally execute different instructions depending on the processor. The local flush on first entry to a guest in book3s_hv_rmhandlers.S is a loop which depends on the number of TLB sets. Rather than using feature sections to set the number of iterations based on which CPU we're on, we now work out this number at VM creation time and store it in the kvm_arch struct. That will make it possible to get the number from the device tree in future, which will help with compatibility with future processors. Since mmu_partition_table_set_entry() does a global flush of the whole LPID, we don't need to do the TLB flush on first entry to the guest on each processor. Therefore we don't set all bits in the tlb_need_flush bitmap on VM startup on POWER9. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-