- 11 4月, 2017 13 次提交
-
-
由 Gautham R. Shenoy 提交于
POWER9 DD1.0 hardware has a bug where the SPRs of a thread waking up from stop 0,1,2 with ESL=1 can endup being misplaced in the core. Thus the HSPRG0 of a thread waking up from can contain the paca pointer of its sibling. This patch implements a context recovery framework within threads of a core, by provisioning space in paca_struct for saving every sibling threads's paca pointers. Basically, we should be able to arrive at the right paca pointer from any of the thread's existing paca pointer. At bootup, during powernv idle-init, we save the paca address of every CPU in each one its siblings paca_struct in the slot corresponding to this CPU's index in the core. On wakeup from a stop, the thread will determine its index in the core from the TIR register and recover its PACA pointer by indexing into the correct slot in the provisioned space in the current PACA. Furthermore, ensure that the NVGPRs are restored from the stack on the way out by setting the NAPSTATELOST in paca. [Changelog written with inputs from svaidy@linux.vnet.ibm.com] Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: NNicholas Piggin <npiggin@gmail.com> [mpe: Call it a bug] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Gautham R. Shenoy 提交于
Currently during idle-init on power9, if we don't find suitable stop states in the device tree that can be used as the default_stop/deepest_stop, we set stop0 (ESL=1,EC=1) as the default stop state psscr to be used by power9_idle and deepest stop state which is used by CPU-Hotplug. However, if the platform firmware has not configured or enabled a stop state, the kernel should not make any assumptions and fallback to a default choice. If the kernel uses a stop state that is not configured by the platform firmware, it may lead to further failures which should be avoided. In this patch, we modify the init code to ensure that the kernel uses only the stop states exposed by the firmware through the device tree. When a suitable default stop state isn't found, we disable ppc_md.power_save for power9. Similarly, when a suitable deepest_stop_state is not found in the device tree exported by the firmware, fall back to the default busy-wait loop in the CPU-Hotplug code. [Changelog written with inputs from svaidy@linux.vnet.ibm.com] Reviewed-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Gautham R. Shenoy 提交于
Currently, the powernv cpu-offline function assumes that platform idle states such as stop on POWER9, winkle/sleep/nap on POWER8 are always available. On POWER8, it picks nap as the default state if other deep idle states like sleep/winkle are not available and enabled in the platform. On POWER9, nap is not available and all idle states are managed by STOP instruction. The parameters to the idle state are passed through processor stop status control register (PSSCR). Hence as such executing STOP would take parameters from current PSSCR. We do not want to make any assumptions in kernel on what STOP states and PSSCR features are configured by the platform. Ideally platform will configure a good set of stop states that can be used in the kernel. We would like to start with a clean slate, if the platform choose to not configure any state or there is an error in platform firmware that lead to no stop states being configured or allowed to be requested. This patch adds a fallback method for CPU-Hotplug that is similar to snooze loop at idle where the threads are left to spin at low priority and hence reduce the cycles consumed. This is a safe fallback mechanism in the case when no stop state would be requested if the platform firmware did not configure them most likely due to an error condition. Requesting a stop state when the platform has not configured them or enabled them would lead to further error conditions which could be difficult to debug. [Changelog written with inputs from svaidy@linux.vnet.ibm.com] Reviewed-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Gautham R. Shenoy 提交于
Move the piece of code in powernv/smp.c::pnv_smp_cpu_kill_self() which transitions the CPU to the deepest available platform idle state to a new function named pnv_cpu_offline() in powernv/idle.c. The rationale behind this code movement is that the data required to determine the deepest available platform state resides in powernv/idle.c. Reviewed-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Anshuman Khandual 提交于
Add user space exported API definitions for 512KB, 1MB, 2MB, 8MB, 16MB, 1GB, 16GB non default huge page sizes to be used with mmap() system call. Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com> [mpe: Reword the comment to emphasise that these are only needed to use the non-default huge page size, and updated the change log.] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Anshuman Khandual 提交于
Generic core VM already prints these information in the log buffer, hence there is no need for a second print. This just removes the second print from arch powerpc NUMA init path. Before the patch: $ dmesg | grep "Initmem" numa: Initmem setup node 0 [mem 0x00000000-0xffffffff] numa: Initmem setup node 1 [mem 0x100000000-0x1ffffffff] numa: Initmem setup node 2 [mem 0x200000000-0x2ffffffff] numa: Initmem setup node 3 [mem 0x300000000-0x3ffffffff] numa: Initmem setup node 4 [mem 0x400000000-0x4ffffffff] numa: Initmem setup node 5 [mem 0x500000000-0x5ffffffff] numa: Initmem setup node 6 [mem 0x600000000-0x6ffffffff] numa: Initmem setup node 7 [mem 0x700000000-0x7ffffffff] Initmem setup node 0 [mem 0x0000000000000000-0x00000000ffffffff] Initmem setup node 1 [mem 0x0000000100000000-0x00000001ffffffff] Initmem setup node 2 [mem 0x0000000200000000-0x00000002ffffffff] Initmem setup node 3 [mem 0x0000000300000000-0x00000003ffffffff] Initmem setup node 4 [mem 0x0000000400000000-0x00000004ffffffff] Initmem setup node 5 [mem 0x0000000500000000-0x00000005ffffffff] Initmem setup node 6 [mem 0x0000000600000000-0x00000006ffffffff] Initmem setup node 7 [mem 0x0000000700000000-0x00000007ffffffff] After the patch just the latter set is printed. Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
Make sparsemem the default on all 64-bit Book3S platforms. It already is for pseries and ps3, and we need to enable it for powernv because on POWER9 memory between chips is discontiguous. For the other platforms sparsemem should work fine, though it might add a small amount of overhead. We can always force FLATMEM in the defconfigs if necessary. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
setup_initial_memory_limit() is called from early_init_devtree(), which runs prior to feature patching. If the kernel is built with CONFIG_JUMP_LABEL=y and CONFIG_JUMP_LABEL_FEATURE_CHECKS=y then we will potentially get the wrong value. If we also have CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG=y we get a warning and backtrace: Warning! mmu_has_feature() used prior to jump label init! CPU: 0 PID: 0 Comm: swapper Not tainted 4.11.0-rc4-gccN-next-20170331-g6af2434 #1 Call Trace: [c000000000fc3d50] [c000000000a26c30] .dump_stack+0xa8/0xe8 (unreliable) [c000000000fc3de0] [c00000000002e6b8] .setup_initial_memory_limit+0xa4/0x104 [c000000000fc3e60] [c000000000d5c23c] .early_init_devtree+0xd0/0x2f8 [c000000000fc3f00] [c000000000d5d3b0] .early_setup+0x90/0x11c [c000000000fc3f90] [c000000000000520] start_here_multiplatform+0x68/0x80 Fix it by using early_mmu_has_feature(). Fixes: c12e6f24 ("powerpc: Add option to use jump label for mmu_has_feature()") Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
These files don't seem to have any need for asm/debug.h, now that all it includes are the debugger hooks and breakpoint definitions. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
powerpc_debugfs_root is the dentry representing the root of the "powerpc" directory tree in debugfs. Currently it sits in asm/debug.h, a long with some other things that have "debug" in the name, but are otherwise unrelated. Pull it out into a separate header, which also includes linux/debugfs.h, and convert all the users to include debugfs.h instead of debug.h. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Alistair Popple 提交于
In the recent commit 1ab66d1f ("powerpc/powernv: Introduce address translation services for Nvlink2") the NPU code gained a dependency on MMU notifiers. All our defconfigs have KVM enabled, which selects MMU_NOTIFIER, but if KVM is not enabled then the build breaks. Fix it by always selecting MMU_NOTIFIER when we're building powernv. Fixes: 1ab66d1f ("powerpc/powernv: Introduce address translation services for Nvlink2") Signed-off-by: NAlistair Popple <alistair@popple.id.au> [mpe: Reword change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
For a tlbiel with pid, we need to issue tlbiel with set number encoded. We don't need to do ptesync for each of those. Instead we need one for the entire tlbiel pid operation. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
For fullmm tlb flush, we do a flush with RIC_FLUSH_ALL which will invalidate all related caches (radix__tlb_flush()). Hence the pwc flush is not needed. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 10 4月, 2017 6 次提交
-
-
由 Benjamin Herrenschmidt 提交于
We need to set LPES in order for normal external interrupts (0x500) to be directed to the guest while running in guest state. We also need HEIC set to prevent them to be sent to the host while in host state. With XIVE the host never gets one of these and wouldn't know how to handle it. All host external interrupts come in via the new hypervisor virtualization interrupts vector. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Benjamin Herrenschmidt 提交于
We have all sort of variants of MMIO accessors for the real mode instructions. This creates a clean set of accessors based on Linux normal naming conventions, replacing all occurrences of the old ones in the tree. I have purposefully removed the "out/in" variants in favor of only including __raw variants. Any code using these is already pretty much hand tuned to operate in a very specific environment. I've fixed up the 2 users (only one of them actually needed a barrier in the first place). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Benjamin Herrenschmidt 提交于
The function doesn't exist anymore Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Benjamin Herrenschmidt 提交于
It's only used within the same file it's defined Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Benjamin Herrenschmidt 提交于
We traditionally have linux/ before asm/ Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Benjamin Herrenschmidt 提交于
The XIVE interrupt controller is the new interrupt controller found in POWER9. It supports advanced virtualization capabilities among other things. Currently we use a set of firmware calls that simulate the old "XICS" interrupt controller but this is fairly inefficient. This adds the framework for using XIVE along with a native backend which OPAL for configuration. Later, a backend allowing the use in a KVM or PowerVM guest will also be provided. This disables some fast path for interrupts in KVM when XIVE is enabled as these rely on the firmware emulation code which is no longer available when the XIVE is used natively by Linux. A latter patch will make KVM also directly exploit the XIVE, thus recovering the lost performance (and more). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Fixup pr_xxx("XIVE:"...), don't split pr_xxx() strings, tweak Kconfig so XIVE_NATIVE selects XIVE and depends on POWERNV, fix build errors when SMP=n, fold in fixes from Ben: Don't call cpu_online() on an invalid CPU number Fix irq target selection returning out of bounds cpu# Extra sanity checks on cpu numbers ] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 07 4月, 2017 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Some powerpc platforms use this to move IRQs away from a CPU being unplugged. This function has several bugs such as not taking the right locks or failing to NULL check pointers. There's a new generic function doing exactly the same thing without all the bugs, so let's use it instead. mpe: The obvious place for the select of GENERIC_IRQ_MIGRATION is on HOTPLUG_CPU, but that doesn't work. On some configs PM_SLEEP_SMP will select HOTPLUG_CPU even though its dependencies are not met, which means the select of GENERIC_IRQ_MIGRATION doesn't happen. That leads to the build breaking. Fix it by moving the select of GENERIC_IRQ_MIGRATION to SMP. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 06 4月, 2017 3 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Some platforms (will) need to perform allocations before bringing a new CPU online. Doing it from smp_ops->setup_cpu is the wrong thing to do: - It has no useful failure path (too late) - Calling any allocator will enable interrupts prematurely causing problems with large decrementer among others Instead, add a new callback that is called from __cpu_up (so from the context trying to online the new CPU) at a point where we can safely allocate and handle failures. This will be used by XIVE support. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Benjamin Herrenschmidt 提交于
Add 32 and 8 bit variants Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Benjamin Herrenschmidt 提交于
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 04 4月, 2017 3 次提交
-
-
由 Matt Brown 提交于
New versions of OPAL have a device node /ibm,opal/firmware/exports, each property of which describes a range of memory in OPAL that Linux might want to export to userspace for debugging. This patch adds a sysfs file under 'opal/exports' for each property found there, and makes it read-only by root. Signed-off-by: NMatt Brown <matthew.brown.dev@gmail.com> [mpe: Drop counting of props, rename to attr, free on sysfs error, c'log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Sukadev Bhattiprolu 提交于
When booting very large systems with a large initrd, we run out of space early in boot for either RTAS or the flattened device tree (FDT). Boot fails with messages like: Could not allocate memory for RTAS or No memory for flatten_device_tree (no room) Increasing the minimum RMA size to 512MB fixes the problem. This should not have an impact on smaller LPARs (with 256MB memory), as the firmware will cap the RMA to the memory assigned to the LPAR. Fix is based on input/discussions with Michael Ellerman. Thanks to Praveen K. Pandey for testing on a large system. Reported-by: NPraveen K. Pandey <preveen.pandey@in.ibm.com> Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Alistair Popple 提交于
Nvlink2 supports address translation services (ATS) allowing devices to request address translations from an mmu known as the nest MMU which is setup to walk the CPU page tables. To access this functionality certain firmware calls are required to setup and manage hardware context tables in the nvlink processing unit (NPU). The NPU also manages forwarding of TLB invalidates (known as address translation shootdowns/ATSDs) to attached devices. This patch exports several methods to allow device drivers to register a process id (PASID/PID) in the hardware tables and to receive notification of when a device should stop issuing address translation requests (ATRs). It also adds a fault handler to allow device drivers to demand fault pages in. Signed-off-by: NAlistair Popple <alistair@popple.id.au> [mpe: Fix up comment formatting, use flush_tlb_mm()] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 03 4月, 2017 5 次提交
-
-
由 Alistair Popple 提交于
The pnv_pci_get_{gpu|npu}_dev functions are used to find associations between nvlink PCIe devices and standard PCIe devices. However they lacked basic sanity checking which results in NULL pointer dereferencing if they are incorrect called can be harder to spot than an explicit WARN_ON. Signed-off-by: NAlistair Popple <alistair@popple.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Oliver O'Halloran 提交于
The code to fix the problem it describes was removed in commit c40785ad ("powerpc/dart: Use a cachable DART"), and it uses the stupid comment style. Away it goooooooooooooes! Signed-off-by: NOliver O'Halloran <oohall@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Anton Blanchard 提交于
Early on in do_page_fault() we call store_updates_sp(), regardless of the type of exception. For an instruction miss this doesn't make sense, because we only use this information to detect if a data miss is the result of a stack expansion instruction or not. Worse still, it results in a data miss within every userspace instruction miss handler, because we try and load the very instruction we are about to install a pte for! A simple exec microbenchmark runs 6% faster on POWER8 with this fix: #include <stdlib.h> #include <stdio.h> #include <unistd.h> int main(int argc, char *argv[]) { unsigned long left = atol(argv[1]); char leftstr[16]; if (left-- == 0) return 0; sprintf(leftstr, "%ld", left); execlp(argv[0], argv[0], leftstr, NULL); perror("exec failed\n"); return 0; } Pass the number of iterations on the command line (eg 10000) and time how long it takes to execute. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
For an MCE (Machine Check Exception) that hits while in user mode MSR(PR=1), print the task info to the console MCE error log. This may help to identify an application that triggered the MCE. After this patch the MCE console looks like: Severe Machine check interrupt [Recovered] NIP: [0000000010039778] PID: 762 Comm: ebizzy Initiator: CPU Error type: SLB [Multihit] Effective address: 0000000010039778 Severe Machine check interrupt [Not recovered] NIP: [0000000010039778] PID: 763 Comm: ebizzy Initiator: CPU Error type: UE [Page table walk ifetch] Effective address: 0000000010039778 ebizzy[763]: unhandled signal 7 at 0000000010039778 nip 0000000010039778 lr 0000000010001b44 code 30004 Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Mahesh Salgaonkar 提交于
For D-side errors we print the load/store address that caused the machine check as 'Effective address'. But the instruction that may have caused the machine check can also be helpful, so in addition to printing the NIP, also print the kernel function name as well. After this patch the MCE console log would look like: Severe Machine check interrupt [Recovered] NIP [d00000001bc70194]: init_module+0x194/0x2b0 [bork_kernel] Initiator: CPU Error type: SLB [Parity] Effective address: d000000026de0000 Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 01 4月, 2017 5 次提交
-
-
由 Aneesh Kumar K.V 提交于
Not all user space application is ready to handle wide addresses. It's known that at least some JIT compilers use higher bits in pointers to encode their information. It collides with valid pointers with 512TB addresses and leads to crashes. To mitigate this, we are not going to allocate virtual address space above 128TB by default. But userspace can ask for allocation from full address space by specifying hint address (with or without MAP_FIXED) above 128TB. If hint address set above 128TB, but MAP_FIXED is not specified, we try to look for unmapped area by specified address. If it's already occupied, we look for unmapped area in *full* address space, rather than from 128TB window. This approach helps to easily make application's memory allocator aware about large address space without manually tracking allocated virtual address space. This is going to be a per mmap decision. ie, we can have some mmaps with larger addresses and other that do not. A sample memory layout looks like: 10000000-10010000 r-xp 00000000 fc:00 9057045 /home/max_addr_512TB 10010000-10020000 r--p 00000000 fc:00 9057045 /home/max_addr_512TB 10020000-10030000 rw-p 00010000 fc:00 9057045 /home/max_addr_512TB 10029630000-10029660000 rw-p 00000000 00:00 0 [heap] 7fff834a0000-7fff834b0000 rw-p 00000000 00:00 0 7fff834b0000-7fff83670000 r-xp 00000000 fc:00 9177190 /lib/powerpc64le-linux-gnu/libc-2.23.so 7fff83670000-7fff83680000 r--p 001b0000 fc:00 9177190 /lib/powerpc64le-linux-gnu/libc-2.23.so 7fff83680000-7fff83690000 rw-p 001c0000 fc:00 9177190 /lib/powerpc64le-linux-gnu/libc-2.23.so 7fff83690000-7fff836a0000 rw-p 00000000 00:00 0 7fff836a0000-7fff836c0000 r-xp 00000000 00:00 0 [vdso] 7fff836c0000-7fff83700000 r-xp 00000000 fc:00 9177193 /lib/powerpc64le-linux-gnu/ld-2.23.so 7fff83700000-7fff83710000 r--p 00030000 fc:00 9177193 /lib/powerpc64le-linux-gnu/ld-2.23.so 7fff83710000-7fff83720000 rw-p 00040000 fc:00 9177193 /lib/powerpc64le-linux-gnu/ld-2.23.so 7fffdccf0000-7fffdcd20000 rw-p 00000000 00:00 0 [stack] 1000000000000-1000000010000 rw-p 00000000 00:00 0 1ffff83710000-1ffff83720000 rw-p 00000000 00:00 0 Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Now that we use all the available virtual address range, we need to make sure we don't generate VSID such that it overlaps with the reserved vsid range. Reserved vsid range include the virtual address range used by the adjunct partition and also the VRMA virtual segment. We find the context value that can result in generating such a VSID and reserve it early in boot. We don't look at the adjunct range, because for now we disable the adjunct usage in a Linux LPAR via CAS interface. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Rewrite hash__reserve_context_id(), move the rest into pseries] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
We optmize the slice page size array copy to paca by copying only the range based on addr_limit. This will require us to not look at page size array beyond addr_limit in PACA on slb fault. To enable that copy task size to paca which will be used during slb fault. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Rename from task_size to addr_limit, consolidate #ifdefs] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
In the followup patch, we will increase the slice array size to handle 512TB range, but will limit the max addr to 128TB. Avoid doing unnecessary computation and avoid doing slice mask related operation above address limit. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 31 3月, 2017 4 次提交
-
-
由 Aneesh Kumar K.V 提交于
We update the hash linux page table layout such that we can support 512TB. But we limit the TASK_SIZE to 128TB. We can switch to 128TB by default without conditional because that is the max virtual address supported by other architectures. We will later add a mechanism to on-demand increase the application's effective address range to 512TB. Having the page table layout changed to accommodate 512TB makes testing large memory configuration easier with less code changes to kernel Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
This doesn't have any functional change. But helps in avoiding mistakes in case the shift bit changes Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Inorder to support large effective address range (512TB), we want to increase the virtual address bits to 68. But we do have platforms like p4 and p5 that can only do 65 bit VA. We support those platforms by limiting context bits on them to 16. The protovsid -> vsid conversion is verified to work with both 65 and 68 bit va values. I also documented the restrictions in a table format as part of code comments. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
get_kernel_vsid() has a very stern comment saying that it's only valid for kernel addresses, but there's nothing in the code to enforce that. Rather than hoping our callers are well behaved, add a check and return a VSID of 0 (invalid). Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-