- 18 4月, 2013 1 次提交
-
-
由 Paul Bolle 提交于
Commit c1fb6816 ("powerpc: Add relocation on exception vector handlers") added two lines of code that depend on the macro CONFIG_HVC_SCOM. That macro doesn't exist. Perhaps it was intended to use CONFIG_PPC_SCOM here. But since "maintence_interrupt" is a typo and there's nothing in arch/powerpc that looks like maintenance_interrupt it seems best to just delete these lines. Signed-off-by: NPaul Bolle <pebolle@tiscali.nl> Acked-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
-
- 25 3月, 2013 1 次提交
-
-
由 Chen Gang 提交于
The FWNMI region is fixed at 0x7000 and the vector are now overflowing that with allmodconfig. Fix that by moving slb_miss_realmode code out of that region as it doesn't need to be that close to the call sites (it is a _GLOBAL function) Fixes this build error: arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:1304: Error: attempt to move .org backwards Signed-off-by: NChen Gang <gang.chen@asianux.com> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
-
- 17 3月, 2013 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
Now we use ESID_BITS of kernel address to build proto vsid. So rename USER_ESIT_BITS to ESID_BITS Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org> [v3.8]
-
由 Aneesh Kumar K.V 提交于
This patch change the kernel VSID range so that we limit VSID_BITS to 37. This enables us to support 64TB with 65 bit VA (37+28). Without this patch we have boot hangs on platforms that only support 65 bit VA. With this patch we now have proto vsid generated as below: We first generate a 37-bit "proto-VSID". Proto-VSIDs are generated from mmu context id and effective segment id of the address. For user processes max context id is limited to ((1ul << 19) - 5) for kernel space, we use the top 4 context ids to map address as below 0x7fffc - [ 0xc000000000000000 - 0xc0003fffffffffff ] 0x7fffd - [ 0xd000000000000000 - 0xd0003fffffffffff ] 0x7fffe - [ 0xe000000000000000 - 0xe0003fffffffffff ] 0x7ffff - [ 0xf000000000000000 - 0xf0003fffffffffff ] Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Tested-by: NGeoff Levand <geoff@infradead.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org> [v3.8]
-
- 05 3月, 2013 1 次提交
-
-
由 Michael Neuling 提交于
Currently we use the link register to branch up high in the early MMU on syscall entry path. Unfortunately, this trashes the link stack as the address we are going to is not associated with the earlier mflr. This patch simply converts us to used the count register (volatile over syscalls anyway) instead. This is much better at predicting in this scenario and doesn't trash link stack causing a bunch of additional branch mispredicts later. Benchmarking this on POWER8 saves a bunch of cycles on Anton's null syscall benchmark here: http://ozlabs.org/~anton/junkcode/null_syscall.cSigned-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 15 2月, 2013 4 次提交
-
-
由 Michael Neuling 提交于
This hooks the new transactional memory code into context switching, FP/VMX/VMX unavailable and exception return. Signed-off-by: NMatt Evans <matt@ozlabs.org> Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
These should never happen since we always turn on MSR TM when in userspace. We don't do lazy TM. Hence if we hit this, we barf and kill the task as something's gone horribly wrong. Signed-off-by: NMatt Evans <matt@ozlabs.org> Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Paul Mackerras 提交于
Some of the interrupt vectors on 64-bit POWER server processors are only 32 bytes long, which is not enough for the full first-level interrupt handler. For these we currently just have a branch to an out-of-line handler. However, this means that we corrupt the CFAR (come-from address register) on POWER7 and later processors. To fix this, we split the EXCEPTION_PROLOG_1 macro into two pieces: EXCEPTION_PROLOG_0 contains the part up to the point where the CFAR is saved in the PACA, and EXCEPTION_PROLOG_1 contains the rest. We then put EXCEPTION_PROLOG_0 in the short interrupt vectors before we branch to the out-of-line handler, which contains the rest of the first-level interrupt handler. To facilitate this, we define new _OOL (out of line) variants of STD_EXCEPTION_PSERIES, etc. In order to get EXCEPTION_PROLOG_0 to be short enough, i.e., no more than 6 instructions, it was necessary to move the stores that move the PPR and CFAR values into the PACA into __EXCEPTION_PROLOG_1 and to get rid of one of the two HMT_MEDIUM instructions. Previously there was a HMT_MEDIUM_PPR_DISCARD before the prolog, which was nop'd out on processors with the PPR (POWER7 and later), and then another HMT_MEDIUM inside the HMT_MEDIUM_PPR_SAVE macro call inside __EXCEPTION_PROLOG_1, which was nop'd out on processors without PPR. Now the HMT_MEDIUM inside EXCEPTION_PROLOG_0 is there unconditionally and the HMT_MEDIUM_PPR_DISCARD is not strictly necessary, although this leaves it in for the interrupt vectors where there is room for it. Previously we had a handler for hypervisor maintenance interrupts at 0xe50, which doesn't leave enough room for the vector for hypervisor emulation assist interrupts at 0xe40, since we need 8 instructions. The 0xe50 vector was only used on POWER6, as the HMI vector was moved to 0xe60 on POWER7. Since we don't support running in hypervisor mode on POWER6, we just remove the handler at 0xe50. This also changes denorm_exception_hv to use EXCEPTION_PROLOG_0 instead of open-coding it, and removes the HMT_MEDIUM_PPR_DISCARD from the relocation-on vectors (since any CPU that supports relocation-on interrupts also has the PPR). Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Paul Mackerras 提交于
The Cell processor doesn't support relocation-on interrupts, so we don't need relocation-on versions of the interrupt vectors that are purely Cell-specific. This removes them. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 10 1月, 2013 6 次提交
-
-
由 Benjamin Herrenschmidt 提交于
The FWNMI region is fixed at 0x7000 and the vector are now overflowing that with some configurations. Fix that by moving some hash management code out of that region as it doesn't need to be that close to the call sites (isn't accessed using conditional branches). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
This is a rewrite so that we don't assume we are using the DABR throughout the code. We now use the arch_hw_breakpoint to store the breakpoint in a generic manner in the thread_struct, rather than storing the raw DABR value. The ptrace GET/SET_DEBUGREG interface currently passes the raw DABR in from userspace. We keep this functionality, so that future changes (like the POWER8 DAWR), will still fake the DABR to userspace. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Haren Myneni 提交于
[PATCH 6/6] powerpc: Implement PPR save/restore When the task enters in to kernel space, the user defined priority (PPR) will be saved in to PACA at the beginning of first level exception vector and then copy from PACA to thread_info in second level vector. PPR will be restored from thread_info before exits the kernel space. P7/P8 temporarily raises the thread priority to higher level during exception until the program executes HMT_* calls. But it will not modify PPR register. So we save PPR value whenever some register is available to use and then calls HMT_MEDIUM to increase the priority. This feature supports on P7 or later processors. We save/ restore PPR for all exception vectors except system call entry. GLIBC will be saving / restore for system calls. So the default PPR value (3) will be set for the system call exit when the task returned to the user space. Signed-off-by: NHaren Myneni <haren@us.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Ian Munsie 提交于
This patch adds the logic to properly handle doorbells that come in when interrupts have been soft disabled and to replay them when interrupts are re-enabled: - masked_##_H##interrupt is modified to leave interrupts enabled when a doorbell has come in since doorbells are edge sensitive and as such won't be automatically re-raised. - __check_irq_replay now tests if a doorbell happened on book3s, and returns either 0xe80 or 0xa00 depending on whether we are the hypervisor or not. - restore_check_irq_replay now tests for the two possible server doorbell vector numbers to replay. - __replay_interrupt also adds tests for the two server doorbell vector numbers, and is modified to use a compare instruction rather than an andi. on the single bit difference between 0x500 and 0x900. The last two use a CPU feature section to avoid needlessly testing against the hypervisor vector if it is not the hypervisor, and vice versa. Signed-off-by: NIan Munsie <imunsie@au1.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Ian Munsie 提交于
Directed Privileged Doorbell Interrupts come in at 0xa00 (or 0xc000000000004a00 if relocation on exception is enabled), so add exception vectors at these locations. If doorbell support is not compiled in we handle it as an unknown_exception. Signed-off-by: NIan Munsie <imunsie@au1.ibm.com> Tested-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Ian Munsie 提交于
Directed Hypervisor Doorbell Interrupts come in at 0xe80 (or 0xc000000000004e80 if relocation on exceptions is enabled), so add exception vectors at these locations. If doorbell support is not compiled in we handle it as an unknown_exception. Signed-off-by: NIan Munsie <imunsie@au1.ibm.com> Tested-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 15 11月, 2012 7 次提交
-
-
由 Michael Neuling 提交于
POWER8/v2.07 allows exceptions to be taken with the MMU still on. A new set of exception vectors is added at 0xc000_0000_0000_4xxx. When the HW takes us here, MSR IR/DR will be set already and we no longer need a costly RFID to turn the MMU back on again. The original 0x0 based exception vectors remain for when the HW can't leave the MMU on. Examples of this are when we can't trust the current MMU mappings, like when we are changing from guest to hypervisor (HV 0 -> 1) or when the MMU was off already. In these cases the HW will take us to the original 0x0 based exception vectors with the MMU off as before. This uses the new macros added previously too implement these new execption vectors at 0xc000_0000_0000_4xxx. We exit these exception vectors using mflr/blr (rather than mtspr SSR0/RFID), since we don't need the costly MMU switch anymore. This moves the __end_interrupts marker down past these new 0x4000 vectors since they will need to be copied down to 0x0 when the kernel is not at 0x0. Signed-off-by: NMatt Evans <matt@ozlabs.org> Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
POWER8/v2.07 allows exceptions to be taken with the MMU still on. A new set of exception vectors is added at 0xc000_0000_0000_4xxx. When the HW takes us here, MSR IR/DR will be set already and we no longer need a costly RFID to turn the MMU back on again. The original 0x0 based exception vectors remain for when the HW can't leave the MMU on. Examples of this are when we can't trust the current the MMU mappings, like when we are changing from guest to hypervisor (HV 0 -> 1) or when the MMU was off already. In these cases the HW will take us to the original 0x0 based exception vectors with the MMU off as before. The below macros are copies of the macros used at the 0x0 offset but modified to handle the MMU being on. In these macros we use the link register to jump to the secondary handlers rather than using RFID (RFID was also use to turn on the MMU). Signed-off-by: NMatt Evans <matt@ozlabs.org> Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
This turns the syscall handler into macros as we are going to want to reuse them again later. Signed-off-by: NMatt Evans <matt@ozlabs.org> Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
If we change load_hander() to use an ori instead of addi, we can load handlers upto 64k away provided we are still 64k aligned. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
This removes the large gap between 0x1800 and 0x3000. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
Remove redundancy spaces and make tab usage consistent. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
Fix global symbol name to match actual denorm_exception_hv label. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 17 9月, 2012 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
Increase max addressable range to 64TB. This is not tested on real hardware yet. Reviewed-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Neuling 提交于
On POWER6 and POWER7 if the input operand to an instruction is a denormalised single precision binary floating point value we can take a denormalisation exception where it's expected that the hypervisor (HV=1) will fix up the inputs before the instruction is run. This adds code to handle this denormalisation exception for POWER6 and POWER7. It also add a CONFIG_PPC_DENORMALISATION option and sets it in pseries/ppc64_defconfig. This is useful on bare metal systems only. Based on patch from Milton Miller. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 05 9月, 2012 1 次提交
-
-
由 Paul Mackerras 提交于
At the moment the handler for hypervisor decrementer interrupts is the same as for decrementer interrupts, i.e. timer_interrupt(). This is bogus; if we ever do get a hypervisor decrementer interrupt it won't have anything to do with the next timer event. In fact the only time we get hypervisor decrementer interrupts is when one is left pending on exit from a KVM guest. When we get a hypervisor decrementer interrupt we don't need to do anything special to clear it, since they are edge-triggered on the transition of HDEC from 0 to -1. Thus this adds an empty handler function for them. We don't need to have them masked when interrupts are soft-disabled, so we use STD_EXCEPTION_HV instead of MASKABLE_EXCEPTION_HV. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 11 7月, 2012 2 次提交
-
-
由 Michael Ellerman 提交于
Purely for cosmetic purposes, otherwise it can appear that we are in single_step_pSeries() which is slightly confusing. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Stuart Yoder 提交于
Signed-off-by: NStuart Yoder <stuart.yoder@freescale.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 09 5月, 2012 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Alignment was the last user of the ENABLE_INTS macro, which we can now remove. All non-syscall exceptions now disable interrupts on entry, they get re-enabled conditionally from C code. Don't unconditionally re-enable in program check either, check the original context. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 30 4月, 2012 1 次提交
-
-
由 Anton Blanchard 提交于
Remove CONFIG_POWER4_ONLY, the option is badly named and only does two things: - It wraps the MMU segment table code. With feature fixups there is little downside to compiling this in. - It uses the newer mtocrf instruction in various assembly functions. Instead of making this a compile option just do it at runtime via a feature fixup. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 08 4月, 2012 1 次提交
-
-
由 Paul Mackerras 提交于
Currently on POWER7, if we are running the guest on a core and we don't need all the hardware threads, we do nothing to ensure that the unused threads aren't executing in the kernel (other than checking that they are offline). We just assume they're napping and we don't do anything to stop them trying to enter the kernel while the guest is running. This means that a stray IPI can wake up the hardware thread and it will then try to enter the kernel, but since the core is in guest context, it will execute code from the guest in hypervisor mode once it turns the MMU on, which tends to lead to crashes or hangs in the host. This fixes the problem by adding two new one-byte flags in the kvmppc_host_state structure in the PACA which are used to interlock between the primary thread and the unused secondary threads when entering the guest. With these flags, the primary thread can ensure that the unused secondaries are not already in kernel mode (i.e. handling a stray IPI) and then indicate that they should not try to enter the kernel if they do get woken for any reason. Instead they will go into KVM code, find that there is no vcpu to run, acknowledge and clear the IPI and go back to nap mode. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 09 3月, 2012 7 次提交
-
-
由 Benjamin Herrenschmidt 提交于
The current implementation of lazy interrupts handling has some issues that this tries to address. We don't do the various workarounds we need to do when re-enabling interrupts in some cases such as when returning from an interrupt and thus we may still lose or get delayed decrementer or doorbell interrupts. The current scheme also makes it much harder to handle the external "edge" interrupts provided by some BookE processors when using the EPR facility (External Proxy) and the Freescale Hypervisor. Additionally, we tend to keep interrupts hard disabled in a number of cases, such as decrementer interrupts, external interrupts, or when a masked decrementer interrupt is pending. This is sub-optimal. This is an attempt at fixing it all in one go by reworking the way we do the lazy interrupt disabling from the ground up. The base idea is to replace the "hard_enabled" field with a "irq_happened" field in which we store a bit mask of what interrupt occurred while soft-disabled. When re-enabling, either via arch_local_irq_restore() or when returning from an interrupt, we can now decide what to do by testing bits in that field. We then implement replaying of the missed interrupts either by re-using the existing exception frame (in exception exit case) or via the creation of a new one from an assembly trampoline (in the arch_local_irq_enable case). This removes the need to play with the decrementer to try to create fake interrupts, among others. In addition, this adds a few refinements: - We no longer hard disable decrementer interrupts that occur while soft-disabled. We now simply bump the decrementer back to max (on BookS) or leave it stopped (on BookE) and continue with hard interrupts enabled, which means that we'll potentially get better sample quality from performance monitor interrupts. - Timer, decrementer and doorbell interrupts now hard-enable shortly after removing the source of the interrupt, which means they no longer run entirely hard disabled. Again, this will improve perf sample quality. - On Book3E 64-bit, we now make the performance monitor interrupt act as an NMI like Book3S (the necessary C code for that to work appear to already be present in the FSL perf code, notably calling nmi_enter instead of irq_enter). (This also fixes a bug where BookE perfmon interrupts could clobber r14 ... oops) - We could make "masked" decrementer interrupts act as NMIs when doing timer-based perf sampling to improve the sample quality. Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- v2: - Add hard-enable to decrementer, timer and doorbells - Fix CR clobber in masked irq handling on BookE - Make embedded perf interrupt act as an NMI - Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want to retrigger an interrupt without preventing hard-enable v3: - Fix or vs. ori bug on Book3E - Fix enabling of interrupts for some exceptions on Book3E v4: - Fix resend of doorbells on return from interrupt on Book3E v5: - Rebased on top of my latest series, which involves some significant rework of some aspects of the patch. v6: - 32-bit compile fix - more compile fixes with various .config combos - factor out the asm code to soft-disable interrupts - remove the C wrapper around preempt_schedule_irq v7: - Fix a bug with hard irq state tracking on native power7
-
由 Benjamin Herrenschmidt 提交于
On 64-bit, the mfmsr instruction can be quite slow, slower than loading a field from the cache-hot PACA, which happens to already contain the value we want in most cases. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
If we get a floating point, altivec or vsx unavaible interrupt in kernel, we trigger a kernel error. There is no point preserving the interrupt state, in fact, that can even make debugging harder as the processor state might change (we may even preempt) between taking the exception and landing in a debugger. So just make those 3 disable interrupts unconditionally. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> --- v2: On BookE only disable when hitting the kernel unavailable path, otherwise it will fail to restore softe as fast_exception_return doesn't do it.
-
由 Benjamin Herrenschmidt 提交于
We currently turn interrupts back to their previous state before calling do_page_fault(). This can be annoying when debugging as a bad fault will potentially have lost some processor state before getting into the debugger. We also end up calling some generic code with interrupts enabled such as notify_page_fault() with interrupts enabled, which could be unexpected. This changes our code to behave more like other architectures, and make the assembly entry code call into do_page_faults() with interrupts disabled. They are conditionally re-enabled from within do_page_fault() in the same spot x86 does it. While there, add the might_sleep() test in the case of a successful trylock of the mmap semaphore, again like x86. Also fix a bug in the existing assembly where r12 (_MSR) could get clobbered by C calls (the DTL accounting in the exception common macro and DISABLE_INTS) in some cases. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> --- v2. Add the r12 clobber fix
-
由 Benjamin Herrenschmidt 提交于
This moves the inlines into system.h and changes the runlatch code to use the thread local flags (non-atomic) rather than the TIF flags (atomic) to keep track of the latch state. The code to turn it back on in an asynchronous interrupt is now simplified and partially inlined. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
The perfmon interrupt is the sole user of a special variant of the interrupt prolog which differs from the one used by external and timer interrupts in that it saves the non-volatile GPRs and doesn't turn the runlatch on. The former is unnecessary and the later is arguably incorrect, so let's clean that up by using the same prolog. While at it we rename that prolog to use the _ASYNC prefix. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
This removes the various bits of assembly in the kernel entry, exception handling and SLB management code that were specific to running under the legacy iSeries hypervisor which is no longer supported. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 05 3月, 2012 1 次提交
-
-
由 Paul Mackerras 提交于
This provides the low-level support for MMIO emulation in Book3S HV guests. When the guest tries to map a page which is not covered by any memslot, that page is taken to be an MMIO emulation page. Instead of inserting a valid HPTE, we insert an HPTE that has the valid bit clear but another hypervisor software-use bit set, which we call HPTE_V_ABSENT, to indicate that this is an absent page. An absent page is treated much like a valid page as far as guest hcalls (H_ENTER, H_REMOVE, H_READ etc.) are concerned, except of course that an absent HPTE doesn't need to be invalidated with tlbie since it was never valid as far as the hardware is concerned. When the guest accesses a page for which there is an absent HPTE, it will take a hypervisor data storage interrupt (HDSI) since we now set the VPM1 bit in the LPCR. Our HDSI handler for HPTE-not-present faults looks up the hash table and if it finds an absent HPTE mapping the requested virtual address, will switch to kernel mode and handle the fault in kvmppc_book3s_hv_page_fault(), which at present just calls kvmppc_hv_emulate_mmio() to set up the MMIO emulation. This is based on an earlier patch by Benjamin Herrenschmidt, but since heavily reworked. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 22 2月, 2012 1 次提交
-
-
由 Michael Ellerman 提交于
In commit 54321242 ("Disable interrupts early in Program Check"), we switched from enabling to disabling interrupts in program_check_common. Whereas ENABLE_INTS leaves r3 untouched, if lockdep is enabled DISABLE_INTS calls into lockdep code and will clobber r3. That means we pass a bogus struct pt_regs* into program_check_exception() and all hell breaks loose. So load our regs pointer into r3 after we call DISABLE_INTS. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 16 2月, 2012 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Program Check exceptions are the result of WARNs, BUGs, some type of breakpoints, kprobe, and other illegal instructions. We want interrupts (and thus preemption) to remain disabled while doing the initial stage of testing the reason and branching off to a debugger or kprobe, so we are still on the original CPU which makes debugging easier in various cases. This is how the code was intended, hence the local_irq_enable() right in the middle of program_check_exception(). However, the assembly exception prologue for that exception was incorrectly marked as enabling interrupts, which defeats that (and records a redundant enable with lockdep). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-