- 05 3月, 2012 24 次提交
-
-
由 Alexander Graf 提交于
When running the 64-bit Book3s PR code without CONFIG_PREEMPT_NONE, we were doing a few things wrong, most notably access to PACA fields without making sure that the pointers stay stable accross the access (preempt_disable()). This patch moves to_svcpu towards a get/put model which allows us to disable preemption while accessing the shadow vcpu fields in the PACA. That way we can run preemptible and everyone's happy! Reported-by: NJörg Sommer <joerg@alea.gnuu.de> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
Somewhere during merges we ended up from local_irq_enable() foo(); local_irq_disable() to always keeping irqs enabled during that part. However, we now have the following code: foo(); local_irq_disable() which disables interrupts without the surrounding code enabling them again! So let's remove that disable and be happy. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
When entering the guest, we want to make sure we're not getting preempted away, so let's disable preemption on entry, but enable it again while handling guest exits. Reported-by: NJörg Sommer <joerg@alea.gnuu.de> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
Decrementers are now properly driven by TCR/TSR, and the guest has full read/write access to these registers. The decrementer keeps ticking (and setting the TSR bit) regardless of whether the interrupts are enabled with TCR. The decrementer stops at zero, rather than going negative. Decrementers (and FITs, once implemented) are delivered as level-triggered interrupts -- dequeued when the TSR bit is cleared, not on delivery. Signed-off-by: NLiu Yu <yu.liu@freescale.com> [scottwood@freescale.com: significant changes] Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This allows additional registers to be accessed by the guest in PR-mode KVM without trapping. SPRG4-7 are readable from userspace. On booke, KVM will sync these registers when it enters the guest, so that accesses from guest userspace will work. The guest kernel, OTOH, must consistently use either the real registers or the shared area between exits. This also applies to the already-paravirted SPRG3. On non-booke, it's not clear to what extent SPRG4-7 are supported (they're not architected for book3s, but exist on at least some classic chips). They are copied in the get/set regs ioctls, but I do not see any non-booke emulation. I also do not see any syncing with real registers (in PR-mode) including the user-readable SPRG3. This patch should not make that situation any worse. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
Also fix wrteei 1 paravirt to check for a pending interrupt. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
int_pending was only being lowered if a bit in pending_exceptions was cleared during exception delivery -- but for interrupts, we clear it during IACK/TSR emulation. This caused paravirt for enabling MSR[EE] to be ineffective. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This prevents us from inappropriately blocking in a KVM_SET_REGS ioctl -- the MSR[WE] will take effect when the guest is next entered. It also causes SRR1[WE] to be set when we enter the guest's interrupt handler, which is what e500 hardware is documented to do. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This function should be called with interrupts disabled, to avoid a race where an exception is delivered after we check, but the resched kick is received before we disable interrupts (and thus doesn't actually trigger the exit code that would recheck exceptions). booke already does this properly in the lightweight exit case, but not on initial entry. For now, move the call of prepare_to_enter into subarch-specific code so that booke can do the right thing here. Ideally book3s would do the same thing, but I'm having a hard time seeing where it does any interrupt disabling of this sort (plus it has several additional call sites), so I'm deferring the book3s fix to someone more familiar with that code. book3s behavior should be unchanged by this patch. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This function also updates paravirt int_pending, so rename it to be more obvious that this is a collection of checks run prior to (re)entering a guest. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
Currently we check prior to returning from a lightweight exit, but not prior to initial entry. book3s already does a similar test. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Bharat Bhushan 提交于
As per specification the decrementer interrupt not happen when DEC is written with 0. Also when DEC is zero, no decrementer running. So we should not start hrtimer for decrementer when DEC = 0. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Bharat Bhushan 提交于
kvmppc_emulate_dec() uses dec_nsec of type unsigned long and does below calculation: dec_nsec = vcpu->arch.dec; dec_nsec *= 1000; This will truncate if DEC value "vcpu->arch.dec" is greater than 0xffff_ffff/1000. For example : For tb_ticks_per_usec = 4a, we can not set decrementer more than ~58ms. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Acked-by: NLiu Yu <yu.liu@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Bharat Bhushan 提交于
The current implementation of mtmsr and mtmsrd are racy in that it does: * check (int_pending == 0) ---> host sets int_pending = 1 <--- * write shared page * done while instead we should check for int_pending after the shared page is written. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Alexander Graf 提交于
With hugetlbfs support emerging on e500, we should also support KVM backing its guest memory by it. This patch adds support for hugetlbfs into the e500 shadow mmu code. Signed-off-by: NAlexander Graf <agraf@suse.de> Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
The hardcoded behavior prevents proper SMP support. user space shall specify the vcpu's PIR as the vcpu id. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
It should contain the way, not the absolute TLB0 index. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
This implements a shared-memory API for giving host userspace access to the guest's TLB. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
Split out the portions of tlbe_priv that should be associated with host entries into tlbe_ref. Base victim selection on the number of hardware entries, not guest entries. For TLB1, where one guest entry can be mapped by multiple host entries, we use the host tlbe_ref for tracking page references. For the guest TLB0 entries, we still track it with gtlb_priv, to avoid having to retranslate if the entry is evicted from the host TLB but not the guest TLB. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
The only place it makes sense to call this function already needs to have preemption disabled. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Scott Wood 提交于
Delay allocation of the shadow pid until we're ready to disable preemption and write the entry. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Christian Borntraeger 提交于
On some cpus the overhead for virtualization instructions is in the same range as a system call. Having to call multiple ioctls to get set registers will make certain userspace handled exits more expensive than necessary. Lets provide a section in kvm_run that works as a shared save area for guest registers. We also provide two 64bit flags fields (architecture specific), that will specify 1. which parts of these fields are valid. 2. which registers were modified by userspace Each bit for these flag fields will define a group of registers (like general purpose) or a single register. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Carsten Otte 提交于
This patch exports the s390 SIE hardware control block to userspace via the mapping of the vcpu file descriptor. In order to do so, a new arch callback named kvm_arch_vcpu_fault is introduced for all architectures. It allows to map architecture specific pages. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Carsten Otte 提交于
This patch introduces a new config option for user controlled kernel virtual machines. It introduces a parameter to KVM_CREATE_VM that allows to set bits that alter the capabilities of the newly created virtual machine. The parameter is passed to kvm_arch_init_vm for all architectures. The only valid modifier bit for now is KVM_VM_S390_UCONTROL. This requires CAP_SYS_ADMIN privileges and creates a user controlled virtual machine on s390 architectures. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 22 2月, 2012 3 次提交
-
-
由 Benjamin Herrenschmidt 提交于
We have a few problems when returning to userspace. This is a quick set of fixes for 3.3, I'll look into a more comprehensive rework for 3.4. This fixes: - We kept interrupts soft-disabled when schedule'ing or calling do_signal when returning to userspace as a result of a hardware interrupt. - Rename do_signal to do_notify_resume like all other archs (and do_signal_pending back to do_signal, which it was before Roland changed it). - Add the missing call to key_replace_session_keyring() to do_notify_resume(). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> ---
-
由 Michael Ellerman 提交于
In commit 54321242 ("Disable interrupts early in Program Check"), we switched from enabling to disabling interrupts in program_check_common. Whereas ENABLE_INTS leaves r3 untouched, if lockdep is enabled DISABLE_INTS calls into lockdep code and will clobber r3. That means we pass a bogus struct pt_regs* into program_check_exception() and all hell breaks loose. So load our regs pointer into r3 after we call DISABLE_INTS. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Rusty Russell 提交于
This has been obsolescent for a while; time for the final push. In adjacent context, replaced old cpus_* with cpumask_*. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 16 2月, 2012 5 次提交
-
-
由 Anton Blanchard 提交于
perf on POWER stopped working after commit e050e3f0 (perf: Fix broken interrupt rate throttling). That patch exposed a bug in the POWER perf_events code. Since the PMCs count upwards and take an exception when the top bit is set, we want to write 0x80000000 - left in power_pmu_start. We were instead programming in left which effectively disables the counter until we eventually hit 0x80000000. This could take seconds or longer. With the patch applied I get the expected number of samples: SAMPLE events: 9948 Signed-off-by: NAnton Blanchard <anton@samba.org> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Cc: <stable@kernel.org>
-
由 Benjamin Herrenschmidt 提交于
Program Check exceptions are the result of WARNs, BUGs, some type of breakpoints, kprobe, and other illegal instructions. We want interrupts (and thus preemption) to remain disabled while doing the initial stage of testing the reason and branching off to a debugger or kprobe, so we are still on the original CPU which makes debugging easier in various cases. This is how the code was intended, hence the local_irq_enable() right in the middle of program_check_exception(). However, the assembly exception prologue for that exception was incorrectly marked as enabling interrupts, which defeats that (and records a redundant enable with lockdep). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Stephen Rothwell 提交于
Since we are heading towards removing the Legacy iSeries platform, start by no longer building it for ppc64_defconfig. Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
Upstream changes to the way PHB resources are registered broke the resource fixup for FSL boards. We can no longer rely on the resource pointer array for the PHB's pci_bus structure, so let's leave it alone and go straight for the PHB resources instead. This also makes the code generally more readable. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Ira Snyder 提交于
A kernel oops/panic prints an instruction dump showing several instructions before and after the instruction which caused the oops/panic. The code intended that the faulting instruction be enclosed in angle brackets, however a bug caused the faulting instruction to be interpreted by printk() as the message log level. To fix this, the KERN_CONT log level is added before the actual text of the printed message. === Before the patch === [ 1081.587266] Instruction dump: [ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001 [ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000 [ 1081.602500] 4e800020 3803ffd0 2b800009 <4>[ 1081.587266] Instruction dump: <4>[ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001 <4>[ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000 <98090000>[ 1081.602500] 4e800020 3803ffd0 2b800009 === After the patch === [ 51.385216] Instruction dump: [ 51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001 [ 51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009 <4>[ 51.385216] Instruction dump: <4>[ 51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001 <4>[ 51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009 Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 14 2月, 2012 7 次提交
-
-
EEH may happen during a PCI driver probe. If the driver is trying to access some register in a loop, the EEH code will try to print the driver name. But the driver pointer in struct pci_dev is not set until probe returns successfully. Use a function to test if the device and the driver pointer is NULL before accessing the driver's name. Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Brian King 提交于
This fixes a hang that was observed during live partition migration. Since stop_topology_update must not be called from an interrupt context, call it earlier in the migration process. The hang observed can be seen below: WARNING: at kernel/timer.c:1011 Modules linked in: ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 fuse loop ibmveth sg ext3 jbd mbcache raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid10 raid1 raid0 scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc dm_round_robin dm_multipath scsi_dh sd_mod crc_t10dif ibmvfc scsi_transport_fc scsi_tgt scsi_mod dm_snapshot dm_mod NIP: c0000000000c52d8 LR: c00000000004be28 CTR: 0000000000000000 REGS: c00000005ffd77d0 TRAP: 0700 Not tainted (3.2.0-git-00001-g07d106d0) MSR: 8000000000021032 <ME,CE,IR,DR> CR: 48000084 XER: 00000001 CFAR: c00000000004be20 TASK = c00000005ec78860[0] 'swapper/3' THREAD: c00000005ec98000 CPU: 3 GPR00: 0000000000000001 c00000005ffd7a50 c000000000fbbc98 c000000000ec8340 GPR04: 00000000282a0020 0000000000000000 0000000000004000 0000000000000101 GPR08: 0000000000000012 c00000005ffd4000 0000000000000020 c000000000f3ba88 GPR12: 0000000000000000 c000000007f40900 0000000000000001 0000000000000004 GPR16: 0000000000000001 0000000000000000 0000000000000000 c000000001022310 GPR20: 0000000000000001 0000000000000000 0000000000200200 c000000001029e14 GPR24: 0000000000000000 0000000000000001 0000000000000040 c00000003f74bc80 GPR28: c00000003f74bc84 c000000000f38038 c000000000f16b58 c000000000ec8340 NIP [c0000000000c52d8] .del_timer_sync+0x28/0x60 LR [c00000000004be28] .stop_topology_update+0x20/0x38 Call Trace: [c00000005ffd7a50] [c00000005ec78860] 0xc00000005ec78860 (unreliable) [c00000005ffd7ad0] [c00000000004be28] .stop_topology_update+0x20/0x38 [c00000005ffd7b40] [c000000000028378] .__rtas_suspend_last_cpu+0x58/0x260 [c00000005ffd7bf0] [c0000000000fa230] .generic_smp_call_function_interrupt+0x160/0x358 [c00000005ffd7cf0] [c000000000036ec8] .smp_ipi_demux+0x88/0x100 [c00000005ffd7d80] [c00000000005c154] .icp_hv_ipi_action+0x5c/0x80 [c00000005ffd7e00] [c00000000012a088] .handle_irq_event_percpu+0x100/0x318 [c00000005ffd7f00] [c00000000012e774] .handle_percpu_irq+0x84/0xd0 [c00000005ffd7f90] [c000000000022ba8] .call_handle_irq+0x1c/0x2c [c00000005ec9ba20] [c00000000001157c] .do_IRQ+0x22c/0x2a8 [c00000005ec9bae0] [c0000000000054bc] hardware_interrupt_entry+0x18/0x1c Exception: 501 at .cpu_idle+0x194/0x2f8 LR = .cpu_idle+0x194/0x2f8 [c00000005ec9bdd0] [c000000000017e58] .cpu_idle+0x188/0x2f8 (unreliable) [c00000005ec9be90] [c00000000067ec18] .start_secondary+0x3e4/0x524 [c00000005ec9bf90] [c0000000000093e8] .start_secondary_prolog+0x10/0x14 Instruction dump: ebe1fff8 4e800020 fbe1fff8 7c0802a6 f8010010 7c7f1b78 f821ff81 78290464 80090014 5400019e 7c0000d0 78000fe0 <0b000000> 4800000c 7c210b78 7c421378 Signed-off-by: NBrian King <brking@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Ellerman 提交于
We need to disable interrupts when taking the phb->lock. Otherwise we could deadlock with pci_lock taken from an interrupt. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
We use __get_cpu_var() which triggers a false positive warning in smp_processor_id() thinking interrupts are enabled (at this point, they are soft-enabled but hard-disabled). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
We call the cache_hwirq_map() function with a linux IRQ number but it expects a HW irq number. This triggers a BUG on multic-chip setups in addition to not doing the right thing. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Srikar Dronamraju 提交于
With this change, helpers such as instruction_pointer() et al, get defined in the generic header in terms of GET_IP Removed the unnecessary definition of profile_pc in !CONFIG_SMP case as suggested by Mike Frysinger. Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: NMike Frysinger <vapier@gentoo.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Benjamin Herrenschmidt 提交于
It appears that on the Chroma card, the class code of the root complex is still wrong even on DD2 or later chips. This could be a firmware issue, but that breaks resource allocation so let's unconditionally fix it up. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 25 1月, 2012 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Commit 9deaa53a broke build on platforms that use legacy_serial.c without also having CONFIG_SERIAL_8250_FSL enabled due to an unconditional code to a routine in that module. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-