- 05 4月, 2008 2 次提交
-
-
由 Mark McLoughlin 提交于
i.e. with this simple test case: int fd = open("/dev/zero", O_RDONLY); munmap(mmap((void *)0x40000000, 0x1000_LEN, PROT_READ, MAP_PRIVATE, fd, 0), 0x1000); close(fd); we currently get: kernel BUG at arch/x86/xen/enlighten.c:678! ... EIP is at xen_release_pt+0x79/0xa9 ... Call Trace: [<c041da25>] ? __pmd_free_tlb+0x1a/0x75 [<c047a192>] ? free_pgd_range+0x1d2/0x2b5 [<c047a2f3>] ? free_pgtables+0x7e/0x93 [<c047b272>] ? unmap_region+0xb9/0xf5 [<c047c1bd>] ? do_munmap+0x193/0x1f5 [<c047c24f>] ? sys_munmap+0x30/0x3f [<c0408cce>] ? syscall_call+0x7/0xb ======================= and xen complains: (XEN) mm.c:2241:d4 Mfn 1cc37 not pinned Further details at: https://bugzilla.redhat.com/436453Signed-off-by: NMark McLoughlin <markmc@redhat.com> Cc: xen-devel@lists.xensource.com Cc: Mark McLoughlin <markmc@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mark McLoughlin 提交于
Signed-off-by: NMark McLoughlin <markmc@redhat.com> Cc: xen-devel@lists.xensource.com Cc: Mark McLoughlin <markmc@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 27 3月, 2008 2 次提交
-
-
由 Jeremy Fitzhardinge 提交于
We need to set up the shared_info pointer once we've mapped the real shared_info into its fixmap slot. That needs to happen once the general pagetable setup has been done. Previously, the UP shared_info was set up one in xen_start_kernel, but that was left pointing to the dummy shared info. Unfortunately there's no really good place to do a later setup of the shared_info in UP, so just do it once the pagetable setup has been done. [ Stable: needed in 2.6.24.x ] Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jeremy Fitzhardinge 提交于
xen_irq_enable_direct and xen_sysexit were using "andw $0x00ff, XEN_vcpu_info_pending(vcpu)" to unmask events and test for pending ones in one instuction. Unfortunately, the pending flag must be modified with a locked operation since it can be set by another CPU, and the unlocked form of this operation was causing the pending flag to get lost, allowing the processor to return to usermode with pending events and ultimately deadlock. The simple fix would be to make it a locked operation, but that's rather costly and unnecessary. The fix here is to split the mask-clearing and pending-testing into two instructions; the interrupt window between them is of no concern because either way pending or new events will be processed. This should fix lingering bugs in using direct vcpu structure access too. [ Stable: needed in 2.6.24.x ] Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable <stable@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 01 3月, 2008 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Fix 32-on-64 pvops kernel: we don't want userspace using syscall/sysenter, even if the hypervisor supports it, so mask it out from CPUID. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 2月, 2008 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Unpin the Xen-provided pagetable once we've finished with it, so it doesn't cause stray references which cause later swapper_pg_dir pagetable updates to fail. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Tested-by: NJody Belka <knew-linux@pimb.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 1月, 2008 10 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Deal properly with pmd-level pages being allocated and freed dynamically. We can handle them more or less the same as pte pages. Also, deal with early_ioremap pagetable manipulations. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jan Beulich 提交于
Based on patch from Jan Beulich <jbeulich@novell.com>. Don't rely on kmalloc(PAGE_SIZE) returning PAGE_SIZE aligned memory (Xen requires GDT *and* LDT to be page-aligned). Using the page allocator interface also removes the (albeit small) slab allocator overhead. The same change being done for 64-bits for consistency. Further, the Xen hypercall interface expects the LDT address to be virtual, not machine. [ Adjusted to unified ldt.c - Jeremy ] Signed-off-by: NJan Beulich <jbeulich@novell.com> Acked-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
this patch changes the signature of write_ldt_entry. Signed-off-by: NGlauber de Oliveira Costa <gcosta@redhat.com> CC: Zachary Amsden <zach@vmware.com> CC: Jeremy Fitzhardinge <Jeremy.Fitzhardinge.citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
This patch changes the write_gdt_entry function signature. Instead of the old "a" and "b" parameters, it now receives a pointer to a desc_struct, and the size of the entry being handled. This is because x86_64 can have some 16-byte entries as well as 8-byte ones. Signed-off-by: NGlauber de Oliveira Costa <gcosta@redhat.com> CC: Zachary Amsden <zach@vmware.com> CC: Jeremy Fitzhardinge <Jeremy.Fitzhardinge.citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
this patch changes write_idt_entry signature. It now takes a gate_desc instead of the a and b parameters. It will allow it to be later unified between i386 and x86_64. Signed-off-by: NGlauber de Oliveira Costa <gcosta@redhat.com> CC: Zachary Amsden <zach@vmware.com> CC: Jeremy Fitzhardinge <Jeremy.Fitzhardinge.citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
This patch unifies struct desc_ptr between i386 and x86_64. They can be expressed in the exact same way in C code, only having to change the name of one of them. As Xgt_desc_struct is ugly and big, this is the one that goes away. There's also a padding field in i386, but it is not really needed in the C structure definition. Signed-off-by: NGlauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 H. Peter Anvin 提交于
This changes size-specific register names (eip/rip, esp/rsp, etc.) to generic names in the thread and tss structures. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 H. Peter Anvin 提交于
We have a lot of code which differs only by the naming of specific members of structures that contain registers. In order to enable additional unifications, this patch drops the e- or r- size prefix from the register names in struct pt_regs, and drops the x- prefixes for segment registers on the 32-bit side. This patch also performs the equivalent renames in some additional places that might be candidates for unification in the future. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
This patch consolidates the irqflags include files containing common paravirt definitions. The native definition for interrupt handling, halt, and such, are the same for 32 and 64 bit, and they are kept in irqflags.h. the differences are split in the arch-specific files. The syscall function, irq_enable_sysexit, has a very specific i386 naming, and its name is then changed to a more general one. Signed-off-by: NGlauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Acked-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
Use u32 so 32 and 64bit have the same interface. Andrew Morton: xen, lguest build fixes Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 24 1月, 2008 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
There have been several reports of Xen guest domains locking up when using vcpu_info structure placement. Disable it for now. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 12月, 2007 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Some versions of Xen 3.x set their magic number to "xen-3.[12]", so relax the test to match them. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 10月, 2007 1 次提交
-
-
由 Jesper Juhl 提交于
This patch cleans up duplicate includes in arch/i386/xen/ [ tglx: arch/x86 adaptation ] Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com> Signed-off-by: NAndi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 17 10月, 2007 8 次提交
-
-
由 H. Peter Anvin 提交于
Instead of using magic macros for boot_params access, simply use the boot_params structure. Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Jeremy Fitzhardinge 提交于
The kernel's copy of struct vcpu_register_vcpu_info was out of date, at best causing the hypercall to fail and the guest kernel to fall back to the old mechanism, or worse, causing random memory corruption. [ Stable folks: applies to 2.6.23 ] Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Cc: Stable Kernel <stable@kernel.org> Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk> Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
-
由 Jeremy Fitzhardinge 提交于
Ask the hypervisor how much space it needs reserved, since 32-on-64 doesn't need any space, and it may change in future. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
-
由 Jeremy Fitzhardinge 提交于
When a pagetable is created, it is made globally visible in the rmap prio tree before it is pinned via arch_dup_mmap(), and remains in the rmap tree while it is unpinned with arch_exit_mmap(). This means that other CPUs may race with the pinning/unpinning process, and see a pte between when it gets marked RO and actually pinned, causing any pte updates to fail with write-protect faults. As a result, all pte pages must be properly locked, and only unlocked once the pinning/unpinning process has finished. In order to avoid taking spinlocks for the whole pagetable - which may overflow the PREEMPT_BITS portion of preempt counter - it locks and pins each pte page individually, and then finally pins the whole pagetable. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickens <hugh@veritas.com> Cc: David Rientjes <rientjes@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Keir Fraser <keir@xensource.com> Cc: Jan Beulich <jbeulich@novell.com>
-
由 Jeremy Fitzhardinge 提交于
When a pagetable is no longer in use, it must be unpinned so that its pages can be freed. However, this is only possible if there are no stray uses of the pagetable. The code currently deals with all the usual cases, but there's a rare case where a vcpu is changing cr3, but is doing so lazily, and the change hasn't actually happened by the time the pagetable is unpinned, even though it appears to have been completed. This change adds a second per-cpu cr3 variable - xen_current_cr3 - which tracks the actual state of the vcpu cr3. It is only updated once the actual hypercall to set cr3 has been completed. Other processors wishing to unpin a pagetable can check other vcpu's xen_current_cr3 values to see if any cross-cpu IPIs are needed to clean things up. [ Stable folks: 2.6.23 bugfix ] Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Cc: Stable Kernel <stable@kernel.org>
-
由 Jesper Juhl 提交于
This patch cleans up duplicate includes in arch/i386/xen/ Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com> Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
-
由 Jeremy Fitzhardinge 提交于
Currently, the set_lazy_mode pv_op is overloaded with 5 functions: 1. enter lazy cpu mode 2. leave lazy cpu mode 3. enter lazy mmu mode 4. leave lazy mmu mode 5. flush pending batched operations This complicates each paravirt backend, since it needs to deal with all the possible state transitions, handling flushing, etc. In particular, flushing is quite distinct from the other 4 functions, and seems to just cause complication. This patch removes the set_lazy_mode operation, and adds "enter" and "leave" lazy mode operations on mmu_ops and cpu_ops. All the logic associated with enter and leaving lazy states is now in common code (basically BUG_ONs to make sure that no mode is current when entering a lazy mode, and make sure that the mode is current when leaving). Also, flush is handled in a common way, by simply leaving and re-entering the lazy mode. The result is that the Xen, lguest and VMI lazy mode implementations are much simpler. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Zach Amsden <zach@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Avi Kivity <avi@qumranet.com> Cc: Anthony Liguory <aliguori@us.ibm.com> Cc: "Glauber de Oliveira Costa" <glommer@gmail.com> Cc: Jun Nakajima <jun.nakajima@intel.com>
-
由 Jeremy Fitzhardinge 提交于
This patch refactors the paravirt_ops structure into groups of functionally related ops: pv_info - random info, rather than function entrypoints pv_init_ops - functions used at boot time (some for module_init too) pv_misc_ops - lazy mode, which didn't fit well anywhere else pv_time_ops - time-related functions pv_cpu_ops - various privileged instruction ops pv_irq_ops - operations for managing interrupt state pv_apic_ops - APIC operations pv_mmu_ops - operations for managing pagetables There are several motivations for this: 1. Some of these ops will be general to all x86, and some will be i386/x86-64 specific. This makes it easier to share common stuff while allowing separate implementations where needed. 2. At the moment we must export all of paravirt_ops, but modules only need selected parts of it. This allows us to export on a case by case basis (and also choose which export license we want to apply). 3. Functional groupings make things a bit more readable. Struct paravirt_ops is now only used as a template to generate patch-site identifiers, and to extract function pointers for inserting into jmp/calls when patching. It is only instantiated when needed. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Cc: Zach Amsden <zach@vmware.com> Cc: Avi Kivity <avi@qumranet.com> Cc: Anthony Liguory <aliguori@us.ibm.com> Cc: "Glauber de Oliveira Costa" <glommer@gmail.com> Cc: Jun Nakajima <jun.nakajima@intel.com>
-
- 11 10月, 2007 1 次提交
-
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 9月, 2007 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Xen ignores all updates to cr4, and some versions will kill the domain if you try to change its value. Just ignore all changes. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 8月, 2007 1 次提交
-
-
由 Andi Kleen 提交于
Commit 19d36ccd "x86: Fix alternatives and kprobes to remap write-protected kernel text" uses code which is being patched for patching. In particular, paravirt_ops does patching in two stages: first it calls paravirt_ops.patch, then it fills any remaining instructions with nop_out(). nop_out calls text_poke() which calls lookup_address() which calls pgd_val() (aka paravirt_ops.pgd_val): that call site is one of the places we patch. If we always do patching as one single call to text_poke(), we only need make sure we're not patching the memcpy in text_poke itself. This means the prototype to paravirt_ops.patch needs to change, to marshal the new code into a buffer rather than patching in place as it does now. It also means all patching goes through text_poke(), which is known to be safe (apply_alternatives is also changed to make a single patch). AK: fix compilation on x86-64 (bad rusty!) AK: fix boot on x86-64 (sigh) AK: merged with other patches Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 7月, 2007 10 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Most of the time we can simply use the iret instruction to exit the kernel, rather than having to use the iret hypercall - the only exception is if we're returning into vm86 mode, or from delivering an NMI (which we don't support yet). When running native, iret has the behaviour of testing for a pending interrupt atomically with re-enabling interrupts. Unfortunately there's no way to do this with Xen, so there's a window in which we could get a recursive exception after enabling events but before actually returning to userspace. This causes a problem: if the nested interrupt causes one of the task's TIF_WORK_MASK flags to be set, they will not be checked again before returning to userspace. This means that pending work may be left pending indefinitely, until the process enters and leaves the kernel again. The net effect is that a pending signal or reschedule event could be delayed for an unbounded amount of time. To deal with this, the xen event upcall handler checks to see if the EIP is within the critical section of the iret code, after events are (potentially) enabled up to the iret itself. If its within this range, it calls the iret critical section fixup, which adjusts the stack to deal with any unrestored registers, and then shifts the stack frame up to replace the previous invocation. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
-
由 Jeremy Fitzhardinge 提交于
This patchs adds the mechanism to allow us to patch inline versions of common operations. The implementations of the direct-access versions save_fl, restore_fl, irq_enable and irq_disable are now in assembler, and the same code is used for both out of line and inline uses. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Keir Fraser <keir@xensource.com>
-
由 Jeremy Fitzhardinge 提交于
An experimental patch for Xen allows guests to place their vcpu_info structs anywhere. We try to use this to place the vcpu_info into the PDA, which allows direct access. If this works, then switch to using direct access operations for irq_enable, disable, save_fl and restore_fl. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Keir Fraser <keir@xensource.com>
-
由 Jeremy Fitzhardinge 提交于
Make the appropriate hypercalls to halt and reboot the virtual machine. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Acked-by: NChris Wright <chrisw@sous-sol.org>
-
由 Jeremy Fitzhardinge 提交于
The hypervisor saves and restores the segment registers as part of the state is saves while context switching. If, during a context switch, the next process doesn't use the TLS segments, it invalidates the GDT entry, causing the segment register reload to fault. This fault effectively doubles the cost of a context switch. This patch is a band-aid workaround which clears the usermode %gs after it has been saved for the previous process, but before it gets reloaded for the next, and it avoids having the hypervisor attempt to erroneously reload it. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NChris Wright <chrisw@sous-sol.org>
-
由 Jeremy Fitzhardinge 提交于
This patch uses the lazy-mmu hooks to batch mmu operations where possible. This is primarily useful for batching operations applied to active pagetables, which happens during mprotect, munmap, mremap and the like (mmap does not do bulk pagetable operations, so it isn't helped). Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Acked-by: NChris Wright <chrisw@sous-sol.org>
-
由 Jeremy Fitzhardinge 提交于
Add Xen support for preemption. This is mostly a cleanup of existing preempt_enable/disable calls, or just comments to explain the current usage. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NChris Wright <chrisw@sous-sol.org>
-
由 Jeremy Fitzhardinge 提交于
This is a fairly straightforward Xen implementation of smp_ops. Xen has its own IPI mechanisms, and has no dependency on any APIC-based IPI. The smp_ops hooks and the flush_tlb_others pv_op allow a Xen guest to avoid all APIC code in arch/i386 (the only apic operation is a single apic_read for the apic version number). One subtle point which needs to be addressed is unpinning pagetables when another cpu may have a lazy tlb reference to the pagetable. Xen will not allow an in-use pagetable to be unpinned, so we must find any other cpus with a reference to the pagetable and get them to shoot down their references. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NChris Wright <chrisw@sous-sol.org> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andi Kleen <ak@suse.de>
-
由 Jeremy Fitzhardinge 提交于
Implement xen_sched_clock, which returns the number of ns the current vcpu has been actually in an unstolen state (ie, running or blocked, vs runnable-but-not-running, or offline) since boot. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Acked-by: NChris Wright <chrisw@sous-sol.org> Cc: john stultz <johnstul@us.ibm.com>
-
由 Jeremy Fitzhardinge 提交于
When setting up the initial pagetable, which includes mappings of all low physical memory, ignore a mapping which tries to set the RW bit on an RO pte. An RO pte indicates a page which is part of the current pagetable, and so it cannot be allowed to become RW. Once xen_pagetable_setup_done is called, set_pte reverts to its normal behaviour. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Acked-by: NChris Wright <chrisw@sous-sol.org> Cc: ebiederm@xmission.com (Eric W. Biederman)
-