- 25 6月, 2008 4 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Some Xen hypercalls accept an array of operations to work on. In general this is because its more efficient for the hypercall to the work all at once rather than as separate hypercalls (even batched as a multicall). This patch adds a mechanism (xen_mc_extend_args()) to allocate more argument space to the last-issued multicall, in order to extend its argument list. The user of this mechanism is xen/mmu.c, which uses it to extend the args array of mmu_update. This is particularly valuable when doing the update for a large mprotect, which goes via ptep_modify_prot_commit(), but it also manages to batch updates to pgd/pmds as well. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jeremy Fitzhardinge 提交于
Xen has a pte update function which will update a pte while preserving its accessed and dirty bits. This means that ptep_modify_prot_start() can be implemented as a simple read of the pte value. The hardware may update the pte in the meantime, but ptep_modify_prot_commit() updates it while preserving any changes that may have happened in the meantime. The updates in ptep_modify_prot_commit() are batched if we're currently in lazy mmu mode. The mmu_update hypercall can take a batch of updates to perform, but this code doesn't make particular use of that feature, in favour of using generic multicall batching to get them all into the hypervisor. The net effect of this is that each mprotect pte update turns from two expensive trap-and-emulate faults into they hypervisor into a single hypercall whose cost is amortized in a batched multicall. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jeremy Fitzhardinge 提交于
This patch adds paravirt-ops hooks in pv_mmu_ops for ptep_modify_prot_start and ptep_modify_prot_commit. This allows the hypervisor-specific backends to implement these in some more efficient way. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Gerd Hoffmann 提交于
This patch updates the xen guest to use the pvclock structs and helper functions. Signed-off-by: NGerd Hoffmann <kraxel@redhat.com> Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 24 6月, 2008 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Non-PAE operation has been deprecated in Xen for a while, and is rarely tested or used. xen-unstable has now officially dropped non-PAE support. Since Xen/pvops' non-PAE support has also been broken for a while, we may as well completely drop it altogether. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 6月, 2008 4 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Because NX is now enforced properly, we must put the hypercall page into the .text segment so that it is executable. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jeremy Fitzhardinge 提交于
[ Stable: this isn't a bugfix in itself, but it's a pre-requiste for "xen: don't drop NX bit" ] Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jeremy Fitzhardinge 提交于
Because NX is now enforced properly, we must put the hypercall page into the .text segment so that it is executable. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jeremy Fitzhardinge 提交于
[ Stable: this isn't a bugfix in itself, but it's a pre-requiste for "xen: don't drop NX bit" ] Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 6月, 2008 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
We have a few instances of the open-coded iterative div/mod loop, used when we don't expcet the dividend to be much bigger than the divisor. Unfortunately modern gcc's have the tendency to strength "reduce" this into a full mod operation, which isn't necessarily any faster, and even if it were, doesn't exist if gcc implements it in libgcc. The workaround is to put a dummy asm statement in the loop to prevent gcc from performing the transformation. This patch creates a single implementation of this loop, and uses it to replace the open-coded versions I know about. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: john stultz <johnstul@us.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Robert Hancock <hancockr@shaw.ca> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 6月, 2008 5 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Define recently added XEN_ELFNOTEs, and use them appropriately. Most significantly, this enables domain checkpointing (xm save -c). Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jeremy Fitzhardinge 提交于
On resume, the vcpu timer modes will not be restored. The timer infrastructure doesn't do this for us, since it assumes the cpus are offline. We can just poke the other vcpus into the right mode directly though. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jeremy Fitzhardinge 提交于
If we're using vcpu_info mapping, then make sure its restored on all processors before relasing them from stop_machine. The only complication is that if this fails, we can't continue because we've already made assumptions that the mapping is available (baked in calls to the _direct versions of the functions, for example). Fortunately this can only happen with a 32-bit hypervisor, which may possibly run out of mapping space. On a 64-bit hypervisor, this is a non-issue. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jeremy Fitzhardinge 提交于
When operating on an unpinned pagetable (ie, one under construction or destruction), it isn't necessary to use a hypercall to update a pud/pmd entry. Jan Beulich observed that a similar optimisation avoided many thousands of hypercalls while doing a kernel build. One tricky part is that early in the kernel boot there's no page structure, so we can't check to see if the page is pinned. In that case, we just always use the hypercall. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jan Beulich <jbeulich@novell.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
-tip testing found the following xen-console symbols trouble: ERROR: "get_phys_to_machine" [drivers/video/xen-fbfront.ko] undefined! ERROR: "get_phys_to_machine" [drivers/net/xen-netfront.ko] undefined! ERROR: "get_phys_to_machine" [drivers/input/xen-kbdfront.ko] undefined! with: http://redhat.com/~mingo/misc/config-Mon_Jun__2_12_25_13_CEST_2008.bad
-
- 28 5月, 2008 1 次提交
-
-
由 Ingo Molnar 提交于
-tip tree auto-testing found the following early bootup hang: --------------> get_memcfg_from_srat: assigning address to rsdp RSD PTR v0 [Nvidia] BUG: Int 14: CR2 ffd00040 EDI 8092fbfe ESI ffd00040 EBP 80b0aee8 ESP 80b0aed0 EBX 000f76f0 EDX 0000000e ECX 00000003 EAX ffd00040 err 00000000 EIP 802c055a CS 00000060 flg 00010006 Stack: ffd00040 80bc78d0 80b0af6c 80b1dbfe 8093d8ba 00000008 80b42810 80b4ddb4 80b42842 00000000 80b0af1c 801079c8 808e724e 00000000 80b42871 802c0531 00000100 00000000 0003fff0 80b0af40 80129999 00040100 00040100 00000000 Pid: 0, comm: swapper Not tainted 2.6.26-rc4-sched-devel.git #570 [<802c055a>] ? strncmp+0x11/0x25 [<80b1dbfe>] ? get_memcfg_from_srat+0xb4/0x568 [<801079c8>] ? mcount_call+0x5/0x9 [<802c0531>] ? strcmp+0xa/0x22 [<80129999>] ? printk+0x38/0x3a [<80129999>] ? printk+0x38/0x3a [<8011b122>] ? memory_present+0x66/0x6f [<80b216b4>] ? setup_memory+0x13/0x40c [<80b16b47>] ? propagate_e820_map+0x80/0x97 [<80b1622a>] ? setup_arch+0x248/0x477 [<80129999>] ? printk+0x38/0x3a [<80b11759>] ? start_kernel+0x6e/0x2eb [<80b110fc>] ? i386_start_kernel+0xeb/0xf2 ======================= <------ with this config: http://redhat.com/~mingo/misc/config-Wed_May_28_01_33_33_CEST_2008.bad The thing is, the crash makes little sense at first sight. We crash on a benign-looking printk. The code around it got changed in -tip but checking those topic branches individually did not reproduce the bug. Bisection led to this commit: | d5edbc1f is first bad commit | commit d5edbc1f | Author: Jeremy Fitzhardinge <jeremy@goop.org> | Date: Mon May 26 23:31:22 2008 +0100 | | xen: add p2m mfn_list_list Which is somewhat surprising, as on native hardware Xen client side should have little to no side-effects. After some head scratching, it turns out the following happened: randconfig enabled the following Xen options: CONFIG_XEN=y CONFIG_XEN_MAX_DOMAIN_MEMORY=8 # CONFIG_XEN_BLKDEV_FRONTEND is not set # CONFIG_XEN_NETDEV_FRONTEND is not set CONFIG_HVC_XEN=y # CONFIG_XEN_BALLOON is not set which activated this piece of code in arch/x86/xen/mmu.c: > @@ -69,6 +69,13 @@ > __attribute__((section(".data.page_aligned"))) = > { [ 0 ... TOP_ENTRIES - 1] = &p2m_missing[0] }; > > +/* Arrays of p2m arrays expressed in mfns used for save/restore */ > +static unsigned long p2m_top_mfn[TOP_ENTRIES] > + __attribute__((section(".bss.page_aligned"))); > + > +static unsigned long p2m_top_mfn_list[TOP_ENTRIES / P2M_ENTRIES_PER_PAGE] > + __attribute__((section(".bss.page_aligned"))); The problem is, you must only put variables into .bss.page_aligned that have a _size_ that is _exactly_ page aligned. In this case the size of p2m_top_mfn_list is not page aligned: 80b8d000 b p2m_top_mfn 80b8f000 b p2m_top_mfn_list 80b8f008 b softirq_stack 80b97008 b hardirq_stack 80b9f008 b bm_pte So all subsequent variables get unaligned which, depending on luck, breaks the kernel in various funny ways. In this case what killed the kernel first was the misaligned bootmap pte page, resulting in that creative crash above. Anyway, this was a fun bug to track down :-) I think the moral is that .bss.page_aligned is a dangerous construct in its current form, and the symptoms of breakage are very non-trivial, so i think we need build-time checks to make sure all symbols in .bss.page_aligned are truly page aligned. The Xen fix below gets the kernel booting again. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 27 5月, 2008 16 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Hook into the device model to make sure that timekeeping's resume handler is called. This deals with our clocksource's non-monotonicity over the save/restore. Explicitly call clock_has_changed() to make sure that all the timers get retriggered properly. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
This patch implements Xen save/restore and migration. Saving is triggered via xenbus, which is polled in drivers/xen/manage.c. When a suspend request comes in, the kernel prepares itself for saving by: 1 - Freeze all processes. This is primarily to prevent any partially-completed pagetable updates from confusing the suspend process. If CONFIG_PREEMPT isn't defined, then this isn't necessary. 2 - Suspend xenbus and other devices 3 - Stop_machine, to make sure all the other vcpus are quiescent. The Xen tools require the domain to run its save off vcpu0. 4 - Within the stop_machine state, it pins any unpinned pgds (under construction or destruction), performs canonicalizes various other pieces of state (mostly converting mfns to pfns), and finally 5 - Suspend the domain Restore reverses the steps used to save the domain, ending when all the frozen processes are thawed. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
When saving a domain, the Xen tools need to remap all our mfns to portable pfns. In order to remap our p2m table, it needs to know where all its pages are, so maintain the references to the p2m table for it to use. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Rename dummy_shared_info to xen_dummy_shared_info and make it non-static, in anticipation of users outside of enlighten.c Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
When using sparsemem and memory hotplug, the kernel's pseudo-physical address space can be discontigious. Previously this was dealt with by having the upper parts of the radix tree stubbed off. Unfortunately, this is incompatible with save/restore, which requires a complete p2m table. The solution is to have a special distinguished all-invalid p2m leaf page, which we can point all the hole areas at. This allows the tools to see a complete p2m table, but it only costs a page for all memory holes. It also simplifies the code since it removes a few special cases. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Add a config option to set the max size of a Xen domain. This is used to scale the size of the physical-to-machine array; it ends up using around 1 page/GByte, so there's no reason to be very restrictive. For a 32-bit guest, the default value of 8GB is probably sufficient; there's not much point in giving a 32-bit machine much more memory than that. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
We now support the use of memory hotplug, so the physical to machine page mapping structure must be dynamic. This is implemented as a two-level radix tree structure, which allows us to efficiently incrementally allocate memory for the p2m table as new pages are added. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Make sure resched interrupts appear in /proc/interrupts in the proper place. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Isaku Yamahata 提交于
move arch/x86/xen/manage.c under drivers/xen/to share codes with x86 and ia64. ia64/xen also uses manage.c Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
For some perverse reason, if you call add_preferred_console() it prevents setup_early_printk() from successfully enabling the boot console - unless you make it a preferred console too... Also, make xenboot console output distinct from normal console output, since it gets repeated when the console handover happens, and the duplicated output is confusing without disambiguation. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Markus Armbruster <armbru@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com>
-
由 Markus Armbruster 提交于
Without console= arguments on the kernel command line, the first console to register becomes enabled and the preferred console (the one behind /dev/console). This is normally tty (assuming CONFIG_VT_CONSOLE is enabled, which it commonly is). This is okay as long tty is a useful console. But unless we have the PV framebuffer, and it is enabled for this domain, tty0 in domU is merely a dummy. In that case, we want the preferred console to be the Xen console hvc0, and we want it without having to fiddle with the kernel command line. Commit b8c2d3df did that for us. Since we now have the PV framebuffer, we want to enable and prefer tty again, but only when PVFB is enabled. But even then we still want to enable the Xen console as well. Problem: when tty registers, we can't yet know whether the PVFB is enabled. By the time we can know (xenstore is up), the console setup game is over. Solution: enable console tty by default, but keep hvc as the preferred console. Change the preferred console to tty when PVFB probes successfully, unless we've been given console kernel parameters. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Add pte_flags() to extract the flags from a pte. This is a special case of pte_val() which is only guaranteed to return the pte's flags correctly; the page number may be corrupted or missing. The intent is to allow paravirt implementations to return pte flags without having to do any translation of the page number (most notably, Xen). Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
When enabling interrupts, we don't need to worry about preemption, because we either enter with interrupts disabled - so no preemption - or the caller is confused and is re-enabling interrupts on some indeterminate processor. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
The guest can legitimately change things like cr4.OSFXSR and OSXMMEXCPT, so let it. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Use the new sched_op hypercall, mainly because xenner doesn't support the old one. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Xen will trap and emulate clts, but its better to use a hypercall. Also, xenner doesn't handle clts. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 23 5月, 2008 2 次提交
-
-
由 Jan Beulich 提交于
While I realize that the function isn't currently being used, I still think an obvious mistake like this should be corrected. Signed-off-by: NJan Beulich <jbeulich@novell.com> Acked-by: NJeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Non-PAE operation has been deprecated in Xen for a while, and is rarely tested or used. xen-unstable has now officially dropped non-PAE support. Since Xen/pvops' non-PAE support has also been broken for a while, we may as well completely drop it altogether. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 28 4月, 2008 1 次提交
-
-
由 Christoph Lameter 提交于
Xen uses bitops to manipulate page flags. Make it use proper page flag functions. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 4月, 2008 1 次提交
-
-
由 Akinobu Mita 提交于
cpu_online(), cpu_present(), for_each_possible_cpu(), num_possible_cpus() Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 25 4月, 2008 4 次提交
-
-
由 Jeremy Fitzhardinge 提交于
There's no real reason we can't support sparsemem/discontigmem, so do so. This is mostly useful to support hotplug memory. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
xen_sysexit and xen_iret were doing essentially the same thing. Rather than having a separate implementation for xen_sysexit, we can just strip the stack back to an iret frame and jump into xen_iret. This removes a lot of code and complexity - specifically, another critical region. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
The usual pagetable locking protocol doesn't seem to apply to updates to init_mm, so don't rely on preemption being disabled in xen_set_pte_at on init_mm. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jeremy Fitzhardinge 提交于
Various places in the kernel flush the tlb even though preemption doens't guarantee the tlb flush is happening on any particular CPU. In many cases this doesn't seem to matter, so don't make a fuss about it. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-