- 18 3月, 2009 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
Impact: crash fix head_32.S needs to map the kernel itself, and enough space so that mm/init.c can allocate space from the e820 allocator for the linear map of low memory. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 15 3月, 2009 4 次提交
-
-
由 Yinghai Lu 提交于
Impact: makes vmlinux section information more useful Don't use ram after _end blindly for pagetables. aka init pages is before _end put those pg table into .bss [Adapted to use brk segment - Jeremy] v2: keep initial page table up to 512M only. v4: put initial page tables just before _end Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Jeremy Fitzhardinge 提交于
Impact: new interface; remove hard-coded limit Add RESERVE_BRK(name, size) macro to reserve space in the brk area. This should be a conservative (ie, larger) estimate of how much space might possibly be required from the brk area. Any unused space will be freed, so there's no real downside on making the reservation too large (within limits). The name should be unique within a given file, and somewhat descriptive. The C definition of RESERVE_BRK() ends up being more complex than one would expect to work around a cluster of gcc infelicities: The first attempt was to simply try putting __section(.brk_reservation) on a variable. This doesn't work because it ends up making it a @progbits section, which gets actual space allocated in the vmlinux executable. The second attempt was to emit the space into a section using asm, but gcc doesn't allow arguments to be passed to file-level asm() statements, making it hard to pass in the size. The final attempt is to wrap the asm() in a function to allow it to have arguments, and put the function itself into the .discard section, which vmlinux*.lds drops entirely from the emitted vmlinux. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Yinghai Lu 提交于
Impact: simplification We only need to map the kernel in head_32.S, not the whole of lowmem. We use 512MB as a reasonable (but arbitrary) limit on the maximum size of the kernel image. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Jeremy Fitzhardinge 提交于
Impact: use new interface instead of previous ad hoc implementation Rather than having special purpose init_pg_table_start/end variables to delimit the kernel pagetable built by head_32.S, just use the brk mechanism to extend the bss for the new pagetable. This patch removes init_pg_table_start/end and pg0, defines __brk_base (which is page-aligned and immediately follows _end), initializes the brk region to start there, and uses it for the 32-bit pagetable. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 14 2月, 2009 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
In general, the only definitions that assembly files can use are in _types.S headers (where available), so convert them. Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
-
- 11 2月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
Impact: fix x86_32 stack protector Brian Gerst found out that %gs was being initialized to stack_canary instead of stack_canary - 20, which basically gave the same canary value for all threads. Fixing this also exposed the following bugs. * cpu_idle() didn't call boot_init_stack_canary() * stack canary switching in switch_to() was being done too late making the initial run of a new thread use the old stack canary value. Fix all of them and while at it update comment in cpu_idle() about calling boot_init_stack_canary(). Reported-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 2月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
Impact: stack protector for x86_32 Implement stack protector for x86_32. GDT entry 28 is used for it. It's set to point to stack_canary-20 and have the length of 24 bytes. CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs to the stack canary segment on entry. As %gs is otherwise unused by the kernel, the canary can be anywhere. It's defined as a percpu variable. x86_32 exception handlers take register frame on stack directly as struct pt_regs. With -fstack-protector turned on, gcc copies the whole structure after the stack canary and (of course) doesn't copy back on return thus losing all changed. For now, -fno-stack-protector is added to all files which contain those functions. We definitely need something better. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 1月, 2009 2 次提交
-
-
由 Ingo Molnar 提交于
Impact: add a stack dump to early IRQs/faults Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Impact: cleanup Remove such constructs: #ifdef CONFIG_EARLY_PRINTK call early_printk #else call printk #endif Not only are they ugly, they are also pointless: a call to printk() maps to early_printk during early bootup anyway, if CONFIG_EARLY_PRINTK is enabled. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 1月, 2009 1 次提交
-
-
由 Brian Gerst 提交于
Impact: cleanup %fs is currently set to __KERNEL_DS at boot, and conditionally switched to __KERNEL_PERCPU for secondary cpus. Instead, initialize GDT_ENTRY_PERCPU to the same attributes as GDT_ENTRY_KERNEL_DS and set %fs to __KERNEL_PERCPU unconditionally. Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 11 10月, 2008 1 次提交
-
-
由 Suresh Siddha 提交于
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 7月, 2008 1 次提交
-
-
由 Thomas Gleixner 提交于
commit 3e970473 ("x86: boot secondary cpus through initial_code") causes the kernel to crash when a CPU is brought online after the read only sections have been write protected. The write to initial_code in do_boot_cpu() fails. Move inital_code to .cpuinit.data section. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NH. Peter Anvin <hpa@zytor.com>
-
- 08 7月, 2008 2 次提交
-
-
由 Glauber Costa 提交于
remove "initialize_secondary". Boot both architectures via initial_code. Signed-off-by: NGlauber Costa <gcosta@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Glauber Costa 提交于
x86_64 jumps to whatever is written in "initial_code" symbol, instead of a fixed address. Do it for i386 too. It will allow us to integrate more of the smp boot code. Signed-off-by: NGlauber Costa <gcosta@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 6月, 2008 1 次提交
-
-
由 Joe Korty 提交于
On Mon, May 19, 2008 at 04:10:02PM -0700, Linus Torvalds wrote: > It also causes these warnings on 32-bit PAE: > > AS arch/x86/kernel/head_32.o > arch/x86/kernel/head_32.S: Assembler messages: > arch/x86/kernel/head_32.S:225: Warning: left operand is a bignum; integer 0 assumed > arch/x86/kernel/head_32.S:609: Warning: left operand is a bignum; integer 0 assumed > > and I do not see why (the end result seems to be identical). Fix head_32.S gcc bignum warnings when CONFIG_PAE=y. arch/x86/kernel/head_32.S: Assembler messages: arch/x86/kernel/head_32.S:225: Warning: left operand is a bignum; integer 0 assumed arch/x86/kernel/head_32.S:609: Warning: left operand is a bignum; integer 0 assumed The assembler was stumbling over the 64-bit constant 0x100000000 in the KPMDS #define. Testing: a cmp(1) on head_32.o before and after shows the binary is unchanged. Signed-off-by: Joe Korty <joe.korty@ccur.com Cc: Hugh Dickins <hugh@veritas.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Gabriel C <nix.or.die@googlemail.com> Cc: Keith Packard <keithp@keithp.com> Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com> Cc: Eric Anholt <eric@anholt.net> Cc: "Siddha Suresh B" <suresh.b.siddha@intel.com> Cc: bugme-daemon@bugzilla.kernel.org Cc: airlied@linux.ie Cc: "Barnes Jesse" <jesse.barnes@intel.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 6月, 2008 1 次提交
-
-
由 Yinghai Lu 提交于
on 32-bit in head_32.S after initial page table is done, we get initial max_pfn_mapped, and then kernel_physical_mapping_init will give us a final one. We need to use that to make sure find_e820_area will get valid addresses for boot_map and for NODE_DATA(0) on numa32. XEN PV and lguest may need to assign max_pfn_mapped too. Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 31 5月, 2008 1 次提交
-
-
由 Yinghai Lu 提交于
introduce init_pg_table_start, so xen PV could specify the value. Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 01 5月, 2008 1 次提交
-
-
由 Vegard Nossum 提交于
The .asciz directive takes any number of strings, but each one is zero- terminated, and string pasting is not done as in C. That results in only the first line being output. Replace .asciz with multiple .ascii directives and terminate with .asciz. Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 20 4月, 2008 1 次提交
-
-
由 WANG Cong 提交于
Remove old comments that include the old arch/i386 directory. Signed-off-by: NWANG Cong <xiyou.wangcong@gmail.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 17 4月, 2008 1 次提交
-
-
由 Eric W. Biederman 提交于
Copy x86_64 and add a head32.c so we can start moving early architecture initialization out of assembly. [ Sam Ravnborg <sam@ravnborg.org>: updated it to x86 ] Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NSam Ravnborg <sam@ravnborg.org> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 22 3月, 2008 1 次提交
-
-
由 Jiri Slaby 提交于
The fault_msg text is not explictly nul terminated now in startup assembly. Do so by converting .ascii to .asciz. Signed-off-by: NJiri Slaby <jirislaby@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 26 2月, 2008 1 次提交
-
-
由 Adrian Bunk 提交于
There doesn't seem to be any reason for swapper_pg_pmd being global. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 2月, 2008 1 次提交
-
-
由 Adrian Bunk 提交于
Signed-off-by: NAdrian Bunk <bunk@kernel.org> Cc: Ian Campbell <ijc@hellion.org.uk> Cc: hpa@zytor.com Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 10 2月, 2008 1 次提交
-
-
由 Ian Campbell 提交于
Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled kernel are in PAE format. early_ioremap is updated to use the standard page table accessors. Clear any mappings beyond max_low_pfn from the boot page tables in native_pagetable_setup_start because the initial mappings can extend beyond the range of physical memory and into the vmalloc area. Derived from patches by Eric Biederman and H. Peter Anvin. [ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ] Signed-off-by: NIan Campbell <ijc@hellion.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Mika Penttilä <mika.penttila@kolumbus.fi> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 30 1月, 2008 4 次提交
-
-
由 Sam Ravnborg 提交于
fix: > WARNING: vmlinux.o(.data+0x4): Section mismatch: reference to .init.text:lguest_entry (between 'subarch_entries' and 'stack_start') Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Ian Campbell 提交于
I am preparing to convert the boot time page table to the kernels native format. To achieve that I need to enable PAE. Enabling PSE and the no execute bit would not hurt. So this patch modifies the boot cpu path to execute all of the kernels enable code if and only if we have the proper bits set in mmu_cr4_features. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NIan Campbell <ijc@hellion.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Mika Penttilä <mika.penttila@kolumbus.fi> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Ian Campbell 提交于
Currently in head_32.S there are two ways we test to see if we are the boot cpu. By looking at %ebx and by looking at the static variable ready. When changing things around I have found that it gets tricky to preserve %ebx. So this patch just switches head.S over to the more reliable test of always using ready. Hopefully later we can kill these tests entirely. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NIan Campbell <ijc@hellion.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Mika Penttilä <mika.penttila@kolumbus.fi> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Ingo Molnar 提交于
Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 02 1月, 2008 1 次提交
-
-
由 Rusty Russell 提交于
After 17d57a92 ("x86: fix x86-32 early fixmap initialization.") removing lg.ko caused a printk from vunmap: mm/memory.c:115: bad pgd 004b3027. On the second use after module load, the kernel crashes. This fixes the immediate problem (accessed and dirty bits not set as expected in pmd_none_or_clear_bad). I can't see why this would cause a crash, but I haven't been able to reproduce it once this is applied. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 12月, 2007 1 次提交
-
-
由 Eric W. Biederman 提交于
pageexec@freemail.hu writes: > i've just noticed that the chunk in i386/kernel/head.S ended up in a > weird place, namely, it's not going to be executed as it's just after > a 'jmp 3f' and before startup_32_smp, probably not what you intended. > on a sidenote, the whole thing can be done in a single insn, like: > > movl $(swapper_pg_pmd - __PAGE_OFFSET + 0x067), (swapper_pg_dir - > __PAGE_OFFSET+ 4092) Thanks for the reminder I thought we had fixed this problem a while ago. Needed to get fixed virtual address for USB debug and earlycon with mmio. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 24 10月, 2007 1 次提交
-
-
由 H. Peter Anvin 提交于
Make <asm/setup.h> usable by the boot code. Clean up vestiges of the old command-line protocol from setup.h and head_32.S (it is still supported from the boot loader point of view, since it is converted to the new command-line protocol by the boot code.) Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 22 10月, 2007 1 次提交
-
-
由 Rusty Russell 提交于
This patch uses the updated boot protocol to do paravirtualized boot. If the boot version is >= 2.07, then it will do two things: 1. Check the bootparams loadflags to see if we should reload the segment registers and clear interrupts. This is appropriate for normal native boot and some paravirtualized environments, but inapproprate for others. 2. Check the hardware architecture, and dispatch to the appropriate kernel entrypoint. If the bootloader doesn't set this, then we simply do the normal boot sequence. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Zachary Amsden <zach@vmware.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 10月, 2007 2 次提交
-
-
由 Ingo Molnar 提交于
improve early fault output. old format: Int 14: CR2 010001e3 err 00000002 EIP c011f2f9 CS 00000060 flags 00010046 Stack: c073695e c0791c10 00000000 ffffffff 00000000 01000000 00001000 c0791c10 new format: BUG: Int 14: CR2 010001e3 EDI c1000000 ESI c0693c10 EBP c0637f9c ESP c0637f08 EBX 00000000 EDX 0000000e ECX 00000000 EAX 010001e3 err 00000002 EIP c0123119 CS 00000060 flg 00010046 Stack: c064d589 c0693000 00000000 c0637f60 00c001e3 01000000 00038000 00000163 00000000 00000163 00000000 ffffffff 00038000 00000000 00000000 00001000 00001000 00000000 c0637f88 c06509be c0a2ae60 00001000 00001000 00000000 Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Ingo Molnar 提交于
To preserve the DMA pool in CONFIG_DEBUG_PAGEALLOC=y kernels, we'll allocate pagetables from above the 16MB DMA limit, so we'll have to set up boot pagetables to cover 16MB more RAM (worst-case). Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 11 10月, 2007 3 次提交
-
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 8月, 2007 1 次提交
-
-
由 Andi Kleen 提交于
Fix WARNING: vmlinux.o(.text+0x183): Section mismatch: reference to .init.text:start_kernel (between 'is386' and 'check_x87') Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 7月, 2007 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
This patch is a rollup of all the core pieces of the Xen implementation, including: - booting and setup - pagetable setup - privileged instructions - segmentation - interrupt flags - upcalls - multicall batching BOOTING AND SETUP The vmlinux image is decorated with ELF notes which tell the Xen domain builder what the kernel's requirements are; the domain builder then constructs the address space accordingly and starts the kernel. Xen has its own entrypoint for the kernel (contained in an ELF note). The ELF notes are set up by xen-head.S, which is included into head.S. In principle it could be linked separately, but it seems to provoke lots of binutils bugs. Because the domain builder starts the kernel in a fairly sane state (32-bit protected mode, paging enabled, flat segments set up), there's not a lot of setup needed before starting the kernel proper. The main steps are: 1. Install the Xen paravirt_ops, which is simply a matter of a structure assignment. 2. Set init_mm to use the Xen-supplied pagetables (analogous to the head.S generated pagetables in a native boot). 3. Reserve address space for Xen, since it takes a chunk at the top of the address space for its own use. 4. Call start_kernel() PAGETABLE SETUP Once we hit the main kernel boot sequence, it will end up calling back via paravirt_ops to set up various pieces of Xen specific state. One of the critical things which requires a bit of extra care is the construction of the initial init_mm pagetable. Because Xen places tight constraints on pagetables (an active pagetable must always be valid, and must always be mapped read-only to the guest domain), we need to be careful when constructing the new pagetable to keep these constraints in mind. It turns out that the easiest way to do this is use the initial Xen-provided pagetable as a template, and then just insert new mappings for memory where a mapping doesn't already exist. This means that during pagetable setup, it uses a special version of xen_set_pte which ignores any attempt to remap a read-only page as read-write (since Xen will map its own initial pagetable as RO), but lets other changes to the ptes happen, so that things like NX are set properly. PRIVILEGED INSTRUCTIONS AND SEGMENTATION When the kernel runs under Xen, it runs in ring 1 rather than ring 0. This means that it is more privileged than user-mode in ring 3, but it still can't run privileged instructions directly. Non-performance critical instructions are dealt with by taking a privilege exception and trapping into the hypervisor and emulating the instruction, but more performance-critical instructions have their own specific paravirt_ops. In many cases we can avoid having to do any hypercalls for these instructions, or the Xen implementation is quite different from the normal native version. The privileged instructions fall into the broad classes of: Segmentation: setting up the GDT and the GDT entries, LDT, TLS and so on. Xen doesn't allow the GDT to be directly modified; all GDT updates are done via hypercalls where the new entries can be validated. This is important because Xen uses segment limits to prevent the guest kernel from damaging the hypervisor itself. Traps and exceptions: Xen uses a special format for trap entrypoints, so when the kernel wants to set an IDT entry, it needs to be converted to the form Xen expects. Xen sets int 0x80 up specially so that the trap goes straight from userspace into the guest kernel without going via the hypervisor. sysenter isn't supported. Kernel stack: The esp0 entry is extracted from the tss and provided to Xen. TLB operations: the various TLB calls are mapped into corresponding Xen hypercalls. Control registers: all the control registers are privileged. The most important is cr3, which points to the base of the current pagetable, and we handle it specially. Another instruction we treat specially is CPUID, even though its not privileged. We want to control what CPU features are visible to the rest of the kernel, and so CPUID ends up going into a paravirt_op. Xen implements this mainly to disable the ACPI and APIC subsystems. INTERRUPT FLAGS Xen maintains its own separate flag for masking events, which is contained within the per-cpu vcpu_info structure. Because the guest kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely ignored (and must be, because even if a guest domain disables interrupts for itself, it can't disable them overall). (A note on terminology: "events" and interrupts are effectively synonymous. However, rather than using an "enable flag", Xen uses a "mask flag", which blocks event delivery when it is non-zero.) There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which are implemented to manage the Xen event mask state. The only thing worth noting is that when events are unmasked, we need to explicitly see if there's a pending event and call into the hypervisor to make sure it gets delivered. UPCALLS Xen needs a couple of upcall (or callback) functions to be implemented by each guest. One is the event upcalls, which is how events (interrupts, effectively) are delivered to the guests. The other is the failsafe callback, which is used to report errors in either reloading a segment register, or caused by iret. These are implemented in i386/kernel/entry.S so they can jump into the normal iret_exc path when necessary. MULTICALL BATCHING Xen provides a multicall mechanism, which allows multiple hypercalls to be issued at once in order to mitigate the cost of trapping into the hypervisor. This is particularly useful for context switches, since the 4-5 hypercalls they would normally need (reload cr3, update TLS, maybe update LDT) can be reduced to one. This patch implements a generic batching mechanism for hypercalls, which gets used in many places in the Xen code. Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: NChris Wright <chrisw@sous-sol.org> Cc: Ian Pratt <ian.pratt@xensource.com> Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Cc: Adrian Bunk <bunk@stusta.de>
-