1. 31 5月, 2008 1 次提交
  2. 01 5月, 2008 1 次提交
  3. 20 4月, 2008 1 次提交
  4. 17 4月, 2008 1 次提交
  5. 22 3月, 2008 1 次提交
  6. 26 2月, 2008 1 次提交
  7. 19 2月, 2008 1 次提交
  8. 10 2月, 2008 1 次提交
    • I
      x86: construct 32-bit boot time page tables in native format. · 551889a6
      Ian Campbell 提交于
      Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled
      kernel are in PAE format.
      
      early_ioremap is updated to use the standard page table accessors.
      
      Clear any mappings beyond max_low_pfn from the boot page tables in
      native_pagetable_setup_start because the initial mappings can extend
      beyond the range of physical memory and into the vmalloc area.
      
      Derived from patches by Eric Biederman and H. Peter Anvin.
      
      [ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ]
      Signed-off-by: NIan Campbell <ijc@hellion.org.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Mika Penttilä <mika.penttila@kolumbus.fi>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      551889a6
  9. 30 1月, 2008 4 次提交
  10. 02 1月, 2008 1 次提交
  11. 04 12月, 2007 1 次提交
    • E
      x86: fix x86-32 early fixmap initialization. · 17d57a92
      Eric W. Biederman 提交于
      pageexec@freemail.hu writes:
      
      > i've just noticed that the chunk in i386/kernel/head.S ended up in a
      > weird place, namely, it's not going to be executed as it's just after
      > a 'jmp 3f' and before startup_32_smp, probably not what you intended.
      > on a sidenote, the whole thing can be done in a single insn, like:
      >
      > movl $(swapper_pg_pmd - __PAGE_OFFSET + 0x067), (swapper_pg_dir -
      > __PAGE_OFFSET+ 4092)
      
      Thanks for the reminder I thought we had fixed this problem a while ago.
      
      Needed to get fixed virtual address for USB debug and earlycon with mmio.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      17d57a92
  12. 24 10月, 2007 1 次提交
  13. 22 10月, 2007 1 次提交
    • R
      i386: paravirt boot sequence · a24e7851
      Rusty Russell 提交于
      This patch uses the updated boot protocol to do paravirtualized boot.
      If the boot version is >= 2.07, then it will do two things:
      
       1. Check the bootparams loadflags to see if we should reload the
          segment registers and clear interrupts.  This is appropriate
          for normal native boot and some paravirtualized environments, but
          inapproprate for others.
      
       2. Check the hardware architecture, and dispatch to the appropriate
          kernel entrypoint.  If the bootloader doesn't set this, then we
          simply do the normal boot sequence.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a24e7851
  14. 18 10月, 2007 2 次提交
    • I
      i386: print better early fault info · 382f64ab
      Ingo Molnar 提交于
      improve early fault output.
      
      old format:
      
       Int 14: CR2 010001e3  err 00000002  EIP c011f2f9  CS 00000060  flags 00010046
       Stack: c073695e c0791c10 00000000 ffffffff 00000000 01000000 00001000 c0791c10
      
      new format:
      
       BUG: Int 14: CR2 010001e3
            EDI c1000000  ESI c0693c10  EBP c0637f9c  ESP c0637f08
            EBX 00000000  EDX 0000000e  ECX 00000000  EAX 010001e3
            err 00000002  EIP c0123119   CS 00000060  flg 00010046
       Stack: c064d589 c0693000 00000000 c0637f60 00c001e3 01000000 00038000 00000163
              00000000 00000163 00000000 ffffffff 00038000 00000000 00000000 00001000
              00001000 00000000 c0637f88 c06509be c0a2ae60 00001000 00001000 00000000
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      382f64ab
    • I
      x86: prepare page allocator for high allocations on PAGEALLOC=y · 1e3e1972
      Ingo Molnar 提交于
      To preserve the DMA pool in CONFIG_DEBUG_PAGEALLOC=y kernels, we'll
      allocate pagetables from above the 16MB DMA limit, so we'll have to set
      up boot pagetables to cover 16MB more RAM (worst-case).
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      1e3e1972
  15. 11 10月, 2007 3 次提交
  16. 12 8月, 2007 1 次提交
  17. 18 7月, 2007 1 次提交
    • J
      xen: Core Xen implementation · 5ead97c8
      Jeremy Fitzhardinge 提交于
      This patch is a rollup of all the core pieces of the Xen
      implementation, including:
       - booting and setup
       - pagetable setup
       - privileged instructions
       - segmentation
       - interrupt flags
       - upcalls
       - multicall batching
      
      BOOTING AND SETUP
      
      The vmlinux image is decorated with ELF notes which tell the Xen
      domain builder what the kernel's requirements are; the domain builder
      then constructs the address space accordingly and starts the kernel.
      
      Xen has its own entrypoint for the kernel (contained in an ELF note).
      The ELF notes are set up by xen-head.S, which is included into head.S.
      In principle it could be linked separately, but it seems to provoke
      lots of binutils bugs.
      
      Because the domain builder starts the kernel in a fairly sane state
      (32-bit protected mode, paging enabled, flat segments set up), there's
      not a lot of setup needed before starting the kernel proper.  The main
      steps are:
        1. Install the Xen paravirt_ops, which is simply a matter of a
           structure assignment.
        2. Set init_mm to use the Xen-supplied pagetables (analogous to the
           head.S generated pagetables in a native boot).
        3. Reserve address space for Xen, since it takes a chunk at the top
           of the address space for its own use.
        4. Call start_kernel()
      
      PAGETABLE SETUP
      
      Once we hit the main kernel boot sequence, it will end up calling back
      via paravirt_ops to set up various pieces of Xen specific state.  One
      of the critical things which requires a bit of extra care is the
      construction of the initial init_mm pagetable.  Because Xen places
      tight constraints on pagetables (an active pagetable must always be
      valid, and must always be mapped read-only to the guest domain), we
      need to be careful when constructing the new pagetable to keep these
      constraints in mind.  It turns out that the easiest way to do this is
      use the initial Xen-provided pagetable as a template, and then just
      insert new mappings for memory where a mapping doesn't already exist.
      
      This means that during pagetable setup, it uses a special version of
      xen_set_pte which ignores any attempt to remap a read-only page as
      read-write (since Xen will map its own initial pagetable as RO), but
      lets other changes to the ptes happen, so that things like NX are set
      properly.
      
      PRIVILEGED INSTRUCTIONS AND SEGMENTATION
      
      When the kernel runs under Xen, it runs in ring 1 rather than ring 0.
      This means that it is more privileged than user-mode in ring 3, but it
      still can't run privileged instructions directly.  Non-performance
      critical instructions are dealt with by taking a privilege exception
      and trapping into the hypervisor and emulating the instruction, but
      more performance-critical instructions have their own specific
      paravirt_ops.  In many cases we can avoid having to do any hypercalls
      for these instructions, or the Xen implementation is quite different
      from the normal native version.
      
      The privileged instructions fall into the broad classes of:
        Segmentation: setting up the GDT and the GDT entries, LDT,
           TLS and so on.  Xen doesn't allow the GDT to be directly
           modified; all GDT updates are done via hypercalls where the new
           entries can be validated.  This is important because Xen uses
           segment limits to prevent the guest kernel from damaging the
           hypervisor itself.
        Traps and exceptions: Xen uses a special format for trap entrypoints,
           so when the kernel wants to set an IDT entry, it needs to be
           converted to the form Xen expects.  Xen sets int 0x80 up specially
           so that the trap goes straight from userspace into the guest kernel
           without going via the hypervisor.  sysenter isn't supported.
        Kernel stack: The esp0 entry is extracted from the tss and provided to
           Xen.
        TLB operations: the various TLB calls are mapped into corresponding
           Xen hypercalls.
        Control registers: all the control registers are privileged.  The most
           important is cr3, which points to the base of the current pagetable,
           and we handle it specially.
      
      Another instruction we treat specially is CPUID, even though its not
      privileged.  We want to control what CPU features are visible to the
      rest of the kernel, and so CPUID ends up going into a paravirt_op.
      Xen implements this mainly to disable the ACPI and APIC subsystems.
      
      INTERRUPT FLAGS
      
      Xen maintains its own separate flag for masking events, which is
      contained within the per-cpu vcpu_info structure.  Because the guest
      kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely
      ignored (and must be, because even if a guest domain disables
      interrupts for itself, it can't disable them overall).
      
      (A note on terminology: "events" and interrupts are effectively
      synonymous.  However, rather than using an "enable flag", Xen uses a
      "mask flag", which blocks event delivery when it is non-zero.)
      
      There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which
      are implemented to manage the Xen event mask state.  The only thing
      worth noting is that when events are unmasked, we need to explicitly
      see if there's a pending event and call into the hypervisor to make
      sure it gets delivered.
      
      UPCALLS
      
      Xen needs a couple of upcall (or callback) functions to be implemented
      by each guest.  One is the event upcalls, which is how events
      (interrupts, effectively) are delivered to the guests.  The other is
      the failsafe callback, which is used to report errors in either
      reloading a segment register, or caused by iret.  These are
      implemented in i386/kernel/entry.S so they can jump into the normal
      iret_exc path when necessary.
      
      MULTICALL BATCHING
      
      Xen provides a multicall mechanism, which allows multiple hypercalls
      to be issued at once in order to mitigate the cost of trapping into
      the hypervisor.  This is particularly useful for context switches,
      since the 4-5 hypercalls they would normally need (reload cr3, update
      TLS, maybe update LDT) can be reduced to one.  This patch implements a
      generic batching mechanism for hypercalls, which gets used in many
      places in the Xen code.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Cc: Ian Pratt <ian.pratt@xensource.com>
      Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
      Cc: Adrian Bunk <bunk@stusta.de>
      5ead97c8
  18. 17 7月, 2007 1 次提交
  19. 11 5月, 2007 1 次提交
    • E
      Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization" · 5a18c92a
      Eric W. Biederman 提交于
      This reverts commit c9ccf30d.
      
      Entering the kernel at startup_32 without passing our real mode data in
      %esi, and without guaranteeing that physical and virtual addresses are
      identity mapped makes head.S impossible to maintain.
      
      The only user of this infrastructure is lguest which is not merged so
      nothing we currently support will break by removing this over designed
      nightmare, and only the pending lguest patches will be affected.  The
      pending Xen patches have a different entry point that they use.
      
      We are currently discussing what Xen and lguest need to do to boot the
      kernel in a more normal fashion so using startup_32 in this weird manner is
      clearly not their long term direction.
      
      So let's remove this code in head.S before it causes brain damage to people
      trying to maintain head.S
      
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Zachary Amsden <zach@vmware.com>
      CC: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5a18c92a
  20. 03 5月, 2007 6 次提交
  21. 13 2月, 2007 6 次提交
    • R
      [PATCH] i386: Rename cpu_gdt_descr and remove extern declaration from smpboot.c · 2a57ff1a
      Rusty Russell 提交于
      When I implemented the DECLARE_PER_CPU(var) macros, I was careful that
      people couldn't use "var" in a non-percpu context, by prepending
      percpu__.  I never considered that this would allow them to overload
      the same name for a per-cpu and a non-percpu variable.
      
      It is only one of many horrors in the i386 boot code, but let's rename
      the non-perpcu cpu_gdt_descr to early_gdt_descr (not boot_gdt_descr,
      that's something else...)
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      
      ===================================================================
      2a57ff1a
    • R
      [PATCH] i386: paravirt unhandled fallthrough · 992af681
      Rusty Russell 提交于
      The current code simply calls "start_kernel" directly if we're under a
      hypervisor and no paravirt_ops backend wants us, because paravirt.c
      registers that as a backend.
      
      This was always a vain hope; start_kernel won't get far without setup.
      It's also impossible for paravirt_ops backends which don't sit in the
      arch/i386/kernel directory: they can't link before paravirt.o anyway.
      
      Keep it simple: if we pass all the registered paravirt probes, BUG().
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      992af681
    • V
      [PATCH] i386: move startup_32() in text.head section · f8657e1b
      Vivek Goyal 提交于
      o Entry startup_32 was in .text section but it was accessing some init
        data too and it prompts MODPOST to generate compilation warnings.
      
      WARNING: vmlinux - Section mismatch: reference to .init.data:boot_params from
      .text between '_text' (at offset 0xc0100029) and 'startup_32_smp'
      WARNING: vmlinux - Section mismatch: reference to .init.data:boot_params from
      .text between '_text' (at offset 0xc0100037) and 'startup_32_smp'
      WARNING: vmlinux - Section mismatch: reference to
      .init.data:init_pg_tables_end from .text between '_text' (at offset
      0xc0100099) and 'startup_32_smp'
      
      o Can't move startup_32 to .init.text as this entry point has to be at the
        start of bzImage. Hence moved startup_32 to a new section .text.head and
        instructed MODPOST to not to generate warnings if init data is being
        accessed from .text.head section. This code has been audited.
      
      o SMP boot up code (startup_32_smp) can go into .init.text if CPU hotplug
        is not supported. Otherwise it generates more warnings
      
      WARNING: vmlinux - Section mismatch: reference to .init.data:new_cpu_data from
      .text between 'checkCPUtype' (at offset 0xc0100126) and 'is486'
      WARNING: vmlinux - Section mismatch: reference to .init.data:new_cpu_data from
      .text between 'checkCPUtype' (at offset 0xc0100130) and 'is486'
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      f8657e1b
    • Z
      [PATCH] i386: vMI backend for paravirt-ops · 7ce0bcfd
      Zachary Amsden 提交于
      Fairly straightforward implementation of VMI backend for paravirt-ops.
      
      [Adrian Bunk: some cleanups]
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      7ce0bcfd
    • J
      [PATCH] i386: Convert i386 PDA code to use %fs · 464d1a78
      Jeremy Fitzhardinge 提交于
      Convert the PDA code to use %fs rather than %gs as the segment for
      per-processor data.  This is because some processors show a small but
      measurable performance gain for reloading a NULL segment selector (as %fs
      generally is in user-space) versus a non-NULL one (as %gs generally is).
      
      On modern processors the difference is very small, perhaps undetectable.
      Some old AMD "K6 3D+" processors are noticably slower when %fs is used
      rather than %gs; I have no idea why this might be, but I think they're
      sufficiently rare that it doesn't matter much.
      
      This patch also fixes the math emulator, which had not been adjusted to
      match the changed struct pt_regs.
      
      [frederik.deweerdt@gmail.com: fixit with gdb]
      [mingo@elte.hu: Fix KVM too]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Ian Campbell <Ian.Campbell@XenSource.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NZachary Amsden <zach@vmware.com>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NFrederik Deweerdt <frederik.deweerdt@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      464d1a78
    • A
      [PATCH] Dynamic kernel command-line: i386 · 4e498b66
      Alon Bar-Lev 提交于
      1. Rename saved_command_line into boot_command_line.
      2. Set command_line as __initdata.
      Signed-off-by: NAlon Bar-Lev <alon.barlev@gmail.com>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4e498b66
  22. 07 12月, 2006 3 次提交
    • R
      [PATCH] paravirt: Add startup infrastructure for paravirtualization · c9ccf30d
      Rusty Russell 提交于
      1) Each hypervisor writes a probe function to detect whether we are
         running under that hypervisor.  paravirt_probe() registers this
         function.
      
      2) If vmlinux is booted with ring != 0, we call all the probe
         functions (with registers except %esp intact) in link order: the
         winner will not return.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Zachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      c9ccf30d
    • J
      [PATCH] i386: Use %gs as the PDA base-segment in the kernel · f95d47ca
      Jeremy Fitzhardinge 提交于
      This patch is the meat of the PDA change.  This patch makes several related
      changes:
      
      1: Most significantly, %gs is now used in the kernel.  This means that on
         entry, the old value of %gs is saved away, and it is reloaded with
         __KERNEL_PDA.
      
      2: entry.S constructs the stack in the shape of struct pt_regs, and this
         is passed around the kernel so that the process's saved register
         state can be accessed.
      
         Unfortunately struct pt_regs doesn't currently have space for %gs
         (or %fs). This patch extends pt_regs to add space for gs (no space
         is allocated for %fs, since it won't be used, and it would just
         complicate the code in entry.S to work around the space).
      
      3: Because %gs is now saved on the stack like %ds, %es and the integer
         registers, there are a number of places where it no longer needs to
         be handled specially; namely context switch, and saving/restoring the
         register state in a signal context.
      
      4: And since kernel threads run in kernel space and call normal kernel
         code, they need to be created with their %gs == __KERNEL_PDA.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Chuck Ebbert <76306.1226@compuserve.com>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      f95d47ca
    • J
      [PATCH] i386: Basic definitions for i386-pda · 9ca36101
      Jeremy Fitzhardinge 提交于
      This patch has the basic definitions of struct i386_pda, and the segment
      selector in the GDT.
      
      asm-i386/pda.h is more or less a direct copy of asm-x86_64/pda.h.  The most
      interesting difference is the use of _proxy_pda, which is used to give gcc a
      model for the actual memory operations on the real pda structure.  No actual
      reference is ever made to _proxy_pda, so it is never defined.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Chuck Ebbert <76306.1226@compuserve.com>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      9ca36101