1. 05 9月, 2008 1 次提交
  2. 15 8月, 2008 1 次提交
  3. 25 5月, 2008 2 次提交
  4. 17 4月, 2008 1 次提交
    • T
      x86: use ELF section to list CPU vendor specific code · 03ae5768
      Thomas Petazzoni 提交于
      Replace the hardcoded list of initialization functions for each CPU
      vendor by a list in an ELF section, which is read at initialization in
      arch/x86/kernel/cpu/cpu.c to fill the cpu_devs[] array. The ELF
      section, named .x86cpuvendor.init, is reclaimed after boot, and
      contains entries of type "struct cpu_vendor_dev" which associates a
      vendor number with a pointer to a "struct cpu_dev" structure.
      
      This first modification allows to remove all the VENDOR_init_cpu()
      functions.
      
      This patch also removes the hardcoded calls to early_init_amd() and
      early_init_intel(). Instead, we add a "c_early_init" member to the
      cpu_dev structure, which is then called if not NULL by the generic CPU
      initialization code. Unfortunately, in early_cpu_detect(), this_cpu is
      not yet set, so we have to use the cpu_devs[] array directly.
      
      This patch is part of the Linux Tiny project, and is needed for
      further patch that will allow to disable compilation of unused CPU
      support code.
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      03ae5768
  5. 19 2月, 2008 1 次提交
  6. 30 1月, 2008 2 次提交
  7. 29 1月, 2008 1 次提交
  8. 11 10月, 2007 2 次提交
  9. 20 7月, 2007 2 次提交
    • R
      i386: Put allocated ELF notes in read-only data segment · cbe87121
      Roland McGrath 提交于
      This changes the i386 linker script and the asm-generic macro it uses so that
      ELF note sections with SHF_ALLOC set are linked into the kernel image along
      with other read-only data.  The PT_NOTE also points to their location.
      
      This paves the way for putting useful build-time information into ELF notes
      that can be found easily later in a kernel memory dump.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cbe87121
    • F
      define new percpu interface for shared data · 5fb7dc37
      Fenghua Yu 提交于
      per cpu data section contains two types of data.  One set which is
      exclusively accessed by the local cpu and the other set which is per cpu,
      but also shared by remote cpus.  In the current kernel, these two sets are
      not clearely separated out.  This can potentially cause the same data
      cacheline shared between the two sets of data, which will result in
      unnecessary bouncing of the cacheline between cpus.
      
      One way to fix the problem is to cacheline align the remotely accessed per
      cpu data, both at the beginning and at the end.  Because of the padding at
      both ends, this will likely cause some memory wastage and also the
      interface to achieve this is not clean.
      
      This patch:
      
      Moves the remotely accessed per cpu data (which is currently marked
      as ____cacheline_aligned_in_smp) into a different section, where all the data
      elements are cacheline aligned. And as such, this differentiates the local
      only data and remotely accessed data cleanly.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: <linux-arch@vger.kernel.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5fb7dc37
  10. 18 7月, 2007 1 次提交
    • J
      xen: Core Xen implementation · 5ead97c8
      Jeremy Fitzhardinge 提交于
      This patch is a rollup of all the core pieces of the Xen
      implementation, including:
       - booting and setup
       - pagetable setup
       - privileged instructions
       - segmentation
       - interrupt flags
       - upcalls
       - multicall batching
      
      BOOTING AND SETUP
      
      The vmlinux image is decorated with ELF notes which tell the Xen
      domain builder what the kernel's requirements are; the domain builder
      then constructs the address space accordingly and starts the kernel.
      
      Xen has its own entrypoint for the kernel (contained in an ELF note).
      The ELF notes are set up by xen-head.S, which is included into head.S.
      In principle it could be linked separately, but it seems to provoke
      lots of binutils bugs.
      
      Because the domain builder starts the kernel in a fairly sane state
      (32-bit protected mode, paging enabled, flat segments set up), there's
      not a lot of setup needed before starting the kernel proper.  The main
      steps are:
        1. Install the Xen paravirt_ops, which is simply a matter of a
           structure assignment.
        2. Set init_mm to use the Xen-supplied pagetables (analogous to the
           head.S generated pagetables in a native boot).
        3. Reserve address space for Xen, since it takes a chunk at the top
           of the address space for its own use.
        4. Call start_kernel()
      
      PAGETABLE SETUP
      
      Once we hit the main kernel boot sequence, it will end up calling back
      via paravirt_ops to set up various pieces of Xen specific state.  One
      of the critical things which requires a bit of extra care is the
      construction of the initial init_mm pagetable.  Because Xen places
      tight constraints on pagetables (an active pagetable must always be
      valid, and must always be mapped read-only to the guest domain), we
      need to be careful when constructing the new pagetable to keep these
      constraints in mind.  It turns out that the easiest way to do this is
      use the initial Xen-provided pagetable as a template, and then just
      insert new mappings for memory where a mapping doesn't already exist.
      
      This means that during pagetable setup, it uses a special version of
      xen_set_pte which ignores any attempt to remap a read-only page as
      read-write (since Xen will map its own initial pagetable as RO), but
      lets other changes to the ptes happen, so that things like NX are set
      properly.
      
      PRIVILEGED INSTRUCTIONS AND SEGMENTATION
      
      When the kernel runs under Xen, it runs in ring 1 rather than ring 0.
      This means that it is more privileged than user-mode in ring 3, but it
      still can't run privileged instructions directly.  Non-performance
      critical instructions are dealt with by taking a privilege exception
      and trapping into the hypervisor and emulating the instruction, but
      more performance-critical instructions have their own specific
      paravirt_ops.  In many cases we can avoid having to do any hypercalls
      for these instructions, or the Xen implementation is quite different
      from the normal native version.
      
      The privileged instructions fall into the broad classes of:
        Segmentation: setting up the GDT and the GDT entries, LDT,
           TLS and so on.  Xen doesn't allow the GDT to be directly
           modified; all GDT updates are done via hypercalls where the new
           entries can be validated.  This is important because Xen uses
           segment limits to prevent the guest kernel from damaging the
           hypervisor itself.
        Traps and exceptions: Xen uses a special format for trap entrypoints,
           so when the kernel wants to set an IDT entry, it needs to be
           converted to the form Xen expects.  Xen sets int 0x80 up specially
           so that the trap goes straight from userspace into the guest kernel
           without going via the hypervisor.  sysenter isn't supported.
        Kernel stack: The esp0 entry is extracted from the tss and provided to
           Xen.
        TLB operations: the various TLB calls are mapped into corresponding
           Xen hypercalls.
        Control registers: all the control registers are privileged.  The most
           important is cr3, which points to the base of the current pagetable,
           and we handle it specially.
      
      Another instruction we treat specially is CPUID, even though its not
      privileged.  We want to control what CPU features are visible to the
      rest of the kernel, and so CPUID ends up going into a paravirt_op.
      Xen implements this mainly to disable the ACPI and APIC subsystems.
      
      INTERRUPT FLAGS
      
      Xen maintains its own separate flag for masking events, which is
      contained within the per-cpu vcpu_info structure.  Because the guest
      kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely
      ignored (and must be, because even if a guest domain disables
      interrupts for itself, it can't disable them overall).
      
      (A note on terminology: "events" and interrupts are effectively
      synonymous.  However, rather than using an "enable flag", Xen uses a
      "mask flag", which blocks event delivery when it is non-zero.)
      
      There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which
      are implemented to manage the Xen event mask state.  The only thing
      worth noting is that when events are unmasked, we need to explicitly
      see if there's a pending event and call into the hypervisor to make
      sure it gets delivered.
      
      UPCALLS
      
      Xen needs a couple of upcall (or callback) functions to be implemented
      by each guest.  One is the event upcalls, which is how events
      (interrupts, effectively) are delivered to the guests.  The other is
      the failsafe callback, which is used to report errors in either
      reloading a segment register, or caused by iret.  These are
      implemented in i386/kernel/entry.S so they can jump into the normal
      iret_exc path when necessary.
      
      MULTICALL BATCHING
      
      Xen provides a multicall mechanism, which allows multiple hypercalls
      to be issued at once in order to mitigate the cost of trapping into
      the hypervisor.  This is particularly useful for context switches,
      since the 4-5 hypercalls they would normally need (reload cr3, update
      TLS, maybe update LDT) can be reduced to one.  This patch implements a
      generic batching mechanism for hypercalls, which gets used in many
      places in the Xen code.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Cc: Ian Pratt <ian.pratt@xensource.com>
      Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
      Cc: Adrian Bunk <bunk@stusta.de>
      5ead97c8
  11. 19 5月, 2007 2 次提交
  12. 11 5月, 2007 1 次提交
    • E
      Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization" · 5a18c92a
      Eric W. Biederman 提交于
      This reverts commit c9ccf30d.
      
      Entering the kernel at startup_32 without passing our real mode data in
      %esi, and without guaranteeing that physical and virtual addresses are
      identity mapped makes head.S impossible to maintain.
      
      The only user of this infrastructure is lguest which is not merged so
      nothing we currently support will break by removing this over designed
      nightmare, and only the pending lguest patches will be affected.  The
      pending Xen patches have a different entry point that they use.
      
      We are currently discussing what Xen and lguest need to do to boot the
      kernel in a more normal fashion so using startup_32 in this weird manner is
      clearly not their long term direction.
      
      So let's remove this code in head.S before it causes brain damage to people
      trying to maintain head.S
      
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Zachary Amsden <zach@vmware.com>
      CC: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5a18c92a
  13. 03 5月, 2007 6 次提交
  14. 16 4月, 2007 1 次提交
    • A
      [PATCH] x86: Fix gcc 4.2 _proxy_pda workaround · 08269c6d
      Andi Kleen 提交于
      Due to an over aggressive optimizer gcc 4.2 cannot optimize away _proxy_pda
      in all cases (counter intuitive, but true).  This breaks loading of some
      modules.
      
      The earlier workaround to just export a dummy symbol didn't work unfortunately
      because the module code ignores exports with 0 value.
      
      Make it 1 instead.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      08269c6d
  15. 13 2月, 2007 1 次提交
    • V
      [PATCH] i386: move startup_32() in text.head section · f8657e1b
      Vivek Goyal 提交于
      o Entry startup_32 was in .text section but it was accessing some init
        data too and it prompts MODPOST to generate compilation warnings.
      
      WARNING: vmlinux - Section mismatch: reference to .init.data:boot_params from
      .text between '_text' (at offset 0xc0100029) and 'startup_32_smp'
      WARNING: vmlinux - Section mismatch: reference to .init.data:boot_params from
      .text between '_text' (at offset 0xc0100037) and 'startup_32_smp'
      WARNING: vmlinux - Section mismatch: reference to
      .init.data:init_pg_tables_end from .text between '_text' (at offset
      0xc0100099) and 'startup_32_smp'
      
      o Can't move startup_32 to .init.text as this entry point has to be at the
        start of bzImage. Hence moved startup_32 to a new section .text.head and
        instructed MODPOST to not to generate warnings if init data is being
        accessed from .text.head section. This code has been audited.
      
      o SMP boot up code (startup_32_smp) can go into .init.text if CPU hotplug
        is not supported. Otherwise it generates more warnings
      
      WARNING: vmlinux - Section mismatch: reference to .init.data:new_cpu_data from
      .text between 'checkCPUtype' (at offset 0xc0100126) and 'is486'
      WARNING: vmlinux - Section mismatch: reference to .init.data:new_cpu_data from
      .text between 'checkCPUtype' (at offset 0xc0100130) and 'is486'
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      f8657e1b
  16. 12 2月, 2007 1 次提交
  17. 10 12月, 2006 1 次提交
    • A
      [PATCH] x86: Work around gcc 4.2 over aggressive optimizer · 1bac3b38
      Andi Kleen 提交于
      The new PDA code uses a dummy _proxy_pda variable to describe
      memory references to the PDA. It is never referenced
      in inline assembly, but exists as input/output arguments.
      gcc 4.2 in some cases can CSE references to this which causes
      unresolved symbols.  Define it to zero to avoid this.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      1bac3b38
  18. 09 12月, 2006 1 次提交
    • J
      [PATCH] Generic BUG for i386 · 91768d6c
      Jeremy Fitzhardinge 提交于
      This makes i386 use the generic BUG machinery.  There are no functional
      changes from the old i386 implementation.
      
      The main advantage in using the generic BUG machinery for i386 is that the
      inlined overhead of BUG is just the ud2a instruction; the file+line(+function)
      information are no longer inlined into the instruction stream.  This reduces
      cache pollution, and makes disassembly work properly.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Hugh Dickens <hugh@veritas.com>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      91768d6c
  19. 07 12月, 2006 8 次提交
    • J
      [PATCH] unwinder: move .eh_frame to RODATA · b65780e1
      Jan Beulich 提交于
      The .eh_frame section contents is never written to, so it can as well
      benefit from CONFIG_DEBUG_RODATA.
      
      Diff-ed against firstfloor tree.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      b65780e1
    • V
      [PATCH] i386: Convert more absolute symbols to section relative · 79929fd1
      Vivek Goyal 提交于
      o Convert more absolute symbols to section relative to keep the theme in
        vmlinux.lds.S file and to avoid problem if kernel is relocated.
      
      o Also put a message so that in future people can be aware of it and
        avoid introducing absolute symbols.
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      79929fd1
    • R
      [PATCH] paravirt: Add startup infrastructure for paravirtualization · c9ccf30d
      Rusty Russell 提交于
      1) Each hypervisor writes a probe function to detect whether we are
         running under that hypervisor.  paravirt_probe() registers this
         function.
      
      2) If vmlinux is booted with ring != 0, we call all the probe
         functions (with registers except %esp intact) in link order: the
         winner will not return.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Zachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      c9ccf30d
    • R
      [PATCH] paravirt: Patch inline replacements for paravirt intercepts · 139ec7c4
      Rusty Russell 提交于
      It turns out that the most called ops, by several orders of magnitude,
      are the interrupt manipulation ops.  These are obvious candidates for
      patching, so mark them up and create infrastructure for it.
      
      The method used is that the ops structure has a patch function, which
      is called for each place which needs to be patched: this returns a
      number of instructions (the rest are NOP-padded).
      
      Usually we can spare a register (%eax) for the binary patched code to
      use, but in a couple of critical places in entry.S we can't: we make
      the clobbers explicit at the call site, and manually clobber the
      allowed registers in debug mode as an extra check.
      
      And:
      
      Don't abuse CONFIG_DEBUG_KERNEL, add CONFIG_DEBUG_PARAVIRT.
      
      And:
      
      AK:  Fix warnings in x86-64 alternative.c build
      
      And:
      
      AK: Fix compilation with defconfig
      
      And:
      
      ^From: Andrew Morton <akpm@osdl.org>
      
      Some binutlises still like to emit references to __stop_parainstructions and
      __start_parainstructions.
      
      And:
      
      AK: Fix warnings about unused variables when PARAVIRT is disabled.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      139ec7c4
    • V
      [PATCH] i386: Implement CONFIG_PHYSICAL_ALIGN · e69f202d
      Vivek Goyal 提交于
      o Now CONFIG_PHYSICAL_START is being replaced with CONFIG_PHYSICAL_ALIGN.
        Hardcoding the kernel physical start value creates a problem in relocatable
        kernel context due to boot loader limitations. For ex, if somebody
        compiles a relocatable kernel to be run from address 4MB, but this kernel
        will run from location 1MB as grub loads the kernel at physical address
        1MB. Kernel thinks that I am a relocatable kernel and I should run from
        the address I have been loaded at. So somebody wanting to run kernel
        from 4MB alignment location (for improved performance regions) can't do
        that.
      
      o Hence, Eric proposed that probably CONFIG_PHYSICAL_ALIGN will make
        more sense in relocatable kernel context. At run time kernel will move
        itself to a physical addr location which meets user specified alignment
        restrictions.
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      e69f202d
    • E
      [PATCH] i386: CONFIG_PHYSICAL_START cleanup · 2a43f3ed
      Eric W. Biederman 提交于
      Defining __PHYSICAL_START and __KERNEL_START in asm-i386/page.h works but
      it triggers a full kernel rebuild for the silliest of reasons.  This
      modifies the users to directly use CONFIG_PHYSICAL_START and linux/config.h
      which prevents the full rebuild problem, which makes the code much
      more maintainer and hopefully user friendly.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      2a43f3ed
    • V
      [PATCH] i386: Add comment for align to vmlinux.lds · 6ed01884
      Vivek Goyal 提交于
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      6ed01884
    • V
      [PATCH] i386: Distinguish absolute symbols · 6569580d
      Vivek Goyal 提交于
      Ld knows about 2 kinds of symbols,  absolute and section
      relative.  Section relative symbols symbols change value
      when a section is moved and absolute symbols do not.
      
      Currently in the linker script we have several labels
      marking the beginning and ending of sections that
      are outside of sections, making them absolute symbols.
      Having a mixture of absolute and section relative
      symbols refereing to the same data is currently harmless
      but it is confusing.
      
      This must be done carefully as newer revs of ld do not place
      symbols that appear in sections without data and instead
      ld makes those symbols global :(
      
      My ultimate goal is to build a relocatable kernel.  The
      safest and least intrusive technique is to generate
      relocation entries so the kernel can be relocated at load
      time.  The only penalty would be an increase in the size
      of the kernel binary.  The problem is that if absolute and
      relocatable symbols are not properly specified absolute symbols
      will be relocated or section relative symbols won't be, which
      is fatal.
      
      The practical motivation is that when generating kernels that
      will run from a reserved area for analyzing what caused
      a kernel panic, it is simpler if you don't need to hard code
      the physical memory location they will run at, especially
      for the distributions.
      
      [AK: and merged:]
      
      o Also put a message so that in future people can be aware of it and
        avoid introducing absolute symbols.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      6569580d
  20. 09 11月, 2006 1 次提交
  21. 28 10月, 2006 1 次提交
  22. 26 9月, 2006 1 次提交
    • J
      [PATCH] x86: put .note.* sections into a PT_NOTE segment in vmlinux · 9c9b8b38
      Jeremy Fitzhardinge 提交于
      This patch will pack any .note.* section into a PT_NOTE segment in the output
      file.
      
      To do this, we tell ld that we need a PT_NOTE segment.  This requires us to
      start explicitly mapping sections to segments, so we also need to explicitly
      create PT_LOAD segments for text and data, and map the sections to them
      appropriately.  Fortunately, each section will default to its previous
      section's segment, so it doesn't take many changes to vmlinux.lds.S.
      
      This only changes i386 for now, but I presume the corresponding changes for
      other architectures will be as simple.
      
      This change also adds <linux/elfnote.h>, which defines C and Assembler macros
      for actually creating ELF notes.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Hollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9c9b8b38
  23. 27 6月, 2006 1 次提交