1. 01 4月, 2009 1 次提交
  2. 10 2月, 2009 1 次提交
    • T
      x86: make lazy %gs optional on x86_32 · ccbeed3a
      Tejun Heo 提交于
      Impact: pt_regs changed, lazy gs handling made optional, add slight
              overhead to SAVE_ALL, simplifies error_code path a bit
      
      On x86_32, %gs hasn't been used by kernel and handled lazily.  pt_regs
      doesn't have place for it and gs is saved/loaded only when necessary.
      In preparation for stack protector support, this patch makes lazy %gs
      handling optional by doing the followings.
      
      * Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs.
      
      * Save and restore %gs along with other registers in entry_32.S unless
        LAZY_GS.  Note that this unfortunately adds "pushl $0" on SAVE_ALL
        even when LAZY_GS.  However, it adds no overhead to common exit path
        and simplifies entry path with error code.
      
      * Define different user_gs accessors depending on LAZY_GS and add
        lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS.  The
        lazy_*_gs() ops are used to save, load and clear %gs lazily.
      
      * Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly.
      
      xen and lguest changes need to be verified.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ccbeed3a
  3. 18 12月, 2008 1 次提交
  4. 08 7月, 2008 1 次提交
    • J
      x86/paravirt: split sysret and sysexit · d75cd22f
      Jeremy Fitzhardinge 提交于
      Don't conflate sysret and sysexit; they're different instructions with
      different semantics, and may be in use at the same time (at least
      within the same kernel, depending on whether its an Intel or AMD
      system).
      
      sysexit - just return to userspace, does no register restoration of
          any kind; must explicitly atomically enable interrupts.
      
      sysret - reloads flags from r11, so no need to explicitly enable
          interrupts on 64-bit, responsible for restoring usermode %gs
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citirx.com>
      Cc: xen-devel <xen-devel@lists.xensource.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d75cd22f
  5. 29 4月, 2008 1 次提交
  6. 17 4月, 2008 1 次提交
  7. 26 2月, 2008 1 次提交
  8. 19 2月, 2008 1 次提交
    • R
      x86: fix lguest build failure · f6c540cd
      Rusty Russell 提交于
      drivers/lguest/x86/switcher_32.S:(.text+0x3815f8): 
      	undefined reference to `LGUEST_PAGES_regs_trapnum'
      
      This problem was caused by asm-offsets.c only having the offsets when
      lguest *guest* support was set, not lguest host (host support used to
      imply guest support, so now they're separate these bugs come out).
      
      Lguest guest support and host support are separate config options:
      they used to be tied together. Sort out which parts of asm-offsets are
      needed for Guest and Host.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      f6c540cd
  9. 30 1月, 2008 7 次提交
  10. 23 10月, 2007 1 次提交
    • R
      Boot with virtual == physical to get closer to native Linux. · 47436aa4
      Rusty Russell 提交于
      1) This allows us to get alot closer to booting bzImages.
      
      2) It means we don't have to know page_offset.
      
      3) The Guest needs to modify the boot pagetables to create the
         PAGE_OFFSET mapping before jumping to C code.
      
      4) guest_pa() walks the page tables rather than using page_offset.
      
      5) We don't use page_offset to figure out whether to emulate: it was
         always kinda quesationable, and won't work for instructions done
         before remapping (bzImage unpacking in particular).
      
      6) We still want the kernel address for tlb flushing: have the initial
         hypercall give us that, too.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      47436aa4
  11. 22 10月, 2007 1 次提交
  12. 17 10月, 2007 1 次提交
    • J
      paravirt: refactor struct paravirt_ops into smaller pv_*_ops · 93b1eab3
      Jeremy Fitzhardinge 提交于
      This patch refactors the paravirt_ops structure into groups of
      functionally related ops:
      
      pv_info - random info, rather than function entrypoints
      pv_init_ops - functions used at boot time (some for module_init too)
      pv_misc_ops - lazy mode, which didn't fit well anywhere else
      pv_time_ops - time-related functions
      pv_cpu_ops - various privileged instruction ops
      pv_irq_ops - operations for managing interrupt state
      pv_apic_ops - APIC operations
      pv_mmu_ops - operations for managing pagetables
      
      There are several motivations for this:
      
      1. Some of these ops will be general to all x86, and some will be
         i386/x86-64 specific.  This makes it easier to share common stuff
         while allowing separate implementations where needed.
      
      2. At the moment we must export all of paravirt_ops, but modules only
         need selected parts of it.  This allows us to export on a case by case
         basis (and also choose which export license we want to apply).
      
      3. Functional groupings make things a bit more readable.
      
      Struct paravirt_ops is now only used as a template to generate
      patch-site identifiers, and to extract function pointers for inserting
      into jmp/calls when patching.  It is only instantiated when needed.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Zach Amsden <zach@vmware.com>
      Cc: Avi Kivity <avi@qumranet.com>
      Cc: Anthony Liguory <aliguori@us.ibm.com>
      Cc: "Glauber de Oliveira Costa" <glommer@gmail.com>
      Cc: Jun Nakajima <jun.nakajima@intel.com>
      93b1eab3
  13. 11 10月, 2007 3 次提交
  14. 20 7月, 2007 1 次提交
  15. 18 7月, 2007 2 次提交
    • J
      xen: use iret directly when possible · 9ec2b804
      Jeremy Fitzhardinge 提交于
      Most of the time we can simply use the iret instruction to exit the
      kernel, rather than having to use the iret hypercall - the only
      exception is if we're returning into vm86 mode, or from delivering an
      NMI (which we don't support yet).
      
      When running native, iret has the behaviour of testing for a pending
      interrupt atomically with re-enabling interrupts.  Unfortunately
      there's no way to do this with Xen, so there's a window in which we
      could get a recursive exception after enabling events but before
      actually returning to userspace.
      
      This causes a problem: if the nested interrupt causes one of the
      task's TIF_WORK_MASK flags to be set, they will not be checked again
      before returning to userspace.  This means that pending work may be
      left pending indefinitely, until the process enters and leaves the
      kernel again.  The net effect is that a pending signal or reschedule
      event could be delayed for an unbounded amount of time.
      
      To deal with this, the xen event upcall handler checks to see if the
      EIP is within the critical section of the iret code, after events
      are (potentially) enabled up to the iret itself.  If its within this
      range, it calls the iret critical section fixup, which adjusts the
      stack to deal with any unrestored registers, and then shifts the
      stack frame up to replace the previous invocation.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      9ec2b804
    • J
      xen: Attempt to patch inline versions of common operations · 6487673b
      Jeremy Fitzhardinge 提交于
      This patchs adds the mechanism to allow us to patch inline versions of
      common operations.
      
      The implementations of the direct-access versions save_fl, restore_fl,
      irq_enable and irq_disable are now in assembler, and the same code is
      used for both out of line and inline uses.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Keir Fraser <keir@xensource.com>
      6487673b
  16. 03 5月, 2007 5 次提交
    • J
      [PATCH] i386: map enough initial memory to create lowmem mappings · 9ce8c2ed
      Jeremy Fitzhardinge 提交于
      head.S creates the very initial pagetable for the kernel.  This just
      maps enough space for the kernel itself, and an allocation bitmap.
      The amount of mapped memory is rounded up to 4Mbytes, and so this
      typically ends up mapping 8Mbytes of memory.
      
      When booting, pagetable_init() needs to create mappings for all
      lowmem, and the pagetables for these mappings are allocated from the
      free pages around the kernel in low memory.  If the number of
      pagetable pages + kernel size exceeds head.S's initial mapping, it
      will end up faulting on an unmapped page.  This will only happen with
      specific combinations of kernel size and memory size.
      
      This patch makes sure that head.S also maps enough space to fit the
      kernel pagetables as well as the kernel itself.  It ends up using an
      additional two pages of unreclaimable memory.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: N"H. Peter Anvin" <hpa@zytor.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>,
      9ce8c2ed
    • J
      [PATCH] i386: Convert PDA into the percpu section · 7c3576d2
      Jeremy Fitzhardinge 提交于
      Currently x86 (similar to x84-64) has a special per-cpu structure
      called "i386_pda" which can be easily and efficiently referenced via
      the %fs register.  An ELF section is more flexible than a structure,
      allowing any piece of code to use this area.  Indeed, such a section
      already exists: the per-cpu area.
      
      So this patch:
      (1) Removes the PDA and uses per-cpu variables for each current member.
      (2) Replaces the __KERNEL_PDA segment with __KERNEL_PERCPU.
      (3) Creates a per-cpu mirror of __per_cpu_offset called this_cpu_off, which
          can be used to calculate addresses for this CPU's variables.
      (4) Simplifies startup, because %fs doesn't need to be loaded with a
          special segment at early boot; it can be deferred until the first
          percpu area is allocated (or never for UP).
      
      The result is less code and one less x86-specific concept.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      7c3576d2
    • R
      [PATCH] i386: i386 separate hardware-defined TSS from Linux additions · a75c54f9
      Rusty Russell 提交于
      On Thu, 2007-03-29 at 13:16 +0200, Andi Kleen wrote:
      > Please clean it up properly with two structs.
      
      Not sure about this, now I've done it.  Running it here.
      
      If you like it, I can do x86-64 as well.
      
      ==
      lguest defines its own TSS struct because the "struct tss_struct"
      contains linux-specific additions.  Andi asked me to split the struct
      in processor.h.
      
      Unfortunately it makes usage a little awkward.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      a75c54f9
    • A
      [PATCH] i386: VDSO_PRELINK warning fix · 1b523fb5
      Andrew Morton 提交于
      The lguest patches somehow managed to trigger this:
      
      In file included from arch/i386/lguest/lguest.c:38:
      include/asm/asm-offsets.h:67:1: warning: "VDSO_PRELINK" redefined
      In file included from include/linux/elf.h:7,
                       from include/linux/module.h:15,
                       from include/linux/device.h:21,
                       from include/linux/interrupt.h:15,
                       from arch/i386/lguest/lguest.c:27:
      include/asm/elf.h:140:1: warning: this is the location of the previous definition
      
      I assume that using the same identifier twice was a bad idea..
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      1b523fb5
    • A
      [PATCH] i386: workaround for a -Wmissing-prototypes warning · 27142219
      Adrian Bunk 提交于
      Work around a warning with -Wmissing-prototypes in
      arch/i386/kernel/asm-offsets.c
      
      The warning isn't gcc's fault - asm-offsets.c is simply a special file.
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      27142219
  17. 13 2月, 2007 1 次提交
  18. 07 12月, 2006 6 次提交
  19. 28 6月, 2006 1 次提交
    • I
      [PATCH] vdso: randomize the i386 vDSO by moving it into a vma · e6e5494c
      Ingo Molnar 提交于
      Move the i386 VDSO down into a vma and thus randomize it.
      
      Besides the security implications, this feature also helps debuggers, which
      can COW a vma-backed VDSO just like a normal DSO and can thus do
      single-stepping and other debugging features.
      
      It's good for hypervisors (Xen, VMWare) too, which typically live in the same
      high-mapped address space as the VDSO, hence whenever the VDSO is used, they
      get lots of guest pagefaults and have to fix such guest accesses up - which
      slows things down instead of speeding things up (the primary purpose of the
      VDSO).
      
      There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
      for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
      distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
      it off is also recommended for security reasons: attackers cannot use the
      predictable high-mapped VDSO page as syscall trampoline anymore.
      
      There is a new vdso=[0|1] boot option as well, and a runtime
      /proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
      on/off.
      
      (This version of the VDSO-randomization patch also has working ELF
      coredumping, the previous patch crashed in the coredumping code.)
      
      This code is a combined work of the exec-shield VDSO randomization
      code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
      started this patch and i completed it.
      
      [akpm@osdl.org: cleanups]
      [akpm@osdl.org: compile fix]
      [akpm@osdl.org: compile fix 2]
      [akpm@osdl.org: compile fix 3]
      [akpm@osdl.org: revernt MAXMEM change]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjan@infradead.org>
      Cc: Gerd Hoffmann <kraxel@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Jan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e6e5494c
  20. 26 6月, 2006 1 次提交
    • H
      [CRYPTO] all: Pass tfm instead of ctx to algorithms · 6c2bb98b
      Herbert Xu 提交于
      Up until now algorithms have been happy to get a context pointer since
      they know everything that's in the tfm already (e.g., alignment, block
      size).
      
      However, once we have parameterised algorithms, such information will
      be specific to each tfm.  So the algorithm API needs to be changed to
      pass the tfm structure instead of the context pointer.
      
      This patch is basically a text substitution.  The only tricky bit is
      the assembly routines that need to get the context pointer offset
      through asm-offsets.h.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      6c2bb98b
  21. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4