1. 15 4月, 2015 2 次提交
  2. 17 3月, 2015 1 次提交
    • D
      x86/asm/entry/32: Document the 32-bit SYSENTER "emergency stack" better · d828c71f
      Denys Vlasenko 提交于
      Before the patch, the 'tss_struct::stack' field was not referenced anywhere.
      
      It was used only to set SYSENTER's stack to point after the last byte
      of tss_struct, thus the trailing field, stack[64], was used.
      
      But grep would not know it. You can comment it out, compile,
      and kernel will even run until an unlucky NMI corrupts
      io_bitmap[] (which is also not easily detectable).
      
      This patch changes code so that the purpose and usage of this
      field is not mysterious anymore, and can be easily grepped for.
      
      This does change generated code, for a subtle reason:
      since tss_struct is ____cacheline_aligned, there happens to be
      5 longs of padding at the end. Old code was using the padding
      too; new code will strictly use it only for SYSENTER_stack[].
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1425912738-559-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d828c71f
  3. 11 12月, 2014 1 次提交
  4. 07 6月, 2013 1 次提交
  5. 03 5月, 2013 1 次提交
    • K
      x86, gdt, hibernate: Store/load GDT for hibernate path. · cc456c4e
      Konrad Rzeszutek Wilk 提交于
      The git commite7a5cd06
      ("x86-64, gdt: Store/load GDT for ACPI S3 or hibernate/resume path
      is not needed.") assumes that for the hibernate path the booting
      kernel and the resuming kernel MUST be the same. That is certainly
      the case for a 32-bit kernel (see check_image_kernel and
      CONFIG_ARCH_HIBERNATION_HEADER config option).
      
      However for 64-bit kernels it is OK to have a different kernel
      version (and size of the image) of the booting and resuming kernels.
      Hence the above mentioned git commit introduces an regression.
      
      This patch fixes it by introducing a 'struct desc_ptr gdt_desc'
      back in the 'struct saved_context'. However instead of having in the
      'save_processor_state' and 'restore_processor_state' the
      store/load_gdt calls, we are only saving the GDT in the
      save_processor_state.
      
      For the restore path the lgdt operation is done in
      hibernate_asm_[32|64].S in the 'restore_registers' path.
      
      The apt reader of this description will recognize that only 64-bit
      kernels need this treatment, not 32-bit. This patch adds the logic
      in the 32-bit path to be more similar to 64-bit so that in the future
      the unification process can take advantage of this.
      
      [ hpa: this also reverts an inadvertent on-disk format change ]
      Suggested-by: N"H. Peter Anvin" <hpa@zytor.com>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Link: http://lkml.kernel.org/r/1367459610-9656-2-git-send-email-konrad.wilk@oracle.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      cc456c4e
  6. 18 11月, 2011 1 次提交
    • H
      x86: Generate system call tables and unistd_*.h from tables · 303395ac
      H. Peter Anvin 提交于
      Generate system call tables and unistd_*.h automatically from the
      tables in arch/x86/syscalls.  All other information, like NR_syscalls,
      is auto-generated, some of which is in asm-offsets_*.c.
      
      This allows us to keep all the system call information in one place,
      and allows for kernel space and user space to see different
      information; this is currently used for the ia32 system call numbers
      when building the 64-bit kernel, but will be used by the x32 ABI in
      the near future.
      
      This also removes some gratuitious differences between i386, x86-64
      and ia32; in particular, now all system call tables are generated with
      the same mechanism.
      
      Cc: H. J. Lu <hjl.tools@gmail.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      303395ac
  7. 22 7月, 2011 1 次提交
    • R
      lguest: use a special 1:1 linear pagetable mode until first switch. · 5dea1c88
      Rusty Russell 提交于
      The Host used to create some page tables for the Guest to use at the
      top of Guest memory; it would then tell the Guest where this was.  In
      particular, it created linear mappings for 0 and 0xC0000000 addresses
      because lguest used to switch to its real page tables quite late in
      boot.
      
      However, since d50d8fe1 Linux initialized boot page tables in
      head_32.S even before the "are we lguest?" boot jump.  So, now we can
      simplify things: the Host pagetable code assumes 1:1 linear mapping
      until it first calls the LHCALL_NEW_PGTABLE hypercall, which we now do
      before we reach C code.
      
      This also means that the Host doesn't need to know anything about the
      Guest's PAGE_OFFSET.  (Non-Linux guests might not even have such a
      thing).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      5dea1c88
  8. 10 2月, 2011 1 次提交
  9. 20 10月, 2010 1 次提交
    • J
      x86, asm: Fix CFI macro invocations to deal with shortcomings in gas · 3234282f
      Jan Beulich 提交于
      gas prior to (perhaps) 2.16.90 has problems with passing non-
      parenthesized expressions containing spaces to macros. Spaces, however,
      get inserted by cpp between any macro expanding to a number and a
      subsequent + or -. For the +, current x86 gas then removes the space
      again (future gas may not do so), but for the - the space gets retained
      and is then considered a separator between macro arguments.
      
      Fix the respective definitions for both the - and + cases, so that they
      neither contain spaces nor make cpp insert any (the latter by adding
      seemingly redundant parentheses).
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      LKML-Reference: <4CBDBEBA020000780001E05A@vpn.id2.novell.com>
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      3234282f
  10. 12 6月, 2009 1 次提交
    • R
      lguest: optimize by coding restore_flags and irq_enable in assembler. · 61f4bc83
      Rusty Russell 提交于
      The downside of the last patch which made restore_flags and irq_enable
      check interrupts is that they are now too big to be patched directly
      into the callsites, so the C versions are always used.
      
      But the C versions go via PV_CALLEE_SAVE_REGS_THUNK which saves all
      the registers.  In fact, we don't need any registers in the fast path,
      so we can do better than this if we actually code them in assembler.
      
      The results are in the noise, but since it's about the same amount of
      code, it's worth applying.
      
      1GB Guest->Host: input(suppressed),output(suppressed)
      Before:
      	Seconds: 0:16.53
      	Packets: 377268,753673
      	Interrupts: 22461,24297
      	Notifications: 1(5245),21303(732370)
      	Net IRQs triggered: 377023(245),42578(711095)
      
      After:
      	Seconds: 0:16.48
      	Packets: 377289,753673
      	Interrupts: 22281,24465
      	Notifications: 1(5245),21296(732377)
      	Net IRQs triggered: 377060(229),42564(711109)
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      61f4bc83
  11. 12 5月, 2009 1 次提交
    • H
      x86, boot: make kernel_alignment adjustable; new bzImage fields · 37ba7ab5
      H. Peter Anvin 提交于
      Make the kernel_alignment field adjustable; this allows us to set it
      to a large value (intended to be 16 MB to avoid ZONE_DMA contention,
      memory holes and other weirdness) while a smart bootloader can still
      force a loading at a lesser alignment if absolutely necessary.
      
      Also export pref_address (preferred loading address, corresponding to
      the link-time address) and init_size, the total amount of linear
      memory the kernel will require during initialization.
      
      [ Impact: allows better kernel placement, gives bootloader more info ]
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      37ba7ab5
  12. 01 4月, 2009 1 次提交
  13. 10 2月, 2009 1 次提交
    • T
      x86: make lazy %gs optional on x86_32 · ccbeed3a
      Tejun Heo 提交于
      Impact: pt_regs changed, lazy gs handling made optional, add slight
              overhead to SAVE_ALL, simplifies error_code path a bit
      
      On x86_32, %gs hasn't been used by kernel and handled lazily.  pt_regs
      doesn't have place for it and gs is saved/loaded only when necessary.
      In preparation for stack protector support, this patch makes lazy %gs
      handling optional by doing the followings.
      
      * Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs.
      
      * Save and restore %gs along with other registers in entry_32.S unless
        LAZY_GS.  Note that this unfortunately adds "pushl $0" on SAVE_ALL
        even when LAZY_GS.  However, it adds no overhead to common exit path
        and simplifies entry path with error code.
      
      * Define different user_gs accessors depending on LAZY_GS and add
        lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS.  The
        lazy_*_gs() ops are used to save, load and clear %gs lazily.
      
      * Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly.
      
      xen and lguest changes need to be verified.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ccbeed3a
  14. 18 12月, 2008 1 次提交
  15. 08 7月, 2008 1 次提交
    • J
      x86/paravirt: split sysret and sysexit · d75cd22f
      Jeremy Fitzhardinge 提交于
      Don't conflate sysret and sysexit; they're different instructions with
      different semantics, and may be in use at the same time (at least
      within the same kernel, depending on whether its an Intel or AMD
      system).
      
      sysexit - just return to userspace, does no register restoration of
          any kind; must explicitly atomically enable interrupts.
      
      sysret - reloads flags from r11, so no need to explicitly enable
          interrupts on 64-bit, responsible for restoring usermode %gs
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citirx.com>
      Cc: xen-devel <xen-devel@lists.xensource.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d75cd22f
  16. 29 4月, 2008 1 次提交
  17. 17 4月, 2008 1 次提交
  18. 26 2月, 2008 1 次提交
  19. 19 2月, 2008 1 次提交
    • R
      x86: fix lguest build failure · f6c540cd
      Rusty Russell 提交于
      drivers/lguest/x86/switcher_32.S:(.text+0x3815f8): 
      	undefined reference to `LGUEST_PAGES_regs_trapnum'
      
      This problem was caused by asm-offsets.c only having the offsets when
      lguest *guest* support was set, not lguest host (host support used to
      imply guest support, so now they're separate these bugs come out).
      
      Lguest guest support and host support are separate config options:
      they used to be tied together. Sort out which parts of asm-offsets are
      needed for Guest and Host.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      f6c540cd
  20. 30 1月, 2008 7 次提交
  21. 23 10月, 2007 1 次提交
    • R
      Boot with virtual == physical to get closer to native Linux. · 47436aa4
      Rusty Russell 提交于
      1) This allows us to get alot closer to booting bzImages.
      
      2) It means we don't have to know page_offset.
      
      3) The Guest needs to modify the boot pagetables to create the
         PAGE_OFFSET mapping before jumping to C code.
      
      4) guest_pa() walks the page tables rather than using page_offset.
      
      5) We don't use page_offset to figure out whether to emulate: it was
         always kinda quesationable, and won't work for instructions done
         before remapping (bzImage unpacking in particular).
      
      6) We still want the kernel address for tlb flushing: have the initial
         hypercall give us that, too.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      47436aa4
  22. 22 10月, 2007 1 次提交
  23. 17 10月, 2007 1 次提交
    • J
      paravirt: refactor struct paravirt_ops into smaller pv_*_ops · 93b1eab3
      Jeremy Fitzhardinge 提交于
      This patch refactors the paravirt_ops structure into groups of
      functionally related ops:
      
      pv_info - random info, rather than function entrypoints
      pv_init_ops - functions used at boot time (some for module_init too)
      pv_misc_ops - lazy mode, which didn't fit well anywhere else
      pv_time_ops - time-related functions
      pv_cpu_ops - various privileged instruction ops
      pv_irq_ops - operations for managing interrupt state
      pv_apic_ops - APIC operations
      pv_mmu_ops - operations for managing pagetables
      
      There are several motivations for this:
      
      1. Some of these ops will be general to all x86, and some will be
         i386/x86-64 specific.  This makes it easier to share common stuff
         while allowing separate implementations where needed.
      
      2. At the moment we must export all of paravirt_ops, but modules only
         need selected parts of it.  This allows us to export on a case by case
         basis (and also choose which export license we want to apply).
      
      3. Functional groupings make things a bit more readable.
      
      Struct paravirt_ops is now only used as a template to generate
      patch-site identifiers, and to extract function pointers for inserting
      into jmp/calls when patching.  It is only instantiated when needed.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Zach Amsden <zach@vmware.com>
      Cc: Avi Kivity <avi@qumranet.com>
      Cc: Anthony Liguory <aliguori@us.ibm.com>
      Cc: "Glauber de Oliveira Costa" <glommer@gmail.com>
      Cc: Jun Nakajima <jun.nakajima@intel.com>
      93b1eab3
  24. 11 10月, 2007 3 次提交
  25. 20 7月, 2007 1 次提交
  26. 18 7月, 2007 2 次提交
    • J
      xen: use iret directly when possible · 9ec2b804
      Jeremy Fitzhardinge 提交于
      Most of the time we can simply use the iret instruction to exit the
      kernel, rather than having to use the iret hypercall - the only
      exception is if we're returning into vm86 mode, or from delivering an
      NMI (which we don't support yet).
      
      When running native, iret has the behaviour of testing for a pending
      interrupt atomically with re-enabling interrupts.  Unfortunately
      there's no way to do this with Xen, so there's a window in which we
      could get a recursive exception after enabling events but before
      actually returning to userspace.
      
      This causes a problem: if the nested interrupt causes one of the
      task's TIF_WORK_MASK flags to be set, they will not be checked again
      before returning to userspace.  This means that pending work may be
      left pending indefinitely, until the process enters and leaves the
      kernel again.  The net effect is that a pending signal or reschedule
      event could be delayed for an unbounded amount of time.
      
      To deal with this, the xen event upcall handler checks to see if the
      EIP is within the critical section of the iret code, after events
      are (potentially) enabled up to the iret itself.  If its within this
      range, it calls the iret critical section fixup, which adjusts the
      stack to deal with any unrestored registers, and then shifts the
      stack frame up to replace the previous invocation.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      9ec2b804
    • J
      xen: Attempt to patch inline versions of common operations · 6487673b
      Jeremy Fitzhardinge 提交于
      This patchs adds the mechanism to allow us to patch inline versions of
      common operations.
      
      The implementations of the direct-access versions save_fl, restore_fl,
      irq_enable and irq_disable are now in assembler, and the same code is
      used for both out of line and inline uses.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Keir Fraser <keir@xensource.com>
      6487673b
  27. 03 5月, 2007 4 次提交
    • J
      [PATCH] i386: map enough initial memory to create lowmem mappings · 9ce8c2ed
      Jeremy Fitzhardinge 提交于
      head.S creates the very initial pagetable for the kernel.  This just
      maps enough space for the kernel itself, and an allocation bitmap.
      The amount of mapped memory is rounded up to 4Mbytes, and so this
      typically ends up mapping 8Mbytes of memory.
      
      When booting, pagetable_init() needs to create mappings for all
      lowmem, and the pagetables for these mappings are allocated from the
      free pages around the kernel in low memory.  If the number of
      pagetable pages + kernel size exceeds head.S's initial mapping, it
      will end up faulting on an unmapped page.  This will only happen with
      specific combinations of kernel size and memory size.
      
      This patch makes sure that head.S also maps enough space to fit the
      kernel pagetables as well as the kernel itself.  It ends up using an
      additional two pages of unreclaimable memory.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: N"H. Peter Anvin" <hpa@zytor.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>,
      9ce8c2ed
    • J
      [PATCH] i386: Convert PDA into the percpu section · 7c3576d2
      Jeremy Fitzhardinge 提交于
      Currently x86 (similar to x84-64) has a special per-cpu structure
      called "i386_pda" which can be easily and efficiently referenced via
      the %fs register.  An ELF section is more flexible than a structure,
      allowing any piece of code to use this area.  Indeed, such a section
      already exists: the per-cpu area.
      
      So this patch:
      (1) Removes the PDA and uses per-cpu variables for each current member.
      (2) Replaces the __KERNEL_PDA segment with __KERNEL_PERCPU.
      (3) Creates a per-cpu mirror of __per_cpu_offset called this_cpu_off, which
          can be used to calculate addresses for this CPU's variables.
      (4) Simplifies startup, because %fs doesn't need to be loaded with a
          special segment at early boot; it can be deferred until the first
          percpu area is allocated (or never for UP).
      
      The result is less code and one less x86-specific concept.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      7c3576d2
    • R
      [PATCH] i386: i386 separate hardware-defined TSS from Linux additions · a75c54f9
      Rusty Russell 提交于
      On Thu, 2007-03-29 at 13:16 +0200, Andi Kleen wrote:
      > Please clean it up properly with two structs.
      
      Not sure about this, now I've done it.  Running it here.
      
      If you like it, I can do x86-64 as well.
      
      ==
      lguest defines its own TSS struct because the "struct tss_struct"
      contains linux-specific additions.  Andi asked me to split the struct
      in processor.h.
      
      Unfortunately it makes usage a little awkward.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      a75c54f9
    • A
      [PATCH] i386: VDSO_PRELINK warning fix · 1b523fb5
      Andrew Morton 提交于
      The lguest patches somehow managed to trigger this:
      
      In file included from arch/i386/lguest/lguest.c:38:
      include/asm/asm-offsets.h:67:1: warning: "VDSO_PRELINK" redefined
      In file included from include/linux/elf.h:7,
                       from include/linux/module.h:15,
                       from include/linux/device.h:21,
                       from include/linux/interrupt.h:15,
                       from arch/i386/lguest/lguest.c:27:
      include/asm/elf.h:140:1: warning: this is the location of the previous definition
      
      I assume that using the same identifier twice was a bad idea..
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      1b523fb5