1. 03 5月, 2007 40 次提交
    • J
      [PATCH] i386: PARAVIRT: drop unused ptep_get_and_clear · 4cdd9c89
      Jeremy Fitzhardinge 提交于
      In shadow mode hypervisors, ptep_get_and_clear achieves the desired
      purpose of keeping the shadows in sync by issuing a native_get_and_clear,
      followed by a call to pte_update, which indicates the PTE has been
      modified.
      
      Direct mode hypervisors (Xen) have no need for this anyway, and will trap
      the update using writable pagetables.
      
      This means no hypervisor makes use of ptep_get_and_clear; there is no
      reason to have it in the paravirt-ops structure.  Change confusing
      terminology about raw vs. native functions into consistent use of
      native_pte_xxx for operations which do not invoke paravirt-ops.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      4cdd9c89
    • J
      [PATCH] i386: PARAVIRT: Clean up paravirt patchable wrappers · 1a45b7aa
      Jeremy Fitzhardinge 提交于
      Replace all the open-coded macros for generating calls with a pair of
      more general macros (__PVOP_CALL/VCALL), and redefine all the
      PVOP_V?CALL[0-4] in terms of them.
      
      [ Andrew, Andi: this should slot in immediately after "Document asm-i386/paravirt.h"
        (paravirt_ops-document-asm-i386-paravirth.patch) ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      1a45b7aa
    • J
      [PATCH] i386: PARAVIRT: Use enums for paravirt lazy flush modi · 4e0fa856
      Jeremy Fitzhardinge 提交于
      Remove #defines, add enum for PARAVIRT_LAZY_FLUSH.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      4e0fa856
    • J
      [PATCH] i386: PARAVIRT: add kmap_atomic_pte for mapping highpte pages · ce6234b5
      Jeremy Fitzhardinge 提交于
      Xen and VMI both have special requirements when mapping a highmem pte
      page into the kernel address space.  These can be dealt with by adding
      a new kmap_atomic_pte() function for mapping highptes, and hooking it
      into the paravirt_ops infrastructure.
      
      Xen specifically wants to map the pte page RO, so this patch exposes a
      helper function, kmap_atomic_prot, which maps the page with the
      specified page protections.
      
      This also adds a kmap_flush_unused() function to clear out the cached
      kmap mappings.  Xen needs this to clear out any potential stray RW
      mappings of pages which will become part of a pagetable.
      
      [ Zach - vmi.c will need some attention after this patch.  It wasn't
        immediately obvious to me what needs to be done. ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      ce6234b5
    • J
      [PATCH] i386: PARAVIRT: revert map_pt_hook. · a27fe809
      Jeremy Fitzhardinge 提交于
      Back out the map_pt_hook to clear the way for kmap_atomic_pte.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      a27fe809
    • J
      [PATCH] i386: PARAVIRT: add flush_tlb_others paravirt_op · d4c10477
      Jeremy Fitzhardinge 提交于
      This patch adds a pv_op for flush_tlb_others.  Linux running on native
      hardware uses cross-CPU IPIs to flush the TLB on any CPU which may
      have a particular mm's pagetable entries cached in its TLB.  This is
      inefficient in a paravirtualized environment, since the hypervisor
      knows which real CPUs actually contain cached mappings, which may be a
      small subset of a guest's VCPUs.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      d4c10477
    • J
      [PATCH] i386: PARAVIRT: add common patching machinery · 63f70270
      Jeremy Fitzhardinge 提交于
      Implement the actual patching machinery.  paravirt_patch_default()
      contains the logic to automatically patch a callsite based on a few
      simple rules:
      
       - if the paravirt_op function is paravirt_nop, then patch nops
       - if the paravirt_op function is a jmp target, then jmp to it
       - if the paravirt_op function is callable and doesn't clobber too much
          for the callsite, call it directly
      
      paravirt_patch_default is suitable as a default implementation of
      paravirt_ops.patch, will remove most of the expensive indirect calls
      in favour of either a direct call or a pile of nops.
      
      Backends may implement their own patcher, however.  There are several
      helper functions to help with this:
      
      paravirt_patch_nop	nop out a callsite
      paravirt_patch_ignore	leave the callsite as-is
      paravirt_patch_call	patch a call if the caller and callee
      			have compatible clobbers
      paravirt_patch_jmp	patch in a jmp
      paravirt_patch_insns	patch some literal instructions over
      			the callsite, if they fit
      
      This patch also implements more direct patches for the native case, so
      that when running on native hardware many common operations are
      implemented inline.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Anthony Liguori <anthony@codemonkey.ws>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      63f70270
    • J
      [PATCH] i386: PARAVIRT: Document asm-i386/paravirt.h · 294688c0
      Jeremy Fitzhardinge 提交于
      Clean things up, and broadly document:
       - the paravirt_ops functions themselves
       - the patching mechanism
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      294688c0
    • J
      [PATCH] i386: PARAVIRT: Consistently wrap paravirt ops callsites to make them patchable · f8822f42
      Jeremy Fitzhardinge 提交于
      Wrap a set of interesting paravirt_ops calls in a wrapper which makes
      the callsites available for patching.  Unfortunately this is pretty
      ugly because there's no way to get gcc to generate a function call,
      but also wrap just the callsite itself with the necessary labels.
      
      This patch supports functions with 0-4 arguments, and either void or
      returning a value.  64-bit arguments must be split into a pair of
      32-bit arguments (lower word first).  Small structures are returned in
      registers.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Anthony Liguori <anthony@codemonkey.ws>
      f8822f42
    • J
      [PATCH] i386: PARAVIRT: Fix patch site clobbers to include return register · 42c24fa2
      Jeremy Fitzhardinge 提交于
      Fix a few clobbers to include the return register.  The clobbers set
      is the set of all registers modified (or may be modified) by the code
      snippet, regardless of whether it was deliberate or accidental.
      
      Also, make sure that callsites which are used in contexts which don't
      allow clobbers actually save and restore all clobberable registers.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      42c24fa2
    • J
      [PATCH] i386: PARAVIRT: Use patch site IDs computed from offset in paravirt_ops structure · d5822035
      Jeremy Fitzhardinge 提交于
      Use patch type identifiers derived from the offset of the operation in
      the paravirt_ops structure.  This avoids having to maintain a separate
      enum for patch site types.
      
      Also, since the identifier is derived from the offset into
      paravirt_ops, the offset can be derived from the identifier.  This is
      used to remove replicated information in the various callsite macros,
      which has been a source of bugs in the past.
      
      This patch also drops the fused save_fl+cli operation, which doesn't
      really add much and makes things more complex - specifically because
      it breaks the 1:1 relationship between identifiers and offsets.  If
      this operation turns out to be particularly beneficial, then the right
      answer is to define a new entrypoint for it.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      d5822035
    • J
      [PATCH] i386: PARAVIRT: rename struct paravirt_patch to paravirt_patch_site for clarity · 98de032b
      Jeremy Fitzhardinge 提交于
      Rename struct paravirt_patch to paravirt_patch_site, so that it
      clearly refers to a callsite, and not the patch which may be applied
      to that callsite.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      98de032b
    • J
      [PATCH] x86: PARAVIRT: add hooks to intercept mm creation and destruction · d6dd61c8
      Jeremy Fitzhardinge 提交于
      Add hooks to allow a paravirt implementation to track the lifetime of
      an mm.  Paravirtualization requires three hooks, but only two are
      needed in common code.  They are:
      
      arch_dup_mmap, which is called when a new mmap is created at fork
      
      arch_exit_mmap, which is called when the last process reference to an
        mm is dropped, which typically happens on exit and exec.
      
      The third hook is activate_mm, which is called from the arch-specific
      activate_mm() macro/function, and so doesn't need stub versions for
      other architectures.  It's called when an mm is first used.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: linux-arch@vger.kernel.org
      Cc: James Bottomley <James.Bottomley@SteelEye.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      d6dd61c8
    • J
      [PATCH] i386: PARAVIRT: Allow paravirt backend to choose kernel PMD sharing · 5311ab62
      Jeremy Fitzhardinge 提交于
      Normally when running in PAE mode, the 4th PMD maps the kernel address space,
      which can be shared among all processes (since they all need the same kernel
      mappings).
      
      Xen, however, does not allow guests to have the kernel pmd shared between page
      tables, so parameterize pgtable.c to allow both modes of operation.
      
      There are several side-effects of this.  One is that vmalloc will update the
      kernel address space mappings, and those updates need to be propagated into
      all processes if the kernel mappings are not intrinsically shared.  In the
      non-PAE case, this is done by maintaining a pgd_list of all processes; this
      list is used when all process pagetables must be updated.  pgd_list is
      threaded via otherwise unused entries in the page structure for the pgd, which
      means that the pgd must be page-sized for this to work.
      
      Normally the PAE pgd is only 4x64 byte entries large, but Xen requires the PAE
      pgd to page aligned anyway, so this patch forces the pgd to be page
      aligned+sized when the kernel pmd is unshared, to accomodate both these
      requirements.
      
      Also, since there may be several distinct kernel pmds (if the user/kernel
      split is below 3G), there's no point in allocating them from a slab cache;
      they're just allocated with get_free_page and initialized appropriately.  (Of
      course the could be cached if there is just a single kernel pmd - which is the
      default with a 3G user/kernel split - but it doesn't seem worthwhile to add
      yet another case into this code).
      
      [ Many thanks to wli for review comments. ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NWilliam Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      5311ab62
    • J
      [PATCH] i386: PARAVIRT: Allocate a fixmap slot · 90caccb9
      Jeremy Fitzhardinge 提交于
      Allocate a fixmap slot for use by a paravirt_ops implementation.  This
      is intended for early-boot bootstrap mappings.  Once the zones and
      allocator have been set up, it would be better to use get_vm_area() to
      allocate some virtual space.
      
      Xen uses this to map the hypervisor's shared info page, which doesn't
      have a pseudo-physical page number, and therefore can't be mapped
      ordinarily.  It is needed early because it contains the vcpu state,
      including the interrupt mask.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      90caccb9
    • J
      [PATCH] i386: PARAVIRT: Hooks to set up initial pagetable · b239fb25
      Jeremy Fitzhardinge 提交于
      This patch introduces paravirt_ops hooks to control how the kernel's
      initial pagetable is set up.
      
      In the case of a native boot, the very early bootstrap code creates a
      simple non-PAE pagetable to map the kernel and physical memory.  When
      the VM subsystem is initialized, it creates a proper pagetable which
      respects the PAE mode, large pages, etc.
      
      When booting under a hypervisor, there are many possibilities for what
      paging environment the hypervisor establishes for the guest kernel, so
      the constructon of the kernel's pagetable depends on the hypervisor.
      
      In the case of Xen, the hypervisor boots the kernel with a fully
      constructed pagetable, which is already using PAE if necessary.  Also,
      Xen requires particular care when constructing pagetables to make sure
      all pagetables are always mapped read-only.
      
      In order to make this easier, kernel's initial pagetable construction
      has been changed to only allocate and initialize a pagetable page if
      there's no page already present in the pagetable.  This allows the Xen
      paravirt backend to make a copy of the hypervisor-provided pagetable,
      allowing the kernel to establish any more mappings it needs while
      keeping the existing ones.
      
      A slightly subtle point which is worth highlighting here is that Xen
      requires all kernel mappings to share the same pte_t pages between all
      pagetables, so that updating a kernel page's mapping in one pagetable
      is reflected in all other pagetables.  This makes it possible to
      allocate a page and attach it to a pagetable without having to
      explicitly enumerate that page's mapping in all pagetables.
      
      And:
      
      +From: "Eric W. Biederman" <ebiederm@xmission.com>
      
      If we don't set the leaf page table entries it is quite possible that
      will inherit and incorrect page table entry from the initial boot
      page table setup in head.S.  So we need to redo the effort here,
      so we pick up PSE, PGE and the like.
      
      Hypervisors like Xen require that their page tables be read-only,
      which is slightly incompatible with our low identity mappings, however
      I discussed this with Jeremy he has modified the Xen early set_pte
      function to avoid problems in this area.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NWilliam Irwin <bill.irwin@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      b239fb25
    • J
      [PATCH] i386: PARAVIRT: Add pagetable accessors to pack and unpack pagetable entries · 3dc494e8
      Jeremy Fitzhardinge 提交于
      Add a set of accessors to pack, unpack and modify page table entries
      (at all levels).  This allows a paravirt implementation to control the
      contents of pgd/pmd/pte entries.  For example, Xen uses this to
      convert the (pseudo-)physical address into a machine address when
      populating a pagetable entry, and converting back to pphys address
      when an entry is read.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      3dc494e8
    • J
      [PATCH] i386: PARAVIRT: use paravirt_nop to consistently mark no-op operations · 45876233
      Jeremy Fitzhardinge 提交于
      Add a _paravirt_nop function for use as a stub for no-op operations,
      and paravirt_nop #defined void * version to make using it easier
      (since all its uses are as a void *).
      
      This is useful to allow the patcher to automatically identify noop
      operations so it can simply nop out the callsite.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      [mingo] but only as a cleanup of the current open-coded (void *) casts.
      My problem with this is that it loses the types. Not that there is much
      to check for, but still, this adds some assumptions about how function
      calls look like
      45876233
    • R
      [PATCH] i386: i386 separate hardware-defined TSS from Linux additions · a75c54f9
      Rusty Russell 提交于
      On Thu, 2007-03-29 at 13:16 +0200, Andi Kleen wrote:
      > Please clean it up properly with two structs.
      
      Not sure about this, now I've done it.  Running it here.
      
      If you like it, I can do x86-64 as well.
      
      ==
      lguest defines its own TSS struct because the "struct tss_struct"
      contains linux-specific additions.  Andi asked me to split the struct
      in processor.h.
      
      Unfortunately it makes usage a little awkward.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      a75c54f9
    • J
      [PATCH] i386: Remove smp_alt_instructions · d0175ab6
      Jeremy Fitzhardinge 提交于
      The .smp_altinstructions section and its corresponding symbols are
      completely unused, so remove them.
      
      Also, remove stray #ifdef __KENREL__ in asm-i386/alternative.h
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      d0175ab6
    • H
      [PATCH] x86: Clean up x86 control register and MSR macros (corrected) · 4bc5aa91
      H. Peter Anvin 提交于
      This patch is based on Rusty's recent cleanup of the EFLAGS-related
      macros; it extends the same kind of cleanup to control registers and
      MSRs.
      
      It also unifies these between i386 and x86-64; at least with regards
      to MSRs, the two had definitely gotten out of sync.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      4bc5aa91
    • A
      [PATCH] x86: Don't use MWAIT on AMD Family 10 · f039b754
      Andi Kleen 提交于
      It doesn't put the CPU into deeper sleep states, so it's better to use the standard
      idle loop to save power. But allow to reenable it anyways for benchmarking.
      
      I also removed the obsolete idle=halt on i386
      
      Cc: andreas.herrmann@amd.com
      Signed-off-by: NAndi Kleen <ak@suse.de>
      f039b754
    • J
      [PATCH] i386: Make COMPAT_VDSO runtime selectable. · 1dbf527c
      Jeremy Fitzhardinge 提交于
      Now that relocation of the VDSO for COMPAT_VDSO users is done at
      runtime rather than compile time, it is possible to enable/disable
      compat mode at runtime.
      
      This patch allows you to enable COMPAT_VDSO mode with "vdso=2" on the
      kernel command line, or via sysctl.  (Switching on a running system
      shouldn't be done lightly; any process which was relying on the compat
      VDSO will be upset if it goes away.)
      
      The COMPAT_VDSO config option still exists, but if enabled it just
      makes vdso_enabled default to VDSO_COMPAT.
      
      +From: Hugh Dickins <hugh@veritas.com>
      
      Fix oops from i386-make-compat_vdso-runtime-selectable.patch.
      
      Even mingetty at system startup finds it easy to trigger an oops
      while reading /proc/PID/maps: though it has a good hold on the mm
      itself, that cannot stop exit_mm() from resetting tsk->mm to NULL.
      
      (It is usually show_map()'s call to get_gate_vma() which oopses,
      and I expect we could change that to check priv->tail_vma instead;
      but no matter, even m_start()'s call just after get_task_mm() is racy.)
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: "Jan Beulich" <JBeulich@novell.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Roland McGrath <roland@redhat.com>
      1dbf527c
    • J
      [PATCH] i386: Relocate VDSO ELF headers to match mapped location with COMPAT_VDSO · d4f7a2c1
      Jeremy Fitzhardinge 提交于
      Some versions of libc can't deal with a VDSO which doesn't have its
      ELF headers matching its mapped address.  COMPAT_VDSO maps the VDSO at
      a specific system-wide fixed address.  Previously this was all done at
      build time, on the grounds that the fixed VDSO address is always at
      the top of the address space.  However, a hypervisor may reserve some
      of that address space, pushing the fixmap address down.
      
      This patch does the adjustment dynamically at runtime, depending on
      the runtime location of the VDSO fixmap.
      
      [ Patch has been through several hands: Jan Beulich wrote the orignal
        version; Zach reworked it, and Jeremy converted it to relocate phdrs
        as well as sections. ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: "Jan Beulich" <JBeulich@novell.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Roland McGrath <roland@redhat.com>
      d4f7a2c1
    • J
      [PATCH] i386: clean up identify_cpu · a6c4e076
      Jeremy Fitzhardinge 提交于
      identify_cpu() is used to identify both the boot CPU and secondary
      CPUs, but it performs some actions which only apply to the boot CPU.
      Those functions are therefore really __init functions, but because
      they're called by identify_cpu(), they must be marked __cpuinit.
      
      This patch splits identify_cpu() into identify_boot_cpu() and
      identify_secondary_cpu(), and calls the appropriate init functions
      from each.  Also, identify_boot_cpu() and all the functions it
      dominates are marked __init.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      a6c4e076
    • J
      [PATCH] i386: Clean up asm-i386/bugs.h · 1353ebb4
      Jeremy Fitzhardinge 提交于
      Most of asm-i386/bugs.h is code which should be in a C file, so put it there.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      1353ebb4
    • J
      [PATCH] x86: fix amd64-agp aperture validation · b92e9fac
      Jan Beulich 提交于
      Under CONFIG_DISCONTIGMEM, assuming that a !pfn_valid() implies all
      subsequent pfn-s are also invalid is wrong. Thus replace this by
      explicitly checking against the E820 map.
      
      AK: make e820 on x86-64 not initdata
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NMark Langsdorf <mark.langsdorf@amd.com>
      b92e9fac
    • J
      [PATCH] i386: Add machine_ops interface to abstract halting and rebooting · 07f3331c
      Jeremy Fitzhardinge 提交于
      machine_ops is an interface for the machine_* functions defined in
      <linux/reboot.h>.  This is intended to allow hypervisors to intercept
      the reboot process, but it could be used to implement other x86
      subarchtecture reboots.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      07f3331c
    • J
      [PATCH] i386: Add smp_ops interface · 01a2f435
      Jeremy Fitzhardinge 提交于
      Add a smp_ops interface.  This abstracts the API defined by
      <linux/smp.h> for use within arch/i386.  The primary intent is that it
      be used by a paravirtualizing hypervisor to implement SMP, but it
      could also be used by non-APIC-using sub-architectures.
      
      This is related to CONFIG_PARAVIRT, but is implemented unconditionally
      since it is simpler that way and not a highly performance-sensitive
      interface.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      01a2f435
    • R
      [PATCH] i386: cleanup GDT Access · 4fbb5968
      Rusty Russell 提交于
      Now we have an explicit per-cpu GDT variable, we don't need to keep the
      descriptors around to use them to find the GDT: expose cpu_gdt directly.
      
      We could go further and make load_gdt() pack the descriptor for us, or even
      assume it means "load the current cpu's GDT" which is what it always does.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      4fbb5968
    • A
      [PATCH] i386: Use X86_EFLAGS_IF in irqflags.h. · b4531e86
      Andi Kleen 提交于
      Move X86_EFLAGS_IF et al out to a new header: processor-flags.h, so we
      can include it from irqflags.h and use it in raw_irqs_disabled_flags().
      
      As a side-effect, we could now use these flags in .S files.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      b4531e86
    • J
      [PATCH] x86: Improve handling of kernel mappings in change_page_attr · d01ad8dd
      Jan Beulich 提交于
      Fix various broken corner cases in i386 and x86-64 change_page_attr.
      
      AK: split off from tighten kernel image access rights
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      d01ad8dd
    • R
      [PATCH] i386: rationalize paravirt wrappers · 90a0a06a
      Rusty Russell 提交于
      paravirt.c used to implement native versions of all low-level
      functions.  Far cleaner is to have the native versions exposed in the
      headers and as inline native_XXX, and if !CONFIG_PARAVIRT, then simply
      #define XXX native_XXX.
      
      There are several nice side effects:
      
      1) write_dt_entry() now takes the correct "struct Xgt_desc_struct *"
         not "void *".
      
      2) load_TLS is reintroduced to the for loop, not manually unrolled
         with a #error in case the bounds ever change.
      
      3) Macros become inlines, with type checking.
      
      4) Access to the native versions is trivial for KVM, lguest, Xen and
         others who might want it.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Avi Kivity <avi@qumranet.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      90a0a06a
    • R
      [PATCH] i386: clean up cpu_init() · d2cbcc49
      Rusty Russell 提交于
      We now have cpu_init() and secondary_cpu_init() doing nothing but calling
      _cpu_init() with the same arguments.  Rename _cpu_init() to cpu_init() and use
      it as a replcement for secondary_cpu_init().
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      d2cbcc49
    • R
      [PATCH] i386: Use per-cpu GDT immediately upon boot · bf504672
      Rusty Russell 提交于
      Now we are no longer dynamically allocating the GDT, we don't need the
      "cpu_gdt_table" at all: we can switch straight from "boot_gdt_table" to the
      per-cpu GDT.  This means initializing the cpu_gdt array in C.
      
      The boot CPU uses the per-cpu var directly, then in smp_prepare_cpus() it
      switches to the per-cpu copy just allocated.  For secondary CPUs, the
      early_gdt_descr is set to point directly to their per-cpu copy.
      
      For UP the code is very simple: it keeps using the "per-cpu" GDT as per SMP,
      but we never have to move.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      bf504672
    • R
      [PATCH] i386: Use per-cpu variables for GDT, PDA · ae1ee11b
      Rusty Russell 提交于
      Allocating PDA and GDT at boot is a pain.  Using simple per-cpu variables adds
      happiness (although we need the GDT page-aligned for Xen, which we do in a
      followup patch).
      
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      ae1ee11b
    • I
      [PATCH] i386: Allow i386 crash kernels to handle x86_64 dumps · 79e03011
      Ian Campbell 提交于
      The specific case I am encountering is kdump under Xen with a 64 bit
      hypervisor and 32 bit kernel/userspace.  The dump created is 64 bit due to
      the hypervisor but the dump kernel is 32 bit for maximum compatibility.
      
      It's possibly less likely to be useful in a purely native scenario but I
      see no reason to disallow it.
      
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NIan Campbell <ian.campbell@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NVivek Goyal <vgoyal@in.ibm.com>
      Cc: Horms <horms@verge.net.au>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      79e03011
    • R
      [PATCH] i386: Initialize esp0 properly all the time · 692174b9
      Rusty Russell 提交于
      Whenever we schedule, __switch_to calls load_esp0 which does:
      
      	tss->esp0 = thread->esp0;
      
      This is never initialized for the initial thread (ie "swapper"), so when we're
      scheduling that, we end up setting esp0 to 0.  This is fine: the swapper never
      leaves ring 0, so this field is never used.
      
      lguest, however, gets upset that we're trying to used an unmapped page as our
      kernel stack.  Rather than work around it there, let's initialize it.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      692174b9
    • J
      [PATCH] x86: Log reason why TSC was marked unstable · 5a90cf20
      john stultz 提交于
      Change mark_tsc_unstable() so it takes a string argument, which holds the
      reason the TSC was marked unstable.
      
      This is then displayed the first time mark_tsc_unstable is called.
      
      This should help us better debug why the TSC was marked unstable on certain
      systems and allow us to make sure we're not being overly paranoid when
      throwing out this troublesome clocksource.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      5a90cf20
    • V
      [PATCH] i386: modpost apic related warning fixes · 1833d6bc
      Vivek Goyal 提交于
      o Modpost generates warnings for i386 if compiled with CONFIG_RELOCATABLE=y
      
      WARNING: vmlinux - Section mismatch: reference to .init.text:find_unisys_acpi_oem_table from .text between 'acpi_madt_oem_check' (at offset 0xc0101eda) and 'enable_apic_mode'
      WARNING: vmlinux - Section mismatch: reference to .init.text:acpi_get_table_header_early from .text between 'acpi_madt_oem_check' (at offset 0xc0101ef0) and 'enable_apic_mode'
      WARNING: vmlinux - Section mismatch: reference to .init.text:parse_unisys_oem from .text between 'acpi_madt_oem_check' (at offset 0xc0101f2e) and 'enable_apic_mode'
      WARNING: vmlinux - Section mismatch: reference to .init.text:setup_unisys from .text between 'acpi_madt_oem_check' (at offset 0xc0101f37) and 'enable_apic_mode'WARNING: vmlinux - Section mismatch: reference to .init.text:parse_unisys_oem from .text between 'mps_oem_check' (at offset 0xc0101ec7) and 'acpi_madt_oem_check'
      WARNING: vmlinux - Section mismatch: reference to .init.text:es7000_sw_apic from .text between 'enable_apic_mode' (at offset 0xc0101f48) and 'check_apicid_present'
      
      o Some functions which are inline (acpi_madt_oem_check) are not inlined by
        compiler as these functions are accessed using function pointer. These
        functions are put in .text section and they in-turn access __init type
        functions hence modpost generates warnings.
      
      o Do not iniline acpi_madt_oem_check, instead make it __init.
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      1833d6bc