1. 23 7月, 2007 1 次提交
    • A
      x86: Fix alternatives and kprobes to remap write-protected kernel text · 19d36ccd
      Andi Kleen 提交于
      Reenable kprobes and alternative patching when the kernel text is write
      protected by DEBUG_RODATA
      
      Add a general utility function to change write protected text.  The new
      function remaps the code using vmap to write it and takes care of CPU
      synchronization.  It also does CLFLUSH to make icache recovery faster.
      
      There are some limitations on when the function can be used, see the
      comment.
      
      This is a newer version that also changes the paravirt_ops code.
      text_poke also supports multi byte patching now.
      
      Contains bug fixes from Zach Amsden and suggestions from Mathieu
      Desnoyers.
      
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Zach Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      19d36ccd
  2. 18 7月, 2007 2 次提交
    • J
      Add a sched_clock paravirt_op · 688340ea
      Jeremy Fitzhardinge 提交于
      The tsc-based get_scheduled_cycles interface is not a good match for
      Xen's runstate accounting, which reports everything in nanoseconds.
      
      This patch replaces this interface with a sched_clock interface, which
      matches both Xen and VMI's requirements.
      
      In order to do this, we:
         1. replace get_scheduled_cycles with sched_clock
         2. hoist cycles_2_ns into a common header
         3. update vmi accordingly
      
      One thing to note: because sched_clock is implemented as a weak
      function in kernel/sched.c, we must define a real function in order to
      override this weak binding.  This means the usual paravirt_ops
      technique of using an inline function won't work in this case.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Dan Hecht <dhecht@vmware.com>
      Cc: john stultz <johnstul@us.ibm.com>
      688340ea
    • J
      paravirt: helper to disable all IO space · d572929c
      Jeremy Fitzhardinge 提交于
      In a virtual environment, device drivers such as legacy IDE will waste
      quite a lot of time probing for their devices which will never appear.
      This helper function allows a paravirt implementation to lay claim to
      the whole iomem and ioport space, thereby disabling all device drivers
      trying to claim IO resources.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      d572929c
  3. 11 5月, 2007 1 次提交
    • E
      Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization" · 5a18c92a
      Eric W. Biederman 提交于
      This reverts commit c9ccf30d.
      
      Entering the kernel at startup_32 without passing our real mode data in
      %esi, and without guaranteeing that physical and virtual addresses are
      identity mapped makes head.S impossible to maintain.
      
      The only user of this infrastructure is lguest which is not merged so
      nothing we currently support will break by removing this over designed
      nightmare, and only the pending lguest patches will be affected.  The
      pending Xen patches have a different entry point that they use.
      
      We are currently discussing what Xen and lguest need to do to boot the
      kernel in a more normal fashion so using startup_32 in this weird manner is
      clearly not their long term direction.
      
      So let's remove this code in head.S before it causes brain damage to people
      trying to maintain head.S
      
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Zachary Amsden <zach@vmware.com>
      CC: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5a18c92a
  4. 03 5月, 2007 14 次提交
    • J
      [PATCH] i386: PARAVIRT: fix startup_ipi_hook config dependency · 0260c196
      Jeremy Fitzhardinge 提交于
      startup_ipi_hook depends on CONFIG_X86_LOCAL_APIC, so move it to the
      right part of the paravirt_ops initialization.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      0260c196
    • A
      [PATCH] i386: PARAVIRT: Export paravirt_ops for non GPL modules too · 21564fd2
      Andi Kleen 提交于
      Otherwise non GPL modules cannot even do basic operations
      like disabling interrupts anymore, which would be excessive.
      
      Longer term should split the single structure up into
      internal and external symbols and not export the internal
      ones at all.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      21564fd2
    • J
      [PATCH] i386: PARAVIRT: drop unused ptep_get_and_clear · 4cdd9c89
      Jeremy Fitzhardinge 提交于
      In shadow mode hypervisors, ptep_get_and_clear achieves the desired
      purpose of keeping the shadows in sync by issuing a native_get_and_clear,
      followed by a call to pte_update, which indicates the PTE has been
      modified.
      
      Direct mode hypervisors (Xen) have no need for this anyway, and will trap
      the update using writable pagetables.
      
      This means no hypervisor makes use of ptep_get_and_clear; there is no
      reason to have it in the paravirt-ops structure.  Change confusing
      terminology about raw vs. native functions into consistent use of
      native_pte_xxx for operations which do not invoke paravirt-ops.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      4cdd9c89
    • J
      [PATCH] i386: PARAVIRT: add kmap_atomic_pte for mapping highpte pages · ce6234b5
      Jeremy Fitzhardinge 提交于
      Xen and VMI both have special requirements when mapping a highmem pte
      page into the kernel address space.  These can be dealt with by adding
      a new kmap_atomic_pte() function for mapping highptes, and hooking it
      into the paravirt_ops infrastructure.
      
      Xen specifically wants to map the pte page RO, so this patch exposes a
      helper function, kmap_atomic_prot, which maps the page with the
      specified page protections.
      
      This also adds a kmap_flush_unused() function to clear out the cached
      kmap mappings.  Xen needs this to clear out any potential stray RW
      mappings of pages which will become part of a pagetable.
      
      [ Zach - vmi.c will need some attention after this patch.  It wasn't
        immediately obvious to me what needs to be done. ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      ce6234b5
    • J
      [PATCH] i386: PARAVIRT: revert map_pt_hook. · a27fe809
      Jeremy Fitzhardinge 提交于
      Back out the map_pt_hook to clear the way for kmap_atomic_pte.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      a27fe809
    • J
      [PATCH] i386: PARAVIRT: add flush_tlb_others paravirt_op · d4c10477
      Jeremy Fitzhardinge 提交于
      This patch adds a pv_op for flush_tlb_others.  Linux running on native
      hardware uses cross-CPU IPIs to flush the TLB on any CPU which may
      have a particular mm's pagetable entries cached in its TLB.  This is
      inefficient in a paravirtualized environment, since the hypervisor
      knows which real CPUs actually contain cached mappings, which may be a
      small subset of a guest's VCPUs.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      d4c10477
    • J
      [PATCH] i386: PARAVIRT: add common patching machinery · 63f70270
      Jeremy Fitzhardinge 提交于
      Implement the actual patching machinery.  paravirt_patch_default()
      contains the logic to automatically patch a callsite based on a few
      simple rules:
      
       - if the paravirt_op function is paravirt_nop, then patch nops
       - if the paravirt_op function is a jmp target, then jmp to it
       - if the paravirt_op function is callable and doesn't clobber too much
          for the callsite, call it directly
      
      paravirt_patch_default is suitable as a default implementation of
      paravirt_ops.patch, will remove most of the expensive indirect calls
      in favour of either a direct call or a pile of nops.
      
      Backends may implement their own patcher, however.  There are several
      helper functions to help with this:
      
      paravirt_patch_nop	nop out a callsite
      paravirt_patch_ignore	leave the callsite as-is
      paravirt_patch_call	patch a call if the caller and callee
      			have compatible clobbers
      paravirt_patch_jmp	patch in a jmp
      paravirt_patch_insns	patch some literal instructions over
      			the callsite, if they fit
      
      This patch also implements more direct patches for the native case, so
      that when running on native hardware many common operations are
      implemented inline.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Anthony Liguori <anthony@codemonkey.ws>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      63f70270
    • J
      [PATCH] i386: PARAVIRT: Use patch site IDs computed from offset in paravirt_ops structure · d5822035
      Jeremy Fitzhardinge 提交于
      Use patch type identifiers derived from the offset of the operation in
      the paravirt_ops structure.  This avoids having to maintain a separate
      enum for patch site types.
      
      Also, since the identifier is derived from the offset into
      paravirt_ops, the offset can be derived from the identifier.  This is
      used to remove replicated information in the various callsite macros,
      which has been a source of bugs in the past.
      
      This patch also drops the fused save_fl+cli operation, which doesn't
      really add much and makes things more complex - specifically because
      it breaks the 1:1 relationship between identifiers and offsets.  If
      this operation turns out to be particularly beneficial, then the right
      answer is to define a new entrypoint for it.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      d5822035
    • J
      [PATCH] x86: PARAVIRT: add hooks to intercept mm creation and destruction · d6dd61c8
      Jeremy Fitzhardinge 提交于
      Add hooks to allow a paravirt implementation to track the lifetime of
      an mm.  Paravirtualization requires three hooks, but only two are
      needed in common code.  They are:
      
      arch_dup_mmap, which is called when a new mmap is created at fork
      
      arch_exit_mmap, which is called when the last process reference to an
        mm is dropped, which typically happens on exit and exec.
      
      The third hook is activate_mm, which is called from the arch-specific
      activate_mm() macro/function, and so doesn't need stub versions for
      other architectures.  It's called when an mm is first used.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: linux-arch@vger.kernel.org
      Cc: James Bottomley <James.Bottomley@SteelEye.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      d6dd61c8
    • J
      [PATCH] i386: PARAVIRT: Allow paravirt backend to choose kernel PMD sharing · 5311ab62
      Jeremy Fitzhardinge 提交于
      Normally when running in PAE mode, the 4th PMD maps the kernel address space,
      which can be shared among all processes (since they all need the same kernel
      mappings).
      
      Xen, however, does not allow guests to have the kernel pmd shared between page
      tables, so parameterize pgtable.c to allow both modes of operation.
      
      There are several side-effects of this.  One is that vmalloc will update the
      kernel address space mappings, and those updates need to be propagated into
      all processes if the kernel mappings are not intrinsically shared.  In the
      non-PAE case, this is done by maintaining a pgd_list of all processes; this
      list is used when all process pagetables must be updated.  pgd_list is
      threaded via otherwise unused entries in the page structure for the pgd, which
      means that the pgd must be page-sized for this to work.
      
      Normally the PAE pgd is only 4x64 byte entries large, but Xen requires the PAE
      pgd to page aligned anyway, so this patch forces the pgd to be page
      aligned+sized when the kernel pmd is unshared, to accomodate both these
      requirements.
      
      Also, since there may be several distinct kernel pmds (if the user/kernel
      split is below 3G), there's no point in allocating them from a slab cache;
      they're just allocated with get_free_page and initialized appropriately.  (Of
      course the could be cached if there is just a single kernel pmd - which is the
      default with a 3G user/kernel split - but it doesn't seem worthwhile to add
      yet another case into this code).
      
      [ Many thanks to wli for review comments. ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NWilliam Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      5311ab62
    • J
      [PATCH] i386: PARAVIRT: Hooks to set up initial pagetable · b239fb25
      Jeremy Fitzhardinge 提交于
      This patch introduces paravirt_ops hooks to control how the kernel's
      initial pagetable is set up.
      
      In the case of a native boot, the very early bootstrap code creates a
      simple non-PAE pagetable to map the kernel and physical memory.  When
      the VM subsystem is initialized, it creates a proper pagetable which
      respects the PAE mode, large pages, etc.
      
      When booting under a hypervisor, there are many possibilities for what
      paging environment the hypervisor establishes for the guest kernel, so
      the constructon of the kernel's pagetable depends on the hypervisor.
      
      In the case of Xen, the hypervisor boots the kernel with a fully
      constructed pagetable, which is already using PAE if necessary.  Also,
      Xen requires particular care when constructing pagetables to make sure
      all pagetables are always mapped read-only.
      
      In order to make this easier, kernel's initial pagetable construction
      has been changed to only allocate and initialize a pagetable page if
      there's no page already present in the pagetable.  This allows the Xen
      paravirt backend to make a copy of the hypervisor-provided pagetable,
      allowing the kernel to establish any more mappings it needs while
      keeping the existing ones.
      
      A slightly subtle point which is worth highlighting here is that Xen
      requires all kernel mappings to share the same pte_t pages between all
      pagetables, so that updating a kernel page's mapping in one pagetable
      is reflected in all other pagetables.  This makes it possible to
      allocate a page and attach it to a pagetable without having to
      explicitly enumerate that page's mapping in all pagetables.
      
      And:
      
      +From: "Eric W. Biederman" <ebiederm@xmission.com>
      
      If we don't set the leaf page table entries it is quite possible that
      will inherit and incorrect page table entry from the initial boot
      page table setup in head.S.  So we need to redo the effort here,
      so we pick up PSE, PGE and the like.
      
      Hypervisors like Xen require that their page tables be read-only,
      which is slightly incompatible with our low identity mappings, however
      I discussed this with Jeremy he has modified the Xen early set_pte
      function to avoid problems in this area.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NWilliam Irwin <bill.irwin@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      b239fb25
    • J
      [PATCH] i386: PARAVIRT: Add pagetable accessors to pack and unpack pagetable entries · 3dc494e8
      Jeremy Fitzhardinge 提交于
      Add a set of accessors to pack, unpack and modify page table entries
      (at all levels).  This allows a paravirt implementation to control the
      contents of pgd/pmd/pte entries.  For example, Xen uses this to
      convert the (pseudo-)physical address into a machine address when
      populating a pagetable entry, and converting back to pphys address
      when an entry is read.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      3dc494e8
    • J
      [PATCH] i386: PARAVIRT: use paravirt_nop to consistently mark no-op operations · 45876233
      Jeremy Fitzhardinge 提交于
      Add a _paravirt_nop function for use as a stub for no-op operations,
      and paravirt_nop #defined void * version to make using it easier
      (since all its uses are as a void *).
      
      This is useful to allow the patcher to automatically identify noop
      operations so it can simply nop out the callsite.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      [mingo] but only as a cleanup of the current open-coded (void *) casts.
      My problem with this is that it loses the types. Not that there is much
      to check for, but still, this adds some assumptions about how function
      calls look like
      45876233
    • R
      [PATCH] i386: rationalize paravirt wrappers · 90a0a06a
      Rusty Russell 提交于
      paravirt.c used to implement native versions of all low-level
      functions.  Far cleaner is to have the native versions exposed in the
      headers and as inline native_XXX, and if !CONFIG_PARAVIRT, then simply
      #define XXX native_XXX.
      
      There are several nice side effects:
      
      1) write_dt_entry() now takes the correct "struct Xgt_desc_struct *"
         not "void *".
      
      2) load_TLS is reintroduced to the for loop, not manually unrolled
         with a #error in case the bounds ever change.
      
      3) Macros become inlines, with type checking.
      
      4) Access to the native versions is trivial for KVM, lguest, Xen and
         others who might want it.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Avi Kivity <avi@qumranet.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      90a0a06a
  5. 05 3月, 2007 5 次提交
    • Z
      [PATCH] vmi: pit override · e30fab3a
      Zachary Amsden 提交于
      The time_init_hook in paravirt-ops no longer functions in the correct manner
      after the integration of the hrtimers code.  The problem is that now the call
      path for time initialization is:
      
        time_init :
             late_time_init = hpet_time_init;
      
        late_time_init -> hpet_time_init:
             setup_pit_timer (BAD)
             do_time_init --> (via paravirt.h)
                time_init_hook --> (via arch_hooks.h)
                    time_init_hook (in SUBARCH/setup.c)
      
      If this isn't confusing enough, the paravirt case goes through an indirect
      function pointer in the paravirt-ops table.  The problem is, by the time the
      paravirt hook is called, the pit timer is already enabled.
      
      But paravirt guests have their own timer, and don't want to use the PIT.
      Rather than intensify the struggle for power going on here, just make it all
      nice and simple and just unconditionally do all timer setup in the
      late_time_init hook.  This also has the advantage of enabling timers in the
      same place in all code paths, so everyone has the same bugs and we don't have
      outliers who break other code because they turn on timer too early or too
      late.
      
      So the paravirt-ops time init function is now by default hpet_time_init, which
      is the time init function used for native hardware.  Paravirt guests have the
      chance to override this when they setup the paravirt-ops table, and should
      need no change.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e30fab3a
    • Z
      [PATCH] vmi: paravirt drop udelay op · eda08b1b
      Zachary Amsden 提交于
      Not respecting udelay causes problems with any virtual hardware that is passed
      through to real hardware.  This can be noticed by any device that interacts
      with the real world in real time - like AP startup, which takes real time.  Or
      keyboard LEDs, which should blink in real-time.  Or floppy drives, but only
      when passed through to a real floppy controller on OSes which can't
      sufficiently buffer the floppy commands to emulate a zero latency floppy.  Or
      IDE drives, when connecting to a physical CDROM.
      
      This was mostly a hack to get the kernel to boot faster, but it introduced a
      number of misvirtualization bugs, and Alan and Pavel argued pretty strongly
      against it.  We were the only client, and now want to clean up this cruft.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eda08b1b
    • Z
      [PATCH] vmi: fix highpte · 9a1c13e9
      Zachary Amsden 提交于
      Provide a PT map hook for HIGHPTE kernels to designate where they are mapping
      page tables.  This information is required so the physical address of PTE
      updates can be determined; otherwise, the mm layer would have to carry the
      physical address all the way to each PTE modification callsite, which is even
      more hideous that the macros required to provide the proper hooks.
      
      So lets not mess up arch neutral code to achieve this, but keep the horror in
      an #ifdef HIGHPTE in include/asm-i386/pgtable.h.  I had to use macros here
      because some types are not yet defined in all the include paths for this
      header.
      
      This patch is absolutely required for HIGHPTE kernels to operate properly with
      VMI.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a1c13e9
    • Z
      [PATCH] vmi: cpu cycles fix · 1182d852
      Zachary Amsden 提交于
      In order to share the common code in tsc.c which does CPU Khz calibration, we
      need to make an accurate value of CPU speed available to the tsc.c code.  This
      value loses a lot of precision in a VM because of the timing differences with
      real hardware, but we need it to be as precise as possible so the guest can
      make accurate time calculations with the cycle counters.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1182d852
    • Z
      [PATCH] vmi: sched clock paravirt op fix · 6cb9a835
      Zachary Amsden 提交于
      The custom_sched_clock hook is broken.  The result from sched_clock needs to
      be in nanoseconds, not in CPU cycles.  The TSC is insufficient for this
      purpose, because TSC is poorly defined in a virtual environment, and mostly
      represents real world time instead of scheduled process time (which can be
      interrupted without notice when a virtual machine is descheduled).
      
      To make the scheduler consistent, we must expose a different nature of time,
      that is scheduled time.  So deprecate this custom_sched_clock hack and turn it
      into a paravirt-op, as it should have been all along.  This allows the tsc.c
      code which converts cycles to nanoseconds to be shared by all paravirt-ops
      backends.
      
      It is unfortunate to add a new paravirt-op, but this is a very distinct
      abstraction which is clearly different for all virtual machine
      implementations, and it gets rid of an ugly indirect function which I
      ashamedly admit I hacked in to try to get this to work earlier, and then even
      got in the wrong units.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6cb9a835
  6. 13 2月, 2007 6 次提交
    • R
      [PATCH] i386: paravirt unhandled fallthrough · 992af681
      Rusty Russell 提交于
      The current code simply calls "start_kernel" directly if we're under a
      hypervisor and no paravirt_ops backend wants us, because paravirt.c
      registers that as a backend.
      
      This was always a vain hope; start_kernel won't get far without setup.
      It's also impossible for paravirt_ops backends which don't sit in the
      arch/i386/kernel directory: they can't link before paravirt.o anyway.
      
      Keep it simple: if we pass all the registered paravirt probes, BUG().
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      992af681
    • A
      [PATCH] i386: Remove fastcall in paravirt.[ch] · 1a1eecd1
      Andi Kleen 提交于
      Not needed because fastcall is always default now
      Signed-off-by: NAndi Kleen <ak@suse.de>
      1a1eecd1
    • Z
      [PATCH] i386: vMI timer patches · bbab4f3b
      Zachary Amsden 提交于
      VMI timer code.  It works by taking over the local APIC clock when APIC is
      configured, which requires a couple hooks into the APIC code.  The backend
      timer code could be commonized into the timer infrastructure, but there are
      some pieces missing (stolen time, in particular), and the exact semantics of
      when to do accounting for NO_IDLE need to be shared between different
      hypervisors as well.  So for now, VMI timer is a separate module.
      
      [Adrian Bunk: cleanups]
      
      Subject: VMI timer patches
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      bbab4f3b
    • Z
      [PATCH] i386: SMP boot hook for paravirt · ae5da273
      Zachary Amsden 提交于
      Add VMI SMP boot hook.  We emulate a regular boot sequence and use the same
      APIC IPI initiation, we just poke magic values to load into the CPU state when
      the startup IPI is received, rather than having to jump through a real mode
      trampoline.
      
      This is all that was needed to get SMP to work.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      ae5da273
    • Z
      [PATCH] i386: paravirt CPU hypercall batching mode · 9226d125
      Zachary Amsden 提交于
      The VMI ROM has a mode where hypercalls can be queued and batched.  This turns
      out to be a significant win during context switch, but must be done at a
      specific point before side effects to CPU state are visible to subsequent
      instructions.  This is similar to the MMU batching hooks already provided.
      The same hooks could be used by the Xen backend to implement a context switch
      multicall.
      
      To explain a bit more about lazy modes in the paravirt patches, basically, the
      idea is that only one of lazy CPU or MMU mode can be active at any given time.
       Lazy MMU mode is similar to this lazy CPU mode, and allows for batching of
      multiple PTE updates (say, inside a remap loop), but to avoid keeping some
      kind of state machine about when to flush cpu or mmu updates, we just allow
      one or the other to be active.  Although there is no real reason a more
      comprehensive scheme could not be implemented, there is also no demonstrated
      need for this extra complexity.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      9226d125
    • Z
      [PATCH] MM: page allocation hooks for VMI backend · c119ecce
      Zachary Amsden 提交于
      The VMI backend uses explicit page type notification to track shadow page
      tables.  The allocation of page table roots is especially tricky.  We need to
      clone the root for non-PAE mode while it is protected under the pgd lock to
      correctly copy the shadow.
      
      We don't need to allocate pgds in PAE mode, (PDPs in Intel terminology) as
      they only have 4 entries, and are cached entirely by the processor, which
      makes shadowing them rather simple.
      
      For base page table level allocation, pmd_populate provides the exact hook
      point we need.  Also, we need to allocate pages when splitting a large page,
      and we must release pages before returning the page to any free pool.
      
      Despite being required with these slightly odd semantics for VMI, Xen also
      uses these hooks to determine the exact moment when page tables are created or
      released.
      
      AK: All nops for other architectures
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      c119ecce
  7. 23 1月, 2007 1 次提交
  8. 07 12月, 2006 5 次提交