提交 · 9d02b43dee0d7fb18dfb13a00915550b1a3daa9f · openeuler / Kernel

02 11月, 2012 1 次提交

xen PVonHVM: use E820_Reserved area for shared_info · 9d02b43d

由 Olaf Hering 提交于 11月 01, 2012

This is a respin of 00e37bdb
("xen PVonHVM: move shared_info to MMIO before kexec").

Currently kexec in a PVonHVM guest fails with a triple fault because the
new kernel overwrites the shared info page. The exact failure depends on
the size of the kernel image. This patch moves the pfn from RAM into an
E820 reserved memory area.

The pfn containing the shared_info is located somewhere in RAM. This will
cause trouble if the current kernel is doing a kexec boot into a new
kernel. The new kernel (and its startup code) can not know where the pfn
is, so it can not reserve the page. The hypervisor will continue to update
the pfn, and as a result memory corruption occours in the new kernel.

The toolstack marks the memory area FC000000-FFFFFFFF as reserved in the
E820 map. Within that range newer toolstacks (4.3+) will keep 1MB
starting from FE700000 as reserved for guest use. Older Xen4 toolstacks
will usually not allocate areas up to FE700000, so FE700000 is expected
to work also with older toolstacks.

In Xen3 there is no reserved area at a fixed location. If the guest is
started on such old hosts the shared_info page will be placed in RAM. As
a result kexec can not be used.
Signed-off-by: NOlaf Hering <olaf@aepfle.de>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

9d02b43d

17 8月, 2012 1 次提交

Revert "xen PVonHVM: move shared_info to MMIO before kexec" · ca08649e

由 Konrad Rzeszutek Wilk 提交于 8月 16, 2012

This reverts commit 00e37bdb.

During shutdown of PVHVM guests with more than 2VCPUs on certain
machines we can hit the race where the replaced shared_info is not
replaced fast enough and the PV time clock retries reading the same
area over and over without any any success and is stuck in an
infinite loop.
Acked-by: NOlaf Hering <olaf@aepfle.de>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ca08649e

20 7月, 2012 1 次提交

xen PVonHVM: move shared_info to MMIO before kexec · 00e37bdb

由 Olaf Hering 提交于 7月 17, 2012

Currently kexec in a PVonHVM guest fails with a triple fault because the
new kernel overwrites the shared info page. The exact failure depends on
the size of the kernel image. This patch moves the pfn from RAM into
MMIO space before the kexec boot.

The pfn containing the shared_info is located somewhere in RAM. This
will cause trouble if the current kernel is doing a kexec boot into a
new kernel. The new kernel (and its startup code) can not know where the
pfn is, so it can not reserve the page. The hypervisor will continue to
update the pfn, and as a result memory corruption occours in the new
kernel.

One way to work around this issue is to allocate a page in the
xen-platform pci device's BAR memory range. But pci init is done very
late and the shared_info page is already in use very early to read the
pvclock. So moving the pfn from RAM to MMIO is racy because some code
paths on other vcpus could access the pfn during the small   window when
the old pfn is moved to the new pfn. There is even a  small window were
the old pfn is not backed by a mfn, and during that time all reads
return -1.

Because it is not known upfront where the MMIO region is located it can
not be used right from the start in xen_hvm_init_shared_info.

To minimise trouble the move of the pfn is done shortly before kexec.
This does not eliminate the race because all vcpus are still online when
the syscore_ops will be called. But hopefully there is no work pending
at this point in time. Also the syscore_op is run last which reduces the
risk further.
Signed-off-by: NOlaf Hering <olaf@aepfle.de>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

00e37bdb

26 2月, 2011 2 次提交
- I
  xen: suspend: add "arch" to pre/post suspend hooks · 03c8142b
  由 Ian Campbell 提交于 2月 17, 2011
```
xen_pre_device_suspend is unused on ia64.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
```
  03c8142b
- S
  xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabled · e057a4b6
  由 Stefano Stabellini 提交于 2月 11, 2011
```
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
```
  e057a4b6
02 12月, 2010 1 次提交

xen: unplug the emulated devices at resume time · 512b109e

由 Stefano Stabellini 提交于 12月 01, 2010

Early after being resumed we need to unplug again the emulated devices.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

512b109e

27 7月, 2010 1 次提交

x86: Use xen_vcpuop_clockevent, xen_clocksource and xen wallclock. · 409771d2

由 Stefano Stabellini 提交于 5月 14, 2010

Use xen_vcpuop_clockevent instead of hpet and APIC timers as main
clockevent device on all vcpus, use the xen wallclock time as wallclock
instead of rtc and use xen_clocksource as clocksource.
The pv clock algorithm needs to work correctly for the xen_clocksource
and xen wallclock to be usable, only modern Xen versions offer a
reliable pv clock in HVM guests (XENFEAT_hvm_safe_pvclock).

Using the hpet as clocksource means a VMEXIT every time we read/write to
the hpet mmio addresses, pvclock give us a better rating without
VMEXITs. Same goes for the xen wallclock and xen_vcpuop_clockevent
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NDon Dutile <ddutile@redhat.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

409771d2

23 7月, 2010 1 次提交

xen: Add suspend/resume support for PV on HVM guests. · 016b6f5f

由 Stefano Stabellini 提交于 5月 14, 2010

Suspend/resume requires few different things on HVM: the suspend
hypercall is different; we don't need to save/restore memory related
settings; except the shared info page and the callback mechanism.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

016b6f5f

03 6月, 2010 1 次提交

xen: ensure timer tick is resumed even on CPU driving the resume · cd52e17e

由 Ian Campbell 提交于 5月 19, 2010

The core suspend/resume code is run from stop_machine on CPU0 but
parts of the suspend/resume machinery (including xen_arch_resume) are
run on whichever CPU happened to schedule the xenwatch kernel thread.

As part of the non-core resume code xen_arch_resume is called in order
to restart the timer tick on non-boot processors. The boot processor
itself is taken care of by core timekeeping code.

xen_arch_resume uses smp_call_function which does not call the given
function on the current processor. This means that we can end up with
one CPU not receiving timer ticks if the xenwatch thread happened to
be scheduled on CPU > 0.

Use on_each_cpu instead of smp_call_function to ensure the timer tick
is resumed everywhere.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
Cc: Stable Kernel <stable@kernel.org> # .32.x

cd52e17e

04 12月, 2009 2 次提交

xen: call clock resume notifier on all CPUs · f6eafe36

由 Ian Campbell 提交于 11月 25, 2009

tick_resume() is never called on secondary processors. Presumably this
is because they are offlined for suspend on native and so this is
normally taken care of in the CPU onlining path. Under Xen we keep all
CPUs online over a suspend.

This patch papers over the issue for me but I will investigate a more
generic, less hacky, way of doing to the same.

tick_suspend is also only called on the boot CPU which I presume should
be fixed too.
Signed-off-by: NIan Campbell <Ian.Campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>

f6eafe36

xen: correctly restore pfn_to_mfn_list_list after resume · fa24ba62

由 Ian Campbell 提交于 11月 21, 2009

pvops kernels >= 2.6.30 can currently only be saved and restored once. The
second attempt to save results in:

ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys
ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2, max 0x120000
ERROR Internal error: Failed to map/save the p2m frame list

I finally narrowed it down to:

commit cdaead6b
Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Date: Fri Feb 27 15:34:59 2009 -0800

xen: split construction of p2m mfn tables from registration

Build the p2m_mfn_list_list early with the rest of the p2m table, but
register it later when the real shared_info structure is in place.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

The unforeseen side-effect of this change was to cause the mfn list list to not
be rebuilt on resume. Prior to this change it would have been rebuilt via
xen_post_suspend() -> xen_setup_shared_info() -> xen_setup_mfn_list_list().

Fix by explicitly calling xen_build_mfn_list_list() from xen_post_suspend().
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>

fa24ba62

23 1月, 2009 1 次提交

x86, xen: fix hardirq.h merge fallout · 99d0000f

由 Ingo Molnar 提交于 1月 23, 2009

Impact: build fix

This build error:

 arch/x86/xen/suspend.c:22: error: implicit declaration of function 'fix_to_virt'
 arch/x86/xen/suspend.c:22: error: 'FIX_PARAVIRT_BOOTMAP' undeclared (first use in this function)
 arch/x86/xen/suspend.c:22: error: (Each undeclared identifier is reported only once
 arch/x86/xen/suspend.c:22: error: for each function it appears in.)

triggers because the hardirq.h unification removed an implicit fixmap.h
include - on which arch/x86/xen/suspend.c depended. Add the fixmap.h
include explicitly.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

99d0000f

17 12月, 2008 1 次提交

xen: convert to cpumask_var_t and new cpumask primitives. · b78936e1

由 Mike Travis 提交于 12月 16, 2008

Simple change, and eventual space saving when NR_CPUS >> nr_cpu_ids.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NMike Travis <travis@sgi.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>

b78936e1

16 7月, 2008 1 次提交

xen: add xen_arch_resume()/xen_timer_resume hook for ia64 support · ad55db9f

由 Isaku Yamahata 提交于 7月 08, 2008

add xen_timer_resume() hook.

Timer resume should be done after event channel is resumed.
add xen_arch_resume() hook when ipi becomes usable after resume.
After resume, some cpu specific resource must be reinitialized
on ia64 that can't be set by another cpu.

However available hooks is run once on only one cpu so that ipi has
to be used.

During stop_machine_run() ipi can't be used because interrupt is masked.
So add another hook after stop_machine_run().
Another approach might be use resume hook which is run by
device_resume(). However device_resume() may be executed on
suspend error recovery path.

So it is necessary to determine whether it is executed on real resume path
or error recovery path.
Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ad55db9f

02 6月, 2008 2 次提交

xen: resume timers on all vcpus · d07af1f0

由 Jeremy Fitzhardinge 提交于 5月 31, 2008

On resume, the vcpu timer modes will not be restored.  The timer
infrastructure doesn't do this for us, since it assumes the cpus
are offline.  We can just poke the other vcpus into the right mode
directly though.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d07af1f0

xen: restore vcpu_info mapping · 9c7a7942

由 Jeremy Fitzhardinge 提交于 5月 31, 2008

If we're using vcpu_info mapping, then make sure its restored on all
processors before relasing them from stop_machine.

The only complication is that if this fails, we can't continue because
we've already made assumptions that the mapping is available (baked in
calls to the _direct versions of the functions, for example).

Fortunately this can only happen with a 32-bit hypervisor, which may
possibly run out of mapping space.  On a 64-bit hypervisor, this is a
non-issue.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9c7a7942

27 5月, 2008 1 次提交

xen: implement save/restore · 0e91398f

由 Jeremy Fitzhardinge 提交于 5月 26, 2008

This patch implements Xen save/restore and migration.

Saving is triggered via xenbus, which is polled in
drivers/xen/manage.c.  When a suspend request comes in, the kernel
prepares itself for saving by:

1 - Freeze all processes.  This is primarily to prevent any
    partially-completed pagetable updates from confusing the suspend
    process.  If CONFIG_PREEMPT isn't defined, then this isn't necessary.

2 - Suspend xenbus and other devices

3 - Stop_machine, to make sure all the other vcpus are quiescent.  The
    Xen tools require the domain to run its save off vcpu0.

4 - Within the stop_machine state, it pins any unpinned pgds (under
    construction or destruction), performs canonicalizes various other
    pieces of state (mostly converting mfns to pfns), and finally

5 - Suspend the domain

Restore reverses the steps used to save the domain, ending when all
the frozen processes are thawed.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

0e91398f

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功