提交 · c7617798771ad588d585986d896197c04b737621 · openeuler / Kernel

14 3月, 2011 3 次提交

xen/mmu: WARN_ON when racing to swap middle leaf. · c7617798

由 Konrad Rzeszutek Wilk 提交于 1月 18, 2011

The initial bootup code uses set_phys_to_machine quite a lot, and after
bootup it would be used by the balloon driver. The balloon driver does have
mutex lock so this should not be necessary - but just in case, add
a WARN_ON if we do hit this scenario. If we do fail this, it is OK
to continue as there is a backup mechanism (VM_IO) that can bypass
the P2M and still set the _PAGE_IOMAP flags.

[v2: Change from WARN to BUG_ON]
[v3: Rebased on top of xen->p2m code split]
[v4: Change from BUG_ON to WARN]
Reviewed-by: NIan Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

c7617798

xen/mmu: Set _PAGE_IOMAP if PFN is an identity PFN. · fb38923e

由 Konrad Rzeszutek Wilk 提交于 1月 05, 2011

If we find that the PFN is within the P2M as an identity
PFN make sure to tack on the _PAGE_IOMAP flag.
Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

fb38923e

xen/mmu: Add the notion of identity (1-1) mapping. · f4cec35b

由 Konrad Rzeszutek Wilk 提交于 1月 18, 2011

Our P2M tree structure is a three-level. On the leaf nodes
we set the Machine Frame Number (MFN) of the PFN. What this means
is that when one does: pfn_to_mfn(pfn), which is used when creating
PTE entries, you get the real MFN of the hardware. When Xen sets
up a guest it initially populates a array which has descending
(or ascending) MFN values, as so:

 idx: 0,  1,       2
 [0x290F, 0x290E, 0x290D, ..]

so pfn_to_mfn(2)==0x290D. If you start, restart many guests that list
starts looking quite random.

We graft this structure on our P2M tree structure and stick in
those MFN in the leafs. But for all other leaf entries, or for the top
root, or middle one, for which there is a void entry, we assume it is
"missing". So
 pfn_to_mfn(0xc0000)=INVALID_P2M_ENTRY.

We add the possibility of setting 1-1 mappings on certain regions, so
that:
 pfn_to_mfn(0xc0000)=0xc0000

The benefit of this is, that we can assume for non-RAM regions (think
PCI BARs, or ACPI spaces), we can create mappings easily b/c we
get the PFN value to match the MFN.

For this to work efficiently we introduce one new page p2m_identity and
allocate (via reserved_brk) any other pages we need to cover the sides
(1GB or 4MB boundary violations). All entries in p2m_identity are set to
INVALID_P2M_ENTRY type (Xen toolstack only recognizes that and MFNs,
no other fancy value).

On lookup we spot that the entry points to p2m_identity and return the identity
value instead of dereferencing and returning INVALID_P2M_ENTRY. If the entry
points to an allocated page, we just proceed as before and return the PFN.
If the PFN has IDENTITY_FRAME_BIT set we unmask that in appropriate functions
(pfn_to_mfn).

The reason for having the IDENTITY_FRAME_BIT instead of just returning the
PFN is that we could find ourselves where pfn_to_mfn(pfn)==pfn for a
non-identity pfn. To protect ourselves against we elect to set (and get) the
IDENTITY_FRAME_BIT on all identity mapped PFNs.

This simplistic diagram is used to explain the more subtle piece of code.
There is also a digram of the P2M at the end that can help.
Imagine your E820 looking as so:

                   1GB                                           2GB
/-------------------+---------\/----\         /----------\    /---+-----\
| System RAM        | Sys RAM ||ACPI|         | reserved |    | Sys RAM |
\-------------------+---------/\----/         \----------/    \---+-----/
                              ^- 1029MB                       ^- 2001MB

[1029MB = 263424 (0x40500), 2001MB = 512256 (0x7D100), 2048MB = 524288 (0x80000)]

And dom0_mem=max:3GB,1GB is passed in to the guest, meaning memory past 1GB
is actually not present (would have to kick the balloon driver to put it in).

When we are told to set the PFNs for identity mapping (see patch: "xen/setup:
Set identity mapping for non-RAM E820 and E820 gaps.") we pass in the start
of the PFN and the end PFN (263424 and 512256 respectively). The first step is
to reserve_brk a top leaf page if the p2m[1] is missing. The top leaf page
covers 512^2 of page estate (1GB) and in case the start or end PFN is not
aligned on 512^2*PAGE_SIZE (1GB) we loop on aligned 1GB PFNs from start pfn to
end pfn.  We reserve_brk top leaf pages if they are missing (means they point
to p2m_mid_missing).

With the E820 example above, 263424 is not 1GB aligned so we allocate a
reserve_brk page which will cover the PFNs estate from 0x40000 to 0x80000.
Each entry in the allocate page is "missing" (points to p2m_missing).

Next stage is to determine if we need to do a more granular boundary check
on the 4MB (or 2MB depending on architecture) off the start and end pfn's.
We check if the start pfn and end pfn violate that boundary check, and if
so reserve_brk a middle (p2m[x][y]) leaf page. This way we have a much finer
granularity of setting which PFNs are missing and which ones are identity.
In our example 263424 and 512256 both fail the check so we reserve_brk two
pages. Populate them with INVALID_P2M_ENTRY (so they both have "missing" values)
and assign them to p2m[1][2] and p2m[1][488] respectively.

At this point we would at minimum reserve_brk one page, but could be up to
three. Each call to set_phys_range_identity has at maximum a three page
cost. If we were to query the P2M at this stage, all those entries from
start PFN through end PFN (so 1029MB -> 2001MB) would return INVALID_P2M_ENTRY
("missing").

The next step is to walk from the start pfn to the end pfn setting
the IDENTITY_FRAME_BIT on each PFN. This is done in 'set_phys_range_identity'.
If we find that the middle leaf is pointing to p2m_missing we can swap it over
to p2m_identity - this way covering 4MB (or 2MB) PFN space.  At this point we
do not need to worry about boundary aligment (so no need to reserve_brk a middle
page, figure out which PFNs are "missing" and which ones are identity), as that
has been done earlier.  If we find that the middle leaf is not occupied by
p2m_identity or p2m_missing, we dereference that page (which covers
512 PFNs) and set the appropriate PFN with IDENTITY_FRAME_BIT. In our example
263424 and 512256 end up there, and we set from p2m[1][2][256->511] and
p2m[1][488][0->256] with IDENTITY_FRAME_BIT set.

All other regions that are void (or not filled) either point to p2m_missing
(considered missing) or have the default value of INVALID_P2M_ENTRY (also
considered missing). In our case, p2m[1][2][0->255] and p2m[1][488][257->511]
contain the INVALID_P2M_ENTRY value and are considered "missing."

This is what the p2m ends up looking (for the E820 above) with this
fabulous drawing:

   p2m         /--------------\
 /-----\       | &mfn_list[0],|                           /-----------------\
 |  0  |------>| &mfn_list[1],|    /---------------\      | ~0, ~0, ..      |
 |-----|       |  ..., ~0, ~0 |    | ~0, ~0, [x]---+----->| IDENTITY [@256] |
 |  1  |---\   \--------------/    | [p2m_identity]+\     | IDENTITY [@257] |
 |-----|    \                      | [p2m_identity]+\\    | ....            |
 |  2  |--\  \-------------------->|  ...          | \\   \----------------/
 |-----|   \                       \---------------/  \\
 |  3  |\   \                                          \\  p2m_identity
 |-----| \   \-------------------->/---------------\   /-----------------\
 | ..  +->+                        | [p2m_identity]+-->| ~0, ~0, ~0, ... |
 \-----/ /                         | [p2m_identity]+-->| ..., ~0         |
        / /---------------\        | ....          |   \-----------------/
       /  | IDENTITY[@0]  |      /-+-[x], ~0, ~0.. |
      /   | IDENTITY[@256]|<----/  \---------------/
     /    | ~0, ~0, ....  |
    |     \---------------/
    |
    p2m_missing             p2m_missing
/------------------\     /------------\
| [p2m_mid_missing]+---->| ~0, ~0, ~0 |
| [p2m_mid_missing]+---->| ..., ~0    |
\------------------/     \------------/

where ~0 is INVALID_P2M_ENTRY. IDENTITY is (PFN | IDENTITY_BIT)
Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
[v5: Changed code to use ranges, added ASCII art]
[v6: Rebased on top of xen->p2m code split]
[v4: Squished patches in just this one]
[v7: Added RESERVE_BRK for potentially allocated pages]
[v8: Fixed alignment problem]
[v9: Changed 1<<3X to 1<<BITS_PER_LONG-X]
[v10: Copied git commit description in the p2m code + Add Review tag]
[v11: Title had '2-1' - should be '1-1' mapping]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

f4cec35b

04 3月, 2011 1 次提交

xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY. · 6eaa412f

由 Konrad Rzeszutek Wilk 提交于 1月 18, 2011

With this patch, we diligently set regions that will be used by the
balloon driver to be INVALID_P2M_ENTRY and under the ownership
of the balloon driver. We are OK using the __set_phys_to_machine
as we do not expect to be allocating any P2M middle or entries pages.
The set_phys_to_machine has the side-effect of potentially allocating
new pages and we do not want that at this stage.

We can do this because xen_build_mfn_list_list will have already
allocated all such pages up to xen_max_p2m_pfn.

We also move the check for auto translated physmap down the
stack so it is present in __set_phys_to_machine.

[v2: Rebased with mmu->p2m code split]
Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

6eaa412f

22 1月, 2011 1 次提交

xen: p2m: correctly initialize partial p2m leaf · 8e1b4cf2

由 Stefan Bader 提交于 1月 20, 2011

After changing the p2m mapping to a tree by

  commit 58e05027
    xen: convert p2m to a 3 level tree

and trying to boot a DomU with 615MB of memory, the following crash was
observed in the dump:

kernel direct mapping tables up to 26f00000 @ 1ec4000-1fff000
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c0107397>] xen_set_pte+0x27/0x60
*pdpt = 0000000000000000 *pde = 0000000000000000

Adding further debug statements showed that when trying to set up
pfn=0x26700 the returned mapping was invalid.

pfn=0x266ff calling set_pte(0xc1fe77f8, 0x6b3003)
pfn=0x26700 calling set_pte(0xc1fe7800, 0x3)

Although the last_pfn obtained from the startup info is 0x26700, which
should in turn not be hit, the additional 8MB which are added as extra
memory normally seem to be ok. This lead to looking into the initial
p2m tree construction, which uses the smaller value and assuming that
there is other code handling the extra memory.

When the p2m tree is set up, the leaves are directly pointed to the
array which the domain builder set up. But if the mapping is not on a
boundary that fits into one p2m page, this will result in the last leaf
being only partially valid. And as the invalid entries are not
initialized in that case, things go badly wrong.

I am trying to fix that by checking whether the current leaf is a
complete map and if not, allocate a completely new page and copy only
the valid pointers there. This may not be the most efficient or elegant
solution, but at least it seems to allow me booting DomUs with memory
assignments all over the range.

BugLink: http://bugs.launchpad.net/bugs/686692
[v2: Redid a bit of commit wording and fixed a compile warning]
Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

8e1b4cf2

21 1月, 2011 1 次提交

xen: fix non-ANSI function warning in irq.c · 7d81c3b9

由 Randy Dunlap 提交于 1月 08, 2011

Fix sparse warning for non-ANSI function declaration:

arch/x86/xen/irq.c:129:30: warning: non-ANSI function declaration of function 'xen_init_irq_ops'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

7d81c3b9

20 1月, 2011 1 次提交

lockdep: Move early boot local IRQ enable/disable status to init/main.c · 2ce802f6

由 Tejun Heo 提交于 1月 20, 2011

During early boot, local IRQ is disabled until IRQ subsystem is
properly initialized.  During this time, no one should enable
local IRQ and some operations which usually are not allowed with
IRQ disabled, e.g. operations which might sleep or require
communications with other processors, are allowed.

lockdep tracked this with early_boot_irqs_off/on() callbacks.
As other subsystems need this information too, move it to
init/main.c and make it generally available.  While at it,
toggle the boolean to early_boot_irqs_disabled instead of
enabled so that it can be initialized with %false and %true
indicates the exceptional condition.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NPekka Enberg <penberg@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110120110635.GB6036@htj.dyndns.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2ce802f6

15 1月, 2011 1 次提交

xen: export arbitrary_virt_to_machine · de23be5f

由 Stephen Rothwell 提交于 1月 15, 2011

Fixes this build error:

 ERROR: "arbitrary_virt_to_machine" [drivers/xen/xen-gntdev.ko] undefined!
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

de23be5f

12 1月, 2011 5 次提交

xen/p2m: Fix module linking error. · e1b478e4

由 Konrad Rzeszutek Wilk 提交于 1月 11, 2011

Fixes:
ERROR: "m2p_find_override_pfn" [drivers/block/xen-blkfront.ko] undefined!
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

e1b478e4

xen p2m: clear the old pte when adding a page to m2p_override · 87f1d40a

由 Jeremy Fitzhardinge 提交于 12月 13, 2010

When adding a page to m2p_override we change the p2m of the page so we
need to also clear the old pte of the kernel linear mapping because it
doesn't correspond anymore.

When we remove the page from m2p_override we restore the original p2m of
the page and we also restore the old pte of the kernel linear mapping.

Before changing the p2m mappings in m2p_add_override and
m2p_remove_override, check that the page passed as argument is valid and
return an error if it is not.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

87f1d40a

xen p2m: transparently change the p2m mappings in the m2p override · 9b705f0e

由 Stefano Stabellini 提交于 12月 10, 2010

In m2p_add_override store the original mfn into page->index and then
change the p2m mapping, setting mfns as FOREIGN_FRAME.

In m2p_remove_override restore the original mapping.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

9b705f0e

xen: add m2p override mechanism · 448f2831

由 Jeremy Fitzhardinge 提交于 12月 15, 2010

Add a simple hashtable based mechanism to override some portions of the
m2p, so that we can find out the pfn corresponding to an mfn of a
granted page. In fact entries corresponding to granted pages in the m2p
hold the original pfn value of the page in the source domain that
granted it.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

448f2831

xen: move p2m handling to separate file · b5eafe92

由 Jeremy Fitzhardinge 提交于 12月 06, 2010

Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b5eafe92

10 1月, 2011 1 次提交

xen: disable ACPI NUMA for PV guests · c1f5db1a

由 Ian Campbell 提交于 12月 03, 2010

Xen does not currently expose PV-NUMA information to PV
guests. Therefore disable NUMA for the time being to prevent the
kernel picking up on an host-level NUMA information which it might
come across in the firmware.

[ Added comment - Jeremy ]
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

c1f5db1a

07 1月, 2011 1 次提交

xen: HVM X2APIC support · d9b8ca84

由 Sheng Yang 提交于 12月 21, 2010

This patch is similiar to Gleb Natapov's patch for KVM, which enable the
hypervisor to emulate x2apic feature for the guest. By this way, the emulation
of lapic would be simpler with x2apic interface(MSR), and faster.

[v2: Re-organized 'xen_hvm_need_lapic' per Ian Campbell suggestion]
Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d9b8ca84

17 12月, 2010 1 次提交

xen: Use this_cpu_ops · 780f36d8

由 Christoph Lameter 提交于 12月 06, 2010

Use this_cpu_ops to reduce code size and simplify things in various places.

V3->V4:
	Move instance of this_cpu_inc_return to a later patchset so that
	this patch can be applied without infrastructure changes.

Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

780f36d8

03 12月, 2010 1 次提交

vmalloc: eagerly clear ptes on vunmap · 64141da5

由 Jeremy Fitzhardinge 提交于 12月 02, 2010

On stock 2.6.37-rc4, running:

  # mount lilith:/export /mnt/lilith
  # find  /mnt/lilith/ -type f -print0 | xargs -0 file

crashes the machine fairly quickly under Xen.  Often it results in oops
messages, but the couple of times I tried just now, it just hung quietly
and made Xen print some rude messages:

    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
    3000000000000000) for mfn 1d7058 (pfn 18fa7)
    (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
    1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
    (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04

Which means the domain tried to map a pagetable page RW, which would
allow it to map arbitrary memory, so Xen stopped it.  This is because
vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
finished with them, and those pages got recycled as pagetable pages
while still having these RW aliases.

Removing those mappings immediately removes the Xen-visible aliases, and
so it has no problem with those pages being reused as pagetable pages.
Deferring the TLB flush doesn't upset Xen because it can flush the TLB
itself as needed to maintain its invariants.

When unmapping a region in the vmalloc space, clear the ptes
immediately.  There's no point in deferring this because there's no
amortization benefit.

The TLBs are left dirty, and they are flushed lazily to amortize the
cost of the IPIs.

This specific motivation for this patch is an oops-causing regression
since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped
pages") .  XFS also uses vm_map_ram() and could cause similar problems.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

64141da5

02 12月, 2010 1 次提交

xen: unplug the emulated devices at resume time · 512b109e

由 Stefano Stabellini 提交于 12月 01, 2010

Early after being resumed we need to unplug again the emulated devices.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

512b109e

30 11月, 2010 2 次提交

xen: x86/32: perform initial startup on initial_page_table · 805e3f49

由 Ian Campbell 提交于 11月 03, 2010

Only make swapper_pg_dir readonly and pinned when generic x86 architecture code
(which also starts on initial_page_table) switches to it.  This helps ensure
that the generic setup paths work on Xen unmodified. In particular
clone_pgd_range writes directly to the destination pgd entries and is used to
initialise swapper_pg_dir so we need to ensure that it remains writeable until
the last possible moment during bring up.

This is complicated slightly by the need to avoid sharing kernel PMD entries
when running under Xen, therefore the Xen implementation must make a copy of
the kernel PMD (which is otherwise referred to by both intial_page_table and
swapper_pg_dir) before switching to swapper_pg_dir.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

805e3f49

xen: don't bother to stop other cpus on shutdown/reboot · 31e323cc

由 Jeremy Fitzhardinge 提交于 11月 29, 2010

Xen will shoot all the VCPUs when we do a shutdown hypercall, so there's
no need to do it manually.

In any case it will fail because all the IPI irqs have been pulled
down by this point, so the cross-CPU calls will simply hang forever.

Until change 76fac077 the function calls
were not synchronously waited for, so this wasn't apparent.  However after
that change the calls became synchronous leading to a hang on shutdown
on multi-VCPU guests.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Alok Kataria <akataria@vmware.com>

31e323cc

28 11月, 2010 1 次提交

x86/pvclock: Zero last_value on resume · e7a3481c

由 Jeremy Fitzhardinge 提交于 10月 25, 2010

If the guest domain has been suspend/resumed or migrated, then the
system clock backing the pvclock clocksource may revert to a smaller
value (ie, can be non-monotonic across the migration/save-restore).

Make sure we zero last_value in that case so that the domain
continues to see clock updates.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e7a3481c

25 11月, 2010 2 次提交

xen: remove duplicated #include · e6d4a76d

由 Huang Weiyi 提交于 11月 20, 2010

Remove duplicated #include('s) in
  arch/x86/xen/setup.c
Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

e6d4a76d

xen: x86/32: perform initial startup on initial_page_table · 5b5c1af1

由 Ian Campbell 提交于 11月 24, 2010

Only make swapper_pg_dir readonly and pinned when generic x86 architecture code
(which also starts on initial_page_table) switches to it.  This helps ensure
that the generic setup paths work on Xen unmodified. In particular
clone_pgd_range writes directly to the destination pgd entries and is used to
initialise swapper_pg_dir so we need to ensure that it remains writeable until
the last possible moment during bring up.

This is complicated slightly by the need to avoid sharing kernel PMD entries
when running under Xen, therefore the Xen implementation must make a copy of
the kernel PMD (which is otherwise referred to by both intial_page_table and
swapper_pg_dir) before switching to swapper_pg_dir.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

5b5c1af1

23 11月, 2010 3 次提交

xen: use default_idle · bc15fde7

由 Jeremy Fitzhardinge 提交于 11月 22, 2010

We just need the idle loop to drop into safe_halt, which default_idle()
is perfectly capable of doing. There's no need to duplicate it.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

bc15fde7

xen: clean up "extra" memory handling some more · c2d08791

由 Jeremy Fitzhardinge 提交于 11月 22, 2010

Make sure that extra_pages is added for all E820_RAM regions beyond
mem_end - completely excluded regions as well as the remains of partially
included regions.

Also makes sure the extra region is not unnecessarily high, and simplifies
the logic to decide which regions should be added.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

c2d08791

xen: set IO permission early (before early_cpu_init()) · ec35a69c

由 Konrad Rzeszutek Wilk 提交于 11月 16, 2010

This patch is based off "xen dom0: Set up basic IO permissions for dom0."
by Juan Quintela <quintela@redhat.com>.

On AMD machines when we boot the kernel as Domain 0 we get this nasty:

mapping kernel into physical memory
Xen: setup ISA identity maps
about to get started...
(XEN) traps.c:475:d0 Unhandled general protection fault fault/trap [#13] on VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1-101116  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff8130271b>]
(XEN) RFLAGS: 0000000000000282   EM: 1   CONTEXT: pv guest
(XEN) rax: 000000008000c068   rbx: ffffffff8186c680   rcx: 0000000000000068
(XEN) rdx: 0000000000000cf8   rsi: 000000000000c000   rdi: 0000000000000000
(XEN) rbp: ffffffff81801e98   rsp: ffffffff81801e50   r8:  ffffffff81801eac
(XEN) r9:  ffffffff81801ea8   r10: ffffffff81801eb4   r11: 00000000ffffffff
(XEN) r12: ffffffff8186c694   r13: ffffffff81801f90   r14: ffffffffffffffff
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 0000000221803000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff81801e50:

RIP points to read_pci_config() function.

The issue is that we don't set IO permissions for the Linux kernel early enough.

The call sequence used to be:

    xen_start_kernel()
	x86_init.oem.arch_setup = xen_setup_arch;
        setup_arch:
           - early_cpu_init
               - early_init_amd
                  - read_pci_config
           - x86_init.oem.arch_setup [ xen_arch_setup ]
               - set IO permissions.

We need to set the IO permissions earlier on, which this patch does.
Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ec35a69c

20 11月, 2010 1 次提交

xen: re-enable boot-time ballooning · d2a81713

由 Jeremy Fitzhardinge 提交于 11月 19, 2010

Now that the balloon driver doesn't stumble over non-RAM pages, we
can enable the extra space for ballooning.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

d2a81713

13 11月, 2010 1 次提交

xen: implement XENMEM_machphys_mapping · 7e77506a

由 Ian Campbell 提交于 9月 30, 2010

This hypercall allows Xen to specify a non-default location for the
machine to physical mapping. This capability is used when running a 32
bit domain 0 on a 64 bit hypervisor to shrink the hypervisor hole to
exactly the size required.

[ Impact: add Xen hypercall definitions ]
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

7e77506a

12 11月, 2010 1 次提交

xen: set vma flag VM_PFNMAP in the privcmd mmap file_op · e060e7af

由 Stefano Stabellini 提交于 11月 11, 2010

Set VM_PFNMAP in the privcmd mmap file_op, rather than later in
xen_remap_domain_mfn_range when it is too late because
vma_wants_writenotify has already been called and vm_page_prot has
already been modified.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

e060e7af

11 11月, 2010 1 次提交

xen: do not release any memory under 1M in domain 0 · 9ec23a7f

由 Ian Campbell 提交于 10月 28, 2010

We already deliberately setup a 1-1 P2M for the region up to 1M in
order to allow code which assumes this region is already mapped to
work without having to convert everything to ioremap.

Domain 0 should not return any apparently unused memory regions
(reserved or otherwise) in this region to Xen since the e820 may not
accurately reflect what the BIOS has stashed in this region.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

9ec23a7f

30 10月, 2010 1 次提交

xen: correct size of level2_kernel_pgt · a2d771c0

由 Ian Campbell 提交于 10月 29, 2010

sizeof(pmd_t *) is 4 bytes on 32-bit PAE leading to an allocation of
only 2048 bytes. The correct size is sizeof(pmd_t) giving us a full
page allocation.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

a2d771c0

28 10月, 2010 1 次提交

Remove duplicate includes from many files · 61d8e11e

由 Zimny Lech 提交于 10月 27, 2010

Signed-off-by: NZimny Lech <napohybelskurwysynom2010@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

61d8e11e

27 10月, 2010 1 次提交

xen: initialize cpu masks for pv guests in xen_smp_init · ea5b8f73

由 Stefano Stabellini 提交于 10月 26, 2010

Pv guests don't have ACPI and need the cpu masks to be set
correctly as early as possible so we call xen_fill_possible_map from
xen_smp_init.
On the other hand the initial domain supports ACPI so in this case we skip
xen_fill_possible_map and rely on it. However Xen might limit the number
of cpus usable by the domain, so we filter those masks during smp
initialization using the VCPUOP_is_up hypercall.
It is important that the filtering is done before
xen_setup_vcpu_info_placement.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

ea5b8f73

26 10月, 2010 1 次提交

xen: include xen/xen.h for definition of xen_initial_domain() · 45263cb0

由 Ian Campbell 提交于 10月 25, 2010

CC arch/x86/xen/setup.o
arch/x86/xen/setup.c: In function 'xen_memory_setup':
arch/x86/xen/setup.c:161: error: implicit declaration of function 'xen_initial_domain'
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

45263cb0

23 10月, 2010 6 次提交

xen: mask the MTRR feature from the cpuid · ff12849a

由 Stefano Stabellini 提交于 9月 28, 2010

We don't want Linux to think that the cpu supports MTRRs when running
under Xen because MTRR operations could only be performed through
hypercalls.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ff12849a

xen: add the direct mapping area for ISA bus access · 4ec5387c

由 Juan Quintela 提交于 9月 02, 2010

add the direct mapping area for ISA bus access when running as initial
domain
Signed-off-by: NJuan Quintela <quintela@redhat.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

4ec5387c

xen: use vcpu_ops to setup cpu masks · 801fd14a

由 Stefano Stabellini 提交于 9月 23, 2010

Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

801fd14a

xen: map a dummy page for local apic and ioapic in xen_set_fixmap · 98511f35

由 Jeremy Fitzhardinge 提交于 9月 03, 2010

Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

98511f35

xen: introduce XEN_DOM0 as a silent option · 6b0661a5

由 Stefano Stabellini 提交于 9月 02, 2010

Add XEN_DOM0 to arch/x86/xen/Kconfig as a silent compile time option
that gets enabled when xen and basic x86, acpi and pci support are
selected.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

6b0661a5

xen: use host E820 map for dom0 · 9e9a5fcb

由 Ian Campbell 提交于 9月 02, 2010

When running as initial domain, get the real physical memory map from
xen using the XENMEM_machine_memory_map hypercall and use it to setup
the e820 regions.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

9e9a5fcb

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功