提交 · 4c13629f816b1aeff92971a40819b4c25b0622f5 · openeuler / Kernel

21 5月, 2011 4 次提交

J
xen: make a pile of mmu pvop functions static · 4c13629f
由 Jeremy Fitzhardinge 提交于 12月 01, 2010
```
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
```
4c13629f

xen: condense everything onto xen_set_pte · 4a35c13c

由 Jeremy Fitzhardinge 提交于 12月 01, 2010

xen_set_pte_at and xen_clear_pte are essentially identical to
xen_set_pte, so just make them all common.

When batched set_pte and pte_clear are the same, but the unbatch operation
must be different: they need to update the two halves of the pte in
different order.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

4a35c13c

xen: use mmu_update for xen_set_pte_at() · a99ac5e8

由 Jeremy Fitzhardinge 提交于 12月 01, 2010

In principle update_va_mapping is a good match for set_pte_at, since
it gets the address being mapped, which allows Xen to use its linear
pagetable mapping.

However that assumes that the pmd for the address is attached to the
current pagetable, which may not be true for a given user address space
because the kernel pmd is not shared (at least on 32-bit guests).
Normally the kernel will automatically sync a missing part of the
pagetable with the init_mm pagetable transparently via faults, but that
fails when a missing address is passed to Xen.

And while the linear pagetable mapping is very useful for 32-bit Xen
(as it avoids an explicit domain mapping), 32-bit Xen is deprecated.
64-bit Xen has all memory mapped all the time, so it makes no real
difference.

The upshot is that we should use mmu_update, since it can operate on
non-current pagetables or detached pagetables.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

a99ac5e8

xen: drop all the special iomap pte paths. · 331468b1

由 Jeremy Fitzhardinge 提交于 12月 01, 2010

Xen can work out when we're doing IO mappings for itself, so we don't
need to do anything special, and the extra tests just clog things up.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

331468b1

13 5月, 2011 2 次提交

x86,xen: introduce x86_init.mapping.pagetable_reserve · 279b706b

由 Stefano Stabellini 提交于 4月 14, 2011

Introduce a new x86_init hook called pagetable_reserve that at the end
of init_memory_mapping is used to reserve a range of memory addresses for
the kernel pagetable pages we used and free the other ones.

On native it just calls memblock_x86_reserve_range while on xen it also
takes care of setting the spare memory previously allocated
for kernel pagetable pages from RO to RW, so that it can be used for
other purposes.

A detailed explanation of the reason why this hook is needed follows.

As a consequence of the commit:

commit 4b239f45
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Fri Dec 17 16:58:28 2010 -0800

    x86-64, mm: Put early page table high

at some point init_memory_mapping is going to reach the pagetable pages
area and map those pages too (mapping them as normal memory that falls
in the range of addresses passed to init_memory_mapping as argument).
Some of those pages are already pagetable pages (they are in the range
pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and
everything is fine.
Some of these pages are not pagetable pages yet (they fall in the range
pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
are going to be mapped RW.  When these pages become pagetable pages and
are hooked into the pagetable, xen will find that the guest has already
a RW mapping of them somewhere and fail the operation.
The reason Xen requires pagetables to be RO is that the hypervisor needs
to verify that the pagetables are valid before using them. The validation
operations are called "pinning" (more details in arch/x86/xen/mmu.c).

In order to fix the issue we mark all the pages in the entire range
pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
is completed only the range pgt_buf_start-pgt_buf_end is reserved by
init_memory_mapping. Hence the kernel is going to crash as soon as one
of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
ranges are RO).

For this reason we need a hook to reserve the kernel pagetable pages we
used and free the other ones so that they can be reused for other
purposes.
On native it just means calling memblock_x86_reserve_range, on Xen it
also means marking RW the pagetable pages that we allocated before but
that haven't been used before.

Another way to fix this is without using the hook is by adding a 'if
(xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen
counterpart, but that is just nasty.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

279b706b

Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high"" · 92bdaef7

由 Konrad Rzeszutek Wilk 提交于 5月 05, 2011

This reverts commit a3864783.

It does not work with certain AMD machines.

last_pfn = 0x100000 max_arch_pfn = 0x400000000
initial memory mapped : 0 - 02c3a000
Base memory trampoline at [ffff88000009b000] 9b000 size 20480
init_memory_mapping: 0000000000000000-0000000100000000
 0000000000 - 0100000000 page 4k
kernel direct mapping tables up to 100000000 @ ff7fb000-100000000
init_memory_mapping: 0000000100000000-00000001e0800000
 0100000000 - 01e0800000 page 4k
kernel direct mapping tables up to 1e0800000 @ 1df0f3000-1e0000000
xen: setting RW the range fffdc000 - 100000000
RAMDISK: 0203b000 - 02c3a000
No NUMA configuration found
Faking a node at 0000000000000000-00000001e0800000
NUMA: Using 63 for the hash shift.
Initmem setup node 0 0000000000000000-00000001e0800000
  NODE_DATA [00000001dfffb000 - 00000001dfffffff]
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
PGD 0
Oops: 0003 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:

Pid: 0, comm: swapper Not tainted 2.6.39-0-virtual #6~smb1
RIP: e030:[<ffffffff81cf6a75>]  [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
RSP: e02b:ffffffff81c01e38  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 00000001e0800000 RCX: 0000000000001040
RDX: 0000000000004100 RSI: 0000000000000000 RDI: ffff8801dfffb000
RBP: ffffffff81c01e58 R08: 0000000000000020 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000bfe400
FS:  0000000000000000(0000) GS:ffffffff81cca000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000001c03000 CR4: 0000000000000660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020)
Stack:
 0000000000000040 0000000000000001 0000000000000000 ffffffffffffffff
 ffffffff81c01e88 ffffffff81cf6c25 0000000000000000 0000000000000000
 ffffffff81cf687f 0000000000000000 ffffffff81c01ea8 ffffffff81cf6e45
Call Trace:
 [<ffffffff81cf6c25>] numa_register_memblks.constprop.3+0x150/0x181
 [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c
 [<ffffffff81cf6e45>] numa_init.part.2+0x1c/0x7c
 [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c
 [<ffffffff81cf6f67>] numa_init+0x6c/0x70
 [<ffffffff81cf7057>] initmem_init+0x39/0x3b
 [<ffffffff81ce5865>] setup_arch+0x64e/0x769
 [<ffffffff815e43c1>] ? printk+0x51/0x53
 [<ffffffff81cdf92b>] start_kernel+0xd4/0x3f3
 [<ffffffff81cdf388>] x86_64_start_reservations+0x132/0x136
 [<ffffffff81ce2ed4>] xen_start_kernel+0x588/0x58f
Code: 41 00 00 48 8b 3c c5 a0 24 cc 81 31 c0 40 f6 c7 01 74 05 aa 66 ba ff 40 40 f6 c7 02 74 05 66 ab 83 ea 02 89 d1 c1 e9 02 f6 c2 02 <f3> ab 74 02 66 ab 80 e2 01 74 01 aa 49 63 c4 48 c1 eb 0c 44 89
RIP  [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
 RSP <ffffffff81c01e38>
CR2: 0000000000000000
---[ end trace a7919e7f17c0a725 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
Pid: 0, comm: swapper Tainted: G      D     2.6.39-0-virtual #6~smb1
Reported-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

92bdaef7

03 5月, 2011 2 次提交

xen: mask_rw_pte mark RO all pagetable pages up to pgt_buf_top · b9269dc7

由 Stefano Stabellini 提交于 4月 12, 2011

mask_rw_pte is currently checking if a pfn is a pagetable page if it
falls in the range pgt_buf_start - pgt_buf_end but that is incorrect
because pgt_buf_end is a moving target: pgt_buf_top is the real
boundary.
Acked-by: N"H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b9269dc7

xen/mmu: Add workaround "x86-64, mm: Put early page table high" · a3864783

由 Konrad Rzeszutek Wilk 提交于 4月 29, 2011

As a consequence of the commit:

commit 4b239f45
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Fri Dec 17 16:58:28 2010 -0800

    x86-64, mm: Put early page table high

it causes the Linux kernel to crash under Xen:

mapping kernel into physical memory
Xen: setup ISA identity maps
about to get started...
(XEN) mm.c:2466:d0 Bad type (saw 7400000000000001 != exp 1000000000000000) for mfn b1d89 (pfn bacf7)
(XEN) mm.c:3027:d0 Error while pinning mfn b1d89
(XEN) traps.c:481:d0 Unhandled invalid opcode fault/trap [#6] on VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
...

The reason is that at some point init_memory_mapping is going to reach
the pagetable pages area and map those pages too (mapping them as normal
memory that falls in the range of addresses passed to init_memory_mapping
as argument). Some of those pages are already pagetable pages (they are
in the range pgt_buf_start-pgt_buf_end) therefore they are going to be
mapped RO and everything is fine.
Some of these pages are not pagetable pages yet (they fall in the range
pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
are going to be mapped RW.  When these pages become pagetable pages and
are hooked into the pagetable, xen will find that the guest has already
a RW mapping of them somewhere and fail the operation.
The reason Xen requires pagetables to be RO is that the hypervisor needs
to verify that the pagetables are valid before using them. The validation
operations are called "pinning" (more details in arch/x86/xen/mmu.c).

In order to fix the issue we mark all the pages in the entire range
pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
is completed only the range pgt_buf_start-pgt_buf_end is reserved by
init_memory_mapping. Hence the kernel is going to crash as soon as one
of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
ranges are RO).

For this reason, this function is introduced which is called _after_
the init_memory_mapping has completed (in a perfect world we would
call this function from init_memory_mapping, but lets ignore that).

Because we are called _after_ init_memory_mapping the pgt_buf_[start,
end,top] have all changed to new values (b/c another init_memory_mapping
is called). Hence, the first time we enter this function, we save
away the pgt_buf_start value and update the pgt_buf_[end,top].

When we detect that the "old" pgt_buf_start through pgt_buf_end
PFNs have been reserved (so memblock_x86_reserve_range has been called),
we immediately set out to RW the "old" pgt_buf_end through pgt_buf_top.

And then we update those "old" pgt_buf_[end|top] with the new ones
so that we can redo this on the next pagetable.
Acked-by: N"H. Peter Anvin" <hpa@zytor.com>
Reviewed-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[v1: Updated with Jeremy's comments]
[v2: Added the crash output]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

a3864783

20 4月, 2011 1 次提交

xen: mask_rw_pte: do not apply the early_ioremap checks on x86_32 · ee176455

由 Stefano Stabellini 提交于 4月 19, 2011

The two "is_early_ioremap_ptep" checks in mask_rw_pte are only used on
x86_64, in fact early_ioremap is not used at all to setup the initial
pagetable on x86_32.
Moreover on x86_32 the two checks are wrong because the range
pgt_buf_start..pgt_buf_end initially should be mapped RW because
the pages in the range are not pagetable pages yet and haven't been
cleared yet. Afterwards considering the pgt_buf_start..pgt_buf_end is
part of the initial mapping, xen_alloc_pte is capable of turning
the ptes RO when they become pagetable pages.

Fix the issue and improve the readability of the code providing two
different implementation of mask_rw_pte for x86_32 and x86_64.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ee176455

05 4月, 2011 1 次提交

xen/debug: Don't be so verbose with WARN on 1-1 mapping errors. · d88885d0

由 Konrad Rzeszutek Wilk 提交于 4月 04, 2011

There are valid situations in which this error is not
a warning. Mainly when QEMU maps a guest memory and uses
the VM_IO flag to set the MFNs. For right now make the
WARN be WARN_ONCE. In the future we will:

 1). Remove the VM_IO code handling..
 2). .. which will also remove this debug facility.
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d88885d0

20 3月, 2011 2 次提交

xen: update mask_rw_pte after kernel page tables init changes · d8aa5ec3

由 Stefano Stabellini 提交于 3月 09, 2011

After "x86-64, mm: Put early page table high" already existing kernel
page table pages can be mapped using early_ioremap too so we need to
update mask_rw_pte to make sure these pages are still mapped RO.
The reason why we have to do that is explain by the commit message of
fef5ba79:

"Xen requires that all pages containing pagetable entries to be mapped
read-only.  If pages used for the initial pagetable are already mapped
then we can change the mapping to RO.  However, if they are initially
unmapped, we need to make sure that when they are later mapped, they
are also mapped RO.

..SNIP..

the pagetable setup code early_ioremaps the pages to write their
entries, so we must make sure that mappings created in the early_ioremap
fixmap area are mapped RW.  (Those mappings are removed before the pages
are presented to Xen as pagetable pages.)"

We accomplish all this in mask_rw_pte by mapping RO all the pages mapped
using early_ioremap apart from the last one that has been allocated
because it is not a page table page yet (it has not been hooked into the
page tables yet).
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

d8aa5ec3

xen: set max_pfn_mapped to the last pfn mapped · 14988a4d

由 Stefano Stabellini 提交于 2月 18, 2011

Do not set max_pfn_mapped to the end of the initial memory mappings,
that also contain pages that don't belong in pfn space (like the mfn
list).

Set max_pfn_mapped to the last real pfn mapped in the initial memory
mappings that is the pfn backing _end.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

14988a4d

18 3月, 2011 1 次提交

x86: Fix common misspellings · 0d2eb44f

由 Lucas De Marchi 提交于 3月 17, 2011

They were generated by 'codespell' and then manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0d2eb44f

14 3月, 2011 4 次提交

xen/balloon: Removal of driver_pages · 06f521d5

由 Daniel Kiper 提交于 3月 08, 2011

Removal of driver_pages (I do not have seen any references to it).
Signed-off-by: NDaniel Kiper <dkiper@net-space.pl>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

06f521d5

xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set. · fc25151d

由 Konrad Rzeszutek Wilk 提交于 12月 23, 2010

Only enabled if XEN_DEBUG is enabled. We print a warning
when:

 pfn_to_mfn(pfn) == pfn, but no VM_IO (_PAGE_IOMAP) flag set
	(and pfn is an identity mapped pfn)
 pfn_to_mfn(pfn) != pfn, and VM_IO flag is set.
	(ditto, pfn is an identity mapped pfn)

[v2: Make it dependent on CONFIG_XEN_DEBUG instead of ..DEBUG_FS]
[v3: Fix compiler warning]
Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

fc25151d

xen/debugfs: Add 'p2m' file for printing out the P2M layout. · 2222e71b

由 Konrad Rzeszutek Wilk 提交于 12月 22, 2010

We walk over the whole P2M tree and construct a simplified view of
which PFN regions belong to what level and what type they are.

Only enabled if CONFIG_XEN_DEBUG_FS is set.

[v2: UNKN->UNKNOWN, use uninitialized_var]
[v3: Rebased on top of mmu->p2m code split]
[v4: Fixed the else if]
Reviewed-by: NIan Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

2222e71b

xen/mmu: Set _PAGE_IOMAP if PFN is an identity PFN. · fb38923e

由 Konrad Rzeszutek Wilk 提交于 1月 05, 2011

If we find that the PFN is within the P2M as an identity
PFN make sure to tack on the _PAGE_IOMAP flag.
Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

fb38923e

10 3月, 2011 1 次提交

x86/mm: Fix pgd_lock deadlock · a79e53d8

由 Andrea Arcangeli 提交于 2月 16, 2011

It's forbidden to take the page_table_lock with the irq disabled
or if there's contention the IPIs (for tlb flushes) sent with
the page_table_lock held will never run leading to a deadlock.

Nobody takes the pgd_lock from irq context so the _irqsave can be
removed.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NRik van Riel <riel@redhat.com>
Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
LKML-Reference: <201102162345.p1GNjMjm021738@imap1.linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a79e53d8

04 3月, 2011 1 次提交

xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY. · 6eaa412f

由 Konrad Rzeszutek Wilk 提交于 1月 18, 2011

With this patch, we diligently set regions that will be used by the
balloon driver to be INVALID_P2M_ENTRY and under the ownership
of the balloon driver. We are OK using the __set_phys_to_machine
as we do not expect to be allocating any P2M middle or entries pages.
The set_phys_to_machine has the side-effect of potentially allocating
new pages and we do not want that at this stage.

We can do this because xen_build_mfn_list_list will have already
allocated all such pages up to xen_max_p2m_pfn.

We also move the check for auto translated physmap down the
stack so it is present in __set_phys_to_machine.

[v2: Rebased with mmu->p2m code split]
Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

6eaa412f

24 2月, 2011 1 次提交

x86: Rename e820_table_* to pgt_buf_* · d1b19426

由 Yinghai Lu 提交于 2月 24, 2011

e820_table_{start|end|top}, which are used to buffer page table
allocation during early boot, are now derived from memblock and don't
have much to do with e820.  Change the names so that they reflect what
they're used for.

This patch doesn't introduce any behavior change.

-v2: Ingo found that earlier patch "x86: Use early pre-allocated page
     table buffer top-down" caused crash on 32bit and needed to be
     dropped.  This patch was updated to reflect the change.

-tj: Updated commit description.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NTejun Heo <tj@kernel.org>

d1b19426

15 1月, 2011 1 次提交

xen: export arbitrary_virt_to_machine · de23be5f

由 Stephen Rothwell 提交于 1月 15, 2011

Fixes this build error:

 ERROR: "arbitrary_virt_to_machine" [drivers/xen/xen-gntdev.ko] undefined!
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

de23be5f

12 1月, 2011 1 次提交

xen: move p2m handling to separate file · b5eafe92

由 Jeremy Fitzhardinge 提交于 12月 06, 2010

Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b5eafe92

03 12月, 2010 1 次提交

vmalloc: eagerly clear ptes on vunmap · 64141da5

由 Jeremy Fitzhardinge 提交于 12月 02, 2010

On stock 2.6.37-rc4, running:

  # mount lilith:/export /mnt/lilith
  # find  /mnt/lilith/ -type f -print0 | xargs -0 file

crashes the machine fairly quickly under Xen.  Often it results in oops
messages, but the couple of times I tried just now, it just hung quietly
and made Xen print some rude messages:

    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
    3000000000000000) for mfn 1d7058 (pfn 18fa7)
    (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
    1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
    (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04

Which means the domain tried to map a pagetable page RW, which would
allow it to map arbitrary memory, so Xen stopped it.  This is because
vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
finished with them, and those pages got recycled as pagetable pages
while still having these RW aliases.

Removing those mappings immediately removes the Xen-visible aliases, and
so it has no problem with those pages being reused as pagetable pages.
Deferring the TLB flush doesn't upset Xen because it can flush the TLB
itself as needed to maintain its invariants.

When unmapping a region in the vmalloc space, clear the ptes
immediately.  There's no point in deferring this because there's no
amortization benefit.

The TLBs are left dirty, and they are flushed lazily to amortize the
cost of the IPIs.

This specific motivation for this patch is an oops-causing regression
since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped
pages") .  XFS also uses vm_map_ram() and could cause similar problems.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

64141da5

30 11月, 2010 1 次提交

xen: x86/32: perform initial startup on initial_page_table · 805e3f49

由 Ian Campbell 提交于 11月 03, 2010

Only make swapper_pg_dir readonly and pinned when generic x86 architecture code
(which also starts on initial_page_table) switches to it.  This helps ensure
that the generic setup paths work on Xen unmodified. In particular
clone_pgd_range writes directly to the destination pgd entries and is used to
initialise swapper_pg_dir so we need to ensure that it remains writeable until
the last possible moment during bring up.

This is complicated slightly by the need to avoid sharing kernel PMD entries
when running under Xen, therefore the Xen implementation must make a copy of
the kernel PMD (which is otherwise referred to by both intial_page_table and
swapper_pg_dir) before switching to swapper_pg_dir.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

805e3f49

25 11月, 2010 1 次提交

xen: x86/32: perform initial startup on initial_page_table · 5b5c1af1

由 Ian Campbell 提交于 11月 24, 2010

Only make swapper_pg_dir readonly and pinned when generic x86 architecture code
(which also starts on initial_page_table) switches to it.  This helps ensure
that the generic setup paths work on Xen unmodified. In particular
clone_pgd_range writes directly to the destination pgd entries and is used to
initialise swapper_pg_dir so we need to ensure that it remains writeable until
the last possible moment during bring up.

This is complicated slightly by the need to avoid sharing kernel PMD entries
when running under Xen, therefore the Xen implementation must make a copy of
the kernel PMD (which is otherwise referred to by both intial_page_table and
swapper_pg_dir) before switching to swapper_pg_dir.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

5b5c1af1

13 11月, 2010 1 次提交

xen: implement XENMEM_machphys_mapping · 7e77506a

由 Ian Campbell 提交于 9月 30, 2010

This hypercall allows Xen to specify a non-default location for the
machine to physical mapping. This capability is used when running a 32
bit domain 0 on a 64 bit hypervisor to shrink the hypervisor hole to
exactly the size required.

[ Impact: add Xen hypercall definitions ]
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

7e77506a

12 11月, 2010 1 次提交

xen: set vma flag VM_PFNMAP in the privcmd mmap file_op · e060e7af

由 Stefano Stabellini 提交于 11月 11, 2010

Set VM_PFNMAP in the privcmd mmap file_op, rather than later in
xen_remap_domain_mfn_range when it is too late because
vma_wants_writenotify has already been called and vm_page_prot has
already been modified.
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

e060e7af

30 10月, 2010 1 次提交

xen: correct size of level2_kernel_pgt · a2d771c0

由 Ian Campbell 提交于 10月 29, 2010

sizeof(pmd_t *) is 4 bytes on 32-bit PAE leading to an allocation of
only 2048 bytes. The correct size is sizeof(pmd_t) giving us a full
page allocation.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

a2d771c0

23 10月, 2010 12 次提交

xen: add the direct mapping area for ISA bus access · 4ec5387c

由 Juan Quintela 提交于 9月 02, 2010

add the direct mapping area for ISA bus access when running as initial
domain
Signed-off-by: NJuan Quintela <quintela@redhat.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

4ec5387c

xen: map a dummy page for local apic and ioapic in xen_set_fixmap · 98511f35

由 Jeremy Fitzhardinge 提交于 9月 03, 2010

Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

98511f35

xen: correctly rebuild mfn list list after migration. · 375b2a9a

由 Ian Campbell 提交于 10月 21, 2010

Otherwise the second migration attempt fails because the mfn_list_list
still refers to all the old mfns.

We need to update the entires in both p2m_top_mfn and the mid_mfn
pages which p2m_top_mfn refers to.

In order to do this we need to keep track of the virtual addresses
mapping the p2m_mid_mfn pages since we cannot rely on
mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still
contain the old MFN after a migration, which may now belong to another
domain and hence have a different mapping in the m2p.

Therefore add and maintain a third top level page, p2m_top_mfn_p[],
which tracks the virtual addresses of the mfns contained in
p2m_top_mfn[].

We also need to update the content of the p2m_mid_missing_mfn page on
resume to refer to the page's new mfn.

p2m_missing does not need updating since the migration process takes
care of the leaf p2m pages for us.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

375b2a9a

xen: add support for PAT · 41f2e477

由 Jeremy Fitzhardinge 提交于 3月 30, 2010

Convert Linux PAT entries into Xen ones when constructing ptes.  Linux
doesn't use _PAGE_PAT for ptes, so the only difference in the first 4
entries is that Linux uses _PAGE_PWT for WC, whereas Xen (and default)
use it for WT.

xen_pte_val does the inverse conversion.

We hard-code assumptions about Linux's current PAT layout, but a
warning on the wrmsr to MSR_IA32_CR_PAT should point out any problems.
If necessary we could go to a more general table-based conversion between
Linux and Xen PAT entries.

hugetlbfs poses a problem at the moment, the x86 architecture uses the
same flag for _PAGE_PAT and _PAGE_PSE, which changes meaning depending
on which pagetable level we're using.  At the moment this should be OK
so long as nobody tries to do a pte_val on a hugetlbfs pte.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

41f2e477

xen: make sure xen_max_p2m_pfn is up to date · 2f7acb20

由 Jeremy Fitzhardinge 提交于 9月 15, 2010

Keep xen_max_p2m_pfn up to date with the end of the extra memory
we're adding.  It is possible that it will be too high since memory
may be truncated by a "mem=" option on the kernel command line, but
that won't matter.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

2f7acb20

xen: don't map missing memory · cfd8951e

由 Jeremy Fitzhardinge 提交于 8月 31, 2010

When setting up a pte for a missing pfn (no matching mfn), just create
an empty pte rather than a junk mapping.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

cfd8951e

xen: defer building p2m mfn structures until kernel is mapped · 33a84750

由 Jeremy Fitzhardinge 提交于 8月 27, 2010

When building mfn parts of p2m structure, we rely on being able to
use mfn_to_virt, which in turn requires kernel to be mapped into
the linear area (which is distinct from the kernel image mapping
on 64-bit).  Defer calling xen_build_mfn_list_list() until after
xen_setup_kernel_pagetable();
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

33a84750

xen: add return value to set_phys_to_machine() · c3798062

由 Jeremy Fitzhardinge 提交于 8月 27, 2010

set_phys_to_machine() can return false on failure, which means a memory
allocation failure for the p2m structure. It can only fail if setting
the mfn for a pfn in previously unused address space. It is guaranteed
to succeed if you're setting a mapping to INVALID_P2M_ENTRY or updating
the mfn for an existing pfn.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

c3798062

xen: convert p2m to a 3 level tree · 58e05027

由 Jeremy Fitzhardinge 提交于 8月 27, 2010

Make the p2m structure a 3 level tree which covers the full possible
physical space.

The p2m structure contains mappings from the domain's pfns to system-wide
mfns.  The structure has 3 levels and two roots.  The first root is for
the domain's own use, and is linked with virtual addresses.  The second
is all mfn references, and is used by Xen on save/restore to allow it to
update the p2m mapping for the domain.

At boot, the domain builder provides a simple flat p2m array for all the
initially present pages.  We construct the two levels above that using
the early_brk allocator.  After early boot time, set_phys_to_machine()
will allocate any missing levels using the normal kernel allocator
(at GFP_KERNEL, so it must be called in a normal blocking context).

Because the early_brk() API requires us to pre-reserve the maximum amount
of memory we could allocate, there is still a CONFIG_XEN_MAX_DOMAIN_MEMORY
config option, but its only negative side-effect is to increase the
kernel's apparent bss size.  However, since all unused brk memory is
returned to the heap, there's no real downside to making it large.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

58e05027

J
xen: make install_p2mtop_page() static · bbbf61ef
由 Jeremy Fitzhardinge 提交于 8月 26, 2010
```
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
```
bbbf61ef
J
xen: set the actual extent of the mfn_list_list · 1f2d9dd3
由 Jeremy Fitzhardinge 提交于 8月 26, 2010
```
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
```
1f2d9dd3
J
xen: set shared_info->arch.max_pfn to max_p2m_pfn · b7eb4ad3
由 Jeremy Fitzhardinge 提交于 8月 26, 2010
```
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
```
b7eb4ad3

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功