1. 18 12月, 2014 1 次提交
  2. 16 12月, 2014 2 次提交
    • L
      x86: mm: consolidate VM_FAULT_RETRY handling · 26178ec1
      Linus Torvalds 提交于
      The VM_FAULT_RETRY handling was confusing and incorrect for the case of
      returning to kernel mode.  We need to handle the exception table fixup
      if we return to kernel mode due to a fatal signal - it will basically
      look to the kernel user mode access like the access failed due to the VM
      going away from udner it.  Which is correct - the process is dying - and
      avoids the whole "repeat endless kernel page faults" case.
      
      Handling the VM_FAULT_RETRY early and in just one place also simplifies
      the mmap_sem handling, since once we've taken care of VM_FAULT_RETRY we
      know that we can just drop the lock.  The remaining accounting and
      possible error handling is thread-local and does not need the mmap_sem.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26178ec1
    • L
      x86: mm: move mmap_sem unlock from mm_fault_error() to caller · 7fb08eca
      Linus Torvalds 提交于
      This replaces four copies in various stages of mm_fault_error() handling
      with just a single one.  It will also allow for more natural placement
      of the unlocking after some further cleanup.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7fb08eca
  3. 14 12月, 2014 5 次提交
  4. 12 12月, 2014 3 次提交
    • S
      ftrace/x86: Update i386 call to prepare_ftrace_return() · f823b37b
      Steven Rostedt (Red Hat) 提交于
      The parameters for prepare_ftrace_return() used by the function graph
      tracer were swapped to simplify the code on x86_64. But i386 function
      graph trampoline also calls this function, and it did not have its
      parameters swapped.
      
      Link: http://lkml.kernel.org/r/20141210231732.GA24163@wfg-t540p.sh.intel.comReported-by: NFengguang Wu <fengguang.wu@intel.com>
      Tested-by: NFengguang Wu <fengguang.wu@intel.com>
      Fixes: 6a06bdbf "ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameter"
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f823b37b
    • A
      arch: Add lightweight memory barriers dma_rmb() and dma_wmb() · 1077fa36
      Alexander Duyck 提交于
      There are a number of situations where the mandatory barriers rmb() and
      wmb() are used to order memory/memory operations in the device drivers
      and those barriers are much heavier than they actually need to be.  For
      example in the case of PowerPC wmb() calls the heavy-weight sync
      instruction when for coherent memory operations all that is really needed
      is an lsync or eieio instruction.
      
      This commit adds a coherent only version of the mandatory memory barriers
      rmb() and wmb().  In most cases this should result in the barrier being the
      same as the SMP barriers for the SMP case, however in some cases we use a
      barrier that is somewhere in between rmb() and smp_rmb().  For example on
      ARM the rmb barriers break down as follows:
      
        Barrier   Call     Explanation
        --------- -------- ----------------------------------
        rmb()     dsb()    Data synchronization barrier - system
        dma_rmb() dmb(osh) data memory barrier - outer sharable
        smp_rmb() dmb(ish) data memory barrier - inner sharable
      
      These new barriers are not as safe as the standard rmb() and wmb().
      Specifically they do not guarantee ordering between coherent and incoherent
      memories.  The primary use case for these would be to enforce ordering of
      reads and writes when accessing coherent memory that is shared between the
      CPU and a device.
      
      It may also be noted that there is no dma_mb().  Most architectures don't
      provide a good mechanism for performing a coherent only full barrier without
      resorting to the same mechanism used in mb().  As such there isn't much to
      be gained in trying to define such a function.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1077fa36
    • A
      arch: Cleanup read_barrier_depends() and comments · 8a449718
      Alexander Duyck 提交于
      This patch is meant to cleanup the handling of read_barrier_depends and
      smp_read_barrier_depends.  In multiple spots in the kernel headers
      read_barrier_depends is defined as "do {} while (0)", however we then go
      into the SMP vs non-SMP sections and have the SMP version reference
      read_barrier_depends, and the non-SMP define it as yet another empty
      do/while.
      
      With this commit I went through and cleaned out the duplicate definitions
      and reduced the number of definitions down to 2 per header.  In addition I
      moved the 50 line comments for the macro from the x86 and mips headers that
      defined it as an empty do/while to those that were actually defining the
      macro, alpha and blackfin.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a449718
  5. 11 12月, 2014 8 次提交
    • J
      xen: switch to post-init routines in xen mmu.c earlier · cdfa0bad
      Juergen Gross 提交于
      With the virtual mapped linear p2m list the post-init mmu operations
      must be used for setting up the p2m mappings, as in case of
      CONFIG_FLATMEM the init routines may trigger BUGs.
      
      paging_init() sets up all infrastructure needed to switch to the
      post-init mmu ops done by xen_post_allocator_init(). With the virtual
      mapped linear p2m list we need some mmu ops during setup of this list,
      so we have to switch to the correct mmu ops as soon as possible.
      
      The p2m list is usable from the beginning, just expansion requires to
      have established the new linear mapping. So the call of
      xen_remap_memory() had to be introduced, but this is not due to the
      mmu ops requiring this.
      
      Summing it up: calling xen_post_allocator_init() not directly after
      paging_init() was conceptually wrong in the beginning, it just didn't
      matter up to now as no functions used between the two calls needed
      some critical mmu ops (e.g. alloc_pte). This has changed now, so I
      corrected it.
      Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      cdfa0bad
    • B
      x86/asm: Unify segment selector defines · be9d1738
      Borislav Petkov 提交于
      Those are identical on 32- and 64-bit, unify them. No functional
      change.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1418127959-29902-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      be9d1738
    • B
      x86/asm: Guard against building the 32/64-bit versions of the asm-offsets*.c file directly · 5de2b61a
      Borislav Petkov 提交于
      Sometimes it is helpful to build a kernel compilation unit
      directly, i.e.:
      
        make .../<filename>.i
      
      in order to look at compiler output.
      
      Since asm-offsets_{32,64}.c are included by asm-offsets.c and
      building them directly doesn't evaluate the macros used (thus
      making the preprocessor output not very useful), error out when
      an attempt is made to build them. Issue a hint for the user to
      build asm-offsets.c instead.
      Suggested-by: NMichael Matz <matz@suse.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1418139917-12722-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5de2b61a
    • A
      x86_64, switch_to(): Load TLS descriptors before switching DS and ES · f647d7c1
      Andy Lutomirski 提交于
      Otherwise, if buggy user code points DS or ES into the TLS
      array, they would be corrupted after a context switch.
      
      This also significantly improves the comments and documents some
      gotchas in the code.
      
      Before this patch, the both tests below failed.  With this
      patch, the es test passes, although the gsbase test still fails.
      
       ----- begin es test -----
      
      /*
       * Copyright (c) 2014 Andy Lutomirski
       * GPL v2
       */
      
      static unsigned short GDT3(int idx)
      {
      	return (idx << 3) | 3;
      }
      
      static int create_tls(int idx, unsigned int base)
      {
      	struct user_desc desc = {
      		.entry_number    = idx,
      		.base_addr       = base,
      		.limit           = 0xfffff,
      		.seg_32bit       = 1,
      		.contents        = 0, /* Data, grow-up */
      		.read_exec_only  = 0,
      		.limit_in_pages  = 1,
      		.seg_not_present = 0,
      		.useable         = 0,
      	};
      
      	if (syscall(SYS_set_thread_area, &desc) != 0)
      		err(1, "set_thread_area");
      
      	return desc.entry_number;
      }
      
      int main()
      {
      	int idx = create_tls(-1, 0);
      	printf("Allocated GDT index %d\n", idx);
      
      	unsigned short orig_es;
      	asm volatile ("mov %%es,%0" : "=rm" (orig_es));
      
      	int errors = 0;
      	int total = 1000;
      	for (int i = 0; i < total; i++) {
      		asm volatile ("mov %0,%%es" : : "rm" (GDT3(idx)));
      		usleep(100);
      
      		unsigned short es;
      		asm volatile ("mov %%es,%0" : "=rm" (es));
      		asm volatile ("mov %0,%%es" : : "rm" (orig_es));
      		if (es != GDT3(idx)) {
      			if (errors == 0)
      				printf("[FAIL]\tES changed from 0x%hx to 0x%hx\n",
      				       GDT3(idx), es);
      			errors++;
      		}
      	}
      
      	if (errors) {
      		printf("[FAIL]\tES was corrupted %d/%d times\n", errors, total);
      		return 1;
      	} else {
      		printf("[OK]\tES was preserved\n");
      		return 0;
      	}
      }
      
       ----- end es test -----
      
       ----- begin gsbase test -----
      
      /*
       * gsbase.c, a gsbase test
       * Copyright (c) 2014 Andy Lutomirski
       * GPL v2
       */
      
      static unsigned char *testptr, *testptr2;
      
      static unsigned char read_gs_testvals(void)
      {
      	unsigned char ret;
      	asm volatile ("movb %%gs:%1, %0" : "=r" (ret) : "m" (*testptr));
      	return ret;
      }
      
      int main()
      {
      	int errors = 0;
      
      	testptr = mmap((void *)0x200000000UL, 1, PROT_READ | PROT_WRITE,
      		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
      	if (testptr == MAP_FAILED)
      		err(1, "mmap");
      
      	testptr2 = mmap((void *)0x300000000UL, 1, PROT_READ | PROT_WRITE,
      		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
      	if (testptr2 == MAP_FAILED)
      		err(1, "mmap");
      
      	*testptr = 0;
      	*testptr2 = 1;
      
      	if (syscall(SYS_arch_prctl, ARCH_SET_GS,
      		    (unsigned long)testptr2 - (unsigned long)testptr) != 0)
      		err(1, "ARCH_SET_GS");
      
      	usleep(100);
      
      	if (read_gs_testvals() == 1) {
      		printf("[OK]\tARCH_SET_GS worked\n");
      	} else {
      		printf("[FAIL]\tARCH_SET_GS failed\n");
      		errors++;
      	}
      
      	asm volatile ("mov %0,%%gs" : : "r" (0));
      
      	if (read_gs_testvals() == 0) {
      		printf("[OK]\tWriting 0 to gs worked\n");
      	} else {
      		printf("[FAIL]\tWriting 0 to gs failed\n");
      		errors++;
      	}
      
      	usleep(100);
      
      	if (read_gs_testvals() == 0) {
      		printf("[OK]\tgsbase is still zero\n");
      	} else {
      		printf("[FAIL]\tgsbase was corrupted\n");
      		errors++;
      	}
      
      	return errors == 0 ? 0 : 1;
      }
      
       ----- end gsbase test -----
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: <stable@vger.kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/509d27c9fec78217691c3dad91cec87e1006b34a.1418075657.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f647d7c1
    • X
      x86/mm: Use min() instead of min_t() in the e820 printout code · 29258cf4
      Xishi Qiu 提交于
      The type of "MAX_DMA_PFN" and "xXx_pfn" are both unsigned long
      now, so use min() instead of min_t().
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Cc: Linux MM <linux-mm@kvack.org>
      Cc: <dave@sr71.net>
      Cc: Rik van Riel <riel@redhat.com>
      Link: http://lkml.kernel.org/r/5487AB3F.7050807@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      29258cf4
    • X
      x86/mm: Fix zone ranges boot printout · c072b90c
      Xishi Qiu 提交于
      This is the usual physical memory layout boot printout:
      	...
      	[    0.000000] Zone ranges:
      	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
      	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
      	[    0.000000]   Normal   [mem 0x100000000-0xc3fffffff]
      	[    0.000000] Movable zone start for each node
      	[    0.000000] Early memory node ranges
      	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
      	[    0.000000]   node   0: [mem 0x00100000-0xbf78ffff]
      	[    0.000000]   node   0: [mem 0x100000000-0x63fffffff]
      	[    0.000000]   node   1: [mem 0x640000000-0xc3fffffff]
      	...
      
      This is the log when we set "mem=2G" on the boot cmdline:
      	...
      	[    0.000000] Zone ranges:
      	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
      	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]  // should be 0x7fffffff, right?
      	[    0.000000]   Normal   empty
      	[    0.000000] Movable zone start for each node
      	[    0.000000] Early memory node ranges
      	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
      	[    0.000000]   node   0: [mem 0x00100000-0x7fffffff]
      	...
      
      This patch fixes the printout, the following log shows the right
      ranges:
      	...
      	[    0.000000] Zone ranges:
      	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
      	[    0.000000]   DMA32    [mem 0x01000000-0x7fffffff]
      	[    0.000000]   Normal   empty
      	[    0.000000] Movable zone start for each node
      	[    0.000000] Early memory node ranges
      	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
      	[    0.000000]   node   0: [mem 0x00100000-0x7fffffff]
      	...
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Cc: Linux MM <linux-mm@kvack.org>
      Cc: <dave@sr71.net>
      Cc: Rik van Riel <riel@redhat.com>
      Link: http://lkml.kernel.org/r/5487AB3D.6070306@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c072b90c
    • K
      mm: fix huge zero page accounting in smaps report · c164e038
      Kirill A. Shutemov 提交于
      As a small zero page, huge zero page should not be accounted in smaps
      report as normal page.
      
      For small pages we rely on vm_normal_page() to filter out zero page, but
      vm_normal_page() is not designed to handle pmds.  We only get here due
      hackish cast pmd to pte in smaps_pte_range() -- pte and pmd format is not
      necessary compatible on each and every architecture.
      
      Let's add separate codepath to handle pmds.  follow_trans_huge_pmd() will
      detect huge zero page for us.
      
      We would need pmd_dirty() helper to do this properly.  The patch adds it
      to THP-enabled architectures which don't yet have one.
      
      [akpm@linux-foundation.org: use do_div to fix 32-bit build]
      Signed-off-by: N"Kirill A. Shutemov" <kirill@shutemov.name>
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Tested-by: NFengwei Yin <yfw.kernel@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c164e038
    • D
      net, lib: kill arch_fast_hash library bits · 0cb6c969
      Daniel Borkmann 提交于
      As there are now no remaining users of arch_fast_hash(), lets kill
      it entirely.
      
      This basically reverts commit 71ae8aac ("lib: introduce arch
      optimized hash library") and follow-up work, that is f.e., commit
      23721754 ("lib: hash: follow-up fixups for arch hash"),
      commit e3fec2f7 ("lib: Add missing arch generic-y entries for
      asm-generic/hash.h") and last but not least commit 6a02652d
      ("perf tools: Fix include for non x86 architectures").
      
      Cc: Francesco Fusco <fusco@ntop.org>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0cb6c969
  6. 10 12月, 2014 3 次提交
  7. 08 12月, 2014 6 次提交
  8. 06 12月, 2014 4 次提交
  9. 04 12月, 2014 8 次提交
    • J
      xen: Speed up set_phys_to_machine() by using read-only mappings · 2e917175
      Juergen Gross 提交于
      Instead of checking at each call of set_phys_to_machine() whether a
      new p2m page has to be allocated due to writing an entry in a large
      invalid or identity area, just map those areas read only and react
      to a page fault on write by allocating the new page.
      
      This change will make the common path with no allocation much
      faster as it only requires a single write of the new mfn instead
      of walking the address translation tables and checking for the
      special cases.
      Suggested-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      2e917175
    • J
      xen: switch to linear virtual mapped sparse p2m list · 054954eb
      Juergen Gross 提交于
      At start of the day the Xen hypervisor presents a contiguous mfn list
      to a pv-domain. In order to support sparse memory this mfn list is
      accessed via a three level p2m tree built early in the boot process.
      Whenever the system needs the mfn associated with a pfn this tree is
      used to find the mfn.
      
      Instead of using a software walked tree for accessing a specific mfn
      list entry this patch is creating a virtual address area for the
      entire possible mfn list including memory holes. The holes are
      covered by mapping a pre-defined  page consisting only of "invalid
      mfn" entries. Access to a mfn entry is possible by just using the
      virtual base address of the mfn list and the pfn as index into that
      list. This speeds up the (hot) path of determining the mfn of a
      pfn.
      
      Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0
      showed following improvements:
      
      Elapsed time: 32:50 ->  32:35
      System:       18:07 ->  17:47
      User:        104:00 -> 103:30
      
      Tested with following configurations:
      - 64 bit dom0, 8GB RAM
      - 64 bit dom0, 128 GB RAM, PCI-area above 4 GB
      - 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without
                                          the patch)
      - 32 bit domU, ballooning up and down
      - 32 bit domU, save and restore
      - 32 bit domU with PCI passthrough
      - 64 bit domU, 8 GB, 2049 MB, 5000 MB
      - 64 bit domU, ballooning up and down
      - 64 bit domU, save and restore
      - 64 bit domU with PCI passthrough
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      054954eb
    • J
      xen: Hide get_phys_to_machine() to be able to tune common path · 0aad5689
      Juergen Gross 提交于
      Today get_phys_to_machine() is always called when the mfn for a pfn
      is to be obtained. Add a wrapper __pfn_to_mfn() as inline function
      to be able to avoid calling get_phys_to_machine() when possible as
      soon as the switch to a linear mapped p2m list has been done.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      0aad5689
    • J
      x86: Introduce function to get pmd entry pointer · 792230c3
      Juergen Gross 提交于
      Introduces lookup_pmd_address() to get the address of the pmd entry
      related to a virtual address in the current address space. This
      function is needed for support of a virtual mapped sparse p2m list
      in xen pv domains, as we need the address of the pmd entry, not the
      one of the pte in that case.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      792230c3
    • J
      xen: Delay invalidating extra memory · 5b8e7d80
      Juergen Gross 提交于
      When the physical memory configuration is initialized the p2m entries
      for not pouplated memory pages are set to "invalid". As those pages
      are beyond the hypervisor built p2m list the p2m tree has to be
      extended.
      
      This patch delays processing the extra memory related p2m entries
      during the boot process until some more basic memory management
      functions are callable. This removes the need to create new p2m
      entries until virtual memory management is available.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      5b8e7d80
    • J
      xen: Delay m2p_override initialization · 97f4533a
      Juergen Gross 提交于
      The m2p overrides are used to be able to find the local pfn for a
      foreign mfn mapped into the domain. They are used by driver backends
      having to access frontend data.
      
      As this functionality isn't used in early boot it makes no sense to
      initialize the m2p override functions very early. It can be done
      later without doing any harm, removing the need for allocating memory
      via extend_brk().
      
      While at it make some m2p override functions static as they are only
      used internally.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      97f4533a
    • J
      xen: Delay remapping memory of pv-domain · 1f3ac86b
      Juergen Gross 提交于
      Early in the boot process the memory layout of a pv-domain is changed
      to match the E820 map (either the host one for Dom0 or the Xen one)
      regarding placement of RAM and PCI holes. This requires removing memory
      pages initially located at positions not suitable for RAM and adding
      them later at higher addresses where no restrictions apply.
      
      To be able to operate on the hypervisor supported p2m list until a
      virtual mapped linear p2m list can be constructed, remapping must
      be delayed until virtual memory management is initialized, as the
      initial p2m list can't be extended unlimited at physical memory
      initialization time due to it's fixed structure.
      
      A further advantage is the reduction in complexity and code volume as
      we don't have to be careful regarding memory restrictions during p2m
      updates.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      1f3ac86b
    • J
      xen: use common page allocation function in p2m.c · 7108c9ce
      Juergen Gross 提交于
      In arch/x86/xen/p2m.c three different allocation functions for
      obtaining a memory page are used: extend_brk(), alloc_bootmem_align()
      or __get_free_page().  Which of those functions is used depends on the
      progress of the boot process of the system.
      
      Introduce a common allocation routine selecting the to be called
      allocation routine dynamically based on the boot progress. This allows
      moving initialization steps without having to care about changing
      allocation calls.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      7108c9ce