1. 23 8月, 2012 1 次提交
    • K
      xen/setup: Fix one-off error when adding for-balloon PFNs to the P2M. · c96aae1f
      Konrad Rzeszutek Wilk 提交于
      When we are finished with return PFNs to the hypervisor, then
      populate it back, and also mark the E820 MMIO and E820 gaps
      as IDENTITY_FRAMEs, we then call P2M to set areas that can
      be used for ballooning. We were off by one, and ended up
      over-writting a P2M entry that most likely was an IDENTITY_FRAME.
      For example:
      
      1-1 mapping on 40000->40200
      1-1 mapping on bc558->bc5ac
      1-1 mapping on bc5b4->bc8c5
      1-1 mapping on bc8c6->bcb7c
      1-1 mapping on bcd00->100000
      Released 614 pages of unused memory
      Set 277889 page(s) to 1-1 mapping
      Populating 40200-40466 pfn range: 614 pages added
      
      => here we set from 40466 up to bc559 P2M tree to be
      INVALID_P2M_ENTRY. We should have done it up to bc558.
      
      The end result is that if anybody is trying to construct
      a PTE for PFN bc558 they end up with ~PAGE_PRESENT.
      
      CC: stable@vger.kernel.org
      Reported-by-and-Tested-by: NAndre Przywara <andre.przywara@amd.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      c96aae1f
  2. 20 7月, 2012 1 次提交
    • Z
      xen: populate correct number of pages when across mem boundary (v2) · c3d93f88
      zhenzhong.duan 提交于
      When populate pages across a mem boundary at bootup, the page count
      populated isn't correct. This is due to mem populated to non-mem
      region and ignored.
      
      Pfn range is also wrongly aligned when mem boundary isn't page aligned.
      
      For a dom0 booted with dom_mem=3368952K(0xcd9ff000-4k) dmesg diff is:
       [    0.000000] Freeing 9e-100 pfn range: 98 pages freed
       [    0.000000] 1-1 mapping on 9e->100
       [    0.000000] 1-1 mapping on cd9ff->100000
       [    0.000000] Released 98 pages of unused memory
       [    0.000000] Set 206435 page(s) to 1-1 mapping
      -[    0.000000] Populating cd9fe-cda00 pfn range: 1 pages added
      +[    0.000000] Populating cd9fe-cd9ff pfn range: 1 pages added
      +[    0.000000] Populating 100000-100061 pfn range: 97 pages added
       [    0.000000] BIOS-provided physical RAM map:
       [    0.000000] Xen: 0000000000000000 - 000000000009e000 (usable)
       [    0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved)
       [    0.000000] Xen: 0000000000100000 - 00000000cd9ff000 (usable)
       [    0.000000] Xen: 00000000cd9ffc00 - 00000000cda53c00 (ACPI NVS)
      ...
       [    0.000000] Xen: 0000000100000000 - 0000000100061000 (usable)
       [    0.000000] Xen: 0000000100061000 - 000000012c000000 (unusable)
      ...
       [    0.000000] MEMBLOCK configuration:
      ...
      -[    0.000000]  reserved[0x4]       [0x000000cd9ff000-0x000000cd9ffbff], 0xc00 bytes
      -[    0.000000]  reserved[0x5]       [0x00000100000000-0x00000100060fff], 0x61000 bytes
      
      Related xen memory layout:
      (XEN) Xen-e820 RAM map:
      (XEN)  0000000000000000 - 000000000009ec00 (usable)
      (XEN)  00000000000f0000 - 0000000000100000 (reserved)
      (XEN)  0000000000100000 - 00000000cd9ffc00 (usable)
      Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
      [v2: If xen_do_chunk fail(populate), abort this chunk and any others]
      Suggested by David, thanks.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      c3d93f88
  3. 30 5月, 2012 1 次提交
    • K
      xen/balloon: Subtract from xen_released_pages the count that is populated. · 58b7b53a
      Konrad Rzeszutek Wilk 提交于
      We did not take into account that xen_released_pages would be
      used outside the initial E820 parsing code. As such we would
      did not subtract from xen_released_pages the count of pages
      that we had populated back (instead we just did a simple
      extra_pages = released - populated).
      
      The balloon driver uses xen_released_pages to set the initial
      current_pages count.  If this is wrong (too low) then when a new
      (higher) target is set, the balloon driver will request too many pages
      from Xen."
      
      This fixes errors such as:
      
      (XEN) memory.c:133:d0 Could not allocate order=0 extent: id=0 memflags=0 (51 of 512)
      during bootup and
      free_memory            : 0
      
      where the free_memory should be 128.
      Acked-by: NDavid Vrabel <david.vrabel@citrix.com>
      [v1: Per David's review made the git commit better]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      58b7b53a
  4. 08 5月, 2012 4 次提交
    • D
      xen/setup: update VA mapping when releasing memory during setup · 83d51ab4
      David Vrabel 提交于
      In xen_memory_setup(), if a page that is being released has a VA
      mapping this must also be updated.  Otherwise, the page will be not
      released completely -- it will still be referenced in Xen and won't be
      freed util the mapping is removed and this prevents it from being
      reallocated at a different PFN.
      
      This was already being done for the ISA memory region in
      xen_ident_map_ISA() but on many systems this was omitting a few pages
      as many systems marked a few pages below the ISA memory region as
      reserved in the e820 map.
      
      This fixes errors such as:
      
      (XEN) page_alloc.c:1148:d0 Over-allocation for domain 0: 2097153 > 2097152
      (XEN) memory.c:133:d0 Could not allocate order=0 extent: id=0 memflags=0 (0 of 17)
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      83d51ab4
    • K
      xen/setup: Combine the two hypercall functions - since they are quite similar. · 96dc08b3
      Konrad Rzeszutek Wilk 提交于
      They use the same set of arguments, so it is just the matter
      of using the proper hypercall.
      Acked-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      96dc08b3
    • K
      xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to E820 RAM · 2e2fb754
      Konrad Rzeszutek Wilk 提交于
      When the Xen hypervisor boots a PV kernel it hands it two pieces
      of information: nr_pages and a made up E820 entry.
      
      The nr_pages value defines the range from zero to nr_pages of PFNs
      which have a valid Machine Frame Number (MFN) underneath it. The
      E820 mirrors that (with the VGA hole):
      BIOS-provided physical RAM map:
       Xen: 0000000000000000 - 00000000000a0000 (usable)
       Xen: 00000000000a0000 - 0000000000100000 (reserved)
       Xen: 0000000000100000 - 0000000080800000 (usable)
      
      The fun comes when a PV guest that is run with a machine E820 - that
      can either be the initial domain or a PCI PV guest, where the E820
      looks like the normal thing:
      
      BIOS-provided physical RAM map:
       Xen: 0000000000000000 - 000000000009e000 (usable)
       Xen: 000000000009ec00 - 0000000000100000 (reserved)
       Xen: 0000000000100000 - 0000000020000000 (usable)
       Xen: 0000000020000000 - 0000000020200000 (reserved)
       Xen: 0000000020200000 - 0000000040000000 (usable)
       Xen: 0000000040000000 - 0000000040200000 (reserved)
       Xen: 0000000040200000 - 00000000bad80000 (usable)
       Xen: 00000000bad80000 - 00000000badc9000 (ACPI NVS)
      ..
      With that overlaying the nr_pages directly on the E820 does not
      work as there are gaps and non-RAM regions that won't be used
      by the memory allocator. The 'xen_release_chunk' helps with that
      by punching holes in the P2M (PFN to MFN lookup tree) for those
      regions and tells us that:
      
      Freeing  20000-20200 pfn range: 512 pages freed
      Freeing  40000-40200 pfn range: 512 pages freed
      Freeing  bad80-badf4 pfn range: 116 pages freed
      Freeing  badf6-bae7f pfn range: 137 pages freed
      Freeing  bb000-100000 pfn range: 282624 pages freed
      Released 283999 pages of unused memory
      
      Those 283999 pages are subtracted from the nr_pages and are returned
      to the hypervisor. The end result is that the initial domain
      boots with 1GB less memory as the nr_pages has been subtracted by
      the amount of pages residing within the PCI hole. It can balloon up
      to that if desired using 'xl mem-set 0 8092', but the balloon driver
      is not always compiled in for the initial domain.
      
      This patch, implements the populate hypercall (XENMEM_populate_physmap)
      which increases the the domain with the same amount of pages that
      were released.
      
      The other solution (that did not work) was to transplant the MFN in
      the P2M tree - the ones that were going to be freed were put in
      the E820_RAM regions past the nr_pages. But the modifications to the
      M2P array (the other side of creating PTEs) were not carried away.
      As the hypervisor is the only one capable of modifying that and the
      only two hypercalls that would do this are: the update_va_mapping
      (which won't work, as during initial bootup only PFNs up to nr_pages
      are mapped in the guest) or via the populate hypercall.
      
      The end result is that the kernel can now boot with the
      nr_pages without having to subtract the 283999 pages.
      
      On a 8GB machine, with various dom0_mem= parameters this is what we get:
      
      no dom0_mem
      -Memory: 6485264k/9435136k available (5817k kernel code, 1136060k absent, 1813812k reserved, 2899k data, 696k init)
      +Memory: 7619036k/9435136k available (5817k kernel code, 1136060k absent, 680040k reserved, 2899k data, 696k init)
      
      dom0_mem=3G
      -Memory: 2616536k/9435136k available (5817k kernel code, 1136060k absent, 5682540k reserved, 2899k data, 696k init)
      +Memory: 2703776k/9435136k available (5817k kernel code, 1136060k absent, 5595300k reserved, 2899k data, 696k init)
      
      dom0_mem=max:3G
      -Memory: 2696732k/4281724k available (5817k kernel code, 1136060k absent, 448932k reserved, 2899k data, 696k init)
      +Memory: 2702204k/4281724k available (5817k kernel code, 1136060k absent, 443460k reserved, 2899k data, 696k init)
      
      And the 'xm list' or 'xl list' now reflect what the dom0_mem=
      argument is.
      Acked-by: NDavid Vrabel <david.vrabel@citrix.com>
      [v2: Use populate hypercall]
      [v3: Remove debug printks]
      [v4: Simplify code]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      2e2fb754
    • K
      xen/setup: Only print "Freeing XXX-YYY pfn range: Z pages freed" if Z > 0 · ca118238
      Konrad Rzeszutek Wilk 提交于
      Otherwise we can get these meaningless:
      Freeing  bad80-badf4 pfn range: 0 pages freed
      
      We also can do this for the summary ones - no point of printing
      "Set 0 page(s) to 1-1 mapping"
      Acked-by: NDavid Vrabel <david.vrabel@citrix.com>
      [v1: Extended to the summary printks]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      ca118238
  5. 21 3月, 2012 1 次提交
  6. 11 3月, 2012 1 次提交
  7. 16 12月, 2011 1 次提交
    • I
      xen: only limit memory map to maximum reservation for domain 0. · d3db7281
      Ian Campbell 提交于
      d312ae87 "xen: use maximum reservation to limit amount of usable RAM"
      clamped the total amount of RAM to the current maximum reservation. This is
      correct for dom0 but is not correct for guest domains. In order to boot a guest
      "pre-ballooned" (e.g. with memory=1G but maxmem=2G) in order to allow for
      future memory expansion the guest must derive max_pfn from the e820 provided by
      the toolstack and not the current maximum reservation (which can reflect only
      the current maximum, not the guest lifetime max). The existing algorithm
      already behaves this correctly if we do not artificially limit the maximum
      number of pages for the guest case.
      
      For a guest booted with maxmem=512, memory=128 this results in:
       [    0.000000] BIOS-provided physical RAM map:
       [    0.000000]  Xen: 0000000000000000 - 00000000000a0000 (usable)
       [    0.000000]  Xen: 00000000000a0000 - 0000000000100000 (reserved)
      -[    0.000000]  Xen: 0000000000100000 - 0000000008100000 (usable)
      -[    0.000000]  Xen: 0000000008100000 - 0000000020800000 (unusable)
      +[    0.000000]  Xen: 0000000000100000 - 0000000020800000 (usable)
      ...
       [    0.000000] NX (Execute Disable) protection: active
       [    0.000000] DMI not present or invalid.
       [    0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
       [    0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
      -[    0.000000] last_pfn = 0x8100 max_arch_pfn = 0x1000000
      +[    0.000000] last_pfn = 0x20800 max_arch_pfn = 0x1000000
       [    0.000000] initial memory mapped : 0 - 027ff000
       [    0.000000] Base memory trampoline at [c009f000] 9f000 size 4096
      -[    0.000000] init_memory_mapping: 0000000000000000-0000000008100000
      -[    0.000000]  0000000000 - 0008100000 page 4k
      -[    0.000000] kernel direct mapping tables up to 8100000 @ 27bb000-27ff000
      +[    0.000000] init_memory_mapping: 0000000000000000-0000000020800000
      +[    0.000000]  0000000000 - 0020800000 page 4k
      +[    0.000000] kernel direct mapping tables up to 20800000 @ 26f8000-27ff000
       [    0.000000] xen: setting RW the range 27e8000 - 27ff000
       [    0.000000] 0MB HIGHMEM available.
      -[    0.000000] 129MB LOWMEM available.
      -[    0.000000]   mapped low ram: 0 - 08100000
      -[    0.000000]   low ram: 0 - 08100000
      +[    0.000000] 520MB LOWMEM available.
      +[    0.000000]   mapped low ram: 0 - 20800000
      +[    0.000000]   low ram: 0 - 20800000
      
      With this change "xl mem-set <domain> 512M" will successfully increase the
      guest RAM (by reducing the balloon).
      
      There is no change for dom0.
      Reported-and-Tested-by: NGeorge Shuklin <george.shuklin@gmail.com>
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Cc: stable@kernel.org
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      d3db7281
  8. 04 12月, 2011 1 次提交
    • K
      xen/pm_idle: Make pm_idle be default_idle under Xen. · e5fd47bf
      Konrad Rzeszutek Wilk 提交于
      The idea behind commit d91ee586 ("cpuidle: replace xen access to x86
      pm_idle and default_idle") was to have one call - disable_cpuidle()
      which would make pm_idle not be molested by other code.  It disallows
      cpuidle_idle_call to be set to pm_idle (which is excellent).
      
      But in the select_idle_routine() and idle_setup(), the pm_idle can still
      be set to either: amd_e400_idle, mwait_idle or default_idle.  This
      depends on some CPU flags (MWAIT) and in AMD case on the type of CPU.
      
      In case of mwait_idle we can hit some instances where the hypervisor
      (Amazon EC2 specifically) sets the MWAIT and we get:
      
        Brought up 2 CPUs
        invalid opcode: 0000 [#1] SMP
      
        Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1
        RIP: e030:[<ffffffff81015d1d>]  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
        ...
        Call Trace:
         [<ffffffff8100e2ed>] cpu_idle+0xae/0xe8
         [<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10
        RIP  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
         RSP <ffff8801d28ddf10>
      
      In the case of amd_e400_idle we don't get so spectacular crashes, but we
      do end up making an MSR which is trapped in the hypervisor, and then
      follow it up with a yield hypercall.  Meaning we end up going to
      hypervisor twice instead of just once.
      
      The previous behavior before v3.0 was that pm_idle was set to
      default_idle regardless of select_idle_routine/idle_setup.
      
      We want to do that, but only for one specific case: Xen.  This patch
      does that.
      
      Fixes RH BZ #739499 and Ubuntu #881076
      Reported-by: NStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e5fd47bf
  9. 29 9月, 2011 4 次提交
    • D
      xen: release all pages within 1-1 p2m mappings · f3f436e3
      David Vrabel 提交于
      In xen_memory_setup() all reserved regions and gaps are set to an
      identity (1-1) p2m mapping.  If an available page has a PFN within one
      of these 1-1 mappings it will become inaccessible (as it MFN is lost)
      so release them before setting up the mapping.
      
      This can make an additional 256 MiB or more of RAM available
      (depending on the size of the reserved regions in the memory map) if
      the initial pages overlap with reserved regions.
      
      The 1:1 p2m mappings are also extended to cover partial pages.  This
      fixes an issue with (for example) systems with a BIOS that puts the
      DMI tables in a reserved region that begins on a non-page boundary.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      f3f436e3
    • D
      xen: allow extra memory to be in multiple regions · dc91c728
      David Vrabel 提交于
      Allow the extra memory (used by the balloon driver) to be in multiple
      regions (typically two regions, one for low memory and one for high
      memory).  This allows the balloon driver to increase the number of
      available low pages (if the initial number if pages is small).
      
      As a side effect, the algorithm for building the e820 memory map is
      simpler and more obviously correct as the map supplied by the
      hypervisor is (almost) used as is (in particular, all reserved regions
      and gaps are preserved).  Only RAM regions are altered and RAM regions
      above max_pfn + extra_pages are marked as unused (the region is split
      in two if necessary).
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      dc91c728
    • D
      xen: allow balloon driver to use more than one memory region · 8b5d44a5
      David Vrabel 提交于
      Allow the xen balloon driver to populate its list of extra pages from
      more than one region of memory.  This will allow platforms to provide
      (for example) a region of low memory and a region of high memory.
      
      The maximum possible number of extra regions is 128 (== E820MAX) which
      is quite large so xen_extra_mem is placed in __initdata.  This is safe
      as both xen_memory_setup() and balloon_init() are in __init.
      
      The balloon regions themselves are not altered (i.e., there is still
      only the one region).
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      8b5d44a5
    • D
      xen/balloon: account for pages released during memory setup · aa24411b
      David Vrabel 提交于
      In xen_memory_setup() pages that occur in gaps in the memory map are
      released back to Xen.  This reduces the domain's current page count in
      the hypervisor.  The Xen balloon driver does not correctly decrease
      its initial current_pages count to reflect this.  If 'delta' pages are
      released and the target is adjusted the resulting reservation is
      always 'delta' less than the requested target.
      
      This affects dom0 if the initial allocation of pages overlaps the PCI
      memory region but won't affect most domU guests that have been setup
      with pseudo-physical memory maps that don't have gaps.
      
      Fix this by accouting for the released pages when starting the balloon
      driver.
      
      If the domain's targets are managed by xapi, the domain may eventually
      run out of memory and die because xapi currently gets its target
      calculations wrong and whenever it is restarted it always reduces the
      target by 'delta'.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      aa24411b
  10. 13 9月, 2011 1 次提交
    • D
      xen/e820: if there is no dom0_mem=, don't tweak extra_pages. · e3b73c4a
      David Vrabel 提交于
      The patch "xen: use maximum reservation to limit amount of usable RAM"
      (d312ae87) breaks machines that
      do not use 'dom0_mem=' argument with:
      
      reserve RAM buffer: 000000133f2e2000 - 000000133fffffff
      (XEN) mm.c:4976:d0 Global bit is set to kernel page fffff8117e
      (XEN) domain_crash_sync called from entry.S
      (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
      ...
      
      The reason being that the last E820 entry is created using the
      'extra_pages' (which is based on how many pages have been freed).
      The mentioned git commit sets the initial value of 'extra_pages'
      using a hypercall which returns the number of pages (if dom0_mem
      has been used) or -1 otherwise. If the later we return with
      MAX_DOMAIN_PAGES as basis for calculation:
      
          return min(max_pages, MAX_DOMAIN_PAGES);
      
      and use it:
      
           extra_limit = xen_get_max_pages();
           if (extra_limit >= max_pfn)
                   extra_pages = extra_limit - max_pfn;
           else
                   extra_pages = 0;
      
      which means we end up with extra_pages = 128GB in PFNs (33554432)
      - 8GB in PFNs (2097152, on this specific box, can be larger or smaller),
      and then we add that value to the E820 making it:
      
        Xen: 00000000ff000000 - 0000000100000000 (reserved)
        Xen: 0000000100000000 - 000000133f2e2000 (usable)
      
      which is clearly wrong. It should look as so:
      
        Xen: 00000000ff000000 - 0000000100000000 (reserved)
        Xen: 0000000100000000 - 000000027fbda000 (usable)
      
      Naturally this problem does not present itself if dom0_mem=max:X
      is used.
      
      CC: stable@kernel.org
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      e3b73c4a
  11. 01 9月, 2011 1 次提交
    • D
      xen: use maximum reservation to limit amount of usable RAM · d312ae87
      David Vrabel 提交于
      Use the domain's maximum reservation to limit the amount of extra RAM
      for the memory balloon. This reduces the size of the pages tables and
      the amount of reserved low memory (which defaults to about 1/32 of the
      total RAM).
      
      On a system with 8 GiB of RAM with the domain limited to 1 GiB the
      kernel reports:
      
      Before:
      
      Memory: 627792k/4472000k available
      
      After:
      
      Memory: 549740k/11132224k available
      
      A increase of about 76 MiB (~1.5% of the unused 7 GiB).  The reserved
      low memory is also reduced from 253 MiB to 32 MiB.  The total
      additional usable RAM is 329 MiB.
      
      For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit
      the number of pages for dom0') (c/s 23790)
      
      CC: stable@kernel.org
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      d312ae87
  12. 05 8月, 2011 2 次提交
  13. 04 8月, 2011 1 次提交
  14. 15 7月, 2011 1 次提交
  15. 17 6月, 2011 1 次提交
    • K
      xen/setup: Fix for incorrect xen_extra_mem_start. · acd049c6
      Konrad Rzeszutek Wilk 提交于
      The earlier attempts (24bdb0b6)
      at fixing this problem caused other problems to surface (PV guests
      with no PCI passthrough would have SWIOTLB turned on - which meant
      64MB of precious contingous DMA32 memory being eaten up per guest).
      The problem was: "on xen we add an extra memory region at the end of
      the e820, and on this particular machine this extra memory region
      would start below 4g and cross over the 4g boundary:
      
      [0xfee01000-0x192655000)
      
      Unfortunately e820_end_of_low_ram_pfn does not expect an
      e820 layout like that so it returns 4g, therefore initial_memory_mapping
      will map [0 - 0x100000000), that is a memory range that includes some
      reserved memory regions."
      
      The memory range was the IOAPIC regions, and with the 1-1 mapping
      turned on, it would map them as RAM, not as MMIO regions. This caused
      the hypervisor to complain. Fortunately this is experienced only under
      the initial domain so we guard for it.
      Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      acd049c6
  16. 13 5月, 2011 3 次提交
  17. 20 4月, 2011 1 次提交
  18. 14 3月, 2011 1 次提交
    • K
      xen/setup: Set identity mapping for non-RAM E820 and E820 gaps. · 68df0da7
      Konrad Rzeszutek Wilk 提交于
      We walk the E820 region and start at 0 (for PV guests we start
      at ISA_END_ADDRESS) and skip any E820 RAM regions. For all other
      regions and as well the gaps we set them to be identity mappings.
      
      The reasons we do not want to set the identity mapping from 0->
      ISA_END_ADDRESS when running as PV is b/c that the kernel would
      try to read DMI information and fail (no permissions to read that).
      There is a lot of gnarly code to deal with that weird region so
      we won't try to do a cleanup in this patch.
      
      This code ends up calling 'set_phys_to_identity' with the start
      and end PFN of the the E820 that are non-RAM or have gaps.
      On 99% of machines that means one big region right underneath the
      4GB mark. Usually starts at 0xc0000 (or 0x80000) and goes to
      0x100000.
      
      [v2: Fix for E820 crossing 1MB region and clamp the start]
      [v3: Squshed in code that does this over ranges]
      [v4: Moved the comment to the correct spot]
      [v5: Use the "raw" E820 from the hypervisor]
      [v6: Added Review-by tag]
      Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      68df0da7
  19. 12 3月, 2011 1 次提交
    • K
      xen/e820: Don't mark balloon memory as E820_UNUSABLE when running as guest and fix overflow. · 86b32122
      Konrad Rzeszutek Wilk 提交于
      If we have a guest that asked for:
      
      memory=1024
      maxmem=2048
      
      Which means we want 1GB now, and create pagetables so that we can expand
      up to 2GB, we would have this E820 layout:
      
      [    0.000000] BIOS-provided physical RAM map:
      [    0.000000]  Xen: 0000000000000000 - 00000000000a0000 (usable)
      [    0.000000]  Xen: 00000000000a0000 - 0000000000100000 (reserved)
      [    0.000000]  Xen: 0000000000100000 - 0000000080800000 (usable)
      
      Due to patch: "xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps."
      we would mark the memory past the 1GB mark as unusuable resulting in:
      
      [    0.000000] BIOS-provided physical RAM map:
      [    0.000000]  Xen: 0000000000000000 - 00000000000a0000 (usable)
      [    0.000000]  Xen: 00000000000a0000 - 0000000000100000 (reserved)
      [    0.000000]  Xen: 0000000000100000 - 0000000040000000 (usable)
      [    0.000000]  Xen: 0000000040000000 - 0000000080800000 (unusable)
      
      which meant that we could not balloon up anymore. We could
      balloon the guest down. The fix is to run the code introduced
      by the above mentioned patch only for the initial domain.
      
      We will have to revisit this once we start introducing a modified
      E820 for PCI passthrough so that we can utilize the P2M identity code.
      
      We also fix an overflow by having UL instead of ULL on 32-bit machines.
      
      [v2: Ian pointed to the overflow issue]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      86b32122
  20. 04 3月, 2011 1 次提交
    • K
      xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY. · 6eaa412f
      Konrad Rzeszutek Wilk 提交于
      With this patch, we diligently set regions that will be used by the
      balloon driver to be INVALID_P2M_ENTRY and under the ownership
      of the balloon driver. We are OK using the __set_phys_to_machine
      as we do not expect to be allocating any P2M middle or entries pages.
      The set_phys_to_machine has the side-effect of potentially allocating
      new pages and we do not want that at this stage.
      
      We can do this because xen_build_mfn_list_list will have already
      allocated all such pages up to xen_max_p2m_pfn.
      
      We also move the check for auto translated physmap down the
      stack so it is present in __set_phys_to_machine.
      
      [v2: Rebased with mmu->p2m code split]
      Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      6eaa412f
  21. 23 2月, 2011 1 次提交
    • Z
      xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps. · 2f14ddc3
      Zhang, Fengzhe 提交于
      With the hypervisor argument of dom0_mem=X we iterate over the physical
      (only for the initial domain) E820 and subtract the the size from each
      E820_RAM region the delta so that the cumulative size of all E820_RAM regions
      is equal to 'X'. This sometimes ends up with E820_RAM regions with zero size
      (which are removed by e820_sanitize) and E820_RAM that are smaller
      than physically.
      
      Later on the PCI API looks at the E820 and attempts to set up an
      resource region for the "PCI mem". The E820 (assume dom0_mem=1GB is
      set) compared to the physical looks as so:
      
       [    0.000000] BIOS-provided physical RAM map:
       [    0.000000]  Xen: 0000000000000000 - 0000000000097c00 (usable)
       [    0.000000]  Xen: 0000000000097c00 - 0000000000100000 (reserved)
      -[    0.000000]  Xen: 0000000000100000 - 00000000defafe00 (usable)
      +[    0.000000]  Xen: 0000000000100000 - 0000000040000000 (usable)
       [    0.000000]  Xen: 00000000defafe00 - 00000000defb1ea0 (ACPI NVS)
       [    0.000000]  Xen: 00000000defb1ea0 - 00000000e0000000 (reserved)
       [    0.000000]  Xen: 00000000f4000000 - 00000000f8000000 (reserved)
      ..
      And we get
      [    0.000000] Allocating PCI resources starting at 40000000 (gap: 40000000:9efafe00)
      
      while it should have started at e0000000 (a nice big gap up to
      f4000000 exists). The "Allocating PCI" is part of the resource API.
      
      The users that end up using those PCI I/O regions usually supply their
      own BARs when calling the resource API (request_resource, or allocate_resource),
      but there are exceptions which provide an empty 'struct resource' and
      expect the API to provide the 'struct resource' to be populated with valid values.
      The one that triggered this bug was the intel AGP driver that requested
      a region for the flush page (intel_i9xx_setup_flush).
      
      Before this patch, when running under Xen hypervisor, the 'struct resource'
      returned could have (depending on the dom0_mem size) physical ranges of a 'System RAM'
      instead of 'I/O' regions. This ended up with the Hypervisor failing a request
      to populate PTE's with those PFNs as the domain did not have access to those
      'System RAM' regions (rightly so).
      
      After this patch, the left-over E820_RAM region from the truncation, will be
      labeled as E820_UNUSABLE. The E820 will look as so:
      
       [    0.000000] BIOS-provided physical RAM map:
       [    0.000000]  Xen: 0000000000000000 - 0000000000097c00 (usable)
       [    0.000000]  Xen: 0000000000097c00 - 0000000000100000 (reserved)
      -[    0.000000]  Xen: 0000000000100000 - 00000000defafe00 (usable)
      +[    0.000000]  Xen: 0000000000100000 - 0000000040000000 (usable)
      +[    0.000000]  Xen: 0000000040000000 - 00000000defafe00 (unusable)
       [    0.000000]  Xen: 00000000defafe00 - 00000000defb1ea0 (ACPI NVS)
       [    0.000000]  Xen: 00000000defb1ea0 - 00000000e0000000 (reserved)
       [    0.000000]  Xen: 00000000f4000000 - 00000000f8000000 (reserved)
      
      For more information:
      http://mid.gmane.org/1A42CE6F5F474C41B63392A5F80372B2335E978C@shsmsx501.ccr.corp.intel.com
      
      BugLink: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1726Signed-off-by: NFengzhe Zhang <fengzhe.zhang@intel.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      2f14ddc3
  22. 28 1月, 2011 1 次提交
  23. 27 1月, 2011 1 次提交
  24. 25 11月, 2010 1 次提交
  25. 23 11月, 2010 3 次提交
    • J
      xen: use default_idle · bc15fde7
      Jeremy Fitzhardinge 提交于
      We just need the idle loop to drop into safe_halt, which default_idle()
      is perfectly capable of doing.  There's no need to duplicate it.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      bc15fde7
    • J
      xen: clean up "extra" memory handling some more · c2d08791
      Jeremy Fitzhardinge 提交于
      Make sure that extra_pages is added for all E820_RAM regions beyond
      mem_end - completely excluded regions as well as the remains of partially
      included regions.
      
      Also makes sure the extra region is not unnecessarily high, and simplifies
      the logic to decide which regions should be added.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      c2d08791
    • K
      xen: set IO permission early (before early_cpu_init()) · ec35a69c
      Konrad Rzeszutek Wilk 提交于
      This patch is based off "xen dom0: Set up basic IO permissions for dom0."
      by Juan Quintela <quintela@redhat.com>.
      
      On AMD machines when we boot the kernel as Domain 0 we get this nasty:
      
      mapping kernel into physical memory
      Xen: setup ISA identity maps
      about to get started...
      (XEN) traps.c:475:d0 Unhandled general protection fault fault/trap [#13] on VCPU 0 [ec=0000]
      (XEN) domain_crash_sync called from entry.S
      (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
      (XEN) ----[ Xen-4.1-101116  x86_64  debug=y  Not tainted ]----
      (XEN) CPU:    0
      (XEN) RIP:    e033:[<ffffffff8130271b>]
      (XEN) RFLAGS: 0000000000000282   EM: 1   CONTEXT: pv guest
      (XEN) rax: 000000008000c068   rbx: ffffffff8186c680   rcx: 0000000000000068
      (XEN) rdx: 0000000000000cf8   rsi: 000000000000c000   rdi: 0000000000000000
      (XEN) rbp: ffffffff81801e98   rsp: ffffffff81801e50   r8:  ffffffff81801eac
      (XEN) r9:  ffffffff81801ea8   r10: ffffffff81801eb4   r11: 00000000ffffffff
      (XEN) r12: ffffffff8186c694   r13: ffffffff81801f90   r14: ffffffffffffffff
      (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
      (XEN) cr3: 0000000221803000   cr2: 0000000000000000
      (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
      (XEN) Guest stack trace from rsp=ffffffff81801e50:
      
      RIP points to read_pci_config() function.
      
      The issue is that we don't set IO permissions for the Linux kernel early enough.
      
      The call sequence used to be:
      
          xen_start_kernel()
      	x86_init.oem.arch_setup = xen_setup_arch;
              setup_arch:
                 - early_cpu_init
                     - early_init_amd
                        - read_pci_config
                 - x86_init.oem.arch_setup [ xen_arch_setup ]
                     - set IO permissions.
      
      We need to set the IO permissions earlier on, which this patch does.
      Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      ec35a69c
  26. 20 11月, 2010 1 次提交
  27. 11 11月, 2010 1 次提交
  28. 26 10月, 2010 1 次提交
  29. 23 10月, 2010 1 次提交