1. 20 4月, 2011 1 次提交
  2. 18 4月, 2011 1 次提交
    • K
      xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override. · cf8d9163
      Konrad Rzeszutek Wilk 提交于
      We only supported the M2P (and P2M) override only for the
      GNTMAP_contains_pte type mappings. Meaning that we grants
      operations would "contain the machine address of the PTE to update"
      If the flag is unset, then the grant operation is
      "contains a host virtual address". The latter case means that
      the Hypervisor takes care of updating our page table
      (specifically the PTE entry) with the guest's MFN. As such we should
      not try to do anything with the PTE. Previous to this patch
      we would try to clear the PTE which resulted in Xen hypervisor
      being upset with us:
      
      (XEN) mm.c:1066:d0 Attempt to implicitly unmap a granted PTE c0100000ccc59067
      (XEN) domain_crash called from mm.c:1067
      (XEN) Domain 0 (vcpu#0) crashed on cpu#3:
      (XEN) ----[ Xen-4.0-110228  x86_64  debug=y  Not tainted ]----
      
      and crashing us.
      
      This patch allows us to inhibit the PTE clearing in the PV guest
      if the GNTMAP_contains_pte is not set.
      
      On the m2p_remove_override path we provide the same parameter.
      
      Sadly in the grant-table driver we do not have a mechanism to
      tell m2p_remove_override whether to clear the PTE or not. Since
      the grant-table driver is used by user-space, we can safely assume
      that it operates only on PTE's. Hence the implementation for
      it to work on !GNTMAP_contains_pte returns -EOPNOTSUPP. In the future
      we can implement the support for this. It will require some extra
      accounting structure to keep track of the page[i], and the flag.
      
      [v1: Added documentation details, made it return -EOPNOTSUPP instead
       of trying to do a half-way implementation]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      cf8d9163
  3. 29 3月, 2011 1 次提交
    • R
      xen: fix p2m section mismatches · b83c6e55
      Randy Dunlap 提交于
      Fix section mismatch warnings:
      set_phys_range_identity() is called by __init xen_set_identity(),
      so also mark set_phys_range_identity() as __init.
      then:
      __early_alloc_p2m() is called set_phys_range_identity(), so also mark
      __early_alloc_p2m() as __init.
      
      WARNING: arch/x86/built-in.o(.text+0x7856): Section mismatch in reference from the function __early_alloc_p2m() to the function .init.text:extend_brk()
      The function __early_alloc_p2m() references
      the function __init extend_brk().
      This is often because __early_alloc_p2m lacks a __init
      annotation or the annotation of extend_brk is wrong.
      
      WARNING: arch/x86/built-in.o(.text+0x7967): Section mismatch in reference from the function set_phys_range_identity() to the function .init.text:extend_brk()
      The function set_phys_range_identity() references
      the function __init extend_brk().
      This is often because set_phys_range_identity lacks a __init
      annotation or the annotation of extend_brk is wrong.
      
      [v2: Per Stephen Hemming recommonedation made __early_alloc_p2m static]
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      b83c6e55
  4. 24 3月, 2011 1 次提交
  5. 14 3月, 2011 3 次提交
    • K
      xen/debugfs: Add 'p2m' file for printing out the P2M layout. · 2222e71b
      Konrad Rzeszutek Wilk 提交于
      We walk over the whole P2M tree and construct a simplified view of
      which PFN regions belong to what level and what type they are.
      
      Only enabled if CONFIG_XEN_DEBUG_FS is set.
      
      [v2: UNKN->UNKNOWN, use uninitialized_var]
      [v3: Rebased on top of mmu->p2m code split]
      [v4: Fixed the else if]
      Reviewed-by: NIan Campbell <Ian.Campbell@eu.citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      2222e71b
    • K
      xen/mmu: WARN_ON when racing to swap middle leaf. · c7617798
      Konrad Rzeszutek Wilk 提交于
      The initial bootup code uses set_phys_to_machine quite a lot, and after
      bootup it would be used by the balloon driver. The balloon driver does have
      mutex lock so this should not be necessary - but just in case, add
      a WARN_ON if we do hit this scenario. If we do fail this, it is OK
      to continue as there is a backup mechanism (VM_IO) that can bypass
      the P2M and still set the _PAGE_IOMAP flags.
      
      [v2: Change from WARN to BUG_ON]
      [v3: Rebased on top of xen->p2m code split]
      [v4: Change from BUG_ON to WARN]
      Reviewed-by: NIan Campbell <Ian.Campbell@eu.citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      c7617798
    • K
      xen/mmu: Add the notion of identity (1-1) mapping. · f4cec35b
      Konrad Rzeszutek Wilk 提交于
      Our P2M tree structure is a three-level. On the leaf nodes
      we set the Machine Frame Number (MFN) of the PFN. What this means
      is that when one does: pfn_to_mfn(pfn), which is used when creating
      PTE entries, you get the real MFN of the hardware. When Xen sets
      up a guest it initially populates a array which has descending
      (or ascending) MFN values, as so:
      
       idx: 0,  1,       2
       [0x290F, 0x290E, 0x290D, ..]
      
      so pfn_to_mfn(2)==0x290D. If you start, restart many guests that list
      starts looking quite random.
      
      We graft this structure on our P2M tree structure and stick in
      those MFN in the leafs. But for all other leaf entries, or for the top
      root, or middle one, for which there is a void entry, we assume it is
      "missing". So
       pfn_to_mfn(0xc0000)=INVALID_P2M_ENTRY.
      
      We add the possibility of setting 1-1 mappings on certain regions, so
      that:
       pfn_to_mfn(0xc0000)=0xc0000
      
      The benefit of this is, that we can assume for non-RAM regions (think
      PCI BARs, or ACPI spaces), we can create mappings easily b/c we
      get the PFN value to match the MFN.
      
      For this to work efficiently we introduce one new page p2m_identity and
      allocate (via reserved_brk) any other pages we need to cover the sides
      (1GB or 4MB boundary violations). All entries in p2m_identity are set to
      INVALID_P2M_ENTRY type (Xen toolstack only recognizes that and MFNs,
      no other fancy value).
      
      On lookup we spot that the entry points to p2m_identity and return the identity
      value instead of dereferencing and returning INVALID_P2M_ENTRY. If the entry
      points to an allocated page, we just proceed as before and return the PFN.
      If the PFN has IDENTITY_FRAME_BIT set we unmask that in appropriate functions
      (pfn_to_mfn).
      
      The reason for having the IDENTITY_FRAME_BIT instead of just returning the
      PFN is that we could find ourselves where pfn_to_mfn(pfn)==pfn for a
      non-identity pfn. To protect ourselves against we elect to set (and get) the
      IDENTITY_FRAME_BIT on all identity mapped PFNs.
      
      This simplistic diagram is used to explain the more subtle piece of code.
      There is also a digram of the P2M at the end that can help.
      Imagine your E820 looking as so:
      
                         1GB                                           2GB
      /-------------------+---------\/----\         /----------\    /---+-----\
      | System RAM        | Sys RAM ||ACPI|         | reserved |    | Sys RAM |
      \-------------------+---------/\----/         \----------/    \---+-----/
                                    ^- 1029MB                       ^- 2001MB
      
      [1029MB = 263424 (0x40500), 2001MB = 512256 (0x7D100), 2048MB = 524288 (0x80000)]
      
      And dom0_mem=max:3GB,1GB is passed in to the guest, meaning memory past 1GB
      is actually not present (would have to kick the balloon driver to put it in).
      
      When we are told to set the PFNs for identity mapping (see patch: "xen/setup:
      Set identity mapping for non-RAM E820 and E820 gaps.") we pass in the start
      of the PFN and the end PFN (263424 and 512256 respectively). The first step is
      to reserve_brk a top leaf page if the p2m[1] is missing. The top leaf page
      covers 512^2 of page estate (1GB) and in case the start or end PFN is not
      aligned on 512^2*PAGE_SIZE (1GB) we loop on aligned 1GB PFNs from start pfn to
      end pfn.  We reserve_brk top leaf pages if they are missing (means they point
      to p2m_mid_missing).
      
      With the E820 example above, 263424 is not 1GB aligned so we allocate a
      reserve_brk page which will cover the PFNs estate from 0x40000 to 0x80000.
      Each entry in the allocate page is "missing" (points to p2m_missing).
      
      Next stage is to determine if we need to do a more granular boundary check
      on the 4MB (or 2MB depending on architecture) off the start and end pfn's.
      We check if the start pfn and end pfn violate that boundary check, and if
      so reserve_brk a middle (p2m[x][y]) leaf page. This way we have a much finer
      granularity of setting which PFNs are missing and which ones are identity.
      In our example 263424 and 512256 both fail the check so we reserve_brk two
      pages. Populate them with INVALID_P2M_ENTRY (so they both have "missing" values)
      and assign them to p2m[1][2] and p2m[1][488] respectively.
      
      At this point we would at minimum reserve_brk one page, but could be up to
      three. Each call to set_phys_range_identity has at maximum a three page
      cost. If we were to query the P2M at this stage, all those entries from
      start PFN through end PFN (so 1029MB -> 2001MB) would return INVALID_P2M_ENTRY
      ("missing").
      
      The next step is to walk from the start pfn to the end pfn setting
      the IDENTITY_FRAME_BIT on each PFN. This is done in 'set_phys_range_identity'.
      If we find that the middle leaf is pointing to p2m_missing we can swap it over
      to p2m_identity - this way covering 4MB (or 2MB) PFN space.  At this point we
      do not need to worry about boundary aligment (so no need to reserve_brk a middle
      page, figure out which PFNs are "missing" and which ones are identity), as that
      has been done earlier.  If we find that the middle leaf is not occupied by
      p2m_identity or p2m_missing, we dereference that page (which covers
      512 PFNs) and set the appropriate PFN with IDENTITY_FRAME_BIT. In our example
      263424 and 512256 end up there, and we set from p2m[1][2][256->511] and
      p2m[1][488][0->256] with IDENTITY_FRAME_BIT set.
      
      All other regions that are void (or not filled) either point to p2m_missing
      (considered missing) or have the default value of INVALID_P2M_ENTRY (also
      considered missing). In our case, p2m[1][2][0->255] and p2m[1][488][257->511]
      contain the INVALID_P2M_ENTRY value and are considered "missing."
      
      This is what the p2m ends up looking (for the E820 above) with this
      fabulous drawing:
      
         p2m         /--------------\
       /-----\       | &mfn_list[0],|                           /-----------------\
       |  0  |------>| &mfn_list[1],|    /---------------\      | ~0, ~0, ..      |
       |-----|       |  ..., ~0, ~0 |    | ~0, ~0, [x]---+----->| IDENTITY [@256] |
       |  1  |---\   \--------------/    | [p2m_identity]+\     | IDENTITY [@257] |
       |-----|    \                      | [p2m_identity]+\\    | ....            |
       |  2  |--\  \-------------------->|  ...          | \\   \----------------/
       |-----|   \                       \---------------/  \\
       |  3  |\   \                                          \\  p2m_identity
       |-----| \   \-------------------->/---------------\   /-----------------\
       | ..  +->+                        | [p2m_identity]+-->| ~0, ~0, ~0, ... |
       \-----/ /                         | [p2m_identity]+-->| ..., ~0         |
              / /---------------\        | ....          |   \-----------------/
             /  | IDENTITY[@0]  |      /-+-[x], ~0, ~0.. |
            /   | IDENTITY[@256]|<----/  \---------------/
           /    | ~0, ~0, ....  |
          |     \---------------/
          |
          p2m_missing             p2m_missing
      /------------------\     /------------\
      | [p2m_mid_missing]+---->| ~0, ~0, ~0 |
      | [p2m_mid_missing]+---->| ..., ~0    |
      \------------------/     \------------/
      
      where ~0 is INVALID_P2M_ENTRY. IDENTITY is (PFN | IDENTITY_BIT)
      Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
      [v5: Changed code to use ranges, added ASCII art]
      [v6: Rebased on top of xen->p2m code split]
      [v4: Squished patches in just this one]
      [v7: Added RESERVE_BRK for potentially allocated pages]
      [v8: Fixed alignment problem]
      [v9: Changed 1<<3X to 1<<BITS_PER_LONG-X]
      [v10: Copied git commit description in the p2m code + Add Review tag]
      [v11: Title had '2-1' - should be '1-1' mapping]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      f4cec35b
  6. 04 3月, 2011 1 次提交
    • K
      xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY. · 6eaa412f
      Konrad Rzeszutek Wilk 提交于
      With this patch, we diligently set regions that will be used by the
      balloon driver to be INVALID_P2M_ENTRY and under the ownership
      of the balloon driver. We are OK using the __set_phys_to_machine
      as we do not expect to be allocating any P2M middle or entries pages.
      The set_phys_to_machine has the side-effect of potentially allocating
      new pages and we do not want that at this stage.
      
      We can do this because xen_build_mfn_list_list will have already
      allocated all such pages up to xen_max_p2m_pfn.
      
      We also move the check for auto translated physmap down the
      stack so it is present in __set_phys_to_machine.
      
      [v2: Rebased with mmu->p2m code split]
      Reviewed-by: NIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      6eaa412f
  7. 12 2月, 2011 2 次提交
    • I
      xen: annotate functions which only call into __init at start of day · 44b46c3e
      Ian Campbell 提交于
      Both xen_hvm_init_shared_info and xen_build_mfn_list_list can be
      called at resume time as well as at start of day but only reference
      __init functions (extend_brk) at start of day. Hence annotate with
      __ref.
      
          WARNING: arch/x86/built-in.o(.text+0x4f1): Section mismatch in reference
              from the function xen_hvm_init_shared_info() to the function
              .init.text:extend_brk()
          The function xen_hvm_init_shared_info() references
          the function __init extend_brk().
          This is often because xen_hvm_init_shared_info lacks a __init
          annotation or the annotation of extend_brk is wrong.
      
      xen_hvm_init_shared_info calls extend_brk() iff !shared_info_page and
      initialises shared_info_page with the result. This happens at start of
      day only.
      
          WARNING: arch/x86/built-in.o(.text+0x599b): Section mismatch in reference
              from the function xen_build_mfn_list_list() to the function
              .init.text:extend_brk()
          The function xen_build_mfn_list_list() references
          the function __init extend_brk().
          This is often because xen_build_mfn_list_list lacks a __init
          annotation or the annotation of extend_brk is wrong.
      
      (this warning occurs multiple times)
      
      xen_build_mfn_list_list only calls extend_brk() at boot time, while
      building the initial mfn list list
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      44b46c3e
    • I
      xen p2m: annotate variable which appears unused · 6b08cfeb
      Ian Campbell 提交于
       CC      arch/x86/xen/p2m.o
      arch/x86/xen/p2m.c: In function 'm2p_remove_override':
      arch/x86/xen/p2m.c:460: warning: 'address' may be used uninitialized in this function
      arch/x86/xen/p2m.c: In function 'm2p_add_override':
      arch/x86/xen/p2m.c:426: warning: 'address' may be used uninitialized in this function
      
      In actual fact address is inialised in one "if (!PageHighMem(page))"
      statement and used in a second and so is always initialised before
      use.
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      6b08cfeb
  8. 27 1月, 2011 1 次提交
  9. 22 1月, 2011 1 次提交
    • S
      xen: p2m: correctly initialize partial p2m leaf · 8e1b4cf2
      Stefan Bader 提交于
      After changing the p2m mapping to a tree by
      
        commit 58e05027
          xen: convert p2m to a 3 level tree
      
      and trying to boot a DomU with 615MB of memory, the following crash was
      observed in the dump:
      
      kernel direct mapping tables up to 26f00000 @ 1ec4000-1fff000
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<c0107397>] xen_set_pte+0x27/0x60
      *pdpt = 0000000000000000 *pde = 0000000000000000
      
      Adding further debug statements showed that when trying to set up
      pfn=0x26700 the returned mapping was invalid.
      
      pfn=0x266ff calling set_pte(0xc1fe77f8, 0x6b3003)
      pfn=0x26700 calling set_pte(0xc1fe7800, 0x3)
      
      Although the last_pfn obtained from the startup info is 0x26700, which
      should in turn not be hit, the additional 8MB which are added as extra
      memory normally seem to be ok. This lead to looking into the initial
      p2m tree construction, which uses the smaller value and assuming that
      there is other code handling the extra memory.
      
      When the p2m tree is set up, the leaves are directly pointed to the
      array which the domain builder set up. But if the mapping is not on a
      boundary that fits into one p2m page, this will result in the last leaf
      being only partially valid. And as the invalid entries are not
      initialized in that case, things go badly wrong.
      
      I am trying to fix that by checking whether the current leaf is a
      complete map and if not, allocate a completely new page and copy only
      the valid pointers there. This may not be the most efficient or elegant
      solution, but at least it seems to allow me booting DomUs with memory
      assignments all over the range.
      
      BugLink: http://bugs.launchpad.net/bugs/686692
      [v2: Redid a bit of commit wording and fixed a compile warning]
      Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      8e1b4cf2
  10. 12 1月, 2011 5 次提交