1. 28 2月, 2010 2 次提交
  2. 04 12月, 2009 1 次提交
    • I
      xen: correctly restore pfn_to_mfn_list_list after resume · fa24ba62
      Ian Campbell 提交于
      pvops kernels >= 2.6.30 can currently only be saved and restored once. The
      second attempt to save results in:
      
          ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys
          ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2, max 0x120000
          ERROR Internal error: Failed to map/save the p2m frame list
      
      I finally narrowed it down to:
      
          commit cdaead6b
              Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
              Date:   Fri Feb 27 15:34:59 2009 -0800
      
                  xen: split construction of p2m mfn tables from registration
      
                  Build the p2m_mfn_list_list early with the rest of the p2m table, but
                  register it later when the real shared_info structure is in place.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      
      The unforeseen side-effect of this change was to cause the mfn list list to not
      be rebuilt on resume. Prior to this change it would have been rebuilt via
      xen_post_suspend() -> xen_setup_shared_info() -> xen_setup_mfn_list_list().
      
      Fix by explicitly calling xen_build_mfn_list_list() from xen_post_suspend().
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Stable Kernel <stable@kernel.org>
      fa24ba62
  3. 24 9月, 2009 1 次提交
  4. 31 8月, 2009 2 次提交
  5. 13 5月, 2009 1 次提交
  6. 08 5月, 2009 1 次提交
  7. 11 4月, 2009 1 次提交
  8. 10 4月, 2009 2 次提交
  9. 09 4月, 2009 6 次提交
  10. 31 3月, 2009 3 次提交
  11. 30 3月, 2009 5 次提交
  12. 19 3月, 2009 1 次提交
  13. 15 3月, 2009 1 次提交
    • J
      x86: add brk allocation for very, very early allocations · 93dbda7c
      Jeremy Fitzhardinge 提交于
      Impact: new interface
      
      Add a brk()-like allocator which effectively extends the bss in order
      to allow very early code to do dynamic allocations.  This is better than
      using statically allocated arrays for data in subsystems which may never
      get used.
      
      The space for brk allocations is in the bss ELF segment, so that the
      space is mapped properly by the code which maps the kernel, and so
      that bootloaders keep the space free rather than putting a ramdisk or
      something into it.
      
      The bss itself, delimited by __bss_stop, ends before the brk area
      (__brk_base to __brk_limit).  The kernel text, data and bss is reserved
      up to __bss_stop.
      
      Any brk-allocated data is reserved separately just before the kernel
      pagetable is built, as that code allocates from unreserved spaces
      in the e820 map, potentially allocating from any unused brk memory.
      Ultimately any unused memory in the brk area is used in the general
      kernel memory pool.
      
      Initially the brk space is set to 1MB, which is probably much larger
      than any user needs (the largest current user is i386 head_32.S's code
      to build the pagetables to map the kernel, which can get fairly large
      with a big kernel image and no PSE support).  So long as the system
      has sufficient memory for the bootloader to reserve the kernel+1MB brk,
      there are no bad effects resulting from an over-large brk.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      93dbda7c
  14. 02 3月, 2009 1 次提交
    • J
      xen: deal with virtually mapped percpu data · 9976b39b
      Jeremy Fitzhardinge 提交于
      The virtually mapped percpu space causes us two problems:
      
       - for hypercalls which take an mfn, we need to do a full pagetable
         walk to convert the percpu va into an mfn, and
      
       - when a hypercall requires a page to be mapped RO via all its aliases,
         we need to make sure its RO in both the percpu mapping and in the
         linear mapping
      
      This primarily affects the gdt and the vcpu info structure.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Xen-devel <xen-devel@lists.xensource.com>
      Cc: Gerd Hoffmann <kraxel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tejun Heo <htejun@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9976b39b
  15. 13 2月, 2009 1 次提交
    • I
      xen: fix xen_flush_tlb_others · 694aa960
      Ian Campbell 提交于
      The commit
          commit 4595f962
          Author: Rusty Russell <rusty@rustcorp.com.au>
          Date:   Sat Jan 10 21:58:09 2009 -0800
      
              x86: change flush_tlb_others to take a const struct cpumask
      
      causes xen_flush_tlb_others to allocate a multicall and then issue it
      without initializing it in the case where the cpumask is empty,
      leading to:
      
              [    8.354898] 1 multicall(s) failed: cpu 1
              [    8.354921] Pid: 2213, comm: bootclean Not tainted 2.6.29-rc3-x86_32p-xenU-tip #135
              [    8.354937] Call Trace:
              [    8.354955]  [<c01036e3>] xen_mc_flush+0x133/0x1b0
              [    8.354971]  [<c0105d2a>] ? xen_force_evtchn_callback+0x1a/0x30
              [    8.354988]  [<c0105a60>] xen_flush_tlb_others+0xb0/0xd0
              [    8.355003]  [<c0126643>] flush_tlb_page+0x53/0xa0
              [    8.355018]  [<c0176a80>] do_wp_page+0x2a0/0x7c0
              [    8.355034]  [<c0238f0a>] ? notify_remote_via_irq+0x3a/0x70
              [    8.355049]  [<c0178950>] handle_mm_fault+0x7b0/0xa50
              [    8.355065]  [<c0131a3e>] ? wake_up_new_task+0x8e/0xb0
              [    8.355079]  [<c01337b5>] ? do_fork+0xe5/0x320
              [    8.355095]  [<c0121919>] do_page_fault+0xe9/0x240
              [    8.355109]  [<c0121830>] ? do_page_fault+0x0/0x240
              [    8.355125]  [<c032457a>] error_code+0x72/0x78
              [    8.355139]   call  1/1: op=2863311530 arg=[aaaaaaaa] result=-38     xen_flush_tlb_others+0x41/0xd0
      
      Since empty cpumasks are rare and undoing an xen_mc_entry() is tricky
      just issue such requests normally.
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      694aa960
  16. 05 2月, 2009 1 次提交
  17. 31 1月, 2009 2 次提交
  18. 18 1月, 2009 1 次提交
  19. 16 1月, 2009 1 次提交
    • I
      percpu: add optimized generic percpu accessors · 6dbde353
      Ingo Molnar 提交于
      It is an optimization and a cleanup, and adds the following new
      generic percpu methods:
      
        percpu_read()
        percpu_write()
        percpu_add()
        percpu_sub()
        percpu_and()
        percpu_or()
        percpu_xor()
      
      and implements support for them on x86. (other architectures will fall
      back to a default implementation)
      
      The advantage is that for example to read a local percpu variable,
      instead of this sequence:
      
       return __get_cpu_var(var);
      
       ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
       ffffffff8102ca32:	81
       ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
       ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax
      
      We can get a single instruction by using the optimized variants:
      
       return percpu_read(var);
      
       ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax
      
      I also cleaned up the x86-specific APIs and made the x86 code use
      these new generic percpu primitives.
      
      tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
          * added percpu_and() for completeness's sake
          * made generic percpu ops atomic against preemption
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6dbde353
  20. 17 12月, 2008 2 次提交
  21. 23 11月, 2008 1 次提交
    • I
      xen: pin correct PGD on suspend · 86bbc2c2
      Ian Campbell 提交于
      Impact: fix Xen guest boot failure
      
      commit eefb47f6 ("xen: use
      spin_lock_nest_lock when pinning a pagetable") changed xen_pgd_walk to
      walk over mm->pgd rather than taking pgd as an argument.
      
      This breaks xen_mm_(un)pin_all() because it makes init_mm.pgd readonly
      instead of the pgd we are interested in and therefore the pin subsequently
      fails.
      
      (XEN) mm.c:2280:d15 Bad type (saw 00000000e8000001 != exp 0000000060000000) for mfn bc464 (pfn 21ca7)
      (XEN) mm.c:2665:d15 Error while pinning mfn bc464
      
      [   14.586913] 1 multicall(s) failed: cpu 0
      [   14.586926] Pid: 14, comm: kstop/0 Not tainted 2.6.28-rc5-x86_32p-xenU-00172-gee2f6cc7 #200
      [   14.586940] Call Trace:
      [   14.586955]  [<c030c17a>] ? printk+0x18/0x1e
      [   14.586972]  [<c0103df3>] xen_mc_flush+0x163/0x1d0
      [   14.586986]  [<c0104bc1>] __xen_pgd_pin+0xa1/0x110
      [   14.587000]  [<c015a330>] ? stop_cpu+0x0/0xf0
      [   14.587015]  [<c0104d7b>] xen_mm_pin_all+0x4b/0x70
      [   14.587029]  [<c022bcb9>] xen_suspend+0x39/0xe0
      [   14.587042]  [<c015a330>] ? stop_cpu+0x0/0xf0
      [   14.587054]  [<c015a3cd>] stop_cpu+0x9d/0xf0
      [   14.587067]  [<c01417cd>] run_workqueue+0x8d/0x150
      [   14.587080]  [<c030e4b3>] ? _spin_unlock_irqrestore+0x23/0x40
      [   14.587094]  [<c014558a>] ? prepare_to_wait+0x3a/0x70
      [   14.587107]  [<c0141918>] worker_thread+0x88/0xf0
      [   14.587120]  [<c01453c0>] ? autoremove_wake_function+0x0/0x50
      [   14.587133]  [<c0141890>] ? worker_thread+0x0/0xf0
      [   14.587146]  [<c014509c>] kthread+0x3c/0x70
      [   14.587157]  [<c0145060>] ? kthread+0x0/0x70
      [   14.587170]  [<c0109d1b>] kernel_thread_helper+0x7/0x10
      [   14.587181]   call  1/3: op=14 arg=[c0415000] result=0
      [   14.587192]   call  2/3: op=14 arg=[e1ca2000] result=0
      [   14.587204]   call  3/3: op=26 arg=[c1808860] result=-22
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      86bbc2c2
  22. 07 11月, 2008 2 次提交
    • J
      xen: make sure stray alias mappings are gone before pinning · d05fdf31
      Jeremy Fitzhardinge 提交于
      Xen requires that all mappings of pagetable pages are read-only, so
      that they can't be updated illegally.  As a result, if a page is being
      turned into a pagetable page, we need to make sure all its mappings
      are RO.
      
      If the page had been used for ioremap or vmalloc, it may still have
      left over mappings as a result of not having been lazily unmapped.
      This change makes sure we explicitly mop them all up before pinning
      the page.
      
      Unlike aliases created by kmap, the there can be vmalloc aliases even
      for non-high pages, so we must do the flush unconditionally.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Linux Memory Management List <linux-mm@kvack.org>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d05fdf31
    • J
      x86, xen: fix use of pgd_page now that it really does return a page · 47cb2ed9
      Jeremy Fitzhardinge 提交于
      Impact: fix 32-bit Xen guest boot crash
      
      On 32-bit PAE, pud_page, for no good reason, didn't really return a
      struct page *.  Since Jan Beulich's fix "i386/PAE: fix pud_page()",
      pud_page does return a struct page *.
      
      Because PAE has 3 pagetable levels, the pud level is folded into the
      pgd level, so pgd_page() is the same as pud_page(), and now returns
      a struct page *.  Update the xen/mmu.c code which uses pgd_page()
      accordingly.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      47cb2ed9
  23. 27 10月, 2008 1 次提交
    • C
      xen: fix Xen domU boot with batched mprotect · 9f32d21c
      Chris Lalancette 提交于
      Impact: fix guest kernel boot crash on certain configs
      
      Recent i686 2.6.27 kernels with a certain amount of memory (between
      736 and 855MB) have a problem booting under a hypervisor that supports
      batched mprotect (this includes the RHEL-5 Xen hypervisor as well as
      any 3.3 or later Xen hypervisor).
      
      The problem ends up being that xen_ptep_modify_prot_commit() is using
      virt_to_machine to calculate which pfn to update.  However, this only
      works for pages that are in the p2m list, and the pages coming from
      change_pte_range() in mm/mprotect.c are kmap_atomic pages.  Because of
      this, we can run into the situation where the lookup in the p2m table
      returns an INVALID_MFN, which we then try to pass to the hypervisor,
      which then (correctly) denies the request to a totally bogus pfn.
      
      The right thing to do is to use arbitrary_virt_to_machine, so that we
      can be sure we are modifying the right pfn.  This unfortunately
      introduces a performance penalty because of a full page-table-walk,
      but we can avoid that penalty for pages in the p2m list by checking if
      virt_addr_valid is true, and if so, just doing the lookup in the p2m
      table.
      
      The attached patch implements this, and allows my 2.6.27 i686 based
      guest with 768MB of memory to boot on a RHEL-5 hypervisor again.
      Thanks to Jeremy for the suggestions about how to fix this particular
      issue.
      Signed-off-by: NChris Lalancette <clalance@redhat.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Chris Lalancette <clalance@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f32d21c