1. 22 1月, 2011 1 次提交
    • S
      xen: p2m: correctly initialize partial p2m leaf · 8e1b4cf2
      Stefan Bader 提交于
      After changing the p2m mapping to a tree by
      
        commit 58e05027
          xen: convert p2m to a 3 level tree
      
      and trying to boot a DomU with 615MB of memory, the following crash was
      observed in the dump:
      
      kernel direct mapping tables up to 26f00000 @ 1ec4000-1fff000
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<c0107397>] xen_set_pte+0x27/0x60
      *pdpt = 0000000000000000 *pde = 0000000000000000
      
      Adding further debug statements showed that when trying to set up
      pfn=0x26700 the returned mapping was invalid.
      
      pfn=0x266ff calling set_pte(0xc1fe77f8, 0x6b3003)
      pfn=0x26700 calling set_pte(0xc1fe7800, 0x3)
      
      Although the last_pfn obtained from the startup info is 0x26700, which
      should in turn not be hit, the additional 8MB which are added as extra
      memory normally seem to be ok. This lead to looking into the initial
      p2m tree construction, which uses the smaller value and assuming that
      there is other code handling the extra memory.
      
      When the p2m tree is set up, the leaves are directly pointed to the
      array which the domain builder set up. But if the mapping is not on a
      boundary that fits into one p2m page, this will result in the last leaf
      being only partially valid. And as the invalid entries are not
      initialized in that case, things go badly wrong.
      
      I am trying to fix that by checking whether the current leaf is a
      complete map and if not, allocate a completely new page and copy only
      the valid pointers there. This may not be the most efficient or elegant
      solution, but at least it seems to allow me booting DomUs with memory
      assignments all over the range.
      
      BugLink: http://bugs.launchpad.net/bugs/686692
      [v2: Redid a bit of commit wording and fixed a compile warning]
      Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      8e1b4cf2
  2. 21 1月, 2011 1 次提交
  3. 20 1月, 2011 1 次提交
    • T
      lockdep: Move early boot local IRQ enable/disable status to init/main.c · 2ce802f6
      Tejun Heo 提交于
      During early boot, local IRQ is disabled until IRQ subsystem is
      properly initialized.  During this time, no one should enable
      local IRQ and some operations which usually are not allowed with
      IRQ disabled, e.g. operations which might sleep or require
      communications with other processors, are allowed.
      
      lockdep tracked this with early_boot_irqs_off/on() callbacks.
      As other subsystems need this information too, move it to
      init/main.c and make it generally available.  While at it,
      toggle the boolean to early_boot_irqs_disabled instead of
      enabled so that it can be initialized with %false and %true
      indicates the exceptional condition.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <20110120110635.GB6036@htj.dyndns.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2ce802f6
  4. 15 1月, 2011 1 次提交
  5. 12 1月, 2011 5 次提交
  6. 10 1月, 2011 1 次提交
  7. 07 1月, 2011 1 次提交
  8. 17 12月, 2010 1 次提交
  9. 03 12月, 2010 1 次提交
    • J
      vmalloc: eagerly clear ptes on vunmap · 64141da5
      Jeremy Fitzhardinge 提交于
      On stock 2.6.37-rc4, running:
      
        # mount lilith:/export /mnt/lilith
        # find  /mnt/lilith/ -type f -print0 | xargs -0 file
      
      crashes the machine fairly quickly under Xen.  Often it results in oops
      messages, but the couple of times I tried just now, it just hung quietly
      and made Xen print some rude messages:
      
          (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
          3000000000000000) for mfn 1d7058 (pfn 18fa7)
          (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
          (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
          1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
          (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04
      
      Which means the domain tried to map a pagetable page RW, which would
      allow it to map arbitrary memory, so Xen stopped it.  This is because
      vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
      finished with them, and those pages got recycled as pagetable pages
      while still having these RW aliases.
      
      Removing those mappings immediately removes the Xen-visible aliases, and
      so it has no problem with those pages being reused as pagetable pages.
      Deferring the TLB flush doesn't upset Xen because it can flush the TLB
      itself as needed to maintain its invariants.
      
      When unmapping a region in the vmalloc space, clear the ptes
      immediately.  There's no point in deferring this because there's no
      amortization benefit.
      
      The TLBs are left dirty, and they are flushed lazily to amortize the
      cost of the IPIs.
      
      This specific motivation for this patch is an oops-causing regression
      since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
      of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped
      pages") .  XFS also uses vm_map_ram() and could cause similar problems.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Bryan Schumaker <bjschuma@netapp.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Alex Elder <aelder@sgi.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      64141da5
  10. 02 12月, 2010 1 次提交
  11. 30 11月, 2010 2 次提交
    • I
      xen: x86/32: perform initial startup on initial_page_table · 805e3f49
      Ian Campbell 提交于
      Only make swapper_pg_dir readonly and pinned when generic x86 architecture code
      (which also starts on initial_page_table) switches to it.  This helps ensure
      that the generic setup paths work on Xen unmodified. In particular
      clone_pgd_range writes directly to the destination pgd entries and is used to
      initialise swapper_pg_dir so we need to ensure that it remains writeable until
      the last possible moment during bring up.
      
      This is complicated slightly by the need to avoid sharing kernel PMD entries
      when running under Xen, therefore the Xen implementation must make a copy of
      the kernel PMD (which is otherwise referred to by both intial_page_table and
      swapper_pg_dir) before switching to swapper_pg_dir.
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      805e3f49
    • J
      xen: don't bother to stop other cpus on shutdown/reboot · 31e323cc
      Jeremy Fitzhardinge 提交于
      Xen will shoot all the VCPUs when we do a shutdown hypercall, so there's
      no need to do it manually.
      
      In any case it will fail because all the IPI irqs have been pulled
      down by this point, so the cross-CPU calls will simply hang forever.
      
      Until change 76fac077 the function calls
      were not synchronously waited for, so this wasn't apparent.  However after
      that change the calls became synchronous leading to a hang on shutdown
      on multi-VCPU guests.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Stable Kernel <stable@kernel.org>
      Cc: Alok Kataria <akataria@vmware.com>
      31e323cc
  12. 28 11月, 2010 1 次提交
  13. 25 11月, 2010 2 次提交
  14. 23 11月, 2010 3 次提交
    • J
      xen: use default_idle · bc15fde7
      Jeremy Fitzhardinge 提交于
      We just need the idle loop to drop into safe_halt, which default_idle()
      is perfectly capable of doing.  There's no need to duplicate it.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      bc15fde7
    • J
      xen: clean up "extra" memory handling some more · c2d08791
      Jeremy Fitzhardinge 提交于
      Make sure that extra_pages is added for all E820_RAM regions beyond
      mem_end - completely excluded regions as well as the remains of partially
      included regions.
      
      Also makes sure the extra region is not unnecessarily high, and simplifies
      the logic to decide which regions should be added.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      c2d08791
    • K
      xen: set IO permission early (before early_cpu_init()) · ec35a69c
      Konrad Rzeszutek Wilk 提交于
      This patch is based off "xen dom0: Set up basic IO permissions for dom0."
      by Juan Quintela <quintela@redhat.com>.
      
      On AMD machines when we boot the kernel as Domain 0 we get this nasty:
      
      mapping kernel into physical memory
      Xen: setup ISA identity maps
      about to get started...
      (XEN) traps.c:475:d0 Unhandled general protection fault fault/trap [#13] on VCPU 0 [ec=0000]
      (XEN) domain_crash_sync called from entry.S
      (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
      (XEN) ----[ Xen-4.1-101116  x86_64  debug=y  Not tainted ]----
      (XEN) CPU:    0
      (XEN) RIP:    e033:[<ffffffff8130271b>]
      (XEN) RFLAGS: 0000000000000282   EM: 1   CONTEXT: pv guest
      (XEN) rax: 000000008000c068   rbx: ffffffff8186c680   rcx: 0000000000000068
      (XEN) rdx: 0000000000000cf8   rsi: 000000000000c000   rdi: 0000000000000000
      (XEN) rbp: ffffffff81801e98   rsp: ffffffff81801e50   r8:  ffffffff81801eac
      (XEN) r9:  ffffffff81801ea8   r10: ffffffff81801eb4   r11: 00000000ffffffff
      (XEN) r12: ffffffff8186c694   r13: ffffffff81801f90   r14: ffffffffffffffff
      (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
      (XEN) cr3: 0000000221803000   cr2: 0000000000000000
      (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
      (XEN) Guest stack trace from rsp=ffffffff81801e50:
      
      RIP points to read_pci_config() function.
      
      The issue is that we don't set IO permissions for the Linux kernel early enough.
      
      The call sequence used to be:
      
          xen_start_kernel()
      	x86_init.oem.arch_setup = xen_setup_arch;
              setup_arch:
                 - early_cpu_init
                     - early_init_amd
                        - read_pci_config
                 - x86_init.oem.arch_setup [ xen_arch_setup ]
                     - set IO permissions.
      
      We need to set the IO permissions earlier on, which this patch does.
      Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      ec35a69c
  15. 20 11月, 2010 1 次提交
  16. 13 11月, 2010 1 次提交
  17. 12 11月, 2010 1 次提交
  18. 11 11月, 2010 1 次提交
  19. 30 10月, 2010 1 次提交
  20. 28 10月, 2010 1 次提交
  21. 27 10月, 2010 1 次提交
    • S
      xen: initialize cpu masks for pv guests in xen_smp_init · ea5b8f73
      Stefano Stabellini 提交于
      Pv guests don't have ACPI and need the cpu masks to be set
      correctly as early as possible so we call xen_fill_possible_map from
      xen_smp_init.
      On the other hand the initial domain supports ACPI so in this case we skip
      xen_fill_possible_map and rely on it. However Xen might limit the number
      of cpus usable by the domain, so we filter those masks during smp
      initialization using the VCPUOP_is_up hypercall.
      It is important that the filtering is done before
      xen_setup_vcpu_info_placement.
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      ea5b8f73
  22. 26 10月, 2010 1 次提交
  23. 23 10月, 2010 10 次提交