1. 19 7月, 2011 8 次提交
  2. 30 6月, 2011 1 次提交
  3. 17 6月, 2011 1 次提交
    • K
      xen/setup: Fix for incorrect xen_extra_mem_start. · acd049c6
      Konrad Rzeszutek Wilk 提交于
      The earlier attempts (24bdb0b6)
      at fixing this problem caused other problems to surface (PV guests
      with no PCI passthrough would have SWIOTLB turned on - which meant
      64MB of precious contingous DMA32 memory being eaten up per guest).
      The problem was: "on xen we add an extra memory region at the end of
      the e820, and on this particular machine this extra memory region
      would start below 4g and cross over the 4g boundary:
      
      [0xfee01000-0x192655000)
      
      Unfortunately e820_end_of_low_ram_pfn does not expect an
      e820 layout like that so it returns 4g, therefore initial_memory_mapping
      will map [0 - 0x100000000), that is a memory range that includes some
      reserved memory regions."
      
      The memory range was the IOAPIC regions, and with the 1-1 mapping
      turned on, it would map them as RAM, not as MMIO regions. This caused
      the hypervisor to complain. Fortunately this is experienced only under
      the initial domain so we guard for it.
      Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      acd049c6
  4. 16 6月, 2011 2 次提交
  5. 09 6月, 2011 1 次提交
  6. 04 6月, 2011 1 次提交
  7. 21 5月, 2011 9 次提交
  8. 19 5月, 2011 6 次提交
  9. 13 5月, 2011 9 次提交
    • D
      arch/x86/xen/setup: Cleanup code/data sections definitions · ae15a3b4
      Daniel Kiper 提交于
      Cleanup code/data sections definitions
      accordingly to include/linux/init.h.
      Signed-off-by: NDaniel Kiper <dkiper@net-space.pl>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      ae15a3b4
    • D
      arch/x86/xen/enlighten: Cleanup code/data sections definitions · ad3062a0
      Daniel Kiper 提交于
      Cleanup code/data sections definitions
      accordingly to include/linux/init.h.
      Signed-off-by: NDaniel Kiper <dkiper@net-space.pl>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      ad3062a0
    • D
      arch/x86/xen/irq: Cleanup code/data sections definitions · 251511a1
      Daniel Kiper 提交于
      Cleanup code/data sections definitions
      accordingly to include/linux/init.h.
      Signed-off-by: NDaniel Kiper <dkiper@net-space.pl>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      251511a1
    • K
      xen/p2m: Create entries in the P2M_MFN trees's to track 1-1 mappings · 8c595088
      Konrad Rzeszutek Wilk 提交于
      .. when applicable. We need to track in the p2m_mfn and
      p2m_mfn_p the MFNs and pointers, respectivly, for the P2M entries
      that are allocated for the identity mappings. Without this,
      a PV domain with an E820 that triggers the 1-1 mapping to kick in,
      won't be able to be restored as the P2M won't have the identity
      mappings.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      8c595088
    • D
      xen/setup: Fix for incorrect xen_extra_mem_start initialization under 32-bit · 0f16d0df
      Daniel Kiper 提交于
      git commit 24bdb0b6 (xen: do not create
      the extra e820 region at an addr lower than 4G) does not take into
      account that ifdef CONFIG_X86_32 instead of e820_end_of_low_ram_pfn()
      find_low_pfn_range() is called (both calls are from arch/x86/kernel/setup.c).
      find_low_pfn_range() behaves correctly and does not require change in
      xen_extra_mem_start initialization. Additionally, if xen_extra_mem_start
      is initialized in the same way as ifdef CONFIG_X86_64 then memory hotplug
      support for Xen balloon driver (under development) is broken.
      Signed-off-by: NDaniel Kiper <dkiper@net-space.pl>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      0f16d0df
    • K
      xen/setup: Ignore E820_UNUSABLE when setting 1-1 mappings. · 15bfc094
      Konrad Rzeszutek Wilk 提交于
      When we parse the raw E820, the Xen hypervisor can set "E820_RAM"
      to "E820_UNUSABLE" if the mem=X argument is used. As such we
      should _not_ consider the E820_UNUSABLE as an 1-1 identity
      mapping, but instead use the same case as for E820_RAM.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      15bfc094
    • T
      xen mmu: fix a race window causing leave_mm BUG() · 7899891c
      Tian, Kevin 提交于
      There's a race window in xen_drop_mm_ref, where remote cpu may exit
      dirty bitmap between the check on this cpu and the point where remote
      cpu handles drop request. So in drop_other_mm_ref we need check
      whether TLB state is still lazy before calling into leave_mm. This
      bug is rarely observed in earlier kernel, but exaggerated by the
      commit 831d52bc
      ("x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm")
      which clears bitmap after changing the TLB state. the call trace is as below:
      
      ---------------------------------
      kernel BUG at arch/x86/mm/tlb.c:61!
      invalid opcode: 0000 [#1] SMP
      last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb
      CPU 1
      Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcs pkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last unloaded: freq_table]
      Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285
      RIP: e030:[<ffffffff8103a3cb>]  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
      RSP: e02b:ffff88002805be48  EFLAGS: 00010046
      RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0
      RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001
      RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200
      R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880
      R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000
      FS:  00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000
      CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 00
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40)
      Stack:
       ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88
      <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78
      <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108
      Call Trace:
       <IRQ>
       [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53
       [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc
       [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28
       [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120
       [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
       [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d
       [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46
       [<ffffffff81013efe>] xen_do_hyper visor_callback+0x1e/0x30
       <EOI>
       [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17
       [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
       [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500
       [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
       [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
       [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef
       [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
       [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7
       [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
       [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255
       [<ffffffff81114362>] ? do_execve+0x1c3/0x29e
       [<ffffffff8101155d>] ? sys_execve+0x43/0x5d
       [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
       [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0
       [<ffffffff 8106fc45>] ? __call_usermodehelper+0x0/0x6f
       [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
       [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e
       [<ffffffff81013daa>] ? child_rip+0xa/0x20
       [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
       [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
       [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
       [<ffffffff81013da0>] ? child_rip+0x0/0x20
      Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8
      RIP  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
       RSP <ffff88002805be48>
      ---[ end trace ce9cee6832a9c503 ]---
      
      Tested-by: Maoxiaoyun<tinnycloud@hotmail.com>
      Signed-off-by: NKevin Tian <kevin.tian@intel.com>
      [v1: Fleshed out the git description a bit]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      7899891c
    • S
      x86,xen: introduce x86_init.mapping.pagetable_reserve · 279b706b
      Stefano Stabellini 提交于
      Introduce a new x86_init hook called pagetable_reserve that at the end
      of init_memory_mapping is used to reserve a range of memory addresses for
      the kernel pagetable pages we used and free the other ones.
      
      On native it just calls memblock_x86_reserve_range while on xen it also
      takes care of setting the spare memory previously allocated
      for kernel pagetable pages from RO to RW, so that it can be used for
      other purposes.
      
      A detailed explanation of the reason why this hook is needed follows.
      
      As a consequence of the commit:
      
      commit 4b239f45
      Author: Yinghai Lu <yinghai@kernel.org>
      Date:   Fri Dec 17 16:58:28 2010 -0800
      
          x86-64, mm: Put early page table high
      
      at some point init_memory_mapping is going to reach the pagetable pages
      area and map those pages too (mapping them as normal memory that falls
      in the range of addresses passed to init_memory_mapping as argument).
      Some of those pages are already pagetable pages (they are in the range
      pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and
      everything is fine.
      Some of these pages are not pagetable pages yet (they fall in the range
      pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
      are going to be mapped RW.  When these pages become pagetable pages and
      are hooked into the pagetable, xen will find that the guest has already
      a RW mapping of them somewhere and fail the operation.
      The reason Xen requires pagetables to be RO is that the hypervisor needs
      to verify that the pagetables are valid before using them. The validation
      operations are called "pinning" (more details in arch/x86/xen/mmu.c).
      
      In order to fix the issue we mark all the pages in the entire range
      pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
      is completed only the range pgt_buf_start-pgt_buf_end is reserved by
      init_memory_mapping. Hence the kernel is going to crash as soon as one
      of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
      ranges are RO).
      
      For this reason we need a hook to reserve the kernel pagetable pages we
      used and free the other ones so that they can be reused for other
      purposes.
      On native it just means calling memblock_x86_reserve_range, on Xen it
      also means marking RW the pagetable pages that we allocated before but
      that haven't been used before.
      
      Another way to fix this is without using the hook is by adding a 'if
      (xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen
      counterpart, but that is just nasty.
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      279b706b
    • K
      Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high"" · 92bdaef7
      Konrad Rzeszutek Wilk 提交于
      This reverts commit a3864783.
      
      It does not work with certain AMD machines.
      
      last_pfn = 0x100000 max_arch_pfn = 0x400000000
      initial memory mapped : 0 - 02c3a000
      Base memory trampoline at [ffff88000009b000] 9b000 size 20480
      init_memory_mapping: 0000000000000000-0000000100000000
       0000000000 - 0100000000 page 4k
      kernel direct mapping tables up to 100000000 @ ff7fb000-100000000
      init_memory_mapping: 0000000100000000-00000001e0800000
       0100000000 - 01e0800000 page 4k
      kernel direct mapping tables up to 1e0800000 @ 1df0f3000-1e0000000
      xen: setting RW the range fffdc000 - 100000000
      RAMDISK: 0203b000 - 02c3a000
      No NUMA configuration found
      Faking a node at 0000000000000000-00000001e0800000
      NUMA: Using 63 for the hash shift.
      Initmem setup node 0 0000000000000000-00000001e0800000
        NODE_DATA [00000001dfffb000 - 00000001dfffffff]
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
      PGD 0
      Oops: 0003 [#1] SMP
      last sysfs file:
      CPU 0
      Modules linked in:
      
      Pid: 0, comm: swapper Not tainted 2.6.39-0-virtual #6~smb1
      RIP: e030:[<ffffffff81cf6a75>]  [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
      RSP: e02b:ffffffff81c01e38  EFLAGS: 00010046
      RAX: 0000000000000000 RBX: 00000001e0800000 RCX: 0000000000001040
      RDX: 0000000000004100 RSI: 0000000000000000 RDI: ffff8801dfffb000
      RBP: ffffffff81c01e58 R08: 0000000000000020 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000bfe400
      FS:  0000000000000000(0000) GS:ffffffff81cca000(0000) knlGS:0000000000000000
      CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000000 CR3: 0000000001c03000 CR4: 0000000000000660
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020)
      Stack:
       0000000000000040 0000000000000001 0000000000000000 ffffffffffffffff
       ffffffff81c01e88 ffffffff81cf6c25 0000000000000000 0000000000000000
       ffffffff81cf687f 0000000000000000 ffffffff81c01ea8 ffffffff81cf6e45
      Call Trace:
       [<ffffffff81cf6c25>] numa_register_memblks.constprop.3+0x150/0x181
       [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c
       [<ffffffff81cf6e45>] numa_init.part.2+0x1c/0x7c
       [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c
       [<ffffffff81cf6f67>] numa_init+0x6c/0x70
       [<ffffffff81cf7057>] initmem_init+0x39/0x3b
       [<ffffffff81ce5865>] setup_arch+0x64e/0x769
       [<ffffffff815e43c1>] ? printk+0x51/0x53
       [<ffffffff81cdf92b>] start_kernel+0xd4/0x3f3
       [<ffffffff81cdf388>] x86_64_start_reservations+0x132/0x136
       [<ffffffff81ce2ed4>] xen_start_kernel+0x588/0x58f
      Code: 41 00 00 48 8b 3c c5 a0 24 cc 81 31 c0 40 f6 c7 01 74 05 aa 66 ba ff 40 40 f6 c7 02 74 05 66 ab 83 ea 02 89 d1 c1 e9 02 f6 c2 02 <f3> ab 74 02 66 ab 80 e2 01 74 01 aa 49 63 c4 48 c1 eb 0c 44 89
      RIP  [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
       RSP <ffffffff81c01e38>
      CR2: 0000000000000000
      ---[ end trace a7919e7f17c0a725 ]---
      Kernel panic - not syncing: Attempted to kill the idle task!
      Pid: 0, comm: swapper Tainted: G      D     2.6.39-0-virtual #6~smb1
      Reported-by: NStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      92bdaef7
  10. 10 5月, 2011 1 次提交
  11. 03 5月, 2011 1 次提交