1. 22 3月, 2008 1 次提交
    • Y
      x86: reserve dma32 early for gart · f62f1fc9
      Yinghai Lu 提交于
      a system with 256 GB of RAM, when NUMA is disabled crashes the
      following way:
      
      Your BIOS doesn't leave a aperture memory hole
      Please enable the IOMMU option in the BIOS setup
      This costs you 64 MB of RAM
      Cannot allocate aperture memory hole (ffff8101c0000000,65536K)
      Kernel panic - not syncing: Not enough memory for aperture
      Pid: 0, comm: swapper Not tainted 2.6.25-rc4-x86-latest.git #33
      
      Call Trace:
       [<ffffffff84037c62>] panic+0xb2/0x190
       [<ffffffff840381fc>] ? release_console_sem+0x7c/0x250
       [<ffffffff847b1628>] ? __alloc_bootmem_nopanic+0x48/0x90
       [<ffffffff847b0ac9>] ? free_bootmem+0x29/0x50
       [<ffffffff847ac1f7>] gart_iommu_hole_init+0x5e7/0x680
       [<ffffffff847b255b>] ? alloc_large_system_hash+0x16b/0x310
       [<ffffffff84506a2f>] ? _etext+0x0/0x1
       [<ffffffff847a2e8c>] pci_iommu_alloc+0x1c/0x40
       [<ffffffff847ac795>] mem_init+0x45/0x1a0
       [<ffffffff8479ff35>] start_kernel+0x295/0x380
       [<ffffffff8479f1c2>] _sinittext+0x1c2/0x230
      
      the root cause is : memmap PMD is too big,
      [ffffe200e0600000-ffffe200e07fffff] PMD ->ffff81383c000000 on node 0
      almost near 4G..., and vmemmap_alloc_block will use up the ram under 4G.
      
      solution will be:
      1. make memmap allocation get memory above 4G...
      2. reserve some dma32 range early before we try to set up memmap for all.
      and release that before pci_iommu_alloc, so gart or swiotlb could get some
      range under 4g limit for sure.
      
      the patch is using method 2.
      because method1 may need more code to handle SPARSEMEM and SPASEMEM_VMEMMAP
      
      will get
      Your BIOS doesn't leave a aperture memory hole
      Please enable the IOMMU option in the BIOS setup
      This costs you 64 MB of RAM
      Mapping aperture over 65536 KB of RAM @ 4000000
      Memory: 264245736k/268959744k available (8484k kernel code, 4187464k reserved, 4004k data, 724k init)
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      f62f1fc9
  2. 26 2月, 2008 1 次提交
    • M
      x86: fix boot failure on 486 due to TSC breakage · 12c247a6
      Mikael Pettersson 提交于
       > Diffing dmesg between git7 and git8 doesn't sched any light since
       > git8 also removed the printouts of the x86 caps as they were being
       > initialised and updated. I'm currently adding those printouts back
       > in the hope of seeing where and when the caps get broken.
      
      That turned out to be very illuminating:
      
       --- dmesg-2.6.24-git7	2008-02-24 18:01:25.295851000 +0100
       +++ dmesg-2.6.24-git8	2008-02-24 18:01:25.530358000 +0100
       ...
       CPU: After generic identify, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      
       CPU: After all inits, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      +CPU: After applying cleared_cpu_caps, caps: 00000013 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      
      Notice how the TSC cap bit goes from Off to On.
      
      (The first two lines are printout loops from -git7 forward-ported
      to -git8, the third line is the same printout loop added just after
      the xor-with-cleared_cpu_caps[] loop.)
      
      Here's how the breakage occurs:
      1. arch/x86/kernel/tsc_32.c:tsc_init() sees !cpu_has_tsc,
         so bails and calls setup_clear_cpu_cap(X86_FEATURE_TSC).
      2. include/asm-x86/cpufeature.h:setup_clear_cpu_cap(bit) clears
         the bit in boot_cpu_data and sets it in cleared_cpu_caps
      3. arch/x86/kernel/cpu/common.c:identify_cpu() XORs all caps
         in with cleared_cpu_caps
         HOWEVER, at this point c->x86_capability correctly has TSC
         Off, cleared_cpu_caps has TSC On, so the XOR incorrectly
         sets TSC to On in c->x86_capability, with disastrous results.
      
      The real bug is that clearing bits with XOR only works if the
      bits are known to be 1 prior to the XOR, and that's not true here.
      
      A simple fix is to convert the XOR to AND-NOT instead. The following
      patch does that, and allows my 486 to boot 2.6.25-rc kernels again.
      
      [ mingo@elte.hu: fixed a similar bug in setup_64.c as well. ]
      
      The breakage was introduced via commit 7d851c8d.
      Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      12c247a6
  3. 19 2月, 2008 3 次提交
  4. 09 2月, 2008 1 次提交
  5. 08 2月, 2008 2 次提交
  6. 04 2月, 2008 1 次提交
  7. 02 2月, 2008 1 次提交
    • Y
      x86_64: make bootmap_start page align v6 · 24a5da73
      Yinghai Lu 提交于
      boot oopses when a system has 64 or 128 GB of RAM installed:
      
      Calling initcall 0xffffffff80bc33b6: sctp_init+0x0/0x711()
      BUG: unable to handle kernel NULL pointer dereference at 000000000000005f
      IP: [<ffffffff802bfe55>] proc_register+0xe7/0x10f
      PGD 0
      Oops: 0000 [1] SMP
      CPU 0
      Modules linked in:
      Pid: 1, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #6
      RIP: 0010:[<ffffffff802bfe55>]  [<ffffffff802bfe55>] proc_register+0xe7/0x10f
      RSP: 0000:ffff810824c57e60  EFLAGS: 00010246
      RAX: 000000000000d7d7 RBX: ffff811024c5fa80 RCX: ffff810824c57e08
      RDX: 0000000000000000 RSI: 0000000000000195 RDI: ffffffff80cc2460
      RBP: ffffffffffffffff R08: 0000000000000000 R09: ffff811024c5fa80
      R10: 0000000000000000 R11: 0000000000000002 R12: ffff810824c57e6c
      R13: 0000000000000000 R14: ffff810824c57ee0 R15: 00000006abd25bee
      FS:  0000000000000000(0000) GS:ffffffff80b4d000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 000000000000005f CR3: 0000000000201000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 1, threadinfo ffff810824c56000, task ffff812024c52000)
      Stack:  ffffffff80a57348 0000019500000000 ffff811024c5fa80 0000000000000000
       00000000ffffff97 ffffffff802bfef0 0000000000000000 ffffffffffffffff
       0000000000000000 ffffffff80bc3b4b ffff810824c57ee0 ffffffff80bc34a5
      Call Trace:
       [<ffffffff802bfef0>] ? create_proc_entry+0x73/0x8a
       [<ffffffff80bc3b4b>] ? sctp_snmp_proc_init+0x1c/0x34
       [<ffffffff80bc34a5>] ? sctp_init+0xef/0x711
       [<ffffffff80b976e3>] ? kernel_init+0x175/0x2e1
       [<ffffffff8020ccf8>] ? child_rip+0xa/0x12
       [<ffffffff80b9756e>] ? kernel_init+0x0/0x2e1
       [<ffffffff8020ccee>] ? child_rip+0x0/0x12
      
      Code: 1e 48 83 7b 38 00 75 08 48 c7 43 38 f0 e8 82 80 48 83 7b 30 00 75 08 48 c7 43 30 d0 e9 82 80 48 c7 c7 60 24 cc 80 e8 bd 5a 54 00 <48> 8b 45 60 48 89 6b 58 48 89 5d 60 48 89 43 50 fe 05 f5 25 a0
      RIP  [<ffffffff802bfe55>] proc_register+0xe7/0x10f
       RSP <ffff810824c57e60>
      CR2: 000000000000005f
      ---[ end trace 02c2d78def82877a ]---
      Kernel panic - not syncing: Attempted to kill init!
      
      it turns out some variables near end of bss are corrupted already.
      
      in System.map we have
      ffffffff80d40420 b rsi_table
      ffffffff80d40620 B krb5_seq_lock
      ffffffff80d40628 b i.20437
      ffffffff80d40630 b xprt_rdma_inline_write_padding
      ffffffff80d40638 b sunrpc_table_header
      ffffffff80d40640 b zero
      ffffffff80d40644 b min_memreg
      ffffffff80d40648 b rpcrdma_tk_lock_g
      ffffffff80d40650 B sctp_assocs_id_lock
      ffffffff80d40658 B proc_net_sctp
      ffffffff80d40660 B sctp_assocs_id
      ffffffff80d40680 B sysctl_sctp_mem
      ffffffff80d40690 B sysctl_sctp_rmem
      ffffffff80d406a0 B sysctl_sctp_wmem
      ffffffff80d406b0 b sctp_ctl_socket
      ffffffff80d406b8 b sctp_pf_inet6_specific
      ffffffff80d406c0 b sctp_pf_inet_specific
      ffffffff80d406c8 b sctp_af_v4_specific
      ffffffff80d406d0 b sctp_af_v6_specific
      ffffffff80d406d8 b sctp_rand.33270
      ffffffff80d406dc b sctp_memory_pressure
      ffffffff80d406e0 b sctp_sockets_allocated
      ffffffff80d406e4 b sctp_memory_allocated
      ffffffff80d406e8 b sctp_sysctl_header
      ffffffff80d406f0 b zero
      ffffffff80d406f4 A __bss_stop
      ffffffff80d406f4 A _end
      
      and setup_node_bootmem() will use that page 0xd40000 for bootmap
      Bootmem setup node 0 0000000000000000-0000000828000000
        NODE_DATA [000000000008a485 - 0000000000091484]
        bootmap [0000000000d406f4 -  0000000000e456f3] pages 105
      Bootmem setup node 1 0000000828000000-0000001028000000
        NODE_DATA [0000000828000000 - 0000000828006fff]
        bootmap [0000000828007000 -  0000000828106fff] pages 100
      Bootmem setup node 2 0000001028000000-0000001828000000
        NODE_DATA [0000001028000000 - 0000001028006fff]
        bootmap [0000001028007000 -  0000001028106fff] pages 100
      Bootmem setup node 3 0000001828000000-0000002028000000
        NODE_DATA [0000001828000000 - 0000001828006fff]
        bootmap [0000001828007000 -  0000001828106fff] pages 100
      
      setup_node_bootmem() makes NODE_DATA cacheline aligned,
      and bootmap is page-aligned.
      
      the patch updates find_e820_area() to make sure we can meet
      the alignment constraints.
      Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      24a5da73
  8. 30 1月, 2008 30 次提交