1. 28 4月, 2008 1 次提交
  2. 27 4月, 2008 2 次提交
    • Y
      x86_64/mm: check and print vmemmap allocation continuous · c2b91e2e
      Yinghai Lu 提交于
      On big systems with lots of memory, don't print out too much during
      bootup, and make it easy to find if it is continuous.
      
      on 256G 8 sockets system will get
       [ffffe20000000000-ffffe20002bfffff] PMD -> [ffff810001400000-ffff810003ffffff] on node 0
      [ffffe2001c700000-ffffe2001c7fffff] potential offnode page_structs
       [ffffe20002c00000-ffffe2001c7fffff] PMD -> [ffff81000c000000-ffff8100255fffff] on node 0
      [ffffe20038700000-ffffe200387fffff] potential offnode page_structs
       [ffffe2001c800000-ffffe200387fffff] PMD -> [ffff810820200000-ffff81083c1fffff] on node 1
       [ffffe20040000000-ffffe2007fffffff] PUD ->ffff811027a00000 on node 2
       [ffffe20038800000-ffffe2003fffffff] PMD -> [ffff811020200000-ffff8110279fffff] on node 2
      [ffffe20054700000-ffffe200547fffff] potential offnode page_structs
       [ffffe20040000000-ffffe200547fffff] PMD -> [ffff811027c00000-ffff81103c3fffff] on node 2
      [ffffe20070700000-ffffe200707fffff] potential offnode page_structs
       [ffffe20054800000-ffffe200707fffff] PMD -> [ffff811820200000-ffff81183c1fffff] on node 3
       [ffffe20080000000-ffffe200bfffffff] PUD ->ffff81202fa00000 on node 4
       [ffffe20070800000-ffffe2007fffffff] PMD -> [ffff812020200000-ffff81202f9fffff] on node 4
      [ffffe2008c700000-ffffe2008c7fffff] potential offnode page_structs
       [ffffe20080000000-ffffe2008c7fffff] PMD -> [ffff81202fc00000-ffff81203c3fffff] on node 4
      [ffffe200a8700000-ffffe200a87fffff] potential offnode page_structs
       [ffffe2008c800000-ffffe200a87fffff] PMD -> [ffff812820200000-ffff81283c1fffff] on node 5
       [ffffe200c0000000-ffffe200ffffffff] PUD ->ffff813037a00000 on node 6
       [ffffe200a8800000-ffffe200bfffffff] PMD -> [ffff813020200000-ffff8130379fffff] on node 6
      [ffffe200c4700000-ffffe200c47fffff] potential offnode page_structs
       [ffffe200c0000000-ffffe200c47fffff] PMD -> [ffff813037c00000-ffff81303c3fffff] on node 6
       [ffffe200c4800000-ffffe200e07fffff] PMD -> [ffff813820200000-ffff81383c1fffff] on node 7
      
      instead of a very long print out...
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      c2b91e2e
    • Y
      x86_64: make reserve_bootmem_generic() use new reserve_bootmem() · 8b3cd09e
      Yinghai Lu 提交于
      "mm: make reserve_bootmem can crossed the nodes" provides new
      reserve_bootmem(), let reserve_bootmem_generic() use that.
      
      reserve_bootmem_generic() is used to reserve initramdisk, so this way
      we can make sure even when bootloader or kexec load ranges cross the
      node memory boundaries, reserve_bootmem still works.
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b3cd09e
  3. 26 4月, 2008 2 次提交
  4. 25 4月, 2008 5 次提交
  5. 20 4月, 2008 1 次提交
  6. 17 4月, 2008 9 次提交
  7. 26 2月, 2008 2 次提交
    • I
      x86: fix spontaneous reboot with allyesconfig bzImage · 88f3aec7
      Ingo Molnar 提交于
      recently the 64-bit allyesconfig bzImage kernel started spontaneously
      rebooting during early bootup.
      
      after a few fun hours spent with early init debugging, it turns out
      that we've got this rather annoying limit on the size of the kernel
      image:
      
            #define KERNEL_TEXT_SIZE  (40*1024*1024)
      
      which limit my vmlinux just happened to pass:
      
             text           data       bss        dec       hex   filename
         29703744        4222751   8646224c   42572719   2899baf   vmlinux
      
      40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/
      
      So it happily crashed right in head_64.S, which - as we all know - is
      the most debuggable code in the whole architecture ;-)
      
      So increase the limit to allow an up to 128MB kernel image to be mapped.
      (should anyone be that crazy or lazy)
      
      We have a full 4K of pagetable (level2_kernel_pgt) allocated for these
      mappings already, so there's no RAM overhead and the limit was rather
      pointless and arbitrary.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      88f3aec7
    • Y
      x86: remove double-checking empty zero pages debug · 3b57bc46
      Yinghai Lu 提交于
      so far no one complained about that.
      Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3b57bc46
  8. 19 2月, 2008 1 次提交
    • T
      x86: zap invalid and unused pmds in early boot · 31eedd82
      Thomas Gleixner 提交于
      The early boot code maps KERNEL_TEXT_SIZE (currently 40MB) starting
      from __START_KERNEL_map. The kernel itself only needs _text to _end
      mapped in the high alias. On relocatible kernels the ASM setup code
      adjusts the compile time created high mappings to the relocation. This
      creates invalid pmd entries for negative offsets:
      
      0xffffffff80000000 -> pmd entry: ffffffffff2001e3
      It points outside of the physical address space and is marked present.
      
      This starts at the virtual address __START_KERNEL_map and goes up to
      the point where the first valid physical address (0x0) is mapped.
      
      Zap the mappings before _text and after _end right away in early
      boot. This removes also the invalid entries.
      
      Furthermore it simplifies the range check for high aliases.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      31eedd82
  9. 15 2月, 2008 1 次提交
  10. 10 2月, 2008 2 次提交
  11. 08 2月, 2008 1 次提交
    • B
      Introduce flags for reserve_bootmem() · 72a7fe39
      Bernhard Walle 提交于
      This patchset adds a flags variable to reserve_bootmem() and uses the
      BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions
      between crashkernel area and already used memory.
      
      This patch:
      
      Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE.
      If that flag is set, the function returns with -EBUSY if the memory already
      has been reserved in the past.  This is to avoid conflicts.
      
      Because that code runs before SMP initialisation, there's no race condition
      inside reserve_bootmem_core().
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix powerpc build]
      Signed-off-by: NBernhard Walle <bwalle@suse.de>
      Cc: <linux-arch@vger.kernel.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72a7fe39
  12. 07 2月, 2008 1 次提交
  13. 04 2月, 2008 3 次提交
  14. 02 2月, 2008 2 次提交
    • Y
      x86_64: make bootmap_start page align v6 · 24a5da73
      Yinghai Lu 提交于
      boot oopses when a system has 64 or 128 GB of RAM installed:
      
      Calling initcall 0xffffffff80bc33b6: sctp_init+0x0/0x711()
      BUG: unable to handle kernel NULL pointer dereference at 000000000000005f
      IP: [<ffffffff802bfe55>] proc_register+0xe7/0x10f
      PGD 0
      Oops: 0000 [1] SMP
      CPU 0
      Modules linked in:
      Pid: 1, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #6
      RIP: 0010:[<ffffffff802bfe55>]  [<ffffffff802bfe55>] proc_register+0xe7/0x10f
      RSP: 0000:ffff810824c57e60  EFLAGS: 00010246
      RAX: 000000000000d7d7 RBX: ffff811024c5fa80 RCX: ffff810824c57e08
      RDX: 0000000000000000 RSI: 0000000000000195 RDI: ffffffff80cc2460
      RBP: ffffffffffffffff R08: 0000000000000000 R09: ffff811024c5fa80
      R10: 0000000000000000 R11: 0000000000000002 R12: ffff810824c57e6c
      R13: 0000000000000000 R14: ffff810824c57ee0 R15: 00000006abd25bee
      FS:  0000000000000000(0000) GS:ffffffff80b4d000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 000000000000005f CR3: 0000000000201000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 1, threadinfo ffff810824c56000, task ffff812024c52000)
      Stack:  ffffffff80a57348 0000019500000000 ffff811024c5fa80 0000000000000000
       00000000ffffff97 ffffffff802bfef0 0000000000000000 ffffffffffffffff
       0000000000000000 ffffffff80bc3b4b ffff810824c57ee0 ffffffff80bc34a5
      Call Trace:
       [<ffffffff802bfef0>] ? create_proc_entry+0x73/0x8a
       [<ffffffff80bc3b4b>] ? sctp_snmp_proc_init+0x1c/0x34
       [<ffffffff80bc34a5>] ? sctp_init+0xef/0x711
       [<ffffffff80b976e3>] ? kernel_init+0x175/0x2e1
       [<ffffffff8020ccf8>] ? child_rip+0xa/0x12
       [<ffffffff80b9756e>] ? kernel_init+0x0/0x2e1
       [<ffffffff8020ccee>] ? child_rip+0x0/0x12
      
      Code: 1e 48 83 7b 38 00 75 08 48 c7 43 38 f0 e8 82 80 48 83 7b 30 00 75 08 48 c7 43 30 d0 e9 82 80 48 c7 c7 60 24 cc 80 e8 bd 5a 54 00 <48> 8b 45 60 48 89 6b 58 48 89 5d 60 48 89 43 50 fe 05 f5 25 a0
      RIP  [<ffffffff802bfe55>] proc_register+0xe7/0x10f
       RSP <ffff810824c57e60>
      CR2: 000000000000005f
      ---[ end trace 02c2d78def82877a ]---
      Kernel panic - not syncing: Attempted to kill init!
      
      it turns out some variables near end of bss are corrupted already.
      
      in System.map we have
      ffffffff80d40420 b rsi_table
      ffffffff80d40620 B krb5_seq_lock
      ffffffff80d40628 b i.20437
      ffffffff80d40630 b xprt_rdma_inline_write_padding
      ffffffff80d40638 b sunrpc_table_header
      ffffffff80d40640 b zero
      ffffffff80d40644 b min_memreg
      ffffffff80d40648 b rpcrdma_tk_lock_g
      ffffffff80d40650 B sctp_assocs_id_lock
      ffffffff80d40658 B proc_net_sctp
      ffffffff80d40660 B sctp_assocs_id
      ffffffff80d40680 B sysctl_sctp_mem
      ffffffff80d40690 B sysctl_sctp_rmem
      ffffffff80d406a0 B sysctl_sctp_wmem
      ffffffff80d406b0 b sctp_ctl_socket
      ffffffff80d406b8 b sctp_pf_inet6_specific
      ffffffff80d406c0 b sctp_pf_inet_specific
      ffffffff80d406c8 b sctp_af_v4_specific
      ffffffff80d406d0 b sctp_af_v6_specific
      ffffffff80d406d8 b sctp_rand.33270
      ffffffff80d406dc b sctp_memory_pressure
      ffffffff80d406e0 b sctp_sockets_allocated
      ffffffff80d406e4 b sctp_memory_allocated
      ffffffff80d406e8 b sctp_sysctl_header
      ffffffff80d406f0 b zero
      ffffffff80d406f4 A __bss_stop
      ffffffff80d406f4 A _end
      
      and setup_node_bootmem() will use that page 0xd40000 for bootmap
      Bootmem setup node 0 0000000000000000-0000000828000000
        NODE_DATA [000000000008a485 - 0000000000091484]
        bootmap [0000000000d406f4 -  0000000000e456f3] pages 105
      Bootmem setup node 1 0000000828000000-0000001028000000
        NODE_DATA [0000000828000000 - 0000000828006fff]
        bootmap [0000000828007000 -  0000000828106fff] pages 100
      Bootmem setup node 2 0000001028000000-0000001828000000
        NODE_DATA [0000001028000000 - 0000001028006fff]
        bootmap [0000001028007000 -  0000001028106fff] pages 100
      Bootmem setup node 3 0000001828000000-0000002028000000
        NODE_DATA [0000001828000000 - 0000001828006fff]
        bootmap [0000001828007000 -  0000001828106fff] pages 100
      
      setup_node_bootmem() makes NODE_DATA cacheline aligned,
      and bootmap is page-aligned.
      
      the patch updates find_e820_area() to make sure we can meet
      the alignment constraints.
      Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      24a5da73
    • Y
      x86_64: add debug name for early_res · 25eff8d4
      Yinghai Lu 提交于
      helps debugging problems in this rather murky area of code.
      Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      25eff8d4
  15. 30 1月, 2008 7 次提交
    • Y
      x86: fix overlap between pagetable with bss section · 91987157
      Yinghai Lu 提交于
      one early crash on one 8 node 256g machine:
      
      Command line: console=uart8250,io,0x3f8,115200n8 initrd=kernel.org/mydisk11_x86_64.gz rw root=/dev/ram0 debug initcall_debug apic=debug acpi.debug_level=0x0000000f pci=routeirq ip=dhcp load_ramdisk=1 ramdisk_size=131072 BOOT_IMAGE=kernel.org/bzImage_2.6.25_k8.1
      BIOS-provided physical RAM map:
       BIOS-e820: 0000000000000000 - 000000000009bc00 (usable)
       BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved)
       BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved)
       BIOS-e820: 0000000000100000 - 00000000dffe0000 (usable)
       BIOS-e820: 00000000dffe0000 - 00000000dffee000 (ACPI data)
       BIOS-e820: 00000000dffee000 - 00000000dffff050 (ACPI NVS)
       BIOS-e820: 00000000dffff050 - 00000000e0000000 (reserved)
       BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
       BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
       BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved)
       BIOS-e820: 0000000100000000 - 0000004020000000 (usable)
      Early serial console at I/O port 0x3f8 (options '115200n8')
      console [uart0] enabled
      end_pfn_map = 67239936
      Kernel panic - not syncing: Duplicated early reservation d40000-e42000
      
      Pid: 0, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #3
      
      Call Trace:
       [<ffffffff80221545>] lapic_get_maxlvt+0x0/0x10
       [<ffffffff80221657>] clear_local_APIC+0x5/0xcf
       [<ffffffff80221726>] disable_local_APIC+0x5/0x17
       [<ffffffff8021fe16>] smp_send_stop+0x46/0x4c
       [<ffffffff80235293>] panic+0x94/0x13e
       [<ffffffff80bc3b03>] sctp_eps_proc_init+0x12/0x34
       [<ffffffff80b9f1c5>] reserve_early+0x30/0x6c
       [<ffffffff80803925>] init_memory_mapping+0x2cd/0x2dc
       [<ffffffff80b9dc01>] setup_arch+0x21f/0x44e
       [<ffffffff80b978be>] start_kernel+0x6f/0x2c7
       [<ffffffff80b971cc>] _sinittext+0x1cc/0x1d3
      
      it turns out there is overlap between pgtable and bss...
      
      in System.map we have
      ffffffff80d40420 b rsi_table
      ffffffff80d40620 B krb5_seq_lock
      ffffffff80d40628 b i.20437
      ffffffff80d40630 b xprt_rdma_inline_write_padding
      ffffffff80d40638 b sunrpc_table_header
      ffffffff80d40640 b zero
      ffffffff80d40644 b min_memreg
      ffffffff80d40648 b rpcrdma_tk_lock_g
      ffffffff80d40650 B sctp_assocs_id_lock
      ffffffff80d40658 B proc_net_sctp
      ffffffff80d40660 B sctp_assocs_id
      ffffffff80d40680 B sysctl_sctp_mem
      ffffffff80d40690 B sysctl_sctp_rmem
      ffffffff80d406a0 B sysctl_sctp_wmem
      ffffffff80d406b0 b sctp_ctl_socket
      ffffffff80d406b8 b sctp_pf_inet6_specific
      ffffffff80d406c0 b sctp_pf_inet_specific
      ffffffff80d406c8 b sctp_af_v4_specific
      ffffffff80d406d0 b sctp_af_v6_specific
      ffffffff80d406d8 b sctp_rand.33270
      ffffffff80d406dc b sctp_memory_pressure
      ffffffff80d406e0 b sctp_sockets_allocated
      ffffffff80d406e4 b sctp_memory_allocated
      ffffffff80d406e8 b sctp_sysctl_header
      ffffffff80d406f0 b zero
      ffffffff80d406f4 A __bss_stop
      ffffffff80d406f4 A _end
      
      need to round up table_start to PAGE_SIZE.
      
      also make the panic more informative.
      Signed-off-by: NYinghai Lu <yinghai.lu@sun.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      91987157
    • I
      x86: arch/x86/mm/init_64.c printk fixes · 10f22dde
      Ingo Molnar 提交于
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      10f22dde
    • T
      x86: unify ioremap · 14a62c34
      Thomas Gleixner 提交于
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      14a62c34
    • I
      x86: cpa: fix the self-test · 86f03989
      Ingo Molnar 提交于
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      86f03989
    • I
      x86: init memory debugging · ee01f112
      Ingo Molnar 提交于
      debug incorrect/late access to init memory, by permanently unmapping
      the init memory ranges. Depends on CONFIG_DEBUG_PAGEALLOC=y.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      ee01f112
    • A
      x86: move misplaced rodata check call · 1a487252
      Arjan van de Ven 提交于
      It looks like a mismerge put the rodata self-check in the wrong spot; move
      it to the right place after marking the .rodata section read only.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      1a487252
    • A
      x86: add testcases for RODATA and NX protections/attributes · edeed305
      Arjan van de Ven 提交于
      Latest update; I now have 4 NX tests, but 2 fail so they're #if 0'd.
      I also cleaned up the NX test code quite a bit, and got rid of the ugly
      exception table sorting stuff.
      
      From: Arjan van de Ven <arjan@linux.intel.com>
      
      This patch adds testcases for the CONFIG_DEBUG_RODATA configuration option
      as well as the NX CPU feature/mappings. Both testcases can move to tests/
      once that patch gets merged into mainline.
      (I'm half considering moving the rodata test into mm/init.c but I'll
      wait with that until init.c is unified)
      
      As part of this I had to fix a not-quite-right alignment in the vmlinux.lds.h
      for the RODATA sections, which lead to 1 page less being marked read only.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      edeed305