1. 27 4月, 2008 15 次提交
    • R
      x86: validate against acpi motherboard resources · 7752d5cf
      Robert Hancock 提交于
      This path adds validation of the MMCONFIG table against the ACPI reserved
      motherboard resources.  If the MMCONFIG table is found to be reserved in
      ACPI, we don't bother checking the E820 table.  The PCI Express firmware
      spec apparently tells BIOS developers that reservation in ACPI is required
      and E820 reservation is optional, so checking against ACPI first makes
      sense.  Many BIOSes don't reserve the MMCONFIG region in E820 even though
      it is perfectly functional, the existing check needlessly disables MMCONFIG
      in these cases.
      
      In order to do this, MMCONFIG setup has been split into two phases.  If PCI
      configuration type 1 is not available then MMCONFIG is enabled early as
      before.  Otherwise, it is enabled later after the ACPI interpreter is
      enabled, since we need to be able to execute control methods in order to
      check the ACPI reserved resources.  Presently this is just triggered off
      the end of ACPI interpreter initialization.
      
      There are a few other behavioral changes here:
      
      - Validate all MMCONFIG configurations provided, not just the first one.
      
      - Validate the entire required length of each configuration according to
        the provided ending bus number is reserved, not just the minimum required
        allocation.
      
      - Validate that the area is reserved even if we read it from the chipset
        directly and not from the MCFG table.  This catches the case where the
        BIOS didn't set the location properly in the chipset and has mapped it
        over other things it shouldn't have.
      
      This also cleans up the MMCONFIG initialization functions so that they
      simply do nothing if MMCONFIG is not compiled in.
      
      Based on an original patch by Rajesh Shah from Intel.
      
      [akpm@linux-foundation.org: many fixes and cleanups]
      Signed-off-by: NRobert Hancock <hancockr@shaw.ca>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NAndi Kleen <ak@suse.de>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      7752d5cf
    • Y
      x86_64/mm: check and print vmemmap allocation continuous · c2b91e2e
      Yinghai Lu 提交于
      On big systems with lots of memory, don't print out too much during
      bootup, and make it easy to find if it is continuous.
      
      on 256G 8 sockets system will get
       [ffffe20000000000-ffffe20002bfffff] PMD -> [ffff810001400000-ffff810003ffffff] on node 0
      [ffffe2001c700000-ffffe2001c7fffff] potential offnode page_structs
       [ffffe20002c00000-ffffe2001c7fffff] PMD -> [ffff81000c000000-ffff8100255fffff] on node 0
      [ffffe20038700000-ffffe200387fffff] potential offnode page_structs
       [ffffe2001c800000-ffffe200387fffff] PMD -> [ffff810820200000-ffff81083c1fffff] on node 1
       [ffffe20040000000-ffffe2007fffffff] PUD ->ffff811027a00000 on node 2
       [ffffe20038800000-ffffe2003fffffff] PMD -> [ffff811020200000-ffff8110279fffff] on node 2
      [ffffe20054700000-ffffe200547fffff] potential offnode page_structs
       [ffffe20040000000-ffffe200547fffff] PMD -> [ffff811027c00000-ffff81103c3fffff] on node 2
      [ffffe20070700000-ffffe200707fffff] potential offnode page_structs
       [ffffe20054800000-ffffe200707fffff] PMD -> [ffff811820200000-ffff81183c1fffff] on node 3
       [ffffe20080000000-ffffe200bfffffff] PUD ->ffff81202fa00000 on node 4
       [ffffe20070800000-ffffe2007fffffff] PMD -> [ffff812020200000-ffff81202f9fffff] on node 4
      [ffffe2008c700000-ffffe2008c7fffff] potential offnode page_structs
       [ffffe20080000000-ffffe2008c7fffff] PMD -> [ffff81202fc00000-ffff81203c3fffff] on node 4
      [ffffe200a8700000-ffffe200a87fffff] potential offnode page_structs
       [ffffe2008c800000-ffffe200a87fffff] PMD -> [ffff812820200000-ffff81283c1fffff] on node 5
       [ffffe200c0000000-ffffe200ffffffff] PUD ->ffff813037a00000 on node 6
       [ffffe200a8800000-ffffe200bfffffff] PMD -> [ffff813020200000-ffff8130379fffff] on node 6
      [ffffe200c4700000-ffffe200c47fffff] potential offnode page_structs
       [ffffe200c0000000-ffffe200c47fffff] PMD -> [ffff813037c00000-ffff81303c3fffff] on node 6
       [ffffe200c4800000-ffffe200e07fffff] PMD -> [ffff813820200000-ffff81383c1fffff] on node 7
      
      instead of a very long print out...
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      c2b91e2e
    • Y
      x86_64: fix setup_node_bootmem to support big mem excluding with memmap · 1a27fc0a
      Yinghai Lu 提交于
      typical case: four sockets system, every node has 4g ram, and we are using:
      
      	memmap=10g$4g
      
      to mask out memory on node1 and node2
      
      when numa is enabled, early_node_mem is used to get node_data and node_bootmap.
      
      if it can not get memory from the same node with find_e820_area(), it will
      use alloc_bootmem to get buff from previous nodes.
      
      so check it and print out some info about it.
      
      need to move early_res_to_bootmem into every setup_node_bootmem.
      and it takes range that node has. otherwise alloc_bootmem could return addr
      that reserved early.
      
      depends on "mm: make reserve_bootmem can crossed the nodes".
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1a27fc0a
    • Y
      x86_64: make reserve_bootmem_generic() use new reserve_bootmem() · 8b3cd09e
      Yinghai Lu 提交于
      "mm: make reserve_bootmem can crossed the nodes" provides new
      reserve_bootmem(), let reserve_bootmem_generic() use that.
      
      reserve_bootmem_generic() is used to reserve initramdisk, so this way
      we can make sure even when bootloader or kexec load ranges cross the
      node memory boundaries, reserve_bootmem still works.
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b3cd09e
    • H
      x86, boot: export linked list of struct setup_data via debugfs · c14b2adf
      Huang, Ying 提交于
      Export linked list of struct setup_data via debugfs.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      c14b2adf
    • H
      x86, boot: add linked list of struct setup_data · 8b664aa6
      Huang, Ying 提交于
      This patch adds a field of 64-bit physical pointer to NULL terminated
      single linked list of struct setup_data to real-mode kernel
      header. This is used as a more extensible boot parameters passing
      mechanism.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      8b664aa6
    • H
      x86, boot: add free_early to early reservation machanism · 50eae2a7
      Huang, Ying 提交于
      Add free_early to early reservation mechanism - this way early bootup
      failure paths can stop wasting memory.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      50eae2a7
    • V
      x86, PAT: disable /dev/mem mmap RAM with PAT · 0124cecf
      Venki Pallipadi 提交于
      disable /dev/mem mmap of RAM with PAT. It makes things safer and
      eliminates aliasing. A future improvement would be to avoid the
      range_is_allowed duplication.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0124cecf
    • A
      x86, bitops: select the generic bitmap search functions · 19870def
      Alexander van Heukelum 提交于
      Introduce GENERIC_FIND_FIRST_BIT and GENERIC_FIND_NEXT_BIT in
      lib/Kconfig, defaulting to off. An arch that wants to use the
      generic implementation now only has to use a select statement
      to include them.
      
      I added an always-y option (X86_CPU) to arch/x86/Kconfig.cpu
      and used that to select the generic search functions. This
      way ARCH=um SUBARCH=i386 automatically picks up the change
      too, and arch/um/Kconfig.i386 can therefore be simplified a
      bit. ARCH=um SUBARCH=x86_64 does things differently, but
      still compiles fine. It seems that a "def_bool y" always
      wins over a "def_bool n"?
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      19870def
    • A
      x86, UML: remove x86-specific implementations of find_first_bit · 5245698f
      Alexander van Heukelum 提交于
      x86 has been switched to the generic versions of find_first_bit
      and find_first_zero_bit, but the original versions were retained.
      This patch just removes the now unused x86-specific versions.
      
      also update UML.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5245698f
    • A
      x86: switch 64-bit to generic find_first_bit · 2aba6925
      Alexander van Heukelum 提交于
      Switch x86_64 to generic find_first_bit. The x86_64-specific
      implementation is not removed.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2aba6925
    • A
      x86: generic versions of find_first_(zero_)bit, convert i386 · 77b9bd9c
      Alexander van Heukelum 提交于
      Generic versions of __find_first_bit and __find_first_zero_bit
      are introduced as simplified versions of __find_next_bit and
      __find_next_zero_bit. Their compilation and use are guarded by
      a new config variable GENERIC_FIND_FIRST_BIT.
      
      The generic versions of find_first_bit and find_first_zero_bit
      are implemented in terms of the newly introduced __find_first_bit
      and __find_first_zero_bit.
      
      This patch does not remove the i386-specific implementation,
      but it does switch i386 to use the generic functions by setting
      GENERIC_FIND_FIRST_BIT=y for X86_32.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      77b9bd9c
    • A
      x86: merge the simple bitops and move them to bitops.h · 12d9c842
      Alexander van Heukelum 提交于
      Some of those can be written in such a way that the same
      inline assembly can be used to generate both 32 bit and
      64 bit code.
      
      For ffs and fls, x86_64 unconditionally used the cmov
      instruction and i386 unconditionally used a conditional
      branch over a mov instruction. In the current patch I
      chose to select the version based on the availability
      of the cmov instruction instead. A small detail here is
      that x86_64 did not previously set CONFIG_X86_CMOV=y.
      
      Improved comments for ffs, ffz, fls and variations.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      12d9c842
    • A
      x86: change x86 to use generic find_next_bit · 6fd92b63
      Alexander van Heukelum 提交于
      The versions with inline assembly are in fact slower on the machines I
      tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, AMD
      Opteron 270). The i386-version needed a fix similar to 06024f21 to avoid
      crashing the benchmark.
      
      Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size
      1...512, for each possible bitmap with one bit set, for each possible
      offset: find the position of the first bit starting at offset. If you
      follow ;). Times include setup of the bitmap and checking of the
      results.
      
      		Athlon		Xeon		Opteron 32/64bit
      x86-specific:	0m3.692s	0m2.820s	0m3.196s / 0m2.480s
      generic:	0m2.622s	0m1.662s	0m2.100s / 0m1.572s
      
      If the bitmap size is not a multiple of BITS_PER_LONG, and no set
      (cleared) bit is found, find_next_bit (find_next_zero_bit) returns a
      value outside of the range [0, size]. The generic version always returns
      exactly size. The generic version also uses unsigned long everywhere,
      while the x86 versions use a mishmash of int, unsigned (int), long and
      unsigned long.
      
      Using the generic version does give a slightly bigger kernel, though.
      
      defconfig:	   text    data     bss     dec     hex filename
      x86-specific:	4738555  481232  626688 5846475  5935cb vmlinux (32 bit)
      generic:	4738621  481232  626688 5846541  59360d vmlinux (32 bit)
      x86-specific:	5392395  846568  724424 6963387  6a40bb vmlinux (64 bit)
      generic:	5392458  846568  724424 6963450  6a40fa vmlinux (64 bit)
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6fd92b63
    • I
      x86 PAT: decouple from nonpromisc devmem · 2a8a2719
      Ingo Molnar 提交于
      Linus pointed it out that PAT should not depend on NONPROMISC_DEVMEM.
      
      Also make PAT non-default.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a8a2719
  2. 26 4月, 2008 25 次提交