1. 22 6月, 2009 2 次提交
    • T
      x86: reorganize cpa_process_alias() · 992f4c1c
      Tejun Heo 提交于
      Reorganize cpa_process_alias() so that new alias condition can be
      added easily.
      
      Jan Beulich spotted problem in the original cleanup thread which
      incorrectly assumed the two existing conditions were mutially
      exclusive.
      
      [ Impact: code reorganization ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      992f4c1c
    • L
      Move FAULT_FLAG_xyz into handle_mm_fault() callers · d06063cc
      Linus Torvalds 提交于
      This allows the callers to now pass down the full set of FAULT_FLAG_xyz
      flags to handle_mm_fault().  All callers have been (mechanically)
      converted to the new calling convention, there's almost certainly room
      for architectures to clean up their code and then add FAULT_FLAG_RETRY
      when that support is added.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d06063cc
  2. 21 6月, 2009 1 次提交
    • L
      x86: don't use 'access_ok()' as a range check in get_user_pages_fast() · 7f818906
      Linus Torvalds 提交于
      It's really not right to use 'access_ok()', since that is meant for the
      normal "get_user()" and "copy_from/to_user()" accesses, which are done
      through the TLB, rather than through the page tables.
      
      Why? access_ok() does both too few, and too many checks.  Too many,
      because it is meant for regular kernel accesses that will not honor the
      'user' bit in the page tables, and because it honors the USER_DS vs
      KERNEL_DS distinction that we shouldn't care about in GUP.  And too few,
      because it doesn't do the 'canonical' check on the address on x86-64,
      since the TLB will do that for us.
      
      So instead of using a function that isn't meant for this, and does
      something else and much more complicated, just do the real rules: we
      don't want the range to overflow, and on x86-64, we want it to be a
      canonical low address (on 32-bit, all addresses are canonical).
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f818906
  3. 19 6月, 2009 1 次提交
    • I
      perf_counter, x86: Improve interactions with fast-gup · 0c871971
      Ingo Molnar 提交于
      Improve a few details in perfcounter call-chain recording that
      makes use of fast-GUP:
      
      - Use ACCESS_ONCE() to observe the pte value. ptes are fundamentally
        racy and can be changed on another CPU, so we have to be careful
        about how we access them. The PAE branch is already careful with
        read-barriers - but the non-PAE and 64-bit side needs an
        ACCESS_ONCE() to make sure the pte value is observed only once.
      
      - make the checks a bit stricter so that we can feed it any kind of
        cra^H^H^H user-space input ;-)
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0c871971
  4. 16 6月, 2009 1 次提交
    • I
      x86: mm: Read cr2 before prefetching the mmap_lock · 5dfaf90f
      Ingo Molnar 提交于
      Prefetch instructions can generate spurious faults on certain
      models of older CPUs. The faults themselves cannot be stopped
      and they can occur pretty much anywhere - so the way we solve
      them is that we detect certain patterns and ignore the fault.
      
      There is one small path of code where we must not take faults
      though: the #PF handler execution leading up to the reading
      of the CR2 (the faulting address). If we take a fault there
      then we destroy the CR2 value (with that of the prefetching
      instruction's) and possibly mishandle user-space or
      kernel-space pagefaults.
      
      It turns out that in current upstream we do exactly that:
      
      	prefetchw(&mm->mmap_sem);
      
      	/* Get the faulting address: */
      	address = read_cr2();
      
      This is not good.
      
      So turn around the order: first read the cr2 then prefetch
      the lock address. Reading cr2 is plenty fast (2 cycles) so
      delaying the prefetch by this amount shouldnt be a big issue
      performance-wise.
      
      [ And this might explain a mystery fault.c warning that sometimes
        occurs on one an old AMD/Semptron based test-system i have -
        which does have such prefetch problems. ]
      
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      LKML-Reference: <20090616030522.GA22162@Krystal>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5dfaf90f
  5. 15 6月, 2009 10 次提交
  6. 13 6月, 2009 2 次提交
    • R
      kmemcheck: include module.h to prevent warnings · 60e38393
      Randy Dunlap 提交于
      kmemcheck/shadow.c needs to include <linux/module.h> to prevent
      the following warnings:
      
      linux-next-20080724/arch/x86/mm/kmemcheck/shadow.c:64: warning : data definition has no type or storage class
      linux-next-20080724/arch/x86/mm/kmemcheck/shadow.c:64: warning : type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
      linux-next-20080724/arch/x86/mm/kmemcheck/shadow.c:64: warning : parameter names (without types) in function declaration
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: vegardno@ifi.uio.no
      Cc: penberg@cs.helsinki.fi
      Cc: akpm <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      60e38393
    • V
      kmemcheck: add the kmemcheck core · dfec072e
      Vegard Nossum 提交于
      General description: kmemcheck is a patch to the linux kernel that
      detects use of uninitialized memory. It does this by trapping every
      read and write to memory that was allocated dynamically (e.g. using
      kmalloc()). If a memory address is read that has not previously been
      written to, a message is printed to the kernel log.
      
      Thanks to Andi Kleen for the set_memory_4k() solution.
      
      Andrew Morton suggested documenting the shadow member of struct page.
      Signed-off-by: NVegard Nossum <vegardno@ifi.uio.no>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      
      [export kmemcheck_mark_initialized]
      [build fix for setup_max_cpus]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      
      [rebased for mainline inclusion]
      Signed-off-by: NVegard Nossum <vegardno@ifi.uio.no>
      dfec072e
  7. 12 6月, 2009 2 次提交
    • S
      x86: change kernel_physical_mapping_init() __init to __meminit · 41d840e2
      Shaohua Li 提交于
      kernel_physical_mapping_init() could be called in memory hotplug path.
      
      [ Impact: fix potential crash with memory hotplug ]
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      LKML-Reference: <20090612045752.GA827@sli10-desk.sh.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      41d840e2
    • Y
      x86: make zap_low_mapping could be used early · 55cd6367
      Yinghai Lu 提交于
      Only one cpu is there, just call __flush_tlb for it. Fixes the following boot
      warning on x86:
      
        [    0.000000] Memory: 885032k/915540k available (5993k kernel code, 29844k reserved, 3842k data, 428k init, 0k highmem)
        [    0.000000] virtual kernel memory layout:
        [    0.000000]     fixmap  : 0xffe17000 - 0xfffff000   (1952 kB)
        [    0.000000]     vmalloc : 0xf8615000 - 0xffe15000   ( 120 MB)
        [    0.000000]     lowmem  : 0xc0000000 - 0xf7e15000   ( 894 MB)
        [    0.000000]       .init : 0xc19a5000 - 0xc1a10000   ( 428 kB)
        [    0.000000]       .data : 0xc15da4bb - 0xc199af6c   (3842 kB)
        [    0.000000]       .text : 0xc1000000 - 0xc15da4bb   (5993 kB)
        [    0.000000] Checking if this processor honours the WP bit even in supervisor mode...Ok.
        [    0.000000] ------------[ cut here ]------------
        [    0.000000] WARNING: at kernel/smp.c:369 smp_call_function_many+0x50/0x1b0()
        [    0.000000] Hardware name: System Product Name
        [    0.000000] Modules linked in:
        [    0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip #52504
        [    0.000000] Call Trace:
        [    0.000000]  [<c104aa16>] warn_slowpath_common+0x65/0x95
        [    0.000000]  [<c104aa58>] warn_slowpath_null+0x12/0x15
        [    0.000000]  [<c1073bbe>] smp_call_function_many+0x50/0x1b0
        [    0.000000]  [<c1037615>] ? do_flush_tlb_all+0x0/0x41
        [    0.000000]  [<c1037615>] ? do_flush_tlb_all+0x0/0x41
        [    0.000000]  [<c1073d4f>] smp_call_function+0x31/0x58
        [    0.000000]  [<c1037615>] ? do_flush_tlb_all+0x0/0x41
        [    0.000000]  [<c104f635>] on_each_cpu+0x26/0x65
        [    0.000000]  [<c10374b5>] flush_tlb_all+0x19/0x1b
        [    0.000000]  [<c1032ab3>] zap_low_mappings+0x4d/0x56
        [    0.000000]  [<c15d64b5>] ? printk+0x14/0x17
        [    0.000000]  [<c19b42a8>] mem_init+0x23d/0x245
        [    0.000000]  [<c19a56a1>] start_kernel+0x17a/0x2d5
        [    0.000000]  [<c19a5347>] ? unknown_bootoption+0x0/0x19a
        [    0.000000]  [<c19a5039>] __init_begin+0x39/0x41
        [    0.000000] ---[ end trace 4eaa2a86a8e2da22 ]---
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      55cd6367
  8. 11 6月, 2009 2 次提交
  9. 09 6月, 2009 1 次提交
  10. 29 5月, 2009 1 次提交
    • M
      x86: ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not · 32b154c0
      Mel Gorman 提交于
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13302
      
      On x86 and x86-64, it is possible that page tables are shared beween
      shared mappings backed by hugetlbfs.  As part of this,
      page_table_shareable() checks a pair of vma->vm_flags and they must match
      if they are to be shared.  All VMA flags are taken into account, including
      VM_LOCKED.
      
      The problem is that VM_LOCKED is cleared on fork().  When a process with a
      shared memory segment forks() to exec() a helper, there will be shared
      VMAs with different flags.  The impact is that the shared segment is
      sometimes considered shareable and other times not, depending on what
      process is checking.
      
      What happens is that the segment page tables are being shared but the
      count is inaccurate depending on the ordering of events.  As the page
      tables are freed with put_page(), bad pmd's are found when some of the
      children exit.  The hugepage counters also get corrupted and the Total and
      Free count will no longer match even when all the hugepage-backed regions
      are freed.  This requires a reboot of the machine to "fix".
      
      This patch addresses the problem by comparing all flags except VM_LOCKED
      when deciding if pagetables should be shared or not for hugetlbfs-backed
      mapping.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <starlight@binnacle.cx>
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32b154c0
  11. 27 5月, 2009 1 次提交
  12. 23 5月, 2009 2 次提交
  13. 18 5月, 2009 2 次提交
    • Y
      x86, mm: Fix node_possible_map logic · 7c43769a
      Yinghai Lu 提交于
      Recently there were some changes to the meaning of node_possible_map,
      and it is quite strange:
      
      - the node without memory would be set in node_possible_map
      - but some node with less NODE_MIN_SIZE will be kicked out of node_possible_map.
      
      fix it by adding strict_setup_node_bootmem().
      
      Also, remove unparse_node().
      
      so result will be:
      
      1. cpu_to_node() will return online node only (nearest one)
      2. apicid_to_node() still returns the node that could be not online but is set
         in node_possible_map.
      3. node_possible_map will include nodes that mem on it are less NODE_MIN_SIZE
      
      v2: after move_cpus_to_node change.
      
      [ Impact: get node_possible_map right ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Tested-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <4A0C49BE.6080800@kernel.org>
      [ v3: various small cleanups and comment clarifications ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7c43769a
    • Y
      mm, x86: remove MEMORY_HOTPLUG_RESERVE related code · 888a589f
      Yinghai Lu 提交于
      after:
      
       | commit b263295d
       | Author: Christoph Lameter <clameter@sgi.com>
       | Date:   Wed Jan 30 13:30:47 2008 +0100
       |
       |    x86: 64-bit, make sparsemem vmemmap the only memory model
      
      we don't have MEMORY_HOTPLUG_RESERVE anymore.
      
      Historically, x86-64 had an architecture-specific method for memory hotplug
      whereby it scanned the SRAT for physical memory ranges that could be
      potentially used for memory hot-add later. By reserving those ranges
      without physical memory, the memmap would be allocated and left dormant
      until needed. This depended on the DISCONTIG memory model which has been
      removed so the code implementing HOTPLUG_RESERVE is now dead.
      
      This patch removes the dead code used by MEMORY_HOTPLUG_RESERVE.
      
      (Changelog authored by Mel.)
      
      v2: updated changelog, and remove hotadd= in doc
      
      [ Impact: remove dead code ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Reviewed-by: NMel Gorman <mel@csn.ul.ie>
      Workflow-found-OK-by: NAndrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A0C4910.7090508@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      888a589f
  14. 12 5月, 2009 1 次提交
  15. 11 5月, 2009 5 次提交
  16. 08 5月, 2009 1 次提交
  17. 06 5月, 2009 2 次提交
  18. 03 5月, 2009 1 次提交
  19. 30 4月, 2009 2 次提交