1. 07 3月, 2010 20 次提交
  2. 04 3月, 2010 2 次提交
    • S
      SLUB: Fix per-cpu merge conflict · 1154fab7
      Stephen Rothwell 提交于
      The slab tree adds a percpu variable usage case (commit
      9dfc6e68 "SLUB: Use this_cpu operations in
      slub"), but the percpu tree removes the prefixing of percpu variables (commit
      dd17c8f7 "percpu: remove per_cpu__ prefix"),
      thus causing the following compilation error:
      
          CC      mm/slub.o
        mm/slub.c: In function ‘alloc_kmem_cache_cpus’:
        mm/slub.c:2078: error: implicit declaration of function ‘per_cpu_var’
        mm/slub.c:2078: warning: assignment makes pointer from integer without a cast
        make[1]: *** [mm/slub.o] Error 1
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      1154fab7
    • C
      kill unused invalidate_inode_pages helper · 2ecdc82e
      Christoph Hellwig 提交于
      No one is calling this anymore as everyone has switched to
      invalidate_mapping_pages long time ago.  Also update a few
      references to it in comments.  nfs has two more, but I can't
      easily figure what they are actually referring to, so I left
      them as-is.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2ecdc82e
  3. 02 3月, 2010 1 次提交
  4. 27 2月, 2010 1 次提交
  5. 26 2月, 2010 1 次提交
  6. 23 2月, 2010 1 次提交
  7. 22 2月, 2010 2 次提交
    • Y
      x86: Fix non-bootmem compilation on PowerPC · 2ee78f7b
      Yinghai Lu 提交于
      These build errors on some non-x86 platforms (PowerPC for example):
      
       mm/page_alloc.c: In function '__alloc_memory_core_early':
         mm/page_alloc.c:3468: error: implicit declaration of function 'find_early_area'
         mm/page_alloc.c:3483: error: implicit declaration of function 'reserve_early_without_check'
      
      The function is only needed on CONFIG_NO_BOOTMEM.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Johannes Weiner <hannes@saeurebad.de>
      Cc: Mel Gorman <mel@csn.ul.ie>
      LKML-Reference: <4B747239.4070907@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2ee78f7b
    • H
      mm: Make copy_from_user() in migrate.c statically predictable · 87b8d1ad
      H. Peter Anvin 提交于
      x86-32 has had a static test for copy_on_user() overflow for a while.
      This test currently fails in mm/migrate.c resulting in an
      allyesconfig/allmodconfig build failure on x86-32:
      
      In function ‘copy_from_user’,
          inlined from ‘do_pages_stat’ at
          /home/hpa/kernel/git/mm/migrate.c:1012:
      /home/hpa/kernel/git/arch/x86/include/asm/uaccess_32.h:212: error:
          call to ‘copy_from_user_overflow’ declared
      
      Make the logic more explicit and therefore easier for gcc to
      understand.
      
      v2: rewrite the loop entirely using a more normal structure for a
          chunked-data loop (Linus Torvalds)
      Reported-by: NLen Brown <lenb@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Reviewed-and-Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Arjan van de Ven <arjan@linux.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      87b8d1ad
  8. 21 2月, 2010 1 次提交
    • R
      MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself · 4b3073e1
      Russell King 提交于
      On VIVT ARM, when we have multiple shared mappings of the same file
      in the same MM, we need to ensure that we have coherency across all
      copies.  We do this via make_coherent() by making the pages
      uncacheable.
      
      This used to work fine, until we allowed highmem with highpte - we
      now have a page table which is mapped as required, and is not available
      for modification via update_mmu_cache().
      
      Ralf Beache suggested getting rid of the PTE value passed to
      update_mmu_cache():
      
        On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
        to construct a pointer to the pte again.  Passing a pte_t * is much
        more elegant.  Maybe we might even replace the pte argument with the
        pte_t?
      
      Ben Herrenschmidt would also like the pte pointer for PowerPC:
      
        Passing the ptep in there is exactly what I want.  I want that
        -instead- of the PTE value, because I have issue on some ppc cases,
        for I$/D$ coherency, where set_pte_at() may decide to mask out the
        _PAGE_EXEC.
      
      So, pass in the mapped page table pointer into update_mmu_cache(), and
      remove the PTE value, updating all implementations and call sites to
      suit.
      
      Includes a fix from Stephen Rothwell:
      
        sparc: fix fallout from update_mmu_cache API change
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      4b3073e1
  9. 17 2月, 2010 1 次提交
  10. 13 2月, 2010 3 次提交
    • Y
      sparsemem: Put mem map for one node together. · 9bdac914
      Yinghai Lu 提交于
      Add vmemmap_alloc_block_buf for mem map only.
      
      It will fallback to the old way if it cannot get a block that big.
      
      Before this patch, when a node have 128g ram installed, memmap are
      split into two parts or more.
      [    0.000000]  [ffffea0000000000-ffffea003fffffff] PMD -> [ffff880100600000-ffff88013e9fffff] on node 1
      [    0.000000]  [ffffea0040000000-ffffea006fffffff] PMD -> [ffff88013ec00000-ffff88016ebfffff] on node 1
      [    0.000000]  [ffffea0070000000-ffffea007fffffff] PMD -> [ffff882000600000-ffff8820105fffff] on node 0
      [    0.000000]  [ffffea0080000000-ffffea00bfffffff] PMD -> [ffff882010800000-ffff8820507fffff] on node 0
      [    0.000000]  [ffffea00c0000000-ffffea00dfffffff] PMD -> [ffff882050a00000-ffff8820709fffff] on node 0
      [    0.000000]  [ffffea00e0000000-ffffea00ffffffff] PMD -> [ffff884000600000-ffff8840205fffff] on node 2
      [    0.000000]  [ffffea0100000000-ffffea013fffffff] PMD -> [ffff884020800000-ffff8840607fffff] on node 2
      [    0.000000]  [ffffea0140000000-ffffea014fffffff] PMD -> [ffff884060a00000-ffff8840709fffff] on node 2
      [    0.000000]  [ffffea0150000000-ffffea017fffffff] PMD -> [ffff886000600000-ffff8860305fffff] on node 3
      [    0.000000]  [ffffea0180000000-ffffea01bfffffff] PMD -> [ffff886030800000-ffff8860707fffff] on node 3
      [    0.000000]  [ffffea01c0000000-ffffea01ffffffff] PMD -> [ffff888000600000-ffff8880405fffff] on node 4
      [    0.000000]  [ffffea0200000000-ffffea022fffffff] PMD -> [ffff888040800000-ffff8880707fffff] on node 4
      [    0.000000]  [ffffea0230000000-ffffea023fffffff] PMD -> [ffff88a000600000-ffff88a0105fffff] on node 5
      [    0.000000]  [ffffea0240000000-ffffea027fffffff] PMD -> [ffff88a010800000-ffff88a0507fffff] on node 5
      [    0.000000]  [ffffea0280000000-ffffea029fffffff] PMD -> [ffff88a050a00000-ffff88a0709fffff] on node 5
      [    0.000000]  [ffffea02a0000000-ffffea02bfffffff] PMD -> [ffff88c000600000-ffff88c0205fffff] on node 6
      [    0.000000]  [ffffea02c0000000-ffffea02ffffffff] PMD -> [ffff88c020800000-ffff88c0607fffff] on node 6
      [    0.000000]  [ffffea0300000000-ffffea030fffffff] PMD -> [ffff88c060a00000-ffff88c0709fffff] on node 6
      [    0.000000]  [ffffea0310000000-ffffea033fffffff] PMD -> [ffff88e000600000-ffff88e0305fffff] on node 7
      [    0.000000]  [ffffea0340000000-ffffea037fffffff] PMD -> [ffff88e030800000-ffff88e0707fffff] on node 7
      
      after patch will get
      [    0.000000]  [ffffea0000000000-ffffea006fffffff] PMD -> [ffff880100200000-ffff88016e5fffff] on node 0
      [    0.000000]  [ffffea0070000000-ffffea00dfffffff] PMD -> [ffff882000200000-ffff8820701fffff] on node 1
      [    0.000000]  [ffffea00e0000000-ffffea014fffffff] PMD -> [ffff884000200000-ffff8840701fffff] on node 2
      [    0.000000]  [ffffea0150000000-ffffea01bfffffff] PMD -> [ffff886000200000-ffff8860701fffff] on node 3
      [    0.000000]  [ffffea01c0000000-ffffea022fffffff] PMD -> [ffff888000200000-ffff8880701fffff] on node 4
      [    0.000000]  [ffffea0230000000-ffffea029fffffff] PMD -> [ffff88a000200000-ffff88a0701fffff] on node 5
      [    0.000000]  [ffffea02a0000000-ffffea030fffffff] PMD -> [ffff88c000200000-ffff88c0701fffff] on node 6
      [    0.000000]  [ffffea0310000000-ffffea037fffffff] PMD -> [ffff88e000200000-ffff88e0701fffff] on node 7
      
      -v2: change buf to vmemmap_buf instead according to Ingo
           also add CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER according to Ingo
      -v3: according to Andrew, use sizeof(name) instead of hard coded 15
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <1265793639-15071-19-git-send-email-yinghai@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Acked-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9bdac914
    • Y
      sparsemem: Put usemap for one node together · a4322e1b
      Yinghai Lu 提交于
      Could save some buffer space instead of applying one by one.
      
      Could help that system that is going to use early_res instead of bootmem
      less entries in early_res make search more faster on system with more memory.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <1265793639-15071-18-git-send-email-yinghai@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      a4322e1b
    • Y
      x86: Make 64 bit use early_res instead of bootmem before slab · 08677214
      Yinghai Lu 提交于
      Finally we can use early_res to replace bootmem for x86_64 now.
      
      Still can use CONFIG_NO_BOOTMEM to enable it or not.
      
      -v2: fix 32bit compiling about MAX_DMA32_PFN
      -v3: folded bug fix from LKML message below
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4B747239.4070907@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      08677214
  11. 07 2月, 2010 1 次提交
  12. 03 2月, 2010 4 次提交
    • J
      hugetlb: fix section mismatches · 094e9539
      Jeff Mahoney 提交于
      hugetlb_sysfs_add_hstate is called by hugetlb_register_node directly
      during init and also indirectly via sysfs after init.
      
      This patch removes the __init tag from hugetlb_sysfs_add_hstate.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      094e9539
    • A
      mm: flush dcache before writing into page to avoid alias · 931e80e4
      anfei zhou 提交于
      The cache alias problem will happen if the changes of user shared mapping
      is not flushed before copying, then user and kernel mapping may be mapped
      into two different cache line, it is impossible to guarantee the coherence
      after iov_iter_copy_from_user_atomic.  So the right steps should be:
      
      	flush_dcache_page(page);
      	kmap_atomic(page);
      	write to page;
      	kunmap_atomic(page);
      	flush_dcache_page(page);
      
      More precisely, we might create two new APIs flush_dcache_user_page and
      flush_dcache_kern_page to replace the two flush_dcache_page accordingly.
      
      Here is a snippet tested on omap2430 with VIPT cache, and I think it is
      not ARM-specific:
      
      	int val = 0x11111111;
      	fd = open("abc", O_RDWR);
      	addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
      	*(addr+0) = 0x44444444;
      	tmp = *(addr+0);
      	*(addr+1) = 0x77777777;
      	write(fd, &val, sizeof(int));
      	close(fd);
      
      The results are not always 0x11111111 0x77777777 at the beginning as expected.  Sometimes we see 0x44444444 0x77777777.
      Signed-off-by: NAnfei <anfei.zhou@gmail.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: <linux-arch@vger.kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      931e80e4
    • N
      mm: purge fragmented percpu vmap blocks · 02b709df
      Nick Piggin 提交于
      Improve handling of fragmented per-CPU vmaps.  We previously don't free
      up per-CPU maps until all its addresses have been used and freed.  So
      fragmented blocks could fill up vmalloc space even if they actually had
      no active vmap regions within them.
      
      Add some logic to allow all CPUs to have these blocks purged in the case
      of failure to allocate a new vm area, and also put some logic to trim
      such blocks of a current CPU if we hit them in the allocation path (so
      as to avoid a large build up of them).
      
      Christoph reported some vmap allocation failures when using the per CPU
      vmap APIs in XFS, which cannot be reproduced after this patch and the
      previous bug fix.
      
      Cc: linux-mm@kvack.org
      Cc: stable@kernel.org
      Tested-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      --
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      02b709df
    • N
      mm: percpu-vmap fix RCU list walking · de560423
      Nick Piggin 提交于
      RCU list walking of the per-cpu vmap cache was broken.  It did not use
      RCU primitives, and also the union of free_list and rcu_head is
      obviously wrong (because free_list is indeed the list we are RCU
      walking).
      
      While we are there, remove a couple of unused fields from an earlier
      iteration.
      
      These APIs aren't actually used anywhere, because of problems with the
      XFS conversion.  Christoph has now verified that the problems are solved
      with these patches.  Also it is an exported interface, so I think it
      will be good to be merged now (and Christoph wants to get the XFS
      changes into their local tree).
      
      Cc: stable@kernel.org
      Cc: linux-mm@kvack.org
      Tested-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      --
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de560423
  13. 30 1月, 2010 2 次提交
    • N
      slab: fix regression in touched logic · 44b57f1c
      Nick Piggin 提交于
      When factoring common code into transfer_objects in commit 3ded175a ("slab: add
      transfer_objects() function"), the 'touched' logic got a bit broken. When
      refilling from the shared array (taking objects from the shared array), we are
      making use of the shared array so it should be marked as touched.
      
      Subsequently pulling an element from the cpu array and allocating it should
      also touch the cpu array, but that is taken care of after the alloc_done label.
      (So yes, the cpu array was getting touched = 1 twice).
      
      So revert this logic to how it worked in earlier kernels.
      
      This also affects the behaviour in __drain_alien_cache, which would previously
      'touch' the shared array and now does not. I think it is more logical not to
      touch there, because we are pushing objects into the shared array rather than
      pulling them off. So there is no good reason to postpone reaping them -- if the
      shared array is getting utilized, then it will get 'touched' in the alloc path
      (where this patch now restores the touch).
      Acked-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      44b57f1c
    • H
      mm: fix migratetype bug which slowed swapping · a7016235
      Hugh Dickins 提交于
      After memory pressure has forced it to dip into the reserves, 2.6.32's
      5f8dcc21 "page-allocator: split per-cpu
      list into one-list-per-migrate-type" has been returning MIGRATE_RESERVE
      pages to the MIGRATE_MOVABLE free_list: in some sense depleting reserves.
      
      Fix that in the most straightforward way (which, considering the overheads
      of alternative approaches, is Mel's preference): the right migratetype is
      already in page_private(page), but free_pcppages_bulk() wasn't using it.
      
      How did this bug show up?  As a 20% slowdown in my tmpfs loop kbuild
      swapping tests, on PowerMac G5 with SLUB allocator.  Bisecting to that
      commit was easy, but explaining the magnitude of the slowdown not easy.
      
      The same effect appears, but much less markedly, with SLAB, and even
      less markedly on other machines (the PowerMac divides into fewer zones
      than x86, I think that may be a factor).  We guess that lumpy reclaim
      of short-lived high-order pages is implicated in some way, and probably
      this bug has been tickling a poor decision somewhere in page reclaim.
      
      But instrumentation hasn't told me much, I've run out of time and
      imagination to determine exactly what's going on, and shouldn't hold up
      the fix any longer: it's valid, and might even fix other misbehaviours.
      Signed-off-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
      Acked-by: NMel Gorman <mel@csn.ul.ie>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7016235