1. 30 5月, 2012 6 次提交
  2. 26 5月, 2012 2 次提交
    • C
      mm: add new arch_make_huge_pte() method for tile support · d9ed9faa
      Chris Metcalf 提交于
      The tile support for multiple-size huge pages requires tagging
      the hugetlb PTE with a "super" bit for PTEs that are multiples of
      the basic size of a pagetable span.  To set that bit properly
      we need to tweak the PTe in make_huge_pte() based on the vma.
      
      This change provides the API for a subsequent tile-specific
      change to use.
      Reviewed-by: NHillf Danton <dhillf@gmail.com>
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      d9ed9faa
    • C
      arch/tile: allow building Linux with transparent huge pages enabled · 73636b1a
      Chris Metcalf 提交于
      The change adds some infrastructure for managing tile pmd's more generally,
      using pte_pmd() and pmd_pte() methods to translate pmd values to and
      from ptes, since on TILEPro a pmd is really just a nested structure
      holding a pgd (aka pte).  Several existing pmd methods are moved into
      this framework, and a whole raft of additional pmd accessors are defined
      that are used by the transparent hugepage framework.
      
      The tile PTE now has a "client2" bit.  The bit is used to indicate a
      transparent huge page is in the process of being split into subpages.
      
      This change also fixes a generic bug where the return value of the
      generic pmdp_splitting_flush() was incorrect.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      73636b1a
  3. 24 5月, 2012 2 次提交
    • T
      mm: add a low limit to alloc_large_system_hash · 31fe62b9
      Tim Bird 提交于
      UDP stack needs a minimum hash size value for proper operation and also
      uses alloc_large_system_hash() for proper NUMA distribution of its hash
      tables and automatic sizing depending on available system memory.
      
      On some low memory situations, udp_table_init() must ignore the
      alloc_large_system_hash() result and reallocs a bigger memory area.
      
      As we cannot easily free old hash table, we leak it and kmemleak can
      issue a warning.
      
      This patch adds a low limit parameter to alloc_large_system_hash() to
      solve this problem.
      
      We then specify UDP_HTABLE_SIZE_MIN for UDP/UDPLite hash table
      allocation.
      Reported-by: NMark Asselstine <mark.asselstine@windriver.com>
      Reported-by: NTim Bird <tim.bird@am.sony.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31fe62b9
    • M
      mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages · 05f144a0
      Mel Gorman 提交于
      Dave Jones' system call fuzz testing tool "trinity" triggered the
      following bug error with slab debugging enabled
      
          =============================================================================
          BUG numa_policy (Not tainted): Poison overwritten
          -----------------------------------------------------------------------------
      
          INFO: 0xffff880146498250-0xffff880146498250. First byte 0x6a instead of 0x6b
          INFO: Allocated in mpol_new+0xa3/0x140 age=46310 cpu=6 pid=32154
           __slab_alloc+0x3d3/0x445
           kmem_cache_alloc+0x29d/0x2b0
           mpol_new+0xa3/0x140
           sys_mbind+0x142/0x620
           system_call_fastpath+0x16/0x1b
          INFO: Freed in __mpol_put+0x27/0x30 age=46268 cpu=6 pid=32154
           __slab_free+0x2e/0x1de
           kmem_cache_free+0x25a/0x260
           __mpol_put+0x27/0x30
           remove_vma+0x68/0x90
           exit_mmap+0x118/0x140
           mmput+0x73/0x110
           exit_mm+0x108/0x130
           do_exit+0x162/0xb90
           do_group_exit+0x4f/0xc0
           sys_exit_group+0x17/0x20
           system_call_fastpath+0x16/0x1b
          INFO: Slab 0xffffea0005192600 objects=27 used=27 fp=0x          (null) flags=0x20000000004080
          INFO: Object 0xffff880146498250 @offset=592 fp=0xffff88014649b9d0
      
      This implied a reference counting bug and the problem happened during
      mbind().
      
      mbind() applies a new memory policy to a range and uses mbind_range() to
      merge existing VMAs or split them as necessary.  In the event of splits,
      mpol_dup() will allocate a new struct mempolicy and maintain existing
      reference counts whose rules are documented in
      Documentation/vm/numa_memory_policy.txt .
      
      The problem occurs with shared memory policies.  The vm_op->set_policy
      increments the reference count if necessary and split_vma() and
      vma_merge() have already handled the existing reference counts.
      However, policy_vma() screws it up by replacing an existing
      vma->vm_policy with one that potentially has the wrong reference count
      leading to a premature free.  This patch removes the damage caused by
      policy_vma().
      
      With this patch applied Dave's trinity tool runs an mbind test for 5
      minutes without error.  /proc/slabinfo reported that there are no
      numa_policy or shared_policy_node objects allocated after the test
      completed and the shared memory region was deleted.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Dave Jones <davej@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Stephen Wilson <wilsons@start.ca>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05f144a0
  4. 21 5月, 2012 13 次提交
  5. 20 5月, 2012 1 次提交
    • H
      memcg,thp: fix res_counter:96 regression · 62ade86a
      Hugh Dickins 提交于
      Occasionally, testing memcg's move_charge_at_immigrate on rc7 shows
      a flurry of hundreds of warnings at kernel/res_counter.c:96, where
      res_counter_uncharge_locked() does WARN_ON(counter->usage < val).
      
      The first trace of each flurry implicates __mem_cgroup_cancel_charge()
      of mc.precharge, and an audit of mc.precharge handling points to
      mem_cgroup_move_charge_pte_range()'s THP handling in commit 12724850
      ("memcg: avoid THP split in task migration").
      
      Checking !mc.precharge is good everywhere else, when a single page is to
      be charged; but here the "mc.precharge -= HPAGE_PMD_NR" likely to
      follow, is liable to result in underflow (a lot can change since the
      precharge was estimated).
      
      Simply check against HPAGE_PMD_NR: there's probably a better
      alternative, trying precharge for more, splitting if unsuccessful; but
      this one-liner is safer for now - no kernel/res_counter.c:96 warnings
      seen in 26 hours.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      62ade86a
  6. 18 5月, 2012 1 次提交
    • M
      slub: missing test for partial pages flush work in flush_all() · 02e1a9cd
      majianpeng 提交于
      I found some kernel messages such as:
      
          SLUB raid5-md127: kmem_cache_destroy called for cache that still has objects.
          Pid: 6143, comm: mdadm Tainted: G           O 3.4.0-rc6+        #75
          Call Trace:
          kmem_cache_destroy+0x328/0x400
          free_conf+0x2d/0xf0 [raid456]
          stop+0x41/0x60 [raid456]
          md_stop+0x1a/0x60 [md_mod]
          do_md_stop+0x74/0x470 [md_mod]
          md_ioctl+0xff/0x11f0 [md_mod]
          blkdev_ioctl+0xd8/0x7a0
          block_ioctl+0x3b/0x40
          do_vfs_ioctl+0x96/0x560
          sys_ioctl+0x91/0xa0
          system_call_fastpath+0x16/0x1b
      
      Then using kmemleak I found these messages:
      
          unreferenced object 0xffff8800b6db7380 (size 112):
            comm "mdadm", pid 5783, jiffies 4294810749 (age 90.589s)
            hex dump (first 32 bytes):
              01 01 db b6 ad 4e ad de ff ff ff ff ff ff ff ff  .....N..........
              ff ff ff ff ff ff ff ff 98 40 4a 82 ff ff ff ff  .........@J.....
            backtrace:
              kmemleak_alloc+0x21/0x50
              kmem_cache_alloc+0xeb/0x1b0
              kmem_cache_open+0x2f1/0x430
              kmem_cache_create+0x158/0x320
              setup_conf+0x649/0x770 [raid456]
              run+0x68b/0x840 [raid456]
              md_run+0x529/0x940 [md_mod]
              do_md_run+0x18/0xc0 [md_mod]
              md_ioctl+0xba8/0x11f0 [md_mod]
              blkdev_ioctl+0xd8/0x7a0
              block_ioctl+0x3b/0x40
              do_vfs_ioctl+0x96/0x560
              sys_ioctl+0x91/0xa0
              system_call_fastpath+0x16/0x1b
      
      This bug was introduced by commit a8364d55 ("slub: only IPI CPUs that
      have per cpu obj to flush"), which did not include checks for per cpu
      partial pages being present on a cpu.
      Signed-off-by: Nmajianpeng <majianpeng@gmail.com>
      Cc: Gilad Ben-Yossef <gilad@benyossef.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Tested-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      02e1a9cd
  7. 16 5月, 2012 2 次提交
  8. 12 5月, 2012 1 次提交
    • H
      mm: raise MemFree by reverting percpu_pagelist_fraction to 0 · 1b76b02f
      Hugh Dickins 提交于
      Why is there less MemFree than there used to be?  It perturbed a test,
      so I've just been bisecting linux-next, and now find the offender went
      upstream yesterday.
      
      Commit 93278814 "mm: fix division by 0 in percpu_pagelist_fraction()"
      mistakenly initialized percpu_pagelist_fraction to the sysctl's minimum 8,
      which leaves 1/8th of memory on percpu lists (on each cpu??); but most of
      us expect it to be left unset at 0 (and it's not then used as a divisor).
      
        MemTotal: 8061476kB  8061476kB  8061476kB  8061476kB  8061476kB  8061476kB
        Repetitive test with percpu_pagelist_fraction 8:
        MemFree:  6948420kB  6237172kB  6949696kB  6840692kB  6949048kB  6862984kB
        Same test with percpu_pagelist_fraction back to 0:
        MemFree:  7945000kB  7944908kB  7948568kB  7949060kB  7948796kB  7948812kB
      Signed-off-by: NHugh Dickins <hughd@google.com>
      [ We really should fix the crazy sysctl interface too, but that's a
        separate thing - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1b76b02f
  9. 11 5月, 2012 4 次提交
  10. 10 5月, 2012 2 次提交
  11. 08 5月, 2012 1 次提交
  12. 07 5月, 2012 2 次提交
  13. 06 5月, 2012 2 次提交
  14. 03 5月, 2012 1 次提交