1. 12 6月, 2009 17 次提交
    • P
      slab: setup cpu caches later on when interrupts are enabled · 8429db5c
      Pekka Enberg 提交于
      Fixes the following boot-time warning:
      
        [    0.000000] ------------[ cut here ]------------
        [    0.000000] WARNING: at kernel/smp.c:369 smp_call_function_many+0x56/0x1bc()
        [    0.000000] Hardware name:
        [    0.000000] Modules linked in:
        [    0.000000] Pid: 0, comm: swapper Not tainted 2.6.30 #492
        [    0.000000] Call Trace:
        [    0.000000]  [<ffffffff8149e021>] ? _spin_unlock+0x4f/0x5c
        [    0.000000]  [<ffffffff8108f11b>] ? smp_call_function_many+0x56/0x1bc
        [    0.000000]  [<ffffffff81061764>] warn_slowpath_common+0x7c/0xa9
        [    0.000000]  [<ffffffff810617a5>] warn_slowpath_null+0x14/0x16
        [    0.000000]  [<ffffffff8108f11b>] smp_call_function_many+0x56/0x1bc
        [    0.000000]  [<ffffffff810f3e00>] ? do_ccupdate_local+0x0/0x54
        [    0.000000]  [<ffffffff810f3e00>] ? do_ccupdate_local+0x0/0x54
        [    0.000000]  [<ffffffff8108f2be>] smp_call_function+0x3d/0x68
        [    0.000000]  [<ffffffff810f3e00>] ? do_ccupdate_local+0x0/0x54
        [    0.000000]  [<ffffffff81066fd8>] on_each_cpu+0x31/0x7c
        [    0.000000]  [<ffffffff810f64f5>] do_tune_cpucache+0x119/0x454
        [    0.000000]  [<ffffffff81087080>] ? lockdep_init_map+0x94/0x10b
        [    0.000000]  [<ffffffff818133b0>] ? kmem_cache_init+0x421/0x593
        [    0.000000]  [<ffffffff810f69cf>] enable_cpucache+0x68/0xad
        [    0.000000]  [<ffffffff818133c3>] kmem_cache_init+0x434/0x593
        [    0.000000]  [<ffffffff8180987c>] ? mem_init+0x156/0x161
        [    0.000000]  [<ffffffff817f8aae>] start_kernel+0x1cc/0x3b9
        [    0.000000]  [<ffffffff817f829a>] x86_64_start_reservations+0xaa/0xae
        [    0.000000]  [<ffffffff817f837f>] x86_64_start_kernel+0xe1/0xe8
        [    0.000000] ---[ end trace 4eaa2a86a8e2da22 ]---
      
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      8429db5c
    • P
      slab,slub: don't enable interrupts during early boot · 7e85ee0c
      Pekka Enberg 提交于
      As explained by Benjamin Herrenschmidt:
      
        Oh and btw, your patch alone doesn't fix powerpc, because it's missing
        a whole bunch of GFP_KERNEL's in the arch code... You would have to
        grep the entire kernel for things that check slab_is_available() and
        even then you'll be missing some.
      
        For example, slab_is_available() didn't always exist, and so in the
        early days on powerpc, we used a mem_init_done global that is set form
        mem_init() (not perfect but works in practice). And we still have code
        using that to do the test.
      
      Therefore, mask out __GFP_WAIT, __GFP_IO, and __GFP_FS in the slab allocators
      in early boot code to avoid enabling interrupts.
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      7e85ee0c
    • P
      slab: fix gfp flag in setup_cpu_cache() · eb91f1d0
      Pekka Enberg 提交于
      Fixes the following warning during bootup when compiling with CONFIG_SLAB:
      
        [    0.000000] ------------[ cut here ]------------
        [    0.000000] WARNING: at kernel/lockdep.c:2282 lockdep_trace_alloc+0x91/0xb9()
        [    0.000000] Hardware name:
        [    0.000000] Modules linked in:
        [    0.000000] Pid: 0, comm: swapper Not tainted 2.6.30 #491
        [    0.000000] Call Trace:
        [    0.000000]  [<ffffffff81087d84>] ? lockdep_trace_alloc+0x91/0xb9
        [    0.000000]  [<ffffffff81061764>] warn_slowpath_common+0x7c/0xa9
        [    0.000000]  [<ffffffff810617a5>] warn_slowpath_null+0x14/0x16
        [    0.000000]  [<ffffffff81087d84>] lockdep_trace_alloc+0x91/0xb9
        [    0.000000]  [<ffffffff810f5b03>] kmem_cache_alloc_node_notrace+0x26/0xdf
        [    0.000000]  [<ffffffff81487f4e>] ? setup_cpu_cache+0x7e/0x210
        [    0.000000]  [<ffffffff81487fe3>] setup_cpu_cache+0x113/0x210
        [    0.000000]  [<ffffffff810f73ff>] kmem_cache_create+0x409/0x486
        [    0.000000]  [<ffffffff818131c1>] kmem_cache_init+0x232/0x593
        [    0.000000]  [<ffffffff8180987c>] ? mem_init+0x156/0x161
        [    0.000000]  [<ffffffff817f8aae>] start_kernel+0x1cc/0x3b9
        [    0.000000]  [<ffffffff817f829a>] x86_64_start_reservations+0xaa/0xae
        [    0.000000]  [<ffffffff817f837f>] x86_64_start_kernel+0xe1/0xe8
        [    0.000000] ---[ end trace 4eaa2a86a8e2da22 ]---
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      eb91f1d0
    • K
      memcg: fix page_cgroup fatal error in FLATMEM · ca371c0d
      KAMEZAWA Hiroyuki 提交于
      Now, SLAB is configured in very early stage and it can be used in
      init routine now.
      
      But replacing alloc_bootmem() in FLAT/DISCONTIGMEM's page_cgroup()
      initialization breaks the allocation, now.
      (Works well in SPARSEMEM case...it supports MEMORY_HOTPLUG and
       size of page_cgroup is in reasonable size (< 1 << MAX_ORDER.)
      
      This patch revive FLATMEM+memory cgroup by using alloc_bootmem.
      
      In future,
      We stop to support FLATMEM (if no users) or rewrite codes for flatmem
      completely.But this will adds more messy codes and overheads.
      Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
      Tested-by: NLi Zefan <lizf@cn.fujitsu.com>
      Tested-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      ca371c0d
    • Y
      memcg: don't use bootmem allocator in setup code · 959982fe
      Yinghai Lu 提交于
      The bootmem allocator is no longer available for page_cgroup_init() because we
      set up the kernel slab allocator much earlier now.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      959982fe
    • P
      vmalloc: use kzalloc() instead of alloc_bootmem() · 43ebdac4
      Pekka Enberg 提交于
      We can call vmalloc_init() after kmem_cache_init() and use kzalloc() instead of
      the bootmem allocator when initializing vmalloc data structures.
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NNick Piggin <npiggin@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      43ebdac4
    • P
      slab: setup allocators earlier in the boot sequence · 83b519e8
      Pekka Enberg 提交于
      This patch makes kmalloc() available earlier in the boot sequence so we can get
      rid of some bootmem allocations. The bulk of the changes are due to
      kmem_cache_init() being called with interrupts disabled which requires some
      changes to allocator boostrap code.
      
      Note: 32-bit x86 does WP protect test in mem_init() so we must setup traps
      before we call mem_init() during boot as reported by Ingo Molnar:
      
        We have a hard crash in the WP-protect code:
      
        [    0.000000] Checking if this processor honours the WP bit even in supervisor mode...BUG: Int 14: CR2 ffcff000
        [    0.000000]      EDI 00000188  ESI 00000ac7  EBP c17eaf9c  ESP c17eaf8c
        [    0.000000]      EBX 000014e0  EDX 0000000e  ECX 01856067  EAX 00000001
        [    0.000000]      err 00000003  EIP c10135b1   CS 00000060  flg 00010002
        [    0.000000] Stack: c17eafa8 c17fd410 c16747bc c17eafc4 c17fd7e5 000011fd f8616000 c18237cc
        [    0.000000]        00099800 c17bb000 c17eafec c17f1668 000001c5 c17f1322 c166e039 c1822bf0
        [    0.000000]        c166e033 c153a014 c18237cc 00020800 c17eaff8 c17f106a 00020800 01ba5003
        [    0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip-02161-g7a74539-dirty #52203
        [    0.000000] Call Trace:
        [    0.000000]  [<c15357c2>] ? printk+0x14/0x16
        [    0.000000]  [<c10135b1>] ? do_test_wp_bit+0x19/0x23
        [    0.000000]  [<c17fd410>] ? test_wp_bit+0x26/0x64
        [    0.000000]  [<c17fd7e5>] ? mem_init+0x1ba/0x1d8
        [    0.000000]  [<c17f1668>] ? start_kernel+0x164/0x2f7
        [    0.000000]  [<c17f1322>] ? unknown_bootoption+0x0/0x19c
        [    0.000000]  [<c17f106a>] ? __init_begin+0x6a/0x6f
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      83b519e8
    • P
      bootmem: fix slab fallback on numa · c91c4773
      Pekka Enberg 提交于
      If the user requested bootmem allocation on a specific node, we should use
      kzalloc_node() for the fallback allocation.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      c91c4773
    • P
      bootmem: use slab if bootmem is no longer available · 441c7e0a
      Pekka Enberg 提交于
      As a preparation for initializing the slab allocator early, make sure the
      bootmem allocator does not crash and burn if someone calls it after slab is up;
      otherwise we'd need a flag day for switching to early slab.
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      441c7e0a
    • C
      kmemleak: Simple testing module for kmemleak · 0822ee4a
      Catalin Marinas 提交于
      This patch adds a loadable module that deliberately leaks memory. It
      is used for testing various memory leaking scenarios.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      0822ee4a
    • C
      kmemleak: Enable the building of the memory leak detector · 3bba00d7
      Catalin Marinas 提交于
      This patch adds the Kconfig.debug and Makefile entries needed for
      building kmemleak into the kernel.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      3bba00d7
    • C
      kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash · dbb1f81c
      Catalin Marinas 提交于
      The alloc_large_system_hash function is called from various places in
      the kernel and it contains pointers to other allocated structures. It
      therefore needs to be traced by kmemleak.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      dbb1f81c
    • C
      kmemleak: Add the vmalloc memory allocation/freeing hooks · 89219d37
      Catalin Marinas 提交于
      This patch adds the callbacks to kmemleak_(alloc|free) functions from
      vmalloc/vfree.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      89219d37
    • C
      kmemleak: Add the slub memory allocation/freeing hooks · 06f22f13
      Catalin Marinas 提交于
      This patch adds the callbacks to kmemleak_(alloc|free) functions from the
      slub allocator.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
      06f22f13
    • C
      kmemleak: Add the slob memory allocation/freeing hooks · 4374e616
      Catalin Marinas 提交于
      This patch adds the callbacks to kmemleak_(alloc|free) functions from the
      slob allocator.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMatt Mackall <mpm@selenic.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      4374e616
    • C
      kmemleak: Add the slab memory allocation/freeing hooks · d5cff635
      Catalin Marinas 提交于
      This patch adds the callbacks to kmemleak_(alloc|free) functions from
      the slab allocator. The patch also adds the SLAB_NOLEAKTRACE flag to
      avoid recursive calls to kmemleak when it allocates its own data
      structures.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi>
      d5cff635
    • C
      kmemleak: Add the base support · 3c7b4e6b
      Catalin Marinas 提交于
      This patch adds the base support for the kernel memory leak
      detector. It traces the memory allocation/freeing in a way similar to
      the Boehm's conservative garbage collector, the difference being that
      the unreferenced objects are not freed but only shown in
      /sys/kernel/debug/kmemleak. Enabling this feature introduces an
      overhead to memory allocations.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3c7b4e6b
  2. 10 6月, 2009 2 次提交
    • P
      nommu: Provide mmap_min_addr definition. · 35f2c2f6
      Paul Mundt 提交于
      With the "security: use mmap_min_addr indepedently of security models"
      change, mmap_min_addr is used in common areas, which susbsequently blows
      up the nommu build. This stubs in the definition in the nommu case as
      well.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      
      --
      
       mm/nommu.c |    3 +++
       1 file changed, 3 insertions(+)
      Signed-off-by: NJames Morris <jmorris@namei.org>
      35f2c2f6
    • L
      tracing/events: convert block trace points to TRACE_EVENT() · 55782138
      Li Zefan 提交于
      TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
      these new capabilities to this tracepoint:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
        ...
      
      Cons:
      
        - no dev_t info for the output of plug, unplug_timer and unplug_io events.
          no dev_t info for getrq and sleeprq events if bio == NULL.
          no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.
      
          This is mainly because we can't get the deivce from a request queue.
          But this may change in the future.
      
        - A packet command is converted to a string in TP_assign, not TP_print.
          While blktrace do the convertion just before output.
      
          Since pc requests should be rather rare, this is not a big issue.
      
        - In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
          has a unique format, which means we have some unused data in a trace entry.
      
          The overhead is minimized by using __dynamic_array() instead of __array().
      
      I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:
      
            dd                   dd + ioctl blktrace       dd + TRACE_EVENT (splice)
      1     7.36s, 42.7 MB/s     7.50s, 42.0 MB/s          7.41s, 42.5 MB/s
      2     7.43s, 42.3 MB/s     7.48s, 42.1 MB/s          7.43s, 42.4 MB/s
      3     7.38s, 42.6 MB/s     7.45s, 42.2 MB/s          7.41s, 42.5 MB/s
      
      So the overhead of tracing is very small, and no regression when using
      those trace events vs blktrace.
      
      And the binary output of TRACE_EVENT is much smaller than blktrace:
      
       # ls -l -h
       -rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
       -rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
       -rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out
      
      Following are some comparisons between TRACE_EVENT and blktrace:
      
      plug:
        kjournald-480   [000]   303.084981: block_plug: [kjournald]
        kjournald-480   [000]   303.084981:   8,0    P   N [kjournald]
      
      unplug_io:
        kblockd/0-118   [000]   300.052973: block_unplug_io: [kblockd/0] 1
        kblockd/0-118   [000]   300.052974:   8,0    U   N [kblockd/0] 1
      
      remap:
        kjournald-480   [000]   303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
        kjournald-480   [000]   303.085043:   8,0    A   W 102736992 + 8 <- (8,8) 33384
      
      bio_backmerge:
        kjournald-480   [000]   303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
        kjournald-480   [000]   303.085086:   8,0    M   W 102737032 + 8 [kjournald]
      
      getrq:
        kjournald-480   [000]   303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
        kjournald-480   [000]   303.084975:   8,0    G   W 102736984 + 8 [kjournald]
      
        bash-2066  [001]  1072.953770:   8,0    G   N [bash]
        bash-2066  [001]  1072.953773: block_getrq: 0,0 N 0 + 0 [bash]
      
      rq_complete:
        konsole-2065  [001]   300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
        konsole-2065  [001]   300.053191:   8,0    C   W 103669040 + 16 [0]
      
        ksoftirqd/1-7   [001]  1072.953811:   8,0    C   N (5a 00 08 00 00 00 00 00 24 00) [0]
        ksoftirqd/1-7   [001]  1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]
      
      rq_insert:
        kjournald-480   [000]   303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
        kjournald-480   [000]   303.084986:   8,0    I   W 102736984 + 8 [kjournald]
      
      Changelog from v2 -> v3:
      
      - use the newly introduced __dynamic_array().
      
      Changelog from v1 -> v2:
      
      - use __string() instead of __array() to minimize the memory required
        to store hex dump of rq->cmd().
      
      - support large pc requests.
      
      - add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.
      
      - some cleanups.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      55782138
  3. 09 6月, 2009 1 次提交
  4. 05 6月, 2009 1 次提交
  5. 04 6月, 2009 2 次提交
  6. 29 5月, 2009 4 次提交
  7. 23 5月, 2009 1 次提交
  8. 22 5月, 2009 3 次提交
  9. 18 5月, 2009 3 次提交
    • M
      [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 · eb33575c
      Mel Gorman 提交于
      pfn_valid() is meant to be able to tell if a given PFN has valid memmap
      associated with it or not. In FLATMEM, it is expected that holes always
      have valid memmap as long as there is valid PFNs either side of the hole.
      In SPARSEMEM, it is assumed that a valid section has a memmap for the
      entire section.
      
      However, ARM and maybe other embedded architectures in the future free
      memmap backing holes to save memory on the assumption the memmap is never
      used. The page_zone linkages are then broken even though pfn_valid()
      returns true. A walker of the full memmap must then do this additional
      check to ensure the memmap they are looking at is sane by making sure the
      zone and PFN linkages are still valid. This is expensive, but walkers of
      the full memmap are extremely rare.
      
      This was caught before for FLATMEM and hacked around but it hits again for
      SPARSEMEM because the page_zone linkages can look ok where the PFN linkages
      are totally screwed. This looks like a hatchet job but the reality is that
      any clean solution would end up consumning all the memory saved by punching
      these unexpected holes in the memmap. For example, we tried marking the
      memmap within the section invalid but the section size exceeds the size of
      the hole in most cases so pfn_valid() starts returning false where valid
      memmap exists. Shrinking the size of the section would increase memory
      consumption offsetting the gains.
      
      This patch identifies when an architecture is punching unexpected holes
      in the memmap that the memory model cannot automatically detect and sets
      ARCH_HAS_HOLES_MEMORYMODEL. At the moment, this is restricted to EP93xx
      which is the model sub-architecture this has been reported on but may expand
      later. When set, walkers of the full memmap must call memmap_valid_within()
      for each PFN and passing in what it expects the page and zone to be for
      that PFN. If it finds the linkages to be broken, it assumes the memmap is
      invalid for that PFN.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      eb33575c
    • Y
      mm, x86: remove MEMORY_HOTPLUG_RESERVE related code · 888a589f
      Yinghai Lu 提交于
      after:
      
       | commit b263295d
       | Author: Christoph Lameter <clameter@sgi.com>
       | Date:   Wed Jan 30 13:30:47 2008 +0100
       |
       |    x86: 64-bit, make sparsemem vmemmap the only memory model
      
      we don't have MEMORY_HOTPLUG_RESERVE anymore.
      
      Historically, x86-64 had an architecture-specific method for memory hotplug
      whereby it scanned the SRAT for physical memory ranges that could be
      potentially used for memory hot-add later. By reserving those ranges
      without physical memory, the memmap would be allocated and left dormant
      until needed. This depended on the DISCONTIG memory model which has been
      removed so the code implementing HOTPLUG_RESERVE is now dead.
      
      This patch removes the dead code used by MEMORY_HOTPLUG_RESERVE.
      
      (Changelog authored by Mel.)
      
      v2: updated changelog, and remove hotadd= in doc
      
      [ Impact: remove dead code ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Reviewed-by: NMel Gorman <mel@csn.ul.ie>
      Workflow-found-OK-by: NAndrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A0C4910.7090508@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      888a589f
    • T
      page-writeback: fix the calculation of the oldest_jif in wb_kupdate() · 22ef37ee
      Toshiyuki Okajima 提交于
      wb_kupdate() function has a bug on linux-2.6.30-rc5.  This bug causes
      generic_sync_sb_inodes() to start to write inodes back much earlier than
      our expectations because it miscalculates oldest_jif in wb_kupdate().
      
      This bug was introduced in 704503d8
      ('mm: fix proc_dointvec_userhz_jiffies "breakage"').
      Signed-off-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22ef37ee
  10. 15 5月, 2009 1 次提交
    • J
      Revert "mm: add /proc controls for pdflush threads" · cd17cbfd
      Jens Axboe 提交于
      This reverts commit fafd688e.
      
      Work is progressing to switch away from pdflush as the process backing
      for flushing out dirty data. So it seems pointless to add more knobs
      to control pdflush threads. The original author of the patch did not
      have any specific use cases for adding the knobs, so we can easily
      revert this before 2.6.30 to avoid having to maintain this API
      forever.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      cd17cbfd
  11. 13 5月, 2009 1 次提交
  12. 08 5月, 2009 1 次提交
  13. 07 5月, 2009 3 次提交
    • D
      nommu: make the initial mmap allocation excess behaviour Kconfig configurable · fc4d5c29
      David Howells 提交于
      NOMMU mmap() has an option controlled by a sysctl variable that determines
      whether the allocations made by do_mmap_private() should have the excess
      space trimmed off and returned to the allocator.  Make the initial setting
      of this variable a Kconfig configuration option.
      
      The reason there can be excess space is that the allocator only allocates
      in power-of-2 size chunks, but mmap()'s can be made in sizes that aren't a
      power of 2.
      
      There are two alternatives:
      
       (1) Keep the excess as dead space.  The dead space then remains unused for the
           lifetime of the mapping.  Mappings of shared objects such as libc, ld.so
           or busybox's text segment may retain their dead space forever.
      
       (2) Return the excess to the allocator.  This means that the dead space is
           limited to less than a page per mapping, but it means that for a transient
           process, there's more chance of fragmentation as the excess space may be
           reused fairly quickly.
      
      During the boot process, a lot of transient processes are created, and
      this can cause a lot of fragmentation as the pagecache and various slabs
      grow greatly during this time.
      
      By turning off the trimming of excess space during boot and disabling
      batching of frees, Coldfire can manage to boot.
      
      A better way of doing things might be to have /sbin/init turn this option
      off.  By that point libc, ld.so and init - which are all long-duration
      processes - have all been loaded and trimmed.
      Reported-by: NLanttor Guo <lanttor.guo@freescale.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NLanttor Guo <lanttor.guo@freescale.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc4d5c29
    • D
      nommu: clamp zone_batchsize() to 0 under NOMMU conditions · 3a6be87f
      David Howells 提交于
      Clamp zone_batchsize() to 0 under NOMMU conditions to stop
      free_hot_cold_page() from queueing and batching frees.
      
      The problem is that under NOMMU conditions it is really important to be
      able to allocate large contiguous chunks of memory, but when munmap() or
      exit_mmap() releases big stretches of memory, return of these to the buddy
      allocator can be deferred, and when it does finally happen, it can be in
      small chunks.
      
      Whilst the fragmentation this incurs isn't so much of a problem under MMU
      conditions as userspace VM is glued together from individual pages with
      the aid of the MMU, it is a real problem if there isn't an MMU.
      
      By clamping the page freeing queue size to 0, pages are returned to the
      allocator immediately, and the buddy detector is more likely to be able to
      glue them together into large chunks immediately, and fragmentation is
      less likely to occur.
      
      By disabling batching of frees, and by turning off the trimming of excess
      space during boot, Coldfire can manage to boot.
      Reported-by: NLanttor Guo <lanttor.guo@freescale.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NLanttor Guo <lanttor.guo@freescale.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3a6be87f
    • D
      mm: use roundown_pow_of_two() in zone_batchsize() · 9155203a
      David Howells 提交于
      Use roundown_pow_of_two(N) in zone_batchsize() rather than (1 <<
      (fls(N)-1)) as they are equivalent, and with the former it is easier to
      see what is going on.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NLanttor Guo <lanttor.guo@freescale.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9155203a