1. 30 10月, 2005 2 次提交
    • H
      [PATCH] mm: update_hiwaters just in time · 365e9c87
      Hugh Dickins 提交于
      update_mem_hiwater has attracted various criticisms, in particular from those
      concerned with mm scalability.  Originally it was called whenever rss or
      total_vm got raised.  Then many of those callsites were replaced by a timer
      tick call from account_system_time.  Now Frank van Maarseveen reports that to
      be found inadequate.  How about this?  Works for Frank.
      
      Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
      update_hiwater_rss and update_hiwater_vm.  Don't attempt to keep
      mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
      by 1): those are hot paths.  Do the opposite, update only when about to lower
      rss (usually by many), or just before final accounting in do_exit.  Handle
      mm->hiwater_vm in the same way, though it's much less of an issue.  Demand
      that whoever collects these hiwater statistics do the work of taking the
      maximum with rss or total_vm.
      
      And there has been no collector of these hiwater statistics in the tree.  The
      new convention needs an example, so match Frank's usage by adding a VmPeak
      line above VmSize to /proc/<pid>/status, and also a VmHWM line above VmRSS
      (High-Water-Mark or High-Water-Memory).
      
      There was a particular anomaly during mremap move, that hiwater_vm might be
      captured too high.  A fleeting such anomaly remains, but it's quickly
      corrected now, whereas before it would stick.
      
      What locking?  None: if the app is racy then these statistics will be racy,
      it's not worth any overhead to make them exact.  But whenever it suits,
      hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
      page_table_lock (for now) or with preemption disabled (later on): without
      going to any trouble, minimize the time between reading current values and
      updating, to minimize those occasions when a racing thread bumps a count up
      and back down in between.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      365e9c87
    • H
      [PATCH] mm: rss = file_rss + anon_rss · 4294621f
      Hugh Dickins 提交于
      I was lazy when we added anon_rss, and chose to change as few places as
      possible.  So currently each anonymous page has to be counted twice, in rss
      and in anon_rss.  Which won't be so good if those are atomic counts in some
      configurations.
      
      Change that around: keep file_rss and anon_rss separately, and add them
      together (with get_mm_rss macro) when the total is needed - reading two
      atomics is much cheaper than updating two atomics.  And update anon_rss
      upfront, typically in memory.c, not tucked away in page_add_anon_rmap.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4294621f
  2. 09 10月, 2005 1 次提交
  3. 12 9月, 2005 1 次提交
    • G
      [PATCH] uclinux: add NULL check, 0 end valid check and some more exports to nommu.c · 66aa2b4b
      Greg Ungerer 提交于
      Move call to get_mm_counter() in update_mem_hiwater() to be
      inside the check for tsk->mm being null. Otherwise you can be
      following a null pointer here. This patch submitted by
      Javier Herrero <jherrero@hvsistemas.es>.
      
      Modify the end check for munmap regions to allow for the
      legacy behavior of 0 being valid. Pretty much all current
      uClinux system libc malloc's pass in 0 as the end point.
      A hard check will fail on these, so change the check so
      that if it is non-zero it must be valid otherwise it fails.
      A passed in value will always succeed (as it used too).
      
      Also export a few more mm system functions - to be consistent
      with the VM code exports.
      Signed-off-by: NGreg Ungerer <gerg@uclinux.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      66aa2b4b
  4. 05 8月, 2005 1 次提交
    • S
      [PATCH] __vm_enough_memory() signedness fix · 2f60f8d3
      Simon Derr 提交于
      We have found what seems to be a small bug in __vm_enough_memory() when
      sysctl_overcommit_memory is set to OVERCOMMIT_NEVER.
      
      When this bug occurs the systems fails to boot, with /sbin/init whining
      about fork() returning ENOMEM.
      
      We hunted down the problem to this:
      
      The deferred update mecanism used in vm_acct_memory(), on a SMP system,
      allows the vm_committed_space counter to have a negative value.
      
      This should not be a problem since this counter is known to be inaccurate.
      
      But in __vm_enough_memory() this counter is compared to the `allowed'
      variable, which is an unsigned long.  This comparison is broken since it
      will consider the negative values of vm_committed_space to be huge positive
      values, resulting in a memory allocation failure.
      
      Signed-off-by: <Jean-Marc.Saffroy@ext.bull.net>
      Signed-off-by: <Simon.Derr@bull.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2f60f8d3
  5. 22 6月, 2005 1 次提交
    • W
      [PATCH] Avoiding mmap fragmentation · 1363c3cd
      Wolfgang Wander 提交于
      Ingo recently introduced a great speedup for allocating new mmaps using the
      free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
      causes huge performance increases in thread creation.
      
      The downside of this patch is that it does lead to fragmentation in the
      mmap-ed areas (visible via /proc/self/maps), such that some applications
      that work fine under 2.4 kernels quickly run out of memory on any 2.6
      kernel.
      
      The problem is twofold:
      
        1) the free_area_cache is used to continue a search for memory where
           the last search ended.  Before the change new areas were always
           searched from the base address on.
      
           So now new small areas are cluttering holes of all sizes
           throughout the whole mmap-able region whereas before small holes
           tended to close holes near the base leaving holes far from the base
           large and available for larger requests.
      
        2) the free_area_cache also is set to the location of the last
           munmap-ed area so in scenarios where we allocate e.g.  five regions of
           1K each, then free regions 4 2 3 in this order the next request for 1K
           will be placed in the position of the old region 3, whereas before we
           appended it to the still active region 1, placing it at the location
           of the old region 2.  Before we had 1 free region of 2K, now we only
           get two free regions of 1K -> fragmentation.
      
      The patch addresses thes issues by introducing yet another cache descriptor
      cached_hole_size that contains the largest known hole size below the
      current free_area_cache.  If a new request comes in the size is compared
      against the cached_hole_size and if the request can be filled with a hole
      below free_area_cache the search is started from the base instead.
      
      The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
      (earlier posted) leakme.c test program terminates after 50000+ iterations
      with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
      (as expected) with thread creation, Ingo's test_str02 with 20000 threads
      requires 0.7s system time.
      
      Taking out Ingo's patch (un-patch available per request) by basically
      deleting all mentions of free_area_cache from the kernel and starting the
      search for new memory always at the respective bases we observe: leakme
      terminates successfully with 11 distinctive hardly fragmented areas in
      /proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
      time for Ingo's test_str02 with 20000 threads.
      
      Now - drumroll ;-) the appended patch works fine with leakme: it ends with
      only 7 distinct areas in /proc/self/maps and also thread creation seems
      sufficiently fast with 0.71s for 20000 threads.
      Signed-off-by: NWolfgang Wander <wwc@rentec.com>
      Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1363c3cd
  6. 17 5月, 2005 1 次提交
  7. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4