1. 22 6月, 2005 1 次提交
    • M
      [PATCH] VM: early zone reclaim · 753ee728
      Martin Hicks 提交于
      This is the core of the (much simplified) early reclaim.  The goal of this
      patch is to reclaim some easily-freed pages from a zone before falling back
      onto another zone.
      
      One of the major uses of this is NUMA machines.  With the default allocator
      behavior the allocator would look for memory in another zone, which might be
      off-node, before trying to reclaim from the current zone.
      
      This adds a zone tuneable to enable early zone reclaim.  It is selected on a
      per-zone basis and is turned on/off via syscall.
      
      Adding some extra throttling on the reclaim was also required (patch
      4/4).  Without the machine would grind to a crawl when doing a "make -j"
      kernel build.  Even with this patch the System Time is higher on
      average, but it seems tolerable.  Here are some numbers for kernbench
      runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run:
      
      			wall  user   sys   %cpu  ctx sw.  sleeps
      			----  ----   ---   ----   ------  ------
      No patch		1009  1384   847   258   298170   504402
      w/patch, no reclaim     880   1376   667   288   254064   396745
      w/patch & reclaim       1079  1385   926   252   291625   548873
      
      These numbers are the average of 2 runs of 3 "make -j" runs done right
      after system boot.  Run-to-run variability for "make -j" is huge, so
      these numbers aren't terribly useful except to seee that with reclaim
      the benchmark still finishes in a reasonable amount of time.
      
      I also looked at the NUMA hit/miss stats for the "make -j" runs and the
      reclaim doesn't make any difference when the machine is thrashing away.
      
      Doing a "make -j8" on a single node that is filled with page cache pages
      takes 700 seconds with reclaim turned on and 735 seconds without reclaim
      (due to remote memory accesses).
      
      The simple zone_reclaim syscall program is at
      http://www.bork.org/~mort/sgi/zone_reclaim.cSigned-off-by: NMartin Hicks <mort@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      753ee728
  2. 11 5月, 2005 1 次提交
    • D
      [IA64] Avoid .spillpsp directive in handcoded assembly · bfd68594
      David Mosberger-Tang 提交于
      Some time ago, GAS was fixed to bring the .spillpsp directive in line
      with the Intel assembler manual (there was some disagreement as to
      whether or not there is a built-in 16-byte offset).  Unfortunately,
      there are two places in the kernel where this directive is used in
      handwritten assembly files and those of course relied on the "buggy"
      behavior.  As a result, when using a "fixed" assembler, the kernel
      picks up the UNaT bits from the wrong place (off by 16) and randomly
      sets NaT bits on the scratch registers.  This can be noticed easily by
      looking at a coredump and finding various scratch registers with
      unexpected NaT values.  The patch below fixes this by using the
      .spillsp directive instead, which works correctly no matter what
      assembler is in use.
      Signed-off-by: NDavid Mosberger-Tang <davidm@hpl.hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      bfd68594
  3. 04 5月, 2005 1 次提交
  4. 01 5月, 2005 1 次提交
  5. 26 4月, 2005 2 次提交
    • D
      [IA64] fix syscall-optimization goof · a37d98f6
      David Mosberger-Tang 提交于
      Sadly, I goofed in this syscall-tuning patch:
      
      ChangeSet 1.1966.1.40 2005/01/22 13:31:05 davidm@hpl.hp.com
        [IA64] Improve ia64_leave_syscall() for McKinley-type cores.
      
        Optimize ia64_leave_syscall() a bit better for McKinley-type cores.
        The patch looks big, but that's mostly due to renaming r16/r17 to r2/r3.
        Good for a 13 cycle improvement.
      
      The problem is that the size of the physical stacked registers was
      loaded into the wrong register (r3 instead of r17).  Since r17 by
      coincidence always had the value 1, this had the effect of turning
      rse_clear_invalid into a no-op.  That poses the risk of leaking kernel
      state back to user-land and is hence not acceptable.
      
      The fix below is simple, but unfortunately it costs us about 28 cycles
      in syscall overhead. ;-(
      
      Unfortunately, there isn't much we can do about that since those
      registers have to be cleared one way or another.
      
      	--david
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a37d98f6
    • D
      [IA64] speed up syscall path a bit more · 30325d17
      David Mosberger-Tang 提交于
      Recently I noticed that clearing ar.ssd/ar.csd right before srlz.d is
      causing significant stalling in the syscall path.  The patch below
      fixes that by moving the register-writes after srlz.d.  On a Madison,
      this drops break-based getpid() from 241 to 226 cycles (-15 cycles).
      Signed-off-by: NDavid Mosberger-Tang <davidm@hpl.hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      30325d17
  6. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4