1. 05 9月, 2005 10 次提交
    • C
      [PATCH] /proc/<pid>/numa_maps to show on which nodes pages reside · 6e21c8f1
      Christoph Lameter 提交于
      This patch was recently discussed on linux-mm:
      http://marc.theaimsgroup.com/?t=112085728500002&r=1&w=2
      
      I inherited a large code base from Ray for page migration.  There was a
      small patch in there that I find to be very useful since it allows the
      display of the locality of the pages in use by a process.  I reworked that
      patch and came up with a /proc/<pid>/numa_maps that gives more information
      about the vma's of a process.  numa_maps is indexes by the start address
      found in /proc/<pid>/maps.  F.e.  with this patch you can see the page use
      of the "getty" process:
      
      margin:/proc/12008 # cat maps
      00000000-00004000 r--p 00000000 00:00 0
      2000000000000000-200000000002c000 r-xp 00000000 08:04 516                /lib/ld-2.3.3.so
      2000000000038000-2000000000040000 rw-p 00028000 08:04 516                /lib/ld-2.3.3.so
      2000000000040000-2000000000044000 rw-p 2000000000040000 00:00 0
      2000000000058000-2000000000260000 r-xp 00000000 08:04 54707842           /lib/tls/libc.so.6.1
      2000000000260000-2000000000268000 ---p 00208000 08:04 54707842           /lib/tls/libc.so.6.1
      2000000000268000-2000000000274000 rw-p 00200000 08:04 54707842           /lib/tls/libc.so.6.1
      2000000000274000-2000000000280000 rw-p 2000000000274000 00:00 0
      2000000000280000-20000000002b4000 r--p 00000000 08:04 9126923            /usr/lib/locale/en_US.utf8/LC_CTYPE
      2000000000300000-2000000000308000 r--s 00000000 08:04 60071467           /usr/lib/gconv/gconv-modules.cache
      2000000000318000-2000000000328000 rw-p 2000000000318000 00:00 0
      4000000000000000-4000000000008000 r-xp 00000000 08:04 29576399           /sbin/mingetty
      6000000000004000-6000000000008000 rw-p 00004000 08:04 29576399           /sbin/mingetty
      6000000000008000-600000000002c000 rw-p 6000000000008000 00:00 0          [heap]
      60000fff7fffc000-60000fff80000000 rw-p 60000fff7fffc000 00:00 0
      60000ffffff44000-60000ffffff98000 rw-p 60000ffffff44000 00:00 0          [stack]
      a000000000000000-a000000000020000 ---p 00000000 00:00 0                  [vdso]
      
      cat numa_maps
      2000000000000000 default MaxRef=43 Pages=11 Mapped=11 N0=4 N1=3 N2=2 N3=2
      2000000000038000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
      2000000000040000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      2000000000058000 default MaxRef=43 Pages=61 Mapped=61 N0=14 N1=15 N2=16 N3=16
      2000000000268000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
      2000000000274000 default MaxRef=1 Pages=3 Mapped=3 Anon=3 N0=3
      2000000000280000 default MaxRef=8 Pages=3 Mapped=3 N0=3
      2000000000300000 default MaxRef=8 Pages=2 Mapped=2 N0=2
      2000000000318000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N2=1
      4000000000000000 default MaxRef=6 Pages=2 Mapped=2 N1=2
      6000000000004000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      6000000000008000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      60000fff7fffc000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      60000ffffff44000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
      
      getty uses ld.so.  The first vma is the code segment which is used by 43
      other processes and the pages are evenly distributed over the 4 nodes.
      
      The second vma is the process specific data portion for ld.so.  This is
      only one page.
      
      The display format is:
      
      <startaddress>	 Links to information in /proc/<pid>/map
      <memory policy>  This can be "default" "interleave={}", "prefer=<node>" or "bind={<zones>}"
      MaxRef=		<maximum reference to a page in this vma>
      Pages=		<Nr of pages in use>
      Mapped=		<Nr of pages with mapcount >
      Anon=		<nr of anonymous pages>
      Nx=		<Nr of pages on Node x>
      
      The content of the proc-file is self-evident.  If this would be tied into
      the sparsemem system then the contents of this file would not be too
      useful.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6e21c8f1
    • H
      [PATCH] swap: swap_lock replace list+device · 5d337b91
      Hugh Dickins 提交于
      The idea of a swap_device_lock per device, and a swap_list_lock over them all,
      is appealing; but in practice almost every holder of swap_device_lock must
      already hold swap_list_lock, which defeats the purpose of the split.
      
      The only exceptions have been swap_duplicate, valid_swaphandles and an
      untrodden path in try_to_unuse (plus a few places added in this series).
      valid_swaphandles doesn't show up high in profiles, but swap_duplicate does
      demand attention.  However, with the hold time in get_swap_pages so much
      reduced, I've not yet found a load and set of swap device priorities to show
      even swap_duplicate benefitting from the split.  Certainly the split is mere
      overhead in the common case of a single swap device.
      
      So, replace swap_list_lock and swap_device_lock by spinlock_t swap_lock
      (generally we seem to prefer an _ in the name, and not hide in a macro).
      
      If someone can show a regression in swap_duplicate, then probably we should
      add a hashlock for the swap_map entries alone (shorts being anatomic), so as
      to help the case of the single swap device too.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5d337b91
    • H
      [PATCH] swap: scan_swap_map drop swap_device_lock · 52b7efdb
      Hugh Dickins 提交于
      get_swap_page has often shown up on latency traces, doing lengthy scans while
      holding two spinlocks.  swap_list_lock is already dropped, now scan_swap_map
      drop swap_device_lock before scanning the swap_map.
      
      While scanning for an empty cluster, don't worry that racing tasks may
      allocate what was free and free what was allocated; but when allocating an
      entry, check it's still free after retaking the lock.  Avoid dropping the lock
      in the expected common path.  No barriers beyond the locks, just let the
      cookie crumble; highest_bit limit is volatile, but benign.
      
      Guard against swapoff: must check SWP_WRITEOK before allocating, must raise
      SWP_SCANNING reference count while in scan_swap_map, swapoff wait for that to
      fall - just use schedule_timeout, we don't want to burden scan_swap_map
      itself, and it's very unlikely that anyone can really still be in
      scan_swap_map once swapoff gets this far.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      52b7efdb
    • H
      [PATCH] swap: swap unsigned int consistency · 6eb396dc
      Hugh Dickins 提交于
      The swap header's unsigned int last_page determines the range of swap pages,
      but swap_info has been using int or unsigned long in some cases: use unsigned
      int throughout (except, in several places a local unsigned long is useful to
      avoid overflows when adding).
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NJens Axboe <axboe@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6eb396dc
    • H
      [PATCH] swap: show span of swap extents · 53092a74
      Hugh Dickins 提交于
      The "Adding %dk swap" message shows the number of swap extents, as a guide to
      how fragmented the swapfile may be.  But a useful further guide is what total
      extent they span across (sometimes scarily large).
      
      And there's no need to keep nr_extents in swap_info: it's unused after the
      initial message, so save a little space by keeping it on stack.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      53092a74
    • H
      [PATCH] swap: swap extent list is ordered · 11d31886
      Hugh Dickins 提交于
      There are several comments that swap's extent_list.prev points to the lowest
      extent: that's not so, it's extent_list.next which points to it, as you'd
      expect.  And a couple of loops in add_swap_extent which go all the way through
      the list, when they should just add to the other end.
      
      Fix those up, and let map_swap_page search the list forwards: profiles shows
      it to be twice as quick that way - because prefetch works better on how the
      structs are typically kmalloc'ed?  or because usually more is written to than
      read from swap, and swap is allocated ascendingly?
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      11d31886
    • S
      [PATCH] mm: consolidate get_order · fd4fd5aa
      Stephen Rothwell 提交于
      Someone mentioned that almost all the architectures used basically the same
      implementation of get_order.  This patch consolidates them into
      asm-generic/page.h and includes that in the appropriate places.  The
      exceptions are ia64 and ppc which have their own (presumably optimised)
      versions.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fd4fd5aa
    • D
      [PATCH] sparsemem extreme: hotplug preparation · 28ae55c9
      Dave Hansen 提交于
      This splits up sparse_index_alloc() into two pieces.  This is needed
      because we'll allocate the memory for the second level in a different place
      from where we actually consume it to keep the allocation from happening
      underneath a lock
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      28ae55c9
    • B
      [PATCH] sparsemem extreme implementation · 3e347261
      Bob Picco 提交于
      With cleanups from Dave Hansen <haveblue@us.ibm.com>
      
      SPARSEMEM_EXTREME makes mem_section a one dimensional array of pointers to
      mem_sections.  This two level layout scheme is able to achieve smaller
      memory requirements for SPARSEMEM with the tradeoff of an additional shift
      and load when fetching the memory section.  The current SPARSEMEM
      implementation is a one dimensional array of mem_sections which is the
      default SPARSEMEM configuration.  The patch attempts isolates the
      implementation details of the physical layout of the sparsemem section
      array.
      
      SPARSEMEM_EXTREME requires bootmem to be functioning at the time of
      memory_present() calls.  This is not always feasible, so architectures
      which do not need it may allocate everything statically by using
      SPARSEMEM_STATIC.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3e347261
    • B
      [PATCH] SPARSEMEM EXTREME · 802f192e
      Bob Picco 提交于
      A new option for SPARSEMEM is ARCH_SPARSEMEM_EXTREME.  Architecture
      platforms with a very sparse physical address space would likely want to
      select this option.  For those architecture platforms that don't select the
      option, the code generated is equivalent to SPARSEMEM currently in -mm.
      I'll be posting a patch on ia64 ml which uses this new SPARSEMEM feature.
      
      ARCH_SPARSEMEM_EXTREME makes mem_section a one dimensional array of
      pointers to mem_sections.  This two level layout scheme is able to achieve
      smaller memory requirements for SPARSEMEM with the tradeoff of an
      additional shift and load when fetching the memory section.  The current
      SPARSEMEM -mm implementation is a one dimensional array of mem_sections
      which is the default SPARSEMEM configuration.  The patch attempts isolates
      the implementation details of the physical layout of the sparsemem section
      array.
      
      ARCH_SPARSEMEM_EXTREME depends on 64BIT and is by default boolean false.
      
      I've boot tested under aim load ia64 configured for ARCH_SPARSEMEM_EXTREME.
       I've also boot tested a 4 way Opteron machine with !ARCH_SPARSEMEM_EXTREME
      and tested with aim.
      Signed-off-by: NAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      802f192e
  2. 02 9月, 2005 2 次提交
  3. 01 9月, 2005 4 次提交
  4. 31 8月, 2005 1 次提交
  5. 30 8月, 2005 23 次提交