1. 09 1月, 2006 18 次提交
    • R
      [PATCH] Change maxaligned_in_smp alignemnt macros to internodealigned_in_smp macros · 22fc6ecc
      Ravikiran G Thirumalai 提交于
      ____cacheline_maxaligned_in_smp is currently used to align critical structures
      and avoid false sharing.  It uses per-arch L1_CACHE_SHIFT_MAX and people find
      L1_CACHE_SHIFT_MAX useless.
      
      However, we have been using ____cacheline_maxaligned_in_smp to align
      structures on the internode cacheline size.  As per Andi's suggestion,
      following patch kills ____cacheline_maxaligned_in_smp and introduces
      INTERNODE_CACHE_SHIFT, which defaults to L1_CACHE_SHIFT for all arches.
      Arches needing L3/Internode cacheline alignment can define
      INTERNODE_CACHE_SHIFT in the arch asm/cache.h.  Patch replaces
      ____cacheline_maxaligned_in_smp with ____cacheline_internodealigned_in_smp
      
      With this patch, L1_CACHE_SHIFT_MAX can be killed
      Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: NShai Fultheim <shai@scalex86.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      22fc6ecc
    • C
      [PATCH] mempolicies: unexport get_vma_policy() · 48fce342
      Christoph Lameter 提交于
      Since the numa_maps functionality is now in mempolicy.c we no longer need to
      export get_vma_policy().
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      48fce342
    • A
      [PATCH] set_page_count() macro safety · 152194aa
      Avishay Traeger 提交于
      Fix set_page_count() macro to handle complex arguments.
      Signed-off-by: NAvishay Traeger <atraeger@cs.sunysb.edu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      152194aa
    • P
      [PATCH] cpusets: swap migration interface · 45b07ef3
      Paul Jackson 提交于
      Add a boolean "memory_migrate" to each cpuset, represented by a file
      containing "0" or "1" in each directory below /dev/cpuset.
      
      It defaults to false (file contains "0").  It can be set true by writing
      "1" to the file.
      
      If true, then anytime that a task is attached to the cpuset so marked, the
      pages of that task will be moved to that cpuset, preserving, to the extent
      practical, the cpuset-relative placement of the pages.
      
      Also anytime that a cpuset so marked has its memory placement changed (by
      writing to its "mems" file), the tasks in that cpuset will have their pages
      moved to the cpusets new nodes, preserving, to the extent practical, the
      cpuset-relative placement of the moved pages.
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Cc: Christoph Lameter <christoph@lameter.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      45b07ef3
    • C
      [PATCH] SwapMig: Extend parameters for migrate_pages() · d4984711
      Christoph Lameter 提交于
      Extend the parameters of migrate_pages() to allow the caller control over the
      fate of successfully migrated or impossible to migrate pages.
      
      Swap migration and direct migration will have the same interface after this
      patch so that patches can be independently applied to the policy layer and the
      core migration code.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d4984711
    • C
      [PATCH] SwapMig: add_to_swap() avoid atomic allocations · 1480a540
      Christoph Lameter 提交于
      Add gfp_mask to add_to_swap
      
      add_to_swap does allocations with GFP_ATOMIC in order not to interfere with
      swapping.  During migration we may have use add_to_swap extensively which may
      lead to out of memory errors.
      
      This patch makes add_to_swap take a parameter that specifies the gfp mask.
      The page migration code can then make add_to_swap use GFP_KERNEL.
      Signed-off-by: NHirokazu Takahashi <taka@valinux.co.jp>
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1480a540
    • C
      [PATCH] SwapMig: CONFIG_MIGRATION fixes · 8419c318
      Christoph Lameter 提交于
      Move move_to_lru, putback_lru_pages and isolate_lru in section surrounded by
      CONFIG_MIGRATION saving some codesize for single processor kernels.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8419c318
    • C
      [PATCH] Swap Migration V5: sys_migrate_pages interface · 39743889
      Christoph Lameter 提交于
      sys_migrate_pages implementation using swap based page migration
      
      This is the original API proposed by Ray Bryant in his posts during the first
      half of 2005 on linux-mm@kvack.org and linux-kernel@vger.kernel.org.
      
      The intent of sys_migrate is to migrate memory of a process.  A process may
      have migrated to another node.  Memory was allocated optimally for the prior
      context.  sys_migrate_pages allows to shift the memory to the new node.
      
      sys_migrate_pages is also useful if the processes available memory nodes have
      changed through cpuset operations to manually move the processes memory.  Paul
      Jackson is working on an automated mechanism that will allow an automatic
      migration if the cpuset of a process is changed.  However, a user may decide
      to manually control the migration.
      
      This implementation is put into the policy layer since it uses concepts and
      functions that are also needed for mbind and friends.  The patch also provides
      a do_migrate_pages function that may be useful for cpusets to automatically
      move memory.  sys_migrate_pages does not modify policies in contrast to Ray's
      implementation.
      
      The current code here is based on the swap based page migration capability and
      thus is not able to preserve the physical layout relative to it containing
      nodeset (which may be a cpuset).  When direct page migration becomes available
      then the implementation needs to be changed to do a isomorphic move of pages
      between different nodesets.  The current implementation simply evicts all
      pages in source nodeset that are not in the target nodeset.
      
      Patch supports ia64, i386 and x86_64.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      39743889
    • C
      [PATCH] Swap Migration V5: MPOL_MF_MOVE interface · dc9aa5b9
      Christoph Lameter 提交于
      Add page migration support via swap to the NUMA policy layer
      
      This patch adds page migration support to the NUMA policy layer.  An
      additional flag MPOL_MF_MOVE is introduced for mbind.  If MPOL_MF_MOVE is
      specified then pages that do not conform to the memory policy will be evicted
      from memory.  When they get pages back in new pages will be allocated
      following the numa policy.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dc9aa5b9
    • C
      [PATCH] Swap Migration V5: Add CONFIG_MIGRATION for page migration support · 7cbe34cf
      Christoph Lameter 提交于
      Include page migration if the system is NUMA or having a memory model that
      allows distinct areas of memory (SPARSEMEM, DISCONTIGMEM).
      
      And:
      - Only include lru_add_drain_per_cpu if building for an SMP system.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7cbe34cf
    • C
      [PATCH] Swap Migration V5: migrate_pages() function · 49d2e9cc
      Christoph Lameter 提交于
      This adds the basic page migration function with a minimal implementation that
      only allows the eviction of pages to swap space.
      
      Page eviction and migration may be useful to migrate pages, to suspend
      programs or for remapping single pages (useful for faulty pages or pages with
      soft ECC failures)
      
      The process is as follows:
      
      The function wanting to migrate pages must first build a list of pages to be
      migrated or evicted and take them off the lru lists via isolate_lru_page().
      isolate_lru_page determines that a page is freeable based on the LRU bit set.
      
      Then the actual migration or swapout can happen by calling migrate_pages().
      
      migrate_pages does its best to migrate or swapout the pages and does multiple
      passes over the list.  Some pages may only be swappable if they are not dirty.
       migrate_pages may start writing out dirty pages in the initial passes over
      the pages.  However, migrate_pages may not be able to migrate or evict all
      pages for a variety of reasons.
      
      The remaining pages may be returned to the LRU lists using putback_lru_pages().
      
      Changelog V4->V5:
      - Use the lru caches to return pages to the LRU
      
      Changelog V3->V4:
      - Restructure code so that applying patches to support full migration does
        require minimal changes. Rename swapout_pages() to migrate_pages().
      
      Changelog V2->V3:
      - Extract common code from shrink_list() and swapout_pages()
      Signed-off-by: NMike Kravetz <kravetz@us.ibm.com>
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: "Michael Kerrisk" <mtk-manpages@gmx.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      49d2e9cc
    • C
      [PATCH] Swap Migration V5: PF_SWAPWRITE to allow writing to swap · 930d9152
      Christoph Lameter 提交于
      Add PF_SWAPWRITE to control a processes permission to write to swap.
      
      - Use PF_SWAPWRITE in may_write_to_queue() instead of checking for kswapd
        and pdflush
      
      - Set PF_SWAPWRITE flag for kswapd and pdflush
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      930d9152
    • C
      [PATCH] Swap Migration V5: LRU operations · 21eac81f
      Christoph Lameter 提交于
      This is the start of the `swap migration' patch series.
      
      Swap migration allows the moving of the physical location of pages between
      nodes in a numa system while the process is running.  This means that the
      virtual addresses that the process sees do not change.  However, the system
      rearranges the physical location of those pages.
      
      The main intent of page migration patches here is to reduce the latency of
      memory access by moving pages near to the processor where the process
      accessing that memory is running.
      
      The patchset allows a process to manually relocate the node on which its
      pages are located through the MF_MOVE and MF_MOVE_ALL options while
      setting a new memory policy.
      
      The pages of process can also be relocated from another process using the
      sys_migrate_pages() function call.  Requires CAP_SYS_ADMIN.  The migrate_pages
      function call takes two sets of nodes and moves pages of a process that are
      located on the from nodes to the destination nodes.
      
      Manual migration is very useful if for example the scheduler has relocated a
      process to a processor on a distant node.  A batch scheduler or an
      administrator can detect the situation and move the pages of the process
      nearer to the new processor.
      
      sys_migrate_pages() could be used on non-numa machines as well, to force all
      of a particualr process's pages out to swap, if someone thinks that's useful.
      
      Larger installations usually partition the system using cpusets into sections
      of nodes.  Paul has equipped cpusets with the ability to move pages when a
      task is moved to another cpuset.  This allows automatic control over locality
      of a process.  If a task is moved to a new cpuset then also all its pages are
      moved with it so that the performance of the process does not sink
      dramatically (as is the case today).
      
      Swap migration works by simply evicting the page.  The pages must be faulted
      back in.  The pages are then typically reallocated by the system near the node
      where the process is executing.
      
      For swap migration the destination of the move is controlled by the allocation
      policy.  Cpusets set the allocation policy before calling sys_migrate_pages()
      in order to move the pages as intended.
      
      No allocation policy changes are performed for sys_migrate_pages().  This
      means that the pages may not faulted in to the specified nodes if no
      allocation policy was set by other means.  The pages will just end up near the
      node where the fault occurred.
      
      There's another patch series in the pipeline which implements "direct
      migration".
      
      The direct migration patchset extends the migration functionality to avoid
      going through swap.  The destination node of the relation is controllable
      during the actual moving of pages.  The crutch of using the allocation policy
      to relocate is not necessary and the pages are moved directly to the target.
      Its also faster since swap is not used.
      
      And sys_migrate_pages() can then move pages directly to the specified node.
      Implement functions to isolate pages from the LRU and put them back later.
      
      This patch:
      
      An earlier implementation was provided by Hirokazu Takahashi
      <taka@valinux.co.jp> and IWAMOTO Toshihiro <iwamoto@valinux.co.jp> for the
      memory hotplug project.
      
      From: Magnus
      
      This breaks out isolate_lru_page() and putpack_lru_page().  Needed for swap
      migration.
      Signed-off-by: NMagnus Damm <magnus.damm@gmail.com>
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      21eac81f
    • C
      [PATCH] add schedule_on_each_cpu() · 15316ba8
      Christoph Lameter 提交于
      swap migration's isolate_lru_page() currently uses an IPI to notify other
      processors that the lru caches need to be drained if the page cannot be
      found on the LRU.  The IPI interrupt may interrupt a processor that is just
      processing lru requests and cause a race condition.
      
      This patch introduces a new function run_on_each_cpu() that uses the
      keventd() to run the LRU draining on each processor.  Processors disable
      preemption when dealing the LRU caches (these are per processor) and thus
      executing LRU draining from another process is safe.
      
      Thanks to Lee Schermerhorn <lee.schermerhorn@hp.com> for finding this race
      condition.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      15316ba8
    • R
      [PATCH] Make high and batch sizes of per_cpu_pagelists configurable · 8ad4b1fb
      Rohit Seth 提交于
      As recently there has been lot of traffic on the right values for batch and
      high water marks for per_cpu_pagelists.  This patch makes these two
      variables configurable through /proc interface.
      
      A new tunable /proc/sys/vm/percpu_pagelist_fraction is added.  This entry
      controls the fraction of pages at most in each zone that are allocated for
      each per cpu page list.  The min value for this is 8.  It means that we
      don't allow more than 1/8th of pages in each zone to be allocated in any
      single per_cpu_pagelist.
      
      The batch value of each per cpu pagelist is also updated as a result.  It
      is set to pcp->high/4.  The upper limit of batch is (PAGE_SHIFT * 8)
      Signed-off-by: NRohit Seth <rohit.seth@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8ad4b1fb
    • A
      [PATCH] drop-pagecache · 9d0243bc
      Andrew Morton 提交于
      Add /proc/sys/vm/drop_caches.  When written to, this will cause the kernel to
      discard as much pagecache and/or reclaimable slab objects as it can.  THis
      operation requires root permissions.
      
      It won't drop dirty data, so the user should run `sync' first.
      
      Caveats:
      
      a) Holds inode_lock for exorbitant amounts of time.
      
      b) Needs to be taught about NUMA nodes: propagate these all the way through
         so the discarding can be controlled on a per-node basis.
      
      This is a debugging feature: useful for getting consistent results between
      filesystem benchmarks.  We could possibly put it under a config option, but
      it's less than 300 bytes.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9d0243bc
    • P
      [PATCH] slab: remove unused align parameter from alloc_percpu · f9f75005
      Pekka Enberg 提交于
      __alloc_percpu and alloc_percpu both take an 'align' argument which is
      completely ignored.  snmp6_mib_init() in net/ipv6/af_inet6.c attempts to use
      it, but it will be ignored.  Therefore, remove the 'align' argument and fixup
      the lone caller.
      Signed-off-by: NMatthew Dobson <colpatch@us.ibm.com>
      Acked-by: NManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f9f75005
    • O
      [PATCH] Fix compilation with CONFIG_MEMORY_HOTPLUG=y and gcc41. · b792de39
      Olaf Hering 提交于
      Fix compilation with CONFIG_MEMORY_HOTPLUG=y and gcc41.
      Also remove unneeded declations, add a public function.
      
      drivers/base/memory.c:53: error: static declaration of 'register_memory_notifier' follows non-static declaration
      include/linux/memory.h:85: error: previous declaration of 'register_memory_notifier' was here
      drivers/base/memory.c:58: error: static declaration of 'unregister_memory_notifier' follows non-static declaration
      include/linux/memory.h:86: error: previous declaration of 'unregister_memory_notifier' was here
      drivers/base/memory.c:68: error: static declaration of 'register_memory' follows non-static declaration
      include/linux/memory.h:73: error: previous declaration of 'register_memory' was here
      Signed-off-by: NOlaf Hering <olh@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b792de39
  2. 08 1月, 2006 7 次提交
  3. 07 1月, 2006 15 次提交