1. 29 12月, 2008 1 次提交
  2. 20 12月, 2008 4 次提交
  3. 19 12月, 2008 3 次提交
  4. 17 12月, 2008 2 次提交
    • J
      x86: consolidate __swp_XXX() macros · 1796316a
      Jan Beulich 提交于
      Impact: cleanup, code robustization
      
      The __swp_...() macros silently relied upon which bits are used for
      _PAGE_FILE and _PAGE_PROTNONE. After having changed _PAGE_PROTNONE in
      our Xen kernel to no longer overlap _PAGE_PAT, live locks and crashes
      were reported that could have been avoided if these macros properly
      used the symbolic constants. Since, as pointed out earlier, for Xen
      Dom0 support mainline likewise will need to eliminate the conflict
      between _PAGE_PAT and _PAGE_PROTNONE, this patch does all the necessary
      adjustments, plus it introduces a mechanism to check consistency
      between MAX_SWAPFILES_SHIFT and the actual encoding macros.
      
      This also fixes a latent bug in that x86-64 used a 6-bit mask in
      __swp_type(), and if MAX_SWAPFILES_SHIFT was increased beyond 5 in (the
      seemingly unrelated) linux/swap.h, this would have resulted in a
      collision with _PAGE_FILE.
      
      Non-PAE 32-bit code gets similarly adjusted for its pte_to_pgoff() and
      pgoff_to_pte() calculations.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1796316a
    • K
      mm: Don't touch uninitialized variable in do_pages_stat_array() · c095adbc
      KOSAKI Motohiro 提交于
      Commit 80bba129 removed one necessary
      variable initialization.  As a result following warning happened:
      
          CC      mm/migrate.o
        mm/migrate.c: In function 'sys_move_pages':
        mm/migrate.c:1001: warning: 'err' may be used uninitialized in this function
      
      More unfortunately, if find_vma() failed, kernel read uninitialized
      memory.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      CC: Brice Goglin <Brice.Goglin@inria.fr>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c095adbc
  5. 16 12月, 2008 1 次提交
  6. 11 12月, 2008 5 次提交
  7. 03 12月, 2008 2 次提交
  8. 02 12月, 2008 2 次提交
    • K
      memcg: memory hotplug fix for notifier callback · dc19f9db
      KAMEZAWA Hiroyuki 提交于
      Fixes for memcg/memory hotplug.
      
      While memory hotplug allocate/free memmap, page_cgroup doesn't free
      page_cgroup at OFFLINE when page_cgroup is allocated via bootomem.
      (Because freeing bootmem requires special care.)
      
      Then, if page_cgroup is allocated by bootmem and memmap is freed/allocated
      by memory hotplug, page_cgroup->page == page is no longer true.
      
      But current MEM_ONLINE handler doesn't check it and update
      page_cgroup->page if it's not necessary to allocate page_cgroup.  (This
      was not found because memmap is not freed if SPARSEMEM_VMEMMAP is y.)
      
      And I noticed that MEM_ONLINE can be called against "part of section".
      So, freeing page_cgroup at CANCEL_ONLINE will cause trouble.  (freeing
      used page_cgroup) Don't rollback at CANCEL.
      
      One more, current memory hotplug notifier is stopped by slub because it
      sets NOTIFY_STOP_MASK to return vaule.  So, page_cgroup's callback never
      be called.  (low priority than slub now.)
      
      I think this slub's behavior is not intentional(BUG). and fixes it.
      
      Another way to be considered about page_cgroup allocation:
        - free page_cgroup at OFFLINE even if it's from bootmem
          and remove specieal handler. But it requires more changes.
      
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12041Signed-off-by: NKAMEZAWA Hiruyoki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Tested-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dc19f9db
    • N
      mm: vmalloc fix lazy unmapping cache aliasing · b29acbdc
      Nick Piggin 提交于
      Jim Radford has reported that the vmap subsystem rewrite was sometimes
      causing his VIVT ARM system to behave strangely (seemed like going into
      infinite loops trying to fault in pages to userspace).
      
      We determined that the problem was most likely due to a cache aliasing
      issue.  flush_cache_vunmap was only being called at the moment the page
      tables were to be taken down, however with lazy unmapping, this can happen
      after the page has subsequently been freed and allocated for something
      else.  The dangling alias may still have dirty data attached to it.
      
      The fix for this problem is to do the cache flushing when the caller has
      called vunmap -- it would be a bug for them to write anything else to the
      mapping at that point.
      
      That appeared to solve Jim's problems.
      Reported-by: NJim Radford <radford@blackbean.org>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b29acbdc
  9. 01 12月, 2008 2 次提交
  10. 26 11月, 2008 2 次提交
  11. 20 11月, 2008 7 次提交
  12. 17 11月, 2008 1 次提交
  13. 16 11月, 2008 1 次提交
  14. 14 11月, 2008 3 次提交
  15. 13 11月, 2008 4 次提交
    • K
      memcg: bugfix for memory hotplug · 33c5d3d6
      KAMEZAWA Hiroyuki 提交于
      The start pfn calculation in page_cgroup's memory hotplug notifier chain
      is wrong.
      Tested-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33c5d3d6
    • K
      mm: remove lru_add_drain_all() from the munlock path · 8891d6da
      KOSAKI Motohiro 提交于
      lockdep warns about following message at boot time on one of my test
      machine.  Then, schedule_on_each_cpu() sholdn't be called when the task
      have mmap_sem.
      
      Actually, lru_add_drain_all() exist to prevent the unevictalble pages
      stay on reclaimable lru list.  but currenct unevictable code can rescue
      unevictable pages although it stay on reclaimable list.
      
      So removing is better.
      
      In addition, this patch add lru_add_drain_all() to sys_mlock() and
      sys_mlockall().  it isn't must.  but it reduce the failure of moving to
      unevictable list.  its failure can rescue in vmscan later.  but reducing
      is better.
      
      Note, if above rescuing happend, the Mlocked and the Unevictable field
      mismatching happend in /proc/meminfo.  but it doesn't cause any real
      trouble.
      
      =======================================================
      [ INFO: possible circular locking dependency detected ]
      2.6.28-rc2-mm1 #2
      -------------------------------------------------------
      lvm/1103 is trying to acquire lock:
       (&cpu_hotplug.lock){--..}, at: [<c0130789>] get_online_cpus+0x29/0x50
      
      but task is already holding lock:
       (&mm->mmap_sem){----}, at: [<c01878ae>] sys_mlockall+0x4e/0xb0
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #3 (&mm->mmap_sem){----}:
             [<c0153da2>] check_noncircular+0x82/0x110
             [<c0185e6a>] might_fault+0x4a/0xa0
             [<c0156161>] validate_chain+0xb11/0x1070
             [<c0185e6a>] might_fault+0x4a/0xa0
             [<c0156923>] __lock_acquire+0x263/0xa10
             [<c015714c>] lock_acquire+0x7c/0xb0			(*) grab mmap_sem
             [<c0185e6a>] might_fault+0x4a/0xa0
             [<c0185e9b>] might_fault+0x7b/0xa0
             [<c0185e6a>] might_fault+0x4a/0xa0
             [<c0294dd0>] copy_to_user+0x30/0x60
             [<c01ae3ec>] filldir+0x7c/0xd0
             [<c01e3a6a>] sysfs_readdir+0x11a/0x1f0			(*) grab sysfs_mutex
             [<c01ae370>] filldir+0x0/0xd0
             [<c01ae370>] filldir+0x0/0xd0
             [<c01ae4c6>] vfs_readdir+0x86/0xa0			(*) grab i_mutex
             [<c01ae75b>] sys_getdents+0x6b/0xc0
             [<c010355a>] syscall_call+0x7/0xb
             [<ffffffff>] 0xffffffff
      
      -> #2 (sysfs_mutex){--..}:
             [<c0153da2>] check_noncircular+0x82/0x110
             [<c01e3d2c>] sysfs_addrm_start+0x2c/0xc0
             [<c0156161>] validate_chain+0xb11/0x1070
             [<c01e3d2c>] sysfs_addrm_start+0x2c/0xc0
             [<c0156923>] __lock_acquire+0x263/0xa10
             [<c015714c>] lock_acquire+0x7c/0xb0			(*) grab sysfs_mutex
             [<c01e3d2c>] sysfs_addrm_start+0x2c/0xc0
             [<c04f8b55>] mutex_lock_nested+0xa5/0x2f0
             [<c01e3d2c>] sysfs_addrm_start+0x2c/0xc0
             [<c01e3d2c>] sysfs_addrm_start+0x2c/0xc0
             [<c01e3d2c>] sysfs_addrm_start+0x2c/0xc0
             [<c01e422f>] create_dir+0x3f/0x90
             [<c01e42a9>] sysfs_create_dir+0x29/0x50
             [<c04faaf5>] _spin_unlock+0x25/0x40
             [<c028f21d>] kobject_add_internal+0xcd/0x1a0
             [<c028f37a>] kobject_set_name_vargs+0x3a/0x50
             [<c028f41d>] kobject_init_and_add+0x2d/0x40
             [<c019d4d2>] sysfs_slab_add+0xd2/0x180
             [<c019d580>] sysfs_add_func+0x0/0x70
             [<c019d5dc>] sysfs_add_func+0x5c/0x70			(*) grab slub_lock
             [<c01400f2>] run_workqueue+0x172/0x200
             [<c014008f>] run_workqueue+0x10f/0x200
             [<c0140bd0>] worker_thread+0x0/0xf0
             [<c0140c6c>] worker_thread+0x9c/0xf0
             [<c0143c80>] autoremove_wake_function+0x0/0x50
             [<c0140bd0>] worker_thread+0x0/0xf0
             [<c0143972>] kthread+0x42/0x70
             [<c0143930>] kthread+0x0/0x70
             [<c01042db>] kernel_thread_helper+0x7/0x1c
             [<ffffffff>] 0xffffffff
      
      -> #1 (slub_lock){----}:
             [<c0153d2d>] check_noncircular+0xd/0x110
             [<c04f650f>] slab_cpuup_callback+0x11f/0x1d0
             [<c0156161>] validate_chain+0xb11/0x1070
             [<c04f650f>] slab_cpuup_callback+0x11f/0x1d0
             [<c015433d>] mark_lock+0x35d/0xd00
             [<c0156923>] __lock_acquire+0x263/0xa10
             [<c015714c>] lock_acquire+0x7c/0xb0
             [<c04f650f>] slab_cpuup_callback+0x11f/0x1d0
             [<c04f93a3>] down_read+0x43/0x80
             [<c04f650f>] slab_cpuup_callback+0x11f/0x1d0		(*) grab slub_lock
             [<c04f650f>] slab_cpuup_callback+0x11f/0x1d0
             [<c04fd9ac>] notifier_call_chain+0x3c/0x70
             [<c04f5454>] _cpu_up+0x84/0x110
             [<c04f552b>] cpu_up+0x4b/0x70				(*) grab cpu_hotplug.lock
             [<c06d1530>] kernel_init+0x0/0x170
             [<c06d15e5>] kernel_init+0xb5/0x170
             [<c06d1530>] kernel_init+0x0/0x170
             [<c01042db>] kernel_thread_helper+0x7/0x1c
             [<ffffffff>] 0xffffffff
      
      -> #0 (&cpu_hotplug.lock){--..}:
             [<c0155bff>] validate_chain+0x5af/0x1070
             [<c040f7e0>] dev_status+0x0/0x50
             [<c0156923>] __lock_acquire+0x263/0xa10
             [<c015714c>] lock_acquire+0x7c/0xb0
             [<c0130789>] get_online_cpus+0x29/0x50
             [<c04f8b55>] mutex_lock_nested+0xa5/0x2f0
             [<c0130789>] get_online_cpus+0x29/0x50
             [<c0130789>] get_online_cpus+0x29/0x50
             [<c017bc30>] lru_add_drain_per_cpu+0x0/0x10
             [<c0130789>] get_online_cpus+0x29/0x50			(*) grab cpu_hotplug.lock
             [<c0140cf2>] schedule_on_each_cpu+0x32/0xe0
             [<c0187095>] __mlock_vma_pages_range+0x85/0x2c0
             [<c0156945>] __lock_acquire+0x285/0xa10
             [<c0188f09>] vma_merge+0xa9/0x1d0
             [<c0187450>] mlock_fixup+0x180/0x200
             [<c0187548>] do_mlockall+0x78/0x90			(*) grab mmap_sem
             [<c01878e1>] sys_mlockall+0x81/0xb0
             [<c010355a>] syscall_call+0x7/0xb
             [<ffffffff>] 0xffffffff
      
      other info that might help us debug this:
      
      1 lock held by lvm/1103:
       #0:  (&mm->mmap_sem){----}, at: [<c01878ae>] sys_mlockall+0x4e/0xb0
      
      stack backtrace:
      Pid: 1103, comm: lvm Not tainted 2.6.28-rc2-mm1 #2
      Call Trace:
       [<c01555fc>] print_circular_bug_tail+0x7c/0xd0
       [<c0155bff>] validate_chain+0x5af/0x1070
       [<c040f7e0>] dev_status+0x0/0x50
       [<c0156923>] __lock_acquire+0x263/0xa10
       [<c015714c>] lock_acquire+0x7c/0xb0
       [<c0130789>] get_online_cpus+0x29/0x50
       [<c04f8b55>] mutex_lock_nested+0xa5/0x2f0
       [<c0130789>] get_online_cpus+0x29/0x50
       [<c0130789>] get_online_cpus+0x29/0x50
       [<c017bc30>] lru_add_drain_per_cpu+0x0/0x10
       [<c0130789>] get_online_cpus+0x29/0x50
       [<c0140cf2>] schedule_on_each_cpu+0x32/0xe0
       [<c0187095>] __mlock_vma_pages_range+0x85/0x2c0
       [<c0156945>] __lock_acquire+0x285/0xa10
       [<c0188f09>] vma_merge+0xa9/0x1d0
       [<c0187450>] mlock_fixup+0x180/0x200
       [<c0187548>] do_mlockall+0x78/0x90
       [<c01878e1>] sys_mlockall+0x81/0xb0
       [<c010355a>] syscall_call+0x7/0xb
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Tested-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8891d6da
    • D
      cpusets: update mems allowed in page allocator · e33c3b5e
      David Rientjes 提交于
      If all allowable memory is unreclaimable, it is possible to loop forever
      in the page allocator for ~__GFP_NORETRY allocations.
      
      During this time, it is also possible for a task's cpuset to expand its
      set of allowable nodes so that it now includes free memory.  The cached
      copy of this set, current->mems_allowed, is stale, however, since there
      has not been a subsequent call to cpuset_update_task_memory_state().
      
      The cached copy of the set of allowable nodes is now updated in the page
      allocator's slow path so the additional memory is available to
      get_page_from_freelist().
      
      [akpm@linux-foundation.org: add comment]
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e33c3b5e
    • A
      hugetlb: make unmap_ref_private multi-size-aware · 7526674d
      Adam Litke 提交于
      Oops.  Part of the hugetlb private reservation code was not fully
      converted to use hstates.
      
      When a huge page must be unmapped from VMAs due to a failed COW,
      HPAGE_SIZE is used in the call to unmap_hugepage_range() regardless of
      the page size being used.  This works if the VMA is using the default
      huge page size.  Otherwise we might unmap too much, too little, or
      trigger a BUG_ON.  Rare but serious -- fix it.
      Signed-off-by: NAdam Litke <agl@us.ibm.com>
      Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7526674d