1. 30 12月, 2011 1 次提交
  2. 22 12月, 2011 1 次提交
  3. 09 12月, 2011 1 次提交
    • Y
      thp: set compound tail page _count to zero · 58a84aa9
      Youquan Song 提交于
      Commit 70b50f94 ("mm: thp: tail page refcounting fix") keeps all
      page_tail->_count zero at all times.  But the current kernel does not
      set page_tail->_count to zero if a 1GB page is utilized.  So when an
      IOMMU 1GB page is used by KVM, it wil result in a kernel oops because a
      tail page's _count does not equal zero.
      
        kernel BUG at include/linux/mm.h:386!
        invalid opcode: 0000 [#1] SMP
        Call Trace:
          gup_pud_range+0xb8/0x19d
          get_user_pages_fast+0xcb/0x192
          ? trace_hardirqs_off+0xd/0xf
          hva_to_pfn+0x119/0x2f2
          gfn_to_pfn_memslot+0x2c/0x2e
          kvm_iommu_map_pages+0xfd/0x1c1
          kvm_iommu_map_memslots+0x7c/0xbd
          kvm_iommu_map_guest+0xaa/0xbf
          kvm_vm_ioctl_assigned_device+0x2ef/0xa47
          kvm_vm_ioctl+0x36c/0x3a2
          do_vfs_ioctl+0x49e/0x4e4
          sys_ioctl+0x5a/0x7c
          system_call_fastpath+0x16/0x1b
        RIP  gup_huge_pud+0xf2/0x159
      Signed-off-by: NYouquan Song <youquan.song@intel.com>
      Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58a84aa9
  4. 16 11月, 2011 1 次提交
  5. 26 7月, 2011 2 次提交
  6. 16 6月, 2011 1 次提交
    • R
      mm: fix negative commitlimit when gigantic hugepages are allocated · b0320c7b
      Rafael Aquini 提交于
      When 1GB hugepages are allocated on a system, free(1) reports less
      available memory than what really is installed in the box.  Also, if the
      total size of hugepages allocated on a system is over half of the total
      memory size, CommitLimit becomes a negative number.
      
      The problem is that gigantic hugepages (order > MAX_ORDER) can only be
      allocated at boot with bootmem, thus its frames are not accounted to
      'totalram_pages'.  However, they are accounted to hugetlb_total_pages()
      
      What happens to turn CommitLimit into a negative number is this
      calculation, in fs/proc/meminfo.c:
      
              allowed = ((totalram_pages - hugetlb_total_pages())
                      * sysctl_overcommit_ratio / 100) + total_swap_pages;
      
      A similar calculation occurs in __vm_enough_memory() in mm/mmap.c.
      
      Also, every vm statistic which depends on 'totalram_pages' will render
      confusing values, as if system were 'missing' some part of its memory.
      
      Impact of this bug:
      
      When gigantic hugepages are allocated and sysctl_overcommit_memory ==
      OVERCOMMIT_NEVER.  In a such situation, __vm_enough_memory() goes through
      the mentioned 'allowed' calculation and might end up mistakenly returning
      -ENOMEM, thus forcing the system to start reclaiming pages earlier than it
      would be ususal, and this could cause detrimental impact to overall
      system's performance, depending on the workload.
      
      Besides the aforementioned scenario, I can only think of this causing
      annoyances with memory reports from /proc/meminfo and free(1).
      
      [akpm@linux-foundation.org: standardize comment layout]
      Reported-by: NRuss Anderson <rja@sgi.com>
      Signed-off-by: NRafael Aquini <aquini@linux.com>
      Acked-by: NRuss Anderson <rja@sgi.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0320c7b
  7. 06 6月, 2011 1 次提交
  8. 27 5月, 2011 1 次提交
  9. 25 5月, 2011 1 次提交
  10. 10 4月, 2011 1 次提交
  11. 31 3月, 2011 1 次提交
  12. 23 3月, 2011 1 次提交
  13. 14 1月, 2011 5 次提交
  14. 03 12月, 2010 1 次提交
  15. 27 10月, 2010 1 次提交
  16. 08 10月, 2010 9 次提交
  17. 24 9月, 2010 2 次提交
  18. 11 8月, 2010 5 次提交
  19. 10 8月, 2010 1 次提交
  20. 25 5月, 2010 1 次提交
    • M
      cpuset,mm: fix no node to alloc memory when changing cpuset's mems · c0ff7453
      Miao Xie 提交于
      Before applying this patch, cpuset updates task->mems_allowed and
      mempolicy by setting all new bits in the nodemask first, and clearing all
      old unallowed bits later.  But in the way, the allocator may find that
      there is no node to alloc memory.
      
      The reason is that cpuset rebinds the task's mempolicy, it cleans the
      nodes which the allocater can alloc pages on, for example:
      
      (mpol: mempolicy)
      	task1			task1's mpol	task2
      	alloc page		1
      	  alloc on node0? NO	1
      				1		change mems from 1 to 0
      				1		rebind task1's mpol
      				0-1		  set new bits
      				0	  	  clear disallowed bits
      	  alloc on node1? NO	0
      	  ...
      	can't alloc page
      	  goto oom
      
      This patch fixes this problem by expanding the nodes range first(set newly
      allowed bits) and shrink it lazily(clear newly disallowed bits).  So we
      use a variable to tell the write-side task that read-side task is reading
      nodemask, and the write-side task clears newly disallowed nodes after
      read-side task ends the current memory allocation.
      
      [akpm@linux-foundation.org: fix spello]
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Paul Menage <menage@google.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Ravikiran Thirumalai <kiran@scalex86.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0ff7453
  21. 12 5月, 2010 1 次提交
    • M
      hugetlbfs: kill applications that use MAP_NORESERVE with SIGBUS instead of OOM-killer · 4a6018f7
      Mel Gorman 提交于
      Ordinarily, application using hugetlbfs will create mappings with
      reserves.  For shared mappings, these pages are reserved before mmap()
      returns success and for private mappings, the caller process is guaranteed
      and a child process that cannot get the pages gets killed with sigbus.
      
      An application that uses MAP_NORESERVE gets no reservations and mmap()
      will always succeed at the risk the page will not be available at fault
      time.  This might be used for example on very large sparse mappings where
      the developer is confident the necessary huge pages exist to satisfy all
      faults even though the whole mapping cannot be backed by huge pages.
      Unfortunately, if an allocation does fail, VM_FAULT_OOM is returned to the
      fault handler which proceeds to trigger the OOM-killer.  This is
      unhelpful.
      
      Even without hugetlbfs mounted, a user using mmap() can trivially trigger
      the OOM-killer because VM_FAULT_OOM is returned (will provide example
      program if desired - it's a whopping 24 lines long).  It could be
      considered a DOS available to an unprivileged user.
      
      This patch alters hugetlbfs to kill a process that uses MAP_NORESERVE
      where huge pages were not available with SIGBUS instead of triggering the
      OOM killer.
      
      This change affects hugetlb_cow() as well.  I feel there is a failure case
      in there, but I didn't create one.  It would need a fairly specific target
      in terms of the faulting application and the hugepage pool size.  The
      hugetlb_no_page() path is much easier to hit but both might as well be
      closed.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4a6018f7
  22. 25 4月, 2010 1 次提交