1. 09 10月, 2012 14 次提交
    • G
      thp: remove assumptions on pgtable_t type · e3ebcf64
      Gerald Schaefer 提交于
      The thp page table pre-allocation code currently assumes that pgtable_t is
      of type "struct page *".  This may not be true for all architectures, so
      this patch removes that assumption by replacing the functions
      prepare_pmd_huge_pte() and get_pmd_huge_pte() with two new functions that
      can be defined architecture-specific.
      
      It also removes two VM_BUG_ON checks for page_count() and page_mapcount()
      operating on a pgtable_t.  Apart from the VM_BUG_ON removal, there will be
      no functional change introduced by this patch.
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3ebcf64
    • D
      oom: remove deprecated oom_adj · 01dc52eb
      Davidlohr Bueso 提交于
      The deprecated /proc/<pid>/oom_adj is scheduled for removal this month.
      Signed-off-by: NDavidlohr Bueso <dave@gnu.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01dc52eb
    • S
      mm: mmu_notifier: have mmu_notifiers use a global SRCU so they may safely schedule · 21a92735
      Sagi Grimberg 提交于
      With an RCU based mmu_notifier implementation, any callout to
      mmu_notifier_invalidate_range_{start,end}() or
      mmu_notifier_invalidate_page() would not be allowed to call schedule()
      as that could potentially allow a modification to the mmu_notifier
      structure while it is currently being used.
      
      Since srcu allocs 4 machine words per instance per cpu, we may end up
      with memory exhaustion if we use srcu per mm.  So all mms share a global
      srcu.  Note that during large mmu_notifier activity exit & unregister
      paths might hang for longer periods, but it is tolerable for current
      mmu_notifier clients.
      Signed-off-by: NSagi Grimberg <sagig@mellanox.co.il>
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Haggai Eran <haggaie@mellanox.com>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      21a92735
    • X
      mm: mmu_notifier: fix inconsistent memory between secondary MMU and host · 48af0d7c
      Xiao Guangrong 提交于
      There is a bug in set_pte_at_notify() which always sets the pte to the
      new page before releasing the old page in the secondary MMU.  At this
      time, the process will access on the new page, but the secondary MMU
      still access on the old page, the memory is inconsistent between them
      
      The below scenario shows the bug more clearly:
      
      at the beginning: *p = 0, and p is write-protected by KSM or shared with
      parent process
      
      CPU 0                                       CPU 1
      write 1 to p to trigger COW,
      set_pte_at_notify will be called:
        *pte = new_page + W; /* The W bit of pte is set */
      
                                           *p = 1; /* pte is valid, so no #PF */
      
                                           return back to secondary MMU, then
                                           the secondary MMU read p, but get:
                                           *p == 0;
      
                               /*
                                * !!!!!!
                                * the host has already set p to 1, but the secondary
                                * MMU still get the old value 0
                                */
      
        call mmu_notifier_change_pte to release
        old page in secondary MMU
      
      We can fix it by release old page first, then set the pte to the new
      page.
      
      Note, the new page will be firstly used in secondary MMU before it is
      mapped into the page table of the process, but this is safe because it
      is protected by the page table lock, there is no race to change the pte
      
      [akpm@linux-foundation.org: add comment from Andrea]
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      48af0d7c
    • M
      mempolicy: fix a race in shared_policy_replace() · b22d127a
      Mel Gorman 提交于
      shared_policy_replace() use of sp_alloc() is unsafe.  1) sp_node cannot
      be dereferenced if sp->lock is not held and 2) another thread can modify
      sp_node between spin_unlock for allocating a new sp node and next
      spin_lock.  The bug was introduced before 2.6.12-rc2.
      
      Kosaki's original patch for this problem was to allocate an sp node and
      policy within shared_policy_replace and initialise it when the lock is
      reacquired.  I was not keen on this approach because it partially
      duplicates sp_alloc().  As the paths were sp->lock is taken are not that
      performance critical this patch converts sp->lock to sp->mutex so it can
      sleep when calling sp_alloc().
      
      [kosaki.motohiro@jp.fujitsu.com: Original patch]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Reviewed-by: NChristoph Lameter <cl@linux.com>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b22d127a
    • M
      mm: compaction: capture a suitable high-order page immediately when it is made available · 1fb3f8ca
      Mel Gorman 提交于
      While compaction is migrating pages to free up large contiguous blocks
      for allocation it races with other allocation requests that may steal
      these blocks or break them up.  This patch alters direct compaction to
      capture a suitable free page as soon as it becomes available to reduce
      this race.  It uses similar logic to split_free_page() to ensure that
      watermarks are still obeyed.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Reviewed-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1fb3f8ca
    • K
      mm: kill vma flag VM_RESERVED and mm->reserved_vm counter · 314e51b9
      Konstantin Khlebnikov 提交于
      A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
      currently it lost original meaning but still has some effects:
      
       | effect                 | alternative flags
      -+------------------------+---------------------------------------------
      1| account as reserved_vm | VM_IO
      2| skip in core dump      | VM_IO, VM_DONTDUMP
      3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      4| do not mlock           | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      
      This patch removes reserved_vm counter from mm_struct.  Seems like nobody
      cares about it, it does not exported into userspace directly, it only
      reduces total_vm showed in proc.
      
      Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
      
      remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
      remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
      
      [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314e51b9
    • K
      mm: prepare VM_DONTDUMP for using in drivers · 0103bd16
      Konstantin Khlebnikov 提交于
      Rename VM_NODUMP into VM_DONTDUMP: this name matches other negative flags:
      VM_DONTEXPAND, VM_DONTCOPY.  Currently this flag used only for
      sys_madvise.  The next patch will use it for replacing the outdated flag
      VM_RESERVED.
      
      Also forbid madvise(MADV_DODUMP) for special kernel mappings VM_SPECIAL
      (VM_IO | VM_DONTEXPAND | VM_RESERVED | VM_PFNMAP)
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0103bd16
    • K
      mm: kill vma flag VM_EXECUTABLE and mm->num_exe_file_vmas · e9714acf
      Konstantin Khlebnikov 提交于
      Currently the kernel sets mm->exe_file during sys_execve() and then tracks
      number of vmas with VM_EXECUTABLE flag in mm->num_exe_file_vmas, as soon
      as this counter drops to zero kernel resets mm->exe_file to NULL.  Plus it
      resets mm->exe_file at last mmput() when mm->mm_users drops to zero.
      
      VMA with VM_EXECUTABLE flag appears after mapping file with flag
      MAP_EXECUTABLE, such vmas can appears only at sys_execve() or after vma
      splitting, because sys_mmap ignores this flag.  Usually binfmt module sets
      mm->exe_file and mmaps executable vmas with this file, they hold
      mm->exe_file while task is running.
      
      comment from v2.6.25-6245-g925d1c40 ("procfs task exe symlink"),
      where all this stuff was introduced:
      
      > The kernel implements readlink of /proc/pid/exe by getting the file from
      > the first executable VMA.  Then the path to the file is reconstructed and
      > reported as the result.
      >
      > Because of the VMA walk the code is slightly different on nommu systems.
      > This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
      > walking the VMAs to find the first executable file-backed VMA we store a
      > reference to the exec'd file in the mm_struct.
      >
      > That reference would prevent the filesystem holding the executable file
      > from being unmounted even after unmapping the VMAs.  So we track the number
      > of VM_EXECUTABLE VMAs and drop the new reference when the last one is
      > unmapped.  This avoids pinning the mounted filesystem.
      
      exe_file's vma accounting is hooked into every file mmap/unmmap and vma
      split/merge just to fix some hypothetical pinning fs from umounting by mm,
      which already unmapped all its executable files, but still alive.
      
      Seems like currently nobody depends on this behaviour.  We can try to
      remove this logic and keep mm->exe_file until final mmput().
      
      mm->exe_file is still protected with mm->mmap_sem, because we want to
      change it via new sys_prctl(PR_SET_MM_EXE_FILE).  Also via this syscall
      task can change its mm->exe_file and unpin mountpoint explicitly.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e9714acf
    • K
      mm: kill vma flag VM_CAN_NONLINEAR · 0b173bc4
      Konstantin Khlebnikov 提交于
      Move actual pte filling for non-linear file mappings into the new special
      vma operation: ->remap_pages().
      
      Filesystems must implement this method to get non-linear mapping support,
      if it uses filemap_fault() then generic_file_remap_pages() can be used.
      
      Now device drivers can implement this method and obtain nonlinear vma support.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>	#arch/tile
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b173bc4
    • K
      mm: kill vma flag VM_INSERTPAGE · 4b6e1e37
      Konstantin Khlebnikov 提交于
      Merge VM_INSERTPAGE into VM_MIXEDMAP.  VM_MIXEDMAP VMA can mix pure-pfn
      ptes, special ptes and normal ptes.
      
      Now copy_page_range() always copies VM_MIXEDMAP VMA on fork like
      VM_PFNMAP.  If driver populates whole VMA at mmap() it probably not
      expects page-faults.
      
      This patch removes special check from vma_wants_writenotify() which
      disables pages write tracking for VMA populated via vm_instert_page().
      BDI below mapped file should not use dirty-accounting, moreover
      do_wp_page() can handle this.
      
      vm_insert_page() still marks vma after first usage.  Usually it is called
      from f_op->mmap() handler under mm->mmap_sem write-lock, so it able to
      change vma->vm_flags.  Caller must set VM_MIXEDMAP at mmap time if it
      wants to call this function from other places, for example from page-fault
      handler.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b6e1e37
    • K
      mm: introduce arch-specific vma flag VM_ARCH_1 · cc2383ec
      Konstantin Khlebnikov 提交于
      Combine several arch-specific vma flags into one.
      
      before patch:
      
              0x00000200      0x01000000      0x20000000      0x40000000
      x86     VM_NOHUGEPAGE   VM_HUGEPAGE     -               VM_PAT
      powerpc -               -               VM_SAO          -
      parisc  VM_GROWSUP      -               -               -
      ia64    VM_GROWSUP      -               -               -
      nommu   -               VM_MAPPED_COPY  -               -
      others  -               -               -               -
      
      after patch:
      
              0x00000200      0x01000000      0x20000000      0x40000000
      x86     -               VM_PAT          VM_HUGEPAGE     VM_NOHUGEPAGE
      powerpc -               VM_SAO          -               -
      parisc  -               VM_GROWSUP      -               -
      ia64    -               VM_GROWSUP      -               -
      nommu   -               VM_MAPPED_COPY  -               -
      others  -               VM_ARCH_1       -               -
      
      And voila! One completely free bit.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cc2383ec
    • K
      mm, x86, pat: rework linear pfn-mmap tracking · b3b9c293
      Konstantin Khlebnikov 提交于
      Replace the generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT.
      
      We can toss mapping address from remap_pfn_range() into
      track_pfn_vma_new(), and collect all PAT-related logic together in
      arch/x86/.
      
      This patch also restores orignal frustration-free is_cow_mapping() check
      in remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73a
      ("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3")
      
      is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c,
      because it already handled by VM_PFNMAP in VM_NO_THP bit-mask.
      
      [suresh.b.siddha@intel.com: Reset the VM_PAT flag as part of untrack_pfn_vma()]
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3b9c293
    • R
      mm: remove __GFP_NO_KSWAPD · c6543459
      Rik van Riel 提交于
      When transparent huge pages were introduced, memory compaction and swap
      storms were an issue, and the kernel had to be careful to not make THP
      allocations cause pageout or compaction.
      
      Now that we have working compaction deferral, kswapd is smart enough to
      invoke compaction and the quadratic behaviour around isolate_free_pages
      has been fixed, it should be safe to remove __GFP_NO_KSWAPD.
      
      [minchan@kernel.org: Comment fix]
      [mgorman@suse.de: Avoid direct reclaim for deferred compaction]
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c6543459
  2. 06 10月, 2012 26 次提交