1. 10 3月, 2017 4 次提交
    • A
      userfaultfd: non-cooperative: release all ctx in dup_userfaultfd_complete · 8c9e7bb7
      Andrea Arcangeli 提交于
      Don't stop running dup_fctx() even if userfaultfd_event_wait_completion
      fails as it has to run userfaultfd_ctx_put on all ctx to pair against
      the userfaultfd_ctx_get that was run on all fctx->orig in
      dup_userfaultfd.
      
      Link: http://lkml.kernel.org/r/20170224181957.19736-4-aarcange@redhat.comSigned-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8c9e7bb7
    • A
      userfaultfd: non-cooperative: robustness check · 9a69a829
      Andrea Arcangeli 提交于
      Similar to the handle_userfault() case, also make sure to never attempt
      to send any event past the PF_EXITING point of no return.
      
      This is purely a robustness check.
      
      Link: http://lkml.kernel.org/r/20170224181957.19736-3-aarcange@redhat.comSigned-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a69a829
    • A
      userfaultfd: non-cooperative: rollback userfaultfd_exit · dd0db88d
      Andrea Arcangeli 提交于
      Patch series "userfaultfd non-cooperative further update for 4.11 merge
      window".
      
      Unfortunately I noticed one relevant bug in userfaultfd_exit while doing
      more testing.  I've been doing testing before and this was also tested
      by kbuild bot and exercised by the selftest, but this bug never
      reproduced before.
      
      I dropped userfaultfd_exit as result.  I dropped it because of
      implementation difficulty in receiving signals in __mmput and because I
      think -ENOSPC as result from the background UFFDIO_COPY should be enough
      already.
      
      Before I decided to remove userfaultfd_exit, I noticed userfaultfd_exit
      wasn't exercised by the selftest and when I tried to exercise it, after
      moving it to a more correct place in __mmput where it would make more
      sense and where the vma list is stable, it resulted in the
      event_wait_completion in D state.  So then I added the second patch to
      be sure even if we call userfaultfd_event_wait_completion too late
      during task exit(), we won't risk to generate tasks in D state.  The
      same check exists in handle_userfault() for the same reason, except it
      makes a difference there, while here is just a robustness check and it's
      run under WARN_ON_ONCE.
      
      While looking at the userfaultfd_event_wait_completion() function I
      looked back at its callers too while at it and I think it's not ok to
      stop executing dup_fctx on the fcs list because we relay on
      userfaultfd_event_wait_completion to execute
      userfaultfd_ctx_put(fctx->orig) which is paired against
      userfaultfd_ctx_get(fctx->orig) in dup_userfault just before
      list_add(fcs).  This change only takes care of fctx->orig but this area
      also needs further review looking for similar problems in fctx->new.
      
      The only patch that is urgent is the first because it's an use after
      free during a SMP race condition that affects all processes if
      CONFIG_USERFAULTFD=y.  Very hard to reproduce though and probably
      impossible without SLUB poisoning enabled.
      
      This patch (of 3):
      
      I once reproduced this oops with the userfaultfd selftest, it's not
      easily reproducible and it requires SLUB poisoning to reproduce.
      
          general protection fault: 0000 [#1] SMP
          Modules linked in:
          CPU: 2 PID: 18421 Comm: userfaultfd Tainted: G               ------------ T 3.10.0+ #15
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
          task: ffff8801f83b9440 ti: ffff8801f833c000 task.ti: ffff8801f833c000
          RIP: 0010:[<ffffffff81451299>]  [<ffffffff81451299>] userfaultfd_exit+0x29/0xa0
          RSP: 0018:ffff8801f833fe80  EFLAGS: 00010202
          RAX: ffff8801f833ffd8 RBX: 6b6b6b6b6b6b6b6b RCX: ffff8801f83b9440
          RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800baf18600
          RBP: ffff8801f833fee8 R08: 0000000000000000 R09: 0000000000000001
          R10: 0000000000000000 R11: ffffffff8127ceb3 R12: 0000000000000000
          R13: ffff8800baf186b0 R14: ffff8801f83b99f8 R15: 00007faed746c700
          FS:  0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
          CR2: 00007faf0966f028 CR3: 0000000001bc6000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
          Call Trace:
            do_exit+0x297/0xd10
            SyS_exit+0x17/0x20
            tracesys+0xdd/0xe2
          Code: 00 00 66 66 66 66 90 55 48 89 e5 41 54 53 48 83 ec 58 48 8b 1f 48 85 db 75 11 eb 73 66 0f 1f 44 00 00 48 8b 5b 10 48 85 db 74 64 <4c> 8b a3 b8 00 00 00 4d 85 e4 74 eb 41 f6 84 24 2c 01 00 00 80
          RIP  [<ffffffff81451299>] userfaultfd_exit+0x29/0xa0
           RSP <ffff8801f833fe80>
          ---[ end trace 9fecd6dcb442846a ]---
      
      In the debugger I located the "mm" pointer in the stack and walking
      mm->mmap->vm_next through the end shows the vma->vm_next list is fully
      consistent and it is null terminated list as expected.  So this has to
      be an SMP race condition where userfaultfd_exit was running while the
      vma list was being modified by another CPU.
      
      When userfaultfd_exit() run one of the ->vm_next pointers pointed to
      SLAB_POISON (RBX is the vma pointer and is 0x6b6b..).
      
      The reason is that it's not running in __mmput but while there are still
      other threads running and it's not holding the mmap_sem (it can't as it
      has to wait the even to be received by the manager).  So this is an use
      after free that was happening for all processes.
      
      One more implementation problem aside from the race condition:
      userfaultfd_exit has really to check a flag in mm->flags before walking
      the vma or it's going to slowdown the exit() path for regular tasks.
      
      One more implementation problem: at that point signals can't be
      delivered so it would also create a task in D state if the manager
      doesn't read the event.
      
      The major design issue: it overall looks superfluous as the manager can
      check for -ENOSPC in the background transfer:
      
      	if (mmget_not_zero(ctx->mm)) {
      [..]
      	} else {
      		return -ENOSPC;
      	}
      
      It's safer to roll it back and re-introduce it later if at all.
      
      [rppt@linux.vnet.ibm.com: documentation fixup after removal of UFFD_EVENT_EXIT]
        Link: http://lkml.kernel.org/r/1488345437-4364-1-git-send-email-rppt@linux.vnet.ibm.com
      Link: http://lkml.kernel.org/r/20170224181957.19736-2-aarcange@redhat.comSigned-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd0db88d
    • A
      userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE · 6bbc4a41
      Andrea Arcangeli 提交于
      __do_fault assumes vmf->page has been initialized and is valid if
      VM_FAULT_NOPAGE is not returned by vma->vm_ops->fault(vma, vmf).
      
      handle_userfault() in turn should return VM_FAULT_NOPAGE if it doesn't
      return VM_FAULT_SIGBUS or VM_FAULT_RETRY (the other two possibilities).
      
      This VM_FAULT_NOPAGE case is only invoked when signal are pending and it
      didn't matter for anonymous memory before.  It only started to matter
      since shmem was introduced.  hugetlbfs also takes a different path and
      doesn't exercise __do_fault.
      
      Link: http://lkml.kernel.org/r/20170228154201.GH5816@redhat.comSigned-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6bbc4a41
  2. 02 3月, 2017 2 次提交
  3. 28 2月, 2017 2 次提交
  4. 25 2月, 2017 4 次提交
  5. 23 2月, 2017 16 次提交
  6. 25 1月, 2017 1 次提交
    • A
      userfaultfd: fix SIGBUS resulting from false rwsem wakeups · 15a77c6f
      Andrea Arcangeli 提交于
      With >=32 CPUs the userfaultfd selftest triggered a graceful but
      unexpected SIGBUS because VM_FAULT_RETRY was returned by
      handle_userfault() despite the UFFDIO_COPY wasn't completed.
      
      This seems caused by rwsem waking the thread blocked in
      handle_userfault() and we can't run up_read() before the wait_event
      sequence is complete.
      
      Keeping the wait_even sequence identical to the first one, would require
      running userfaultfd_must_wait() again to know if the loop should be
      repeated, and it would also require retaking the rwsem and revalidating
      the whole vma status.
      
      It seems simpler to wait the targeted wakeup so that if false wakeups
      materialize we still wait for our specific wakeup event, unless of
      course there are signals or the uffd was released.
      
      Debug code collecting the stack trace of the wakeup showed this:
      
        $ ./userfaultfd 100 99999
        nr_pages: 25600, nr_pages_per_cpu: 800
        bounces: 99998, mode: racing ver poll, userfaults: 32 35 90 232 30 138 69 82 34 30 139 40 40 31 20 19 43 13 15 28 27 38 21 43 56 22 1 17 31 8 4 2
        bounces: 99997, mode: rnd ver poll, Bus error (core dumped)
      
          save_stack_trace+0x2b/0x50
          try_to_wake_up+0x2a6/0x580
          wake_up_q+0x32/0x70
          rwsem_wake+0xe0/0x120
          call_rwsem_wake+0x1b/0x30
          up_write+0x3b/0x40
          vm_mmap_pgoff+0x9c/0xc0
          SyS_mmap_pgoff+0x1a9/0x240
          SyS_mmap+0x22/0x30
          entry_SYSCALL_64_fastpath+0x1f/0xbd
          0xffffffffffffffff
          FAULT_FLAG_ALLOW_RETRY missing 70
        CPU: 24 PID: 1054 Comm: userfaultfd Tainted: G        W       4.8.0+ #30
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
        Call Trace:
          dump_stack+0xb8/0x112
          handle_userfault+0x572/0x650
          handle_mm_fault+0x12cb/0x1520
          __do_page_fault+0x175/0x500
          trace_do_page_fault+0x61/0x270
          do_async_page_fault+0x19/0x90
          async_page_fault+0x25/0x30
      
      This always happens when the main userfault selftest thread is running
      clone() while glibc runs either mprotect or mmap (both taking mmap_sem
      down_write()) to allocate the thread stack of the background threads,
      while locking/userfault threads already run at full throttle and are
      susceptible to false wakeups that may cause handle_userfault() to return
      before than expected (which results in graceful SIGBUS at the next
      attempt).
      
      This was reproduced only with >=32 CPUs because the loop to start the
      thread where clone() is too quick with fewer CPUs, while with 32 CPUs
      there's already significant activity on ~32 locking and userfault
      threads when the last background threads are started with clone().
      
      This >=32 CPUs SMP race condition is likely reproducible only with the
      selftest because of the much heavier userfault load it generates if
      compared to real apps.
      
      We'll have to allow "one more" VM_FAULT_RETRY for the WP support and a
      patch floating around that provides it also hidden this problem but in
      reality only is successfully at hiding the problem.
      
      False wakeups could still happen again the second time
      handle_userfault() is invoked, even if it's a so rare race condition
      that getting false wakeups twice in a row is impossible to reproduce.
      This full fix is needed for correctness, the only alternative would be
      to allow VM_FAULT_RETRY to be returned infinitely.  With this fix the WP
      support can stick to a strict "one more" VM_FAULT_RETRY logic (no need
      of returning it infinite times to avoid the SIGBUS).
      
      Link: http://lkml.kernel.org/r/20170111005535.13832-2-aarcange@redhat.comSigned-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: NShubham Kumar Sharma <shubham.kumar.sharma@oracle.com>
      Tested-by: NMike Kravetz <mike.kravetz@oracle.com>
      Acked-by: NHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Michael Rapoport <RAPOPORT@il.ibm.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      15a77c6f
  7. 15 12月, 2016 1 次提交
  8. 27 7月, 2016 1 次提交
  9. 21 5月, 2016 1 次提交
  10. 03 3月, 2016 1 次提交
    • L
      userfaultfd: don't block on the last VM updates at exit time · 39680f50
      Linus Torvalds 提交于
      The exit path will do some final updates to the VM of an exiting process
      to inform others of the fact that the process is going away.
      
      That happens, for example, for robust futex state cleanup, but also if
      the parent has asked for a TID update when the process exits (we clear
      the child tid field in user space).
      
      However, at the time we do those final VM accesses, we've already
      stopped accepting signals, so the usual "stop waiting for userfaults on
      signal" code in fs/userfaultfd.c no longer works, and the process can
      become an unkillable zombie waiting for something that will never
      happen.
      
      To solve this, just make handle_userfault() abort any user fault
      handling if we're already in the exit path past the signal handling
      state being dead (marked by PF_EXITING).
      
      This VM special case is pretty ugly, and it is possible that we should
      look at finalizing signals later (or move the VM final accesses
      earlier).  But in the meantime this is a fairly minimally intrusive fix.
      Reported-and-tested-by: NDmitry Vyukov <dvyukov@google.com>
      Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      39680f50
  11. 23 9月, 2015 1 次提交
  12. 18 9月, 2015 1 次提交
  13. 05 9月, 2015 5 次提交
    • A
      userfaultfd: avoid missing wakeups during refile in userfaultfd_read · 2c5b7e1b
      Andrea Arcangeli 提交于
      During the refile in userfaultfd_read both waitqueues could look empty to
      the lockless wake_userfault().  Use a seqcount to prevent this false
      negative that could leave an userfault blocked.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2c5b7e1b
    • A
      userfaultfd: allow signals to interrupt a userfault · dfa37dc3
      Andrea Arcangeli 提交于
      This is only simple to achieve if the userfault is going to return to
      userland (not to the kernel) because we can avoid returning VM_FAULT_RETRY
      despite we temporarily released the mmap_sem.  The fault would just be
      retried by userland then.  This is safe at least on x86 and powerpc (the
      two archs with the syscall implemented so far).
      
      Hint to verify for which archs this is safe: after handle_mm_fault
      returns, no access to data structures protected by the mmap_sem must be
      done by the fault code in arch/*/mm/fault.c until up_read(&mm->mmap_sem)
      is called.
      
      This has two main benefits: signals can run with lower latency in
      production (signals aren't blocked by userfaults and userfaults are
      immediately repeated after signal processing) and gdb can then trivially
      debug the threads blocked in this kind of userfaults coming directly from
      userland.
      
      On a side note: while gdb has a need to get signal processed, coredumps
      always worked perfectly with userfaults, no matter if the userfault is
      triggered by GUP a kernel copy_user or directly from userland.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dfa37dc3
    • A
      userfaultfd: require UFFDIO_API before other ioctls · e6485a47
      Andrea Arcangeli 提交于
      UFFDIO_API was already forced before read/poll could work.  This makes the
      code more strict to force it also for all other ioctls.
      
      All users would already have been required to call UFFDIO_API before
      invoking other ioctls but this makes it more explicit.
      
      This will ensure we can change all ioctls (all but UFFDIO_API/struct
      uffdio_api) with a bump of uffdio_api.api.
      
      There's no actual plan or need to change the API or the ioctl, the current
      API already should cover fine even the non cooperative usage, but this is
      just for the longer term future just in case.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e6485a47
    • A
      userfaultfd: UFFDIO_COPY and UFFDIO_ZEROPAGE · ad465cae
      Andrea Arcangeli 提交于
      These two ioctl allows to either atomically copy or to map zeropages
      into the virtual address space. This is used by the thread that opened
      the userfaultfd to resolve the userfaults.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
      Cc: zhang.zhanghailiang@huawei.com
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ad465cae
    • A
      userfaultfd: solve the race between UFFDIO_COPY|ZEROPAGE and read · 8d2afd96
      Andrea Arcangeli 提交于
      Solve in-kernel the race between UFFDIO_COPY|ZEROPAGE and
      userfaultfd_read if they are run on different threads simultaneously.
      
      Until now qemu solved the race in userland: the race was explicitly
      and intentionally left for userland to solve. However we can also
      solve it in kernel.
      
      Requiring all users to solve this race if they use two threads (one
      for the background transfer and one for the userfault reads) isn't
      very attractive from an API prospective, furthermore this allows to
      remove a whole bunch of mutex and bitmap code from qemu, making it
      faster. The cost of __get_user_pages_fast should be insignificant
      considering it scales perfectly and the pagetables are already hot in
      the CPU cache, compared to the overhead in userland to maintain those
      structures.
      
      Applying this patch is backwards compatible with respect to the
      userfaultfd userland API, however reverting this change wouldn't be
      backwards compatible anymore.
      
      Without this patch qemu in the background transfer thread, has to read
      the old state, and do UFFDIO_WAKE if old_state is missing but it
      become REQUESTED by the time it tries to set it to RECEIVED (signaling
      the other side received an userfault).
      
          vcpu                background_thr userfault_thr
          -----               -----          -----
          vcpu0 handle_mm_fault()
      
                              postcopy_place_page
                              read old_state -> MISSING
                              UFFDIO_COPY 0x7fb76a139000 (no wakeup, still pending)
      
          vcpu0 fault at 0x7fb76a139000 enters handle_userfault
          poll() is kicked
      
                                              poll() -> POLLIN
                                              read() -> 0x7fb76a139000
                                              postcopy_pmi_change_state(MISSING, REQUESTED) -> REQUESTED
      
                              tmp_state = postcopy_pmi_change_state(old_state, RECEIVED) -> REQUESTED
                              /* check that no userfault raced with UFFDIO_COPY */
                              if (old_state == MISSING && tmp_state == REQUESTED)
                                      UFFDIO_WAKE from background thread
      
      And a second case where a UFFDIO_WAKE would be needed is in the userfault thread:
      
          vcpu                background_thr userfault_thr
          -----               -----          -----
          vcpu0 handle_mm_fault()
      
                              postcopy_place_page
                              read old_state -> MISSING
                              UFFDIO_COPY 0x7fb76a139000 (no wakeup, still pending)
                              tmp_state = postcopy_pmi_change_state(old_state, RECEIVED) -> RECEIVED
      
          vcpu0 fault at 0x7fb76a139000 enters handle_userfault
          poll() is kicked
      
                                              poll() -> POLLIN
                                              read() -> 0x7fb76a139000
      
                                              if (postcopy_pmi_change_state(MISSING, REQUESTED) == RECEIVED)
                                                      UFFDIO_WAKE from userfault thread
      
      This patch removes the need of both UFFDIO_WAKE and of the associated
      per-page tristate as well.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
      Cc: zhang.zhanghailiang@huawei.com
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d2afd96