1. 08 8月, 2020 2 次提交
  2. 25 7月, 2020 1 次提交
  3. 21 7月, 2020 1 次提交
  4. 17 7月, 2020 1 次提交
  5. 26 6月, 2020 3 次提交
  6. 10 6月, 2020 6 次提交
  7. 05 6月, 2020 2 次提交
  8. 04 6月, 2020 6 次提交
  9. 03 6月, 2020 1 次提交
  10. 27 5月, 2020 2 次提交
  11. 11 4月, 2020 2 次提交
  12. 08 4月, 2020 7 次提交
    • C
      mm: fix ambiguous comments for better code readability · 552657b7
      chenqiwu 提交于
      The parameter of remap_pfn_range() @pfn passed from the caller is actually
      a page-frame number converted by corresponding physical address of kernel
      memory, the original comment is ambiguous that may mislead the users.
      
      Meanwhile, there is an ambiguous typo "VMM" in the comment of
      vm_area_struct.  So fixing them will make the code more readable.
      Signed-off-by: Nchenqiwu <chenqiwu@xiaomi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1583026921-15279-1-git-send-email-qiwuchen55@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      552657b7
    • P
      userfaultfd: wp: support swap and page migration · f45ec5ff
      Peter Xu 提交于
      For either swap and page migration, we all use the bit 2 of the entry to
      identify whether this entry is uffd write-protected.  It plays a similar
      role as the existing soft dirty bit in swap entries but only for keeping
      the uffd-wp tracking for a specific PTE/PMD.
      
      Something special here is that when we want to recover the uffd-wp bit
      from a swap/migration entry to the PTE bit we'll also need to take care of
      the _PAGE_RW bit and make sure it's cleared, otherwise even with the
      _PAGE_UFFD_WP bit we can't trap it at all.
      
      In change_pte_range() we do nothing for uffd if the PTE is a swap entry.
      That can lead to data mismatch if the page that we are going to write
      protect is swapped out when sending the UFFDIO_WRITEPROTECT.  This patch
      also applies/removes the uffd-wp bit even for the swap entries.
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bobby Powers <bobbypowers@gmail.com>
      Cc: Brian Geffon <bgeffon@google.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
      Cc: Martin Cracauer <cracauer@cons.org>
      Cc: Marty McFadden <mcfadden8@llnl.gov>
      Cc: Maya Gokhale <gokhale2@llnl.gov>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Shaohua Li <shli@fb.com>
      Link: http://lkml.kernel.org/r/20200220163112.11409-11-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f45ec5ff
    • P
      userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork · b569a176
      Peter Xu 提交于
      UFFD_EVENT_FORK support for uffd-wp should be already there, except that
      we should clean the uffd-wp bit if uffd fork event is not enabled.  Detect
      that to avoid _PAGE_UFFD_WP being set even if the VMA is not being tracked
      by VM_UFFD_WP.  Do this for both small PTEs and huge PMDs.
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NJerome Glisse <jglisse@redhat.com>
      Reviewed-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bobby Powers <bobbypowers@gmail.com>
      Cc: Brian Geffon <bgeffon@google.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
      Cc: Martin Cracauer <cracauer@cons.org>
      Cc: Marty McFadden <mcfadden8@llnl.gov>
      Cc: Maya Gokhale <gokhale2@llnl.gov>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Shaohua Li <shli@fb.com>
      Link: http://lkml.kernel.org/r/20200220163112.11409-9-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b569a176
    • P
      userfaultfd: wp: apply _PAGE_UFFD_WP bit · 292924b2
      Peter Xu 提交于
      Firstly, introduce two new flags MM_CP_UFFD_WP[_RESOLVE] for
      change_protection() when used with uffd-wp and make sure the two new flags
      are exclusively used.  Then,
      
        - For MM_CP_UFFD_WP: apply the _PAGE_UFFD_WP bit and remove _PAGE_RW
          when a range of memory is write protected by uffd
      
        - For MM_CP_UFFD_WP_RESOLVE: remove the _PAGE_UFFD_WP bit and recover
          _PAGE_RW when write protection is resolved from userspace
      
      And use this new interface in mwriteprotect_range() to replace the old
      MM_CP_DIRTY_ACCT.
      
      Do this change for both PTEs and huge PMDs.  Then we can start to identify
      which PTE/PMD is write protected by general (e.g., COW or soft dirty
      tracking), and which is for userfaultfd-wp.
      
      Since we should keep the _PAGE_UFFD_WP when doing pte_modify(), add it
      into _PAGE_CHG_MASK as well.  Meanwhile, since we have this new bit, we
      can be even more strict when detecting uffd-wp page faults in either
      do_wp_page() or wp_huge_pmd().
      
      After we're with _PAGE_UFFD_WP, a special case is when a page is both
      protected by the general COW logic and also userfault-wp.  Here the
      userfault-wp will have higher priority and will be handled first.  Only
      after the uffd-wp bit is cleared on the PTE/PMD will we continue to handle
      the general COW.  These are the steps on what will happen with such a
      page:
      
        1. CPU accesses write protected shared page (so both protected by
           general COW and uffd-wp), blocked by uffd-wp first because in
           do_wp_page we'll handle uffd-wp first, so it has higher priority
           than general COW.
      
        2. Uffd service thread receives the request, do UFFDIO_WRITEPROTECT
           to remove the uffd-wp bit upon the PTE/PMD.  However here we
           still keep the write bit cleared.  Notify the blocked CPU.
      
        3. The blocked CPU resumes the page fault process with a fault
           retry, during retry it'll notice it was not with the uffd-wp bit
           this time but it is still write protected by general COW, then
           it'll go though the COW path in the fault handler, copy the page,
           apply write bit where necessary, and retry again.
      
        4. The CPU will be able to access this page with write bit set.
      Suggested-by: NAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Brian Geffon <bgeffon@google.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Martin Cracauer <cracauer@cons.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Bobby Powers <bobbypowers@gmail.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
      Cc: Maya Gokhale <gokhale2@llnl.gov>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Marty McFadden <mcfadden8@llnl.gov>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Shaohua Li <shli@fb.com>
      Link: http://lkml.kernel.org/r/20200220163112.11409-8-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      292924b2
    • A
      userfaultfd: wp: hook userfault handler to write protection fault · 529b930b
      Andrea Arcangeli 提交于
      There are several cases write protection fault happens.  It could be a
      write to zero page, swaped page or userfault write protected page.  When
      the fault happens, there is no way to know if userfault write protect the
      page before.  Here we just blindly issue a userfault notification for vma
      with VM_UFFD_WP regardless if app write protects it yet.  Application
      should be ready to handle such wp fault.
      
      In the swapin case, always swapin as readonly.  This will cause false
      positive userfaults.  We need to decide later if to eliminate them with a
      flag like soft-dirty in the swap entry (see _PAGE_SWP_SOFT_DIRTY).
      
      hugetlbfs wouldn't need to worry about swapouts but and tmpfs would be
      handled by a swap entry bit like anonymous memory.
      
      The main problem with no easy solution to eliminate the false positives,
      will be if/when userfaultfd is extended to real filesystem pagecache.
      When the pagecache is freed by reclaim we can't leave the radix tree
      pinned if the inode and in turn the radix tree is reclaimed as well.
      
      The estimation is that full accuracy and lack of false positives could be
      easily provided only to anonymous memory (as long as there's no fork or as
      long as MADV_DONTFORK is used on the userfaultfd anonymous range) tmpfs
      and hugetlbfs, it's most certainly worth to achieve it but in a later
      incremental patch.
      
      [peterx@redhat.com: don't conditionally drop FAULT_FLAG_WRITE in do_swap_page]
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Reviewed-by: NJerome Glisse <jglisse@redhat.com>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Bobby Powers <bobbypowers@gmail.com>
      Cc: Brian Geffon <bgeffon@google.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
      Cc: Martin Cracauer <cracauer@cons.org>
      Cc: Marty McFadden <mcfadden8@llnl.gov>
      Cc: Maya Gokhale <gokhale2@llnl.gov>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Rik van Riel <riel@redhat.com>
      Link: http://lkml.kernel.org/r/20200220163112.11409-3-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      529b930b
    • M
      mm: remove CONFIG_TRANSPARENT_HUGE_PAGECACHE · 396bcc52
      Matthew Wilcox (Oracle) 提交于
      Commit e496cf3d ("thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE")
      notes that it should be reverted when the PowerPC problem was fixed.  The
      commit fixing the PowerPC problem (953c66c2) did not revert the
      commit; instead setting CONFIG_TRANSPARENT_HUGE_PAGECACHE to the same as
      CONFIG_TRANSPARENT_HUGEPAGE.  Checking with Kirill and Aneesh, this was an
      oversight, so remove the Kconfig symbol and undo the work of commit
      e496cf3d.
      Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
      Link: http://lkml.kernel.org/r/20200318140253.6141-6-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      396bcc52
    • A
      mm/vma: make vma_is_accessible() available for general use · 3122e80e
      Anshuman Khandual 提交于
      Lets move vma_is_accessible() helper to include/linux/mm.h which makes it
      available for general use.  While here, this replaces all remaining open
      encodings for VMA access check with vma_is_accessible().
      Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NGuo Ren <guoren@kernel.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paulburton@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Nick Piggin <npiggin@gmail.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/1582520593-30704-3-git-send-email-anshuman.khandual@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3122e80e
  13. 03 4月, 2020 2 次提交
  14. 25 3月, 2020 1 次提交
    • T
      mm: Split huge pages on write-notify or COW · 327e9fd4
      Thomas Hellstrom (VMware) 提交于
      The functions wp_huge_pmd() and wp_huge_pud() currently relies on the
      huge_fault() callback to split huge page table entries if needed.
      However for module users that requires export of the split_huge_xxx()
      functionality which may be undesired. Instead split pre-existing huge
      page-table entries on VM_FAULT_FALLBACK return.
      
      We currently only do COW and write-notify on the PTE level, so if the
      huge_fault() handler returns VM_FAULT_FALLBACK on wp faults,
      split the huge pages and page-table entries. Also do this for huge PUDs
      if there is no huge_fault() handler and the vma is not anonymous, similar
      to how it's done for PMDs.
      
      Note that fs/dax.c still does the splitting in the huge_fault() handler,
      but as huge_fault() A follow-up patch can remove the dax.c split_huge_pmd()
      if needed.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: "Christian König" <christian.koenig@amd.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: NThomas Hellstrom (VMware) <thomas_os@shipmail.org>
      Acked-by: NChristian König <christian.koenig@amd.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      327e9fd4
  15. 06 3月, 2020 1 次提交
  16. 16 1月, 2020 2 次提交
    • T
      mm, drm/ttm: Fix vm page protection handling · 5379e4dd
      Thomas Hellstrom 提交于
      TTM graphics buffer objects may, transparently to user-space,  move
      between IO and system memory. When that happens, all PTEs pointing to the
      old location are zapped before the move and then faulted in again if
      needed. When that happens, the page protection caching mode- and
      encryption bits may change and be different from those of
      struct vm_area_struct::vm_page_prot.
      
      We were using an ugly hack to set the page protection correctly.
      Fix that and instead export and use vmf_insert_mixed_prot() or use
      vmf_insert_pfn_prot().
      Also get the default page protection from
      struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
      This way we catch modifications done by the vm system for drivers that
      want write-notification.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: "Christian König" <christian.koenig@amd.com>
      Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      5379e4dd
    • T
      mm: Add a vmf_insert_mixed_prot() function · 574c5b3d
      Thomas Hellstrom 提交于
      The TTM module today uses a hack to be able to set a different page
      protection than struct vm_area_struct::vm_page_prot. To be able to do
      this properly, add the needed vm functionality as vmf_insert_mixed_prot().
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: "Christian König" <christian.koenig@amd.com>
      Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
      Acked-by: NChristian König <christian.koenig@amd.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      574c5b3d