1. 01 7月, 2021 2 次提交
  2. 30 6月, 2021 2 次提交
  3. 25 6月, 2021 3 次提交
  4. 17 6月, 2021 2 次提交
    • Y
      mm/memory-failure: make sure wait for page writeback in memory_failure · e8675d29
      yangerkun 提交于
      Our syzkaller trigger the "BUG_ON(!list_empty(&inode->i_wb_list))" in
      clear_inode:
      
        kernel BUG at fs/inode.c:519!
        Internal error: Oops - BUG: 0 [#1] SMP
        Modules linked in:
        Process syz-executor.0 (pid: 249, stack limit = 0x00000000a12409d7)
        CPU: 1 PID: 249 Comm: syz-executor.0 Not tainted 4.19.95
        Hardware name: linux,dummy-virt (DT)
        pstate: 80000005 (Nzcv daif -PAN -UAO)
        pc : clear_inode+0x280/0x2a8
        lr : clear_inode+0x280/0x2a8
        Call trace:
          clear_inode+0x280/0x2a8
          ext4_clear_inode+0x38/0xe8
          ext4_free_inode+0x130/0xc68
          ext4_evict_inode+0xb20/0xcb8
          evict+0x1a8/0x3c0
          iput+0x344/0x460
          do_unlinkat+0x260/0x410
          __arm64_sys_unlinkat+0x6c/0xc0
          el0_svc_common+0xdc/0x3b0
          el0_svc_handler+0xf8/0x160
          el0_svc+0x10/0x218
        Kernel panic - not syncing: Fatal exception
      
      A crash dump of this problem show that someone called __munlock_pagevec
      to clear page LRU without lock_page: do_mmap -> mmap_region -> do_munmap
      -> munlock_vma_pages_range -> __munlock_pagevec.
      
      As a result memory_failure will call identify_page_state without
      wait_on_page_writeback.  And after truncate_error_page clear the mapping
      of this page.  end_page_writeback won't call sb_clear_inode_writeback to
      clear inode->i_wb_list.  That will trigger BUG_ON in clear_inode!
      
      Fix it by checking PageWriteback too to help determine should we skip
      wait_on_page_writeback.
      
      Link: https://lkml.kernel.org/r/20210604084705.3729204-1-yangerkun@huawei.com
      Fixes: 0bc1f8b0 ("hwpoison: fix the handling path of the victimized page frame that belong to non-LRU")
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Yu Kuai <yukuai3@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8675d29
    • N
      mm,hwpoison: fix race with hugetlb page allocation · 25182f05
      Naoya Horiguchi 提交于
      When hugetlb page fault (under overcommitting situation) and
      memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following
      race:
      
          CPU0:                           CPU1:
      
                                          gather_surplus_pages()
                                            page = alloc_surplus_huge_page()
          memory_failure_hugetlb()
            get_hwpoison_page(page)
              __get_hwpoison_page(page)
                get_page_unless_zero(page)
                                            zero = put_page_testzero(page)
                                            VM_BUG_ON_PAGE(!zero, page)
                                            enqueue_huge_page(h, page)
            put_page(page)
      
      __get_hwpoison_page() only checks the page refcount before taking an
      additional one for memory error handling, which is not enough because
      there's a time window where compound pages have non-zero refcount during
      hugetlb page initialization.
      
      So make __get_hwpoison_page() check page status a bit more for hugetlb
      pages with get_hwpoison_huge_page().  Checking hugetlb-specific flags
      under hugetlb_lock makes sure that the hugetlb page is not transitive.
      It's notable that another new function, HWPoisonHandlable(), is helpful
      to prevent a race against other transitive page states (like a generic
      compound page just before PageHuge becomes true).
      
      Link: https://lkml.kernel.org/r/20210603233632.2964832-2-nao.horiguchi@gmail.com
      Fixes: ead07f6a ("mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling")
      Signed-off-by: NNaoya Horiguchi <naoya.horiguchi@nec.com>
      Reported-by: NMuchun Song <songmuchun@bytedance.com>
      Acked-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: <stable@vger.kernel.org>	[5.12+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      25182f05
  5. 07 5月, 2021 1 次提交
  6. 01 5月, 2021 1 次提交
  7. 27 2月, 2021 1 次提交
  8. 25 2月, 2021 1 次提交
  9. 25 1月, 2021 1 次提交
  10. 13 1月, 2021 1 次提交
  11. 16 12月, 2020 8 次提交
  12. 15 11月, 2020 1 次提交
    • M
      hugetlbfs: fix anon huge page migration race · 336bf30e
      Mike Kravetz 提交于
      Qian Cai reported the following BUG in [1]
      
        LTP: starting move_pages12
        BUG: unable to handle page fault for address: ffffffffffffffe0
        ...
        RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170 avc_start_pgoff at mm/interval_tree.c:63
        Call Trace:
          rmap_walk_anon+0x141/0xa30 rmap_walk_anon at mm/rmap.c:1864
          try_to_unmap+0x209/0x2d0 try_to_unmap at mm/rmap.c:1763
          migrate_pages+0x1005/0x1fb0
          move_pages_and_store_status.isra.47+0xd7/0x1a0
          __x64_sys_move_pages+0xa5c/0x1100
          do_syscall_64+0x5f/0x310
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Hugh Dickins diagnosed this as a migration bug caused by code introduced
      to use i_mmap_rwsem for pmd sharing synchronization.  Specifically, the
      routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED
      flag to try_to_unmap() while holding i_mmap_rwsem.  This is wrong for
      anon pages as the anon_vma_lock should be held in this case.  Further
      analysis suggested that i_mmap_rwsem was not required to he held at all
      when calling try_to_unmap for anon pages as an anon page could never be
      part of a shared pmd mapping.
      
      Discussion also revealed that the hack in hugetlb_page_mapping_lock_write
      to drop page lock and acquire i_mmap_rwsem is wrong.  There is no way to
      keep mapping valid while dropping page lock.
      
      This patch does the following:
      
       - Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when
         calling try_to_unmap.
      
       - Remove the hacky code in hugetlb_page_mapping_lock_write. The routine
         will now simply do a 'trylock' while still holding the page lock. If
         the trylock fails, it will return NULL. This could impact the
         callers:
      
          - migration calling code will receive -EAGAIN and retry up to the
            hard coded limit (10).
      
          - memory error code will treat the page as BUSY. This will force
            killing (SIGKILL) instead of SIGBUS any mapping tasks.
      
         Do note that this change in behavior only happens when there is a
         race. None of the standard kernel testing suites actually hit this
         race, but it is possible.
      
      [1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/
      [2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.anvils/
      
      Fixes: c0d0381a ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
      Reported-by: NQian Cai <cai@lca.pw>
      Suggested-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20201105195058.78401-1-mike.kravetz@oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      336bf30e
  13. 19 10月, 2020 1 次提交
  14. 17 10月, 2020 12 次提交
  15. 14 10月, 2020 2 次提交
  16. 25 9月, 2020 1 次提交