1. 04 10月, 2022 1 次提交
  2. 12 9月, 2022 1 次提交
  3. 03 8月, 2022 1 次提交
  4. 04 7月, 2022 3 次提交
  5. 29 6月, 2022 1 次提交
  6. 28 5月, 2022 1 次提交
  7. 20 5月, 2022 2 次提交
  8. 13 5月, 2022 3 次提交
  9. 10 5月, 2022 3 次提交
  10. 18 3月, 2022 1 次提交
    • G
      mm: swap: get rid of livelock in swapin readahead · 029c4628
      Guo Ziliang 提交于
      In our testing, a livelock task was found.  Through sysrq printing, same
      stack was found every time, as follows:
      
        __swap_duplicate+0x58/0x1a0
        swapcache_prepare+0x24/0x30
        __read_swap_cache_async+0xac/0x220
        read_swap_cache_async+0x58/0xa0
        swapin_readahead+0x24c/0x628
        do_swap_page+0x374/0x8a0
        __handle_mm_fault+0x598/0xd60
        handle_mm_fault+0x114/0x200
        do_page_fault+0x148/0x4d0
        do_translation_fault+0xb0/0xd4
        do_mem_abort+0x50/0xb0
      
      The reason for the livelock is that swapcache_prepare() always returns
      EEXIST, indicating that SWAP_HAS_CACHE has not been cleared, so that it
      cannot jump out of the loop.  We suspect that the task that clears the
      SWAP_HAS_CACHE flag never gets a chance to run.  We try to lower the
      priority of the task stuck in a livelock so that the task that clears
      the SWAP_HAS_CACHE flag will run.  The results show that the system
      returns to normal after the priority is lowered.
      
      In our testing, multiple real-time tasks are bound to the same core, and
      the task in the livelock is the highest priority task of the core, so
      the livelocked task cannot be preempted.
      
      Although cond_resched() is used by __read_swap_cache_async, it is an
      empty function in the preemptive system and cannot achieve the purpose
      of releasing the CPU.  A high-priority task cannot release the CPU
      unless preempted by a higher-priority task.  But when this task is
      already the highest priority task on this core, other tasks will not be
      able to be scheduled.  So we think we should replace cond_resched() with
      schedule_timeout_uninterruptible(1), schedule_timeout_interruptible will
      call set_current_state first to set the task state, so the task will be
      removed from the running queue, so as to achieve the purpose of giving
      up the CPU and prevent it from running in kernel mode for too long.
      
      (akpm: ugly hack becomes uglier.  But it fixes the issue in a
      backportable-to-stable fashion while we hopefully work on something
      better)
      
      Link: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.comSigned-off-by: NGuo Ziliang <guo.ziliang@zte.com.cn>
      Reported-by: NZeal Robot <zealci@zte.com.cn>
      Reviewed-by: NRan Xiaokai <ran.xiaokai@zte.com.cn>
      Reviewed-by: NJiang Xuexin <jiang.xuexin@zte.com.cn>
      Reviewed-by: NYang Yang <yang.yang29@zte.com.cn>
      Acked-by: NHugh Dickins <hughd@google.com>
      Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Roger Quadros <rogerq@kernel.org>
      Cc: Ziliang Guo <guo.ziliang@zte.com.cn>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      029c4628
  11. 15 3月, 2022 1 次提交
  12. 18 10月, 2021 1 次提交
  13. 21 8月, 2021 1 次提交
  14. 30 6月, 2021 4 次提交
  15. 07 5月, 2021 1 次提交
  16. 06 5月, 2021 1 次提交
  17. 01 5月, 2021 1 次提交
  18. 27 2月, 2021 2 次提交
  19. 25 2月, 2021 3 次提交
  20. 16 12月, 2020 2 次提交
  21. 17 10月, 2020 1 次提交
  22. 14 10月, 2020 3 次提交
  23. 15 8月, 2020 2 次提交
    • Q
      mm/swap_state: mark various intentional data races · b96a3db2
      Qian Cai 提交于
      swap_cache_info.* could be accessed concurrently as noticed by
      KCSAN,
      
       BUG: KCSAN: data-race in lookup_swap_cache / lookup_swap_cache
      
       write to 0xffffffff85517318 of 8 bytes by task 94138 on cpu 101:
        lookup_swap_cache+0x12e/0x460
        lookup_swap_cache at mm/swap_state.c:322
        do_swap_page+0x112/0xeb0
        __handle_mm_fault+0xc7a/0xd00
        handle_mm_fault+0xfc/0x2f0
        do_page_fault+0x263/0x6f9
        page_fault+0x34/0x40
      
       read to 0xffffffff85517318 of 8 bytes by task 91655 on cpu 100:
        lookup_swap_cache+0x117/0x460
        lookup_swap_cache at mm/swap_state.c:322
        shmem_swapin_page+0xc7/0x9e0
        shmem_getpage_gfp+0x2ca/0x16c0
        shmem_fault+0xef/0x3c0
        __do_fault+0x9e/0x220
        do_fault+0x4a0/0x920
        __handle_mm_fault+0xc69/0xd00
        handle_mm_fault+0xfc/0x2f0
        do_page_fault+0x263/0x6f9
        page_fault+0x34/0x40
      
       Reported by Kernel Concurrency Sanitizer on:
       CPU: 100 PID: 91655 Comm: systemd-journal Tainted: G        W  O L 5.5.0-next-20200204+ #6
       Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
      
       write to 0xffffffff8d717308 of 8 bytes by task 11365 on cpu 87:
         __delete_from_swap_cache+0x681/0x8b0
         __delete_from_swap_cache at mm/swap_state.c:178
      
       read to 0xffffffff8d717308 of 8 bytes by task 11275 on cpu 53:
         __delete_from_swap_cache+0x66e/0x8b0
         __delete_from_swap_cache at mm/swap_state.c:178
      
      Both the read and write are done as lockless. Since swap_cache_info.*
      are only used to print out counter information, even if any of them
      missed a few incremental due to data races, it will be harmless, so just
      mark it as an intentional data race using the data_race() macro.
      
      While at it, fix a checkpatch.pl warning,
      
      WARNING: Single statement macros should not use a do {} while (0) loop
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Marco Elver <elver@google.com>
      Link: http://lkml.kernel.org/r/20200207003715.1578-1-cai@lca.pwSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b96a3db2
    • M
      mm: replace hpage_nr_pages with thp_nr_pages · 6c357848
      Matthew Wilcox (Oracle) 提交于
      The thp prefix is more frequently used than hpage and we should be
      consistent between the various functions.
      
      [akpm@linux-foundation.org: fix mm/migrate.c]
      Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com>
      Reviewed-by: NZi Yan <ziy@nvidia.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20200629151959.15779-6-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6c357848