1. 04 7月, 2022 1 次提交
  2. 17 6月, 2022 1 次提交
  3. 28 5月, 2022 2 次提交
  4. 27 5月, 2022 1 次提交
    • M
      mm/page_alloc: always attempt to allocate at least one page during bulk allocation · c572e488
      Mel Gorman 提交于
      Peter Pavlisko reported the following problem on kernel bugzilla 216007.
      
      	When I try to extract an uncompressed tar archive (2.6 milion
      	files, 760.3 GiB in size) on newly created (empty) XFS file system,
      	after first low tens of gigabytes extracted the process hangs in
      	iowait indefinitely. One CPU core is 100% occupied with iowait,
      	the other CPU core is idle (on 2-core Intel Celeron G1610T).
      
      It was bisected to c9fa5630 ("xfs: use alloc_pages_bulk_array() for
      buffers") but XFS is only the messenger.  The problem is that nothing is
      waking kswapd to reclaim some pages at a time the PCP lists cannot be
      refilled until some reclaim happens.  The bulk allocator checks that there
      are some pages in the array and the original intent was that a bulk
      allocator did not necessarily need all the requested pages and it was best
      to return as quickly as possible.
      
      This was fine for the first user of the API but both NFS and XFS require
      the requested number of pages be available before making progress.  Both
      could be adjusted to call the page allocator directly if a bulk allocation
      fails but it puts a burden on users of the API.  Adjust the semantics to
      attempt at least one allocation via __alloc_pages() before returning so
      kswapd is woken if necessary.
      
      It was reported via bugzilla that the patch addressed the problem and that
      the tar extraction completed successfully.  This may also address bug
      215975 but has yet to be confirmed.
      
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216007
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215975
      Link: https://lkml.kernel.org/r/20220526091210.GC3441@techsingularity.net
      Fixes: 387ba26f ("mm/page_alloc: add a bulk page allocator")
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Cc: "Darrick J. Wong" <djwong@kernel.org>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: <stable@vger.kernel.org>	[5.13+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      c572e488
  5. 26 5月, 2022 1 次提交
    • Z
      mm: fix a potential infinite loop in start_isolate_page_range() · 88ee1343
      Zi Yan 提交于
      In isolate_single_pageblock() called by start_isolate_page_range(), there
      are some pageblock isolation issues causing a potential infinite loop when
      isolating a page range.  This is reported by Qian Cai.
      
      1. the pageblock was isolated by just changing pageblock migratetype
         without checking unmovable pages. Calling set_migratetype_isolate() to
         isolate pageblock properly.
      2. an off-by-one error caused migrating pages unnecessarily, since the page
         is not crossing pageblock boundary.
      3. migrating a compound page across pageblock boundary then splitting the
         free page later has a small race window that the free page might be
         allocated again, so that the code will try again, causing an potential
         infinite loop. Temporarily set the to-be-migrated page's pageblock to
         MIGRATE_ISOLATE to prevent that and bail out early if no free page is
         found after page migration.
      
      An additional fix to split_free_page() aims to avoid crashing in
      __free_one_page().  When the free page is split at the specified
      split_pfn_offset, free_page_order should check both the first bit of
      free_page_pfn and the last bit of split_pfn_offset and use the smaller
      one.  For example, if free_page_pfn=0x10000, split_pfn_offset=0xc000,
      free_page_order should first be 0x8000 then 0x4000, instead of 0x4000 then
      0x8000, which the original algorithm did.
      
      [akpm@linux-foundation.org: suppress min() warning]
      Link: https://lkml.kernel.org/r/20220524194756.1698351-1-zi.yan@sent.com
      Fixes: b2c9e2fb ("mm: make alloc_contig_range work at pageblock granularity")
      Signed-off-by: NZi Yan <ziy@nvidia.com>
      Reported-by: NQian Cai <quic_qiancai@quicinc.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Eric Ren <renzhengeek@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      88ee1343
  6. 20 5月, 2022 2 次提交
  7. 13 5月, 2022 6 次提交
  8. 10 5月, 2022 1 次提交
  9. 30 4月, 2022 1 次提交
  10. 29 4月, 2022 6 次提交
  11. 25 4月, 2022 1 次提交
  12. 16 4月, 2022 1 次提交
    • J
      mm, page_alloc: fix build_zonerefs_node() · e553f62f
      Juergen Gross 提交于
      Since commit 6aa303de ("mm, vmscan: only allocate and reclaim from
      zones with pages managed by the buddy allocator") only zones with free
      memory are included in a built zonelist.  This is problematic when e.g.
      all memory of a zone has been ballooned out when zonelists are being
      rebuilt.
      
      The decision whether to rebuild the zonelists when onlining new memory
      is done based on populated_zone() returning 0 for the zone the memory
      will be added to.  The new zone is added to the zonelists only, if it
      has free memory pages (managed_zone() returns a non-zero value) after
      the memory has been onlined.  This implies, that onlining memory will
      always free the added pages to the allocator immediately, but this is
      not true in all cases: when e.g. running as a Xen guest the onlined new
      memory will be added only to the ballooned memory list, it will be freed
      only when the guest is being ballooned up afterwards.
      
      Another problem with using managed_zone() for the decision whether a
      zone is being added to the zonelists is, that a zone with all memory
      used will in fact be removed from all zonelists in case the zonelists
      happen to be rebuilt.
      
      Use populated_zone() when building a zonelist as it has been done before
      that commit.
      
      There was a report that QubesOS (based on Xen) is hitting this problem.
      Xen has switched to use the zone device functionality in kernel 5.9 and
      QubesOS wants to use memory hotplugging for guests in order to be able
      to start a guest with minimal memory and expand it as needed.  This was
      the report leading to the patch.
      
      Link: https://lkml.kernel.org/r/20220407120637.9035-1-jgross@suse.com
      Fixes: 6aa303de ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator")
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reported-by: NMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Reviewed-by: NWei Yang <richard.weiyang@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e553f62f
  13. 05 4月, 2022 1 次提交
  14. 02 4月, 2022 1 次提交
  15. 31 3月, 2022 1 次提交
  16. 25 3月, 2022 13 次提交