1. 06 5月, 2021 7 次提交
  2. 27 2月, 2021 3 次提交
  3. 16 12月, 2020 2 次提交
  4. 13 8月, 2020 3 次提交
  5. 04 7月, 2020 1 次提交
    • B
      mm/cma.c: use exact_nid true to fix possible per-numa cma leak · 40366bd7
      Barry Song 提交于
      Calling cma_declare_contiguous_nid() with false exact_nid for per-numa
      reservation can easily cause cma leak and various confusion.  For example,
      mm/hugetlb.c is trying to reserve per-numa cma for gigantic pages.  But it
      can easily leak cma and make users confused when system has memoryless
      nodes.
      
      In case the system has 4 numa nodes, and only numa node0 has memory.  if
      we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas for 4
      different numa nodes.  since exact_nid=false in current code, all 4 numa
      nodes will get cma successfully from node0, but hugetlb_cma[1 to 3] will
      never be available to hugepage will only allocate memory from
      hugetlb_cma[0].
      
      In case the system has 4 numa nodes, both numa node0&2 has memory, other
      nodes have no memory.  if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c
      will get 4 cma areas for 4 different numa nodes.  since exact_nid=false in
      current code, all 4 numa nodes will get cma successfully from node0 or 2,
      but hugetlb_cma[1] and [3] will never be available to hugepage as
      mm/hugetlb.c will only allocate memory from hugetlb_cma[0] and
      hugetlb_cma[2].  This causes permanent leak of the cma areas which are
      supposed to be used by memoryless node.
      
      Of cource we can workaround the issue by letting mm/hugetlb.c scan all cma
      areas in alloc_gigantic_page() even node_mask includes node0 only.  that
      means when node_mask includes node0 only, we can get page from
      hugetlb_cma[1] to hugetlb_cma[3].  But this will cause kernel crash in
      free_gigantic_page() while it wants to free page by:
      cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)
      
      On the other hand, exact_nid=false won't consider numa distance, it might
      be not that useful to leverage cma areas on remote nodes.  I feel it is
      much simpler to make exact_nid true to make everything clear.  After that,
      memoryless nodes won't be able to reserve per-numa CMA from other nodes
      which have memory.
      
      Fixes: cf11e85f ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
      Signed-off-by: NBarry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NRoman Gushchin <guro@fb.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Aslan Bakirov <aslan@fb.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200628074345.27228-1-song.bao.hua@hisilicon.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40366bd7
  6. 11 4月, 2020 1 次提交
  7. 02 12月, 2019 1 次提交
  8. 17 7月, 2019 2 次提交
  9. 24 5月, 2019 1 次提交
  10. 15 5月, 2019 2 次提交
  11. 13 3月, 2019 1 次提交
    • M
      memblock: emphasize that memblock_alloc_range() returns a physical address · 8a770c2a
      Mike Rapoport 提交于
      Rename memblock_alloc_range() to memblock_phys_alloc_range() to
      emphasize that it returns a physical address.
      
      While on it, remove the 'enum memblock_flags' parameter from this
      function as its only user anyway sets it to MEMBLOCK_NONE, which is the
      default for the most of memblock allocations.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-6-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a770c2a
  12. 06 3月, 2019 1 次提交
  13. 29 12月, 2018 1 次提交
  14. 18 8月, 2018 1 次提交
  15. 25 5月, 2018 1 次提交
    • J
      Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE" · d883c6cf
      Joonsoo Kim 提交于
      This reverts the following commits that change CMA design in MM.
      
       3d2054ad ("ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y")
      
       1d47a3ec ("mm/cma: remove ALLOC_CMA")
      
       bad8c6c0 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")
      
      Ville reported a following error on i386.
      
        Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
        microcode: microcode updated early to revision 0x4, date = 2013-06-28
        Initializing CPU#0
        Initializing HighMem for node 0 (000377fe:00118000)
        Initializing Movable for node 0 (00000001:00118000)
        BUG: Bad page state in process swapper  pfn:377fe
        page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0
        flags: 0x80000000()
        raw: 80000000 00000000 00000000 ffffff80 00000000 00000100 00000200 00000001
        page dumped because: nonzero mapcount
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0-rc5-elk+ #145
        Hardware name: Dell Inc. Latitude E5410/03VXMC, BIOS A15 07/11/2013
        Call Trace:
         dump_stack+0x60/0x96
         bad_page+0x9a/0x100
         free_pages_check_bad+0x3f/0x60
         free_pcppages_bulk+0x29d/0x5b0
         free_unref_page_commit+0x84/0xb0
         free_unref_page+0x3e/0x70
         __free_pages+0x1d/0x20
         free_highmem_page+0x19/0x40
         add_highpages_with_active_regions+0xab/0xeb
         set_highmem_pages_init+0x66/0x73
         mem_init+0x1b/0x1d7
         start_kernel+0x17a/0x363
         i386_start_kernel+0x95/0x99
         startup_32_smp+0x164/0x168
      
      The reason for this error is that the span of MOVABLE_ZONE is extended
      to whole node span for future CMA initialization, and, normal memory is
      wrongly freed here.  I submitted the fix and it seems to work, but,
      another problem happened.
      
      It's so late time to fix the later problem so I decide to reverting the
      series.
      Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Acked-by: NLaura Abbott <labbott@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d883c6cf
  16. 12 4月, 2018 1 次提交
    • J
      mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE · bad8c6c0
      Joonsoo Kim 提交于
      Patch series "mm/cma: manage the memory of the CMA area by using the
      ZONE_MOVABLE", v2.
      
      0. History
      
      This patchset is the follow-up of the discussion about the "Introduce
      ZONE_CMA (v7)" [1].  Please reference it if more information is needed.
      
      1. What does this patch do?
      
      This patch changes the management way for the memory of the CMA area in
      the MM subsystem.  Currently the memory of the CMA area is managed by
      the zone where their pfn is belong to.  However, this approach has some
      problems since MM subsystem doesn't have enough logic to handle the
      situation that different characteristic memories are in a single zone.
      To solve this issue, this patch try to manage all the memory of the CMA
      area by using the MOVABLE zone.  In MM subsystem's point of view,
      characteristic of the memory on the MOVABLE zone and the memory of the
      CMA area are the same.  So, managing the memory of the CMA area by using
      the MOVABLE zone will not have any problem.
      
      2. Motivation
      
      There are some problems with current approach.  See following.  Although
      these problem would not be inherent and it could be fixed without this
      conception change, it requires many hooks addition in various code path
      and it would be intrusive to core MM and would be really error-prone.
      Therefore, I try to solve them with this new approach.  Anyway,
      following is the problems of the current implementation.
      
      o CMA memory utilization
      
      First, following is the freepage calculation logic in MM.
      
       - For movable allocation: freepage = total freepage
       - For unmovable allocation: freepage = total freepage - CMA freepage
      
      Freepages on the CMA area is used after the normal freepages in the zone
      where the memory of the CMA area is belong to are exhausted.  At that
      moment that the number of the normal freepages is zero, so
      
       - For movable allocation: freepage = total freepage = CMA freepage
       - For unmovable allocation: freepage = 0
      
      If unmovable allocation comes at this moment, allocation request would
      fail to pass the watermark check and reclaim is started.  After reclaim,
      there would exist the normal freepages so freepages on the CMA areas
      would not be used.
      
      FYI, there is another attempt [2] trying to solve this problem in lkml.
      And, as far as I know, Qualcomm also has out-of-tree solution for this
      problem.
      
      Useless reclaim:
      
      There is no logic to distinguish CMA pages in the reclaim path.  Hence,
      CMA page is reclaimed even if the system just needs the page that can be
      usable for the kernel allocation.
      
      Atomic allocation failure:
      
      This is also related to the fallback allocation policy for the memory of
      the CMA area.  Consider the situation that the number of the normal
      freepages is *zero* since the bunch of the movable allocation requests
      come.  Kswapd would not be woken up due to following freepage
      calculation logic.
      
      - For movable allocation: freepage = total freepage = CMA freepage
      
      If atomic unmovable allocation request comes at this moment, it would
      fails due to following logic.
      
      - For unmovable allocation: freepage = total freepage - CMA freepage = 0
      
      It was reported by Aneesh [3].
      
      Useless compaction:
      
      Usual high-order allocation request is unmovable allocation request and
      it cannot be served from the memory of the CMA area.  In compaction,
      migration scanner try to migrate the page in the CMA area and make
      high-order page there.  As mentioned above, it cannot be usable for the
      unmovable allocation request so it's just waste.
      
      3. Current approach and new approach
      
      Current approach is that the memory of the CMA area is managed by the
      zone where their pfn is belong to.  However, these memory should be
      distinguishable since they have a strong limitation.  So, they are
      marked as MIGRATE_CMA in pageblock flag and handled specially.  However,
      as mentioned in section 2, the MM subsystem doesn't have enough logic to
      deal with this special pageblock so many problems raised.
      
      New approach is that the memory of the CMA area is managed by the
      MOVABLE zone.  MM already have enough logic to deal with special zone
      like as HIGHMEM and MOVABLE zone.  So, managing the memory of the CMA
      area by the MOVABLE zone just naturally work well because constraints
      for the memory of the CMA area that the memory should always be
      migratable is the same with the constraint for the MOVABLE zone.
      
      There is one side-effect for the usability of the memory of the CMA
      area.  The use of MOVABLE zone is only allowed for a request with
      GFP_HIGHMEM && GFP_MOVABLE so now the memory of the CMA area is also
      only allowed for this gfp flag.  Before this patchset, a request with
      GFP_MOVABLE can use them.  IMO, It would not be a big issue since most
      of GFP_MOVABLE request also has GFP_HIGHMEM flag.  For example, file
      cache page and anonymous page.  However, file cache page for blockdev
      file is an exception.  Request for it has no GFP_HIGHMEM flag.  There is
      pros and cons on this exception.  In my experience, blockdev file cache
      pages are one of the top reason that causes cma_alloc() to fail
      temporarily.  So, we can get more guarantee of cma_alloc() success by
      discarding this case.
      
      Note that there is no change in admin POV since this patchset is just
      for internal implementation change in MM subsystem.  Just one minor
      difference for admin is that the memory stat for CMA area will be
      printed in the MOVABLE zone.  That's all.
      
      4. Result
      
      Following is the experimental result related to utilization problem.
      
      8 CPUs, 1024 MB, VIRTUAL MACHINE
      make -j16
      
      <Before>
        CMA area:               0 MB            512 MB
        Elapsed-time:           92.4		186.5
        pswpin:                 82		18647
        pswpout:                160		69839
      
      <After>
        CMA        :            0 MB            512 MB
        Elapsed-time:           93.1		93.4
        pswpin:                 84		46
        pswpout:                183		92
      
      akpm: "kernel test robot" reported a 26% improvement in
      vm-scalability.throughput:
      http://lkml.kernel.org/r/20180330012721.GA3845@yexl-desktop
      
      [1]: lkml.kernel.org/r/1491880640-9944-1-git-send-email-iamjoonsoo.kim@lge.com
      [2]: https://lkml.org/lkml/2014/10/15/623
      [3]: http://www.spinics.net/lists/linux-mm/msg100562.html
      
      Link: http://lkml.kernel.org/r/1512114786-5085-2-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Tested-by: NTony Lindgren <tony@atomide.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bad8c6c0
  17. 06 4月, 2018 2 次提交
  18. 16 11月, 2017 1 次提交
  19. 14 10月, 2017 1 次提交
  20. 11 7月, 2017 2 次提交
    • D
      cma: fix calculation of aligned offset · e048cb32
      Doug Berger 提交于
      The align_offset parameter is used by bitmap_find_next_zero_area_off()
      to represent the offset of map's base from the previous alignment
      boundary; the function ensures that the returned index, plus the
      align_offset, honors the specified align_mask.
      
      The logic introduced by commit b5be83e3 ("mm: cma: align to physical
      address, not CMA region position") has the cma driver calculate the
      offset to the *next* alignment boundary.  In most cases, the base
      alignment is greater than that specified when making allocations,
      resulting in a zero offset whether we align up or down.  In the example
      given with the commit, the base alignment (8MB) was half the requested
      alignment (16MB) so the math also happened to work since the offset is
      8MB in both directions.  However, when requesting allocations with an
      alignment greater than twice that of the base, the returned index would
      not be correctly aligned.
      
      Also, the align_order arguments of cma_bitmap_aligned_mask() and
      cma_bitmap_aligned_offset() should not be negative so the argument type
      was made unsigned.
      
      Fixes: b5be83e3 ("mm: cma: align to physical address, not CMA region position")
      Link: http://lkml.kernel.org/r/20170628170742.2895-1-opendmb@gmail.comSigned-off-by: NAngus Clark <angus@angusclark.org>
      Signed-off-by: NDoug Berger <opendmb@gmail.com>
      Acked-by: NGregory Fong <gregory.0xf0@gmail.com>
      Cc: Doug Berger <opendmb@gmail.com>
      Cc: Angus Clark <angus@angusclark.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Lucas Stach <l.stach@pengutronix.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Shiraz Hashim <shashim@codeaurora.org>
      Cc: Jaewon Kim <jaewon31.kim@samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e048cb32
    • A
      mm/cma.c: warn if the CMA area could not be activated · e35ef639
      Anshuman Khandual 提交于
      While activating a CMA area we check to make sure that all the PFNs in
      the range are inside the same zone.  This is a requirement for
      alloc_contig_range() to work.  Any CMA area failing the check is
      disabled for good.  This happens silently right now making all future
      cma_alloc() allocations failure inevitable.
      
      Here we add an error message stating that the CMA area could not be
      activated which makes it easier to explain any future cma_alloc()
      failures on it.  While in there, change the bail out goto label from
      'err' to 'not_in_zone' which makes more sense.
      
      Link: http://lkml.kernel.org/r/20170605023729.26303-1-khandual@linux.vnet.ibm.comSigned-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e35ef639
  21. 19 4月, 2017 2 次提交
  22. 25 2月, 2017 3 次提交