1. 02 12月, 2019 2 次提交
  2. 08 10月, 2019 1 次提交
  3. 25 9月, 2019 5 次提交
  4. 19 7月, 2019 11 次提交
    • D
      mm/sparsemem: cleanup 'section number' data types · 9a845030
      Dan Williams 提交于
      David points out that there is a mixture of 'int' and 'unsigned long'
      usage for section number data types.  Update the memory hotplug path to
      use 'unsigned long' consistently for section numbers.
      
      [akpm@linux-foundation.org: fix printk format]
      Link: http://lkml.kernel.org/r/156107543656.1329419.11505835211949439815.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Reported-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a845030
    • D
      mm/sparsemem: support sub-section hotplug · ba72b4c8
      Dan Williams 提交于
      The libnvdimm sub-system has suffered a series of hacks and broken
      workarounds for the memory-hotplug implementation's awkward
      section-aligned (128MB) granularity.
      
      For example the following backtrace is emitted when attempting
      arch_add_memory() with physical address ranges that intersect 'System
      RAM' (RAM) with 'Persistent Memory' (PMEM) within a given section:
      
          # cat /proc/iomem | grep -A1 -B1 Persistent\ Memory
          100000000-1ffffffff : System RAM
          200000000-303ffffff : Persistent Memory (legacy)
          304000000-43fffffff : System RAM
          440000000-23ffffffff : Persistent Memory
          2400000000-43bfffffff : Persistent Memory
            2400000000-43bfffffff : namespace2.0
      
          WARNING: CPU: 38 PID: 928 at arch/x86/mm/init_64.c:850 add_pages+0x5c/0x60
          [..]
          RIP: 0010:add_pages+0x5c/0x60
          [..]
          Call Trace:
           devm_memremap_pages+0x460/0x6e0
           pmem_attach_disk+0x29e/0x680 [nd_pmem]
           ? nd_dax_probe+0xfc/0x120 [libnvdimm]
           nvdimm_bus_probe+0x66/0x160 [libnvdimm]
      
      It was discovered that the problem goes beyond RAM vs PMEM collisions as
      some platform produce PMEM vs PMEM collisions within a given section.
      The libnvdimm workaround for that case revealed that the libnvdimm
      section-alignment-padding implementation has been broken for a long
      while.
      
      A fix for that long-standing breakage introduces as many problems as it
      solves as it would require a backward-incompatible change to the
      namespace metadata interpretation.  Instead of that dubious route [1],
      address the root problem in the memory-hotplug implementation.
      
      Note that EEXIST is no longer treated as success as that is how
      sparse_add_section() reports subsection collisions, it was also obviated
      by recent changes to perform the request_region() for 'System RAM'
      before arch_add_memory() in the add_memory() sequence.
      
      [1] https://lore.kernel.org/r/155000671719.348031.2347363160141119237.stgit@dwillia2-desk3.amr.corp.intel.com
      
      [osalvador@suse.de: fix deactivate_section for early sections]
        Link: http://lkml.kernel.org/r/20190715081549.32577-2-osalvador@suse.de
      Link: http://lkml.kernel.org/r/156092354368.979959.6232443923440952359.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NOscar Salvador <osalvador@suse.de>
      Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>	[ppc64]
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Jane Chu <jane.chu@oracle.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Wei Yang <richardw.yang@linux.intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba72b4c8
    • D
      mm/sparsemem: prepare for sub-section ranges · 7ea62160
      Dan Williams 提交于
      Prepare the memory hot-{add,remove} paths for handling sub-section
      ranges by plumbing the starting page frame and number of pages being
      handled through arch_{add,remove}_memory() to
      sparse_{add,remove}_one_section().
      
      This is simply plumbing, small cleanups, and some identifier renames.
      No intended functional changes.
      
      Link: http://lkml.kernel.org/r/156092353780.979959.9713046515562743194.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com>
      Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>	[ppc64]
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Jane Chu <jane.chu@oracle.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Wei Yang <richardw.yang@linux.intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ea62160
    • D
      mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap() · e9c0a3f0
      Dan Williams 提交于
      Allow sub-section sized ranges to be added to the memmap.
      
      populate_section_memmap() takes an explict pfn range rather than
      assuming a full section, and those parameters are plumbed all the way
      through to vmmemap_populate().  There should be no sub-section usage in
      current deployments.  New warnings are added to clarify which memmap
      allocation paths are sub-section capable.
      
      Link: http://lkml.kernel.org/r/156092352058.979959.6551283472062305149.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com>
      Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>	[ppc64]
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Jane Chu <jane.chu@oracle.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richardw.yang@linux.intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e9c0a3f0
    • D
      mm/sparsemem: add helpers track active portions of a section at boot · f46edbd1
      Dan Williams 提交于
      Prepare for hot{plug,remove} of sub-ranges of a section by tracking a
      sub-section active bitmask, each bit representing a PMD_SIZE span of the
      architecture's memory hotplug section size.
      
      The implications of a partially populated section is that pfn_valid()
      needs to go beyond a valid_section() check and either determine that the
      section is an "early section", or read the sub-section active ranges
      from the bitmask.  The expectation is that the bitmask (subsection_map)
      fits in the same cacheline as the valid_section() / early_section()
      data, so the incremental performance overhead to pfn_valid() should be
      negligible.
      
      The rationale for using early_section() to short-ciruit the
      subsection_map check is that there are legacy code paths that use
      pfn_valid() at section granularity before validating the pfn against
      pgdat data.  So, the early_section() check allows those traditional
      assumptions to persist while also permitting subsection_map to tell the
      truth for purposes of populating the unused portions of early sections
      with PMEM and other ZONE_DEVICE mappings.
      
      Link: http://lkml.kernel.org/r/156092350874.979959.18185938451405518285.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Reported-by: NQian Cai <cai@lca.pw>
      Tested-by: NJane Chu <jane.chu@oracle.com>
      Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>	[ppc64]
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Wei Yang <richardw.yang@linux.intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f46edbd1
    • D
      mm/sparsemem: introduce a SECTION_IS_EARLY flag · 326e1b8f
      Dan Williams 提交于
      In preparation for sub-section hotplug, track whether a given section
      was created during early memory initialization, or later via memory
      hotplug.  This distinction is needed to maintain the coarse expectation
      that pfn_valid() returns true for any pfn within a given section even if
      that section has pages that are reserved from the page allocator.
      
      For example one of the of goals of subsection hotplug is to support
      cases where the system physical memory layout collides System RAM and
      PMEM within a section.  Several pfn_valid() users expect to just check
      if a section is valid, but they are not careful to check if the given
      pfn is within a "System RAM" boundary and instead expect pgdat
      information to further validate the pfn.
      
      Rather than unwind those paths to make their pfn_valid() queries more
      precise a follow on patch uses the SECTION_IS_EARLY flag to maintain the
      traditional expectation that pfn_valid() returns true for all early
      sections.
      
      Link: https://lore.kernel.org/lkml/1560366952-10660-1-git-send-email-cai@lca.pw/
      Link: http://lkml.kernel.org/r/156092350358.979959.5817209875548072819.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Reported-by: NQian Cai <cai@lca.pw>
      Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>	[ppc64]
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Jane Chu <jane.chu@oracle.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richardw.yang@linux.intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      326e1b8f
    • D
      mm/sparsemem: introduce struct mem_section_usage · f1eca35a
      Dan Williams 提交于
      Patch series "mm: Sub-section memory hotplug support", v10.
      
      The memory hotplug section is an arbitrary / convenient unit for memory
      hotplug.  'Section-size' units have bled into the user interface
      ('memblock' sysfs) and can not be changed without breaking existing
      userspace.  The section-size constraint, while mostly benign for typical
      memory hotplug, has and continues to wreak havoc with 'device-memory'
      use cases, persistent memory (pmem) in particular.  Recall that pmem
      uses devm_memremap_pages(), and subsequently arch_add_memory(), to
      allocate a 'struct page' memmap for pmem.  However, it does not use the
      'bottom half' of memory hotplug, i.e.  never marks pmem pages online and
      never exposes the userspace memblock interface for pmem.  This leaves an
      opening to redress the section-size constraint.
      
      To date, the libnvdimm subsystem has attempted to inject padding to
      satisfy the internal constraints of arch_add_memory().  Beyond
      complicating the code, leading to bugs [2], wasting memory, and limiting
      configuration flexibility, the padding hack is broken when the platform
      changes this physical memory alignment of pmem from one boot to the
      next.  Device failure (intermittent or permanent) and physical
      reconfiguration are events that can cause the platform firmware to
      change the physical placement of pmem on a subsequent boot, and device
      failure is an everyday event in a data-center.
      
      It turns out that sections are only a hard requirement of the
      user-facing interface for memory hotplug and with a bit more
      infrastructure sub-section arch_add_memory() support can be added for
      kernel internal usages like devm_memremap_pages().  Here is an analysis
      of the current design assumptions in the current code and how they are
      addressed in the new implementation:
      
      Current design assumptions:
      
       - Sections that describe boot memory (early sections) are never
         unplugged / removed.
      
       - pfn_valid(), in the CONFIG_SPARSEMEM_VMEMMAP=y, case devolves to a
         valid_section() check
      
       - __add_pages() and helper routines assume all operations occur in
         PAGES_PER_SECTION units.
      
       - The memblock sysfs interface only comprehends full sections
      
      New design assumptions:
      
       - Sections are instrumented with a sub-section bitmask to track (on
         x86) individual 2MB sub-divisions of a 128MB section.
      
       - Partially populated early sections can be extended with additional
         sub-sections, and those sub-sections can be removed with
         arch_remove_memory(). With this in place we no longer lose usable
         memory capacity to padding.
      
       - pfn_valid() is updated to look deeper than valid_section() to also
         check the active-sub-section mask. This indication is in the same
         cacheline as the valid_section() so the performance impact is
         expected to be negligible. So far the lkp robot has not reported any
         regressions.
      
       - Outside of the core vmemmap population routines which are replaced,
         other helper routines like shrink_{zone,pgdat}_span() are updated to
         handle the smaller granularity. Core memory hotplug routines that
         deal with online memory are not touched.
      
       - The existing memblock sysfs user api guarantees / assumptions are not
         touched since this capability is limited to !online
         !memblock-sysfs-accessible sections.
      
      Meanwhile the issue reports continue to roll in from users that do not
      understand when and how the 128MB constraint will bite them.  The current
      implementation relied on being able to support at least one misaligned
      namespace, but that immediately falls over on any moderately complex
      namespace creation attempt.  Beyond the initial problem of 'System RAM'
      colliding with pmem, and the unsolvable problem of physical alignment
      changes, Linux is now being exposed to platforms that collide pmem ranges
      with other pmem ranges by default [3].  In short, devm_memremap_pages()
      has pushed the venerable section-size constraint past the breaking point,
      and the simplicity of section-aligned arch_add_memory() is no longer
      tenable.
      
      These patches are exposed to the kbuild robot on a subsection-v10 branch
      [4], and a preview of the unit test for this functionality is available
      on the 'subsection-pending' branch of ndctl [5].
      
      [2]: https://lore.kernel.org/r/155000671719.348031.2347363160141119237.stgit@dwillia2-desk3.amr.corp.intel.com
      [3]: https://github.com/pmem/ndctl/issues/76
      [4]: https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=subsection-v10
      [5]: https://github.com/pmem/ndctl/commit/7c59b4867e1c
      
      This patch (of 13):
      
      Towards enabling memory hotplug to track partial population of a section,
      introduce 'struct mem_section_usage'.
      
      A pointer to a 'struct mem_section_usage' instance replaces the existing
      pointer to a 'pageblock_flags' bitmap.  Effectively it adds one more
      'unsigned long' beyond the 'pageblock_flags' (usemap) allocation to house
      a new 'subsection_map' bitmap.  The new bitmap enables the memory
      hot{plug,remove} implementation to act on incremental sub-divisions of a
      section.
      
      SUBSECTION_SHIFT is defined as global constant instead of per-architecture
      value like SECTION_SIZE_BITS in order to allow cross-arch compatibility of
      subsection users.  Specifically a common subsection size allows for the
      possibility that persistent memory namespace configurations be made
      compatible across architectures.
      
      The primary motivation for this functionality is to support platforms that
      mix "System RAM" and "Persistent Memory" within a single section, or
      multiple PMEM ranges with different mapping lifetimes within a single
      section.  The section restriction for hotplug has caused an ongoing saga
      of hacks and bugs for devm_memremap_pages() users.
      
      Beyond the fixups to teach existing paths how to retrieve the 'usemap'
      from a section, and updates to usemap allocation path, there are no
      expected behavior changes.
      
      Link: http://lkml.kernel.org/r/156092349845.979959.73333291612799019.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Reviewed-by: NWei Yang <richardw.yang@linux.intel.com>
      Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>	[ppc64]
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Jane Chu <jane.chu@oracle.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1eca35a
    • D
      mm: section numbers use the type "unsigned long" · 2491f0a2
      David Hildenbrand 提交于
      Patch series "mm: Further memory block device cleanups", v1.
      
      Some further cleanups around memory block devices.  Especially, clean up
      and simplify walk_memory_range().  Including some other minor cleanups.
      
      This patch (of 6):
      
      We are using a mixture of "int" and "unsigned long".  Let's make this
      consistent by using "unsigned long" everywhere.  We'll do the same with
      memory block ids next.
      
      While at it, turn the "unsigned long i" in removable_show() into an int
      - sections_per_block is an int.
      
      [akpm@linux-foundation.org: s/unsigned long i/unsigned long nr/]
      [david@redhat.com: v3]
        Link: http://lkml.kernel.org/r/20190620183139.4352-2-david@redhat.com
      Link: http://lkml.kernel.org/r/20190614100114.311-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2491f0a2
    • W
      mm/sparse.c: set section nid for hot-add memory · 26f26bed
      Wei Yang 提交于
      In case of NODE_NOT_IN_PAGE_FLAGS is set, we store section's node id in
      section_to_node_table[].  While for hot-add memory, this is missed.
      Without this information, page_to_nid() may not give the right node id.
      
      BTW, current online_pages works because it leverages nid in
      memory_block.  But the granularity of node id should be mem_section
      wide.
      
      Link: http://lkml.kernel.org/r/20190618005537.18878-1-richardw.yang@linux.intel.comSigned-off-by: NWei Yang <richardw.yang@linux.intel.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26f26bed
    • D
      mm/memory_hotplug: remove "zone" parameter from sparse_remove_one_section · b9bf8d34
      David Hildenbrand 提交于
      The parameter is unused, so let's drop it.  Memory removal paths should
      never care about zones.  This is the job of memory offlining and will
      require more refactorings.
      
      Link: http://lkml.kernel.org/r/20190527111152.16324-12-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NDan Williams <dan.j.williams@intel.com>
      Reviewed-by: NWei Yang <richardw.yang@linux.intel.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Jun Yao <yaojun8558363@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b9bf8d34
    • D
      mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE · 80ec922d
      David Hildenbrand 提交于
      We want to improve error handling while adding memory by allowing to use
      arch_remove_memory() and __remove_pages() even if
      CONFIG_MEMORY_HOTREMOVE is not set to e.g., implement something like:
      
      	arch_add_memory()
      	rc = do_something();
      	if (rc) {
      		arch_remove_memory();
      	}
      
      We won't get rid of CONFIG_MEMORY_HOTREMOVE for now, as it will require
      quite some dependencies for memory offlining.
      
      Link: http://lkml.kernel.org/r/20190527111152.16324-7-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NPavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Oscar Salvador <osalvador@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
      Cc: Andrew Banman <andrew.banman@hpe.com>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Jun Yao <yaojun8558363@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      80ec922d
  5. 15 5月, 2019 1 次提交
  6. 30 3月, 2019 1 次提交
    • Q
      mm/hotplug: fix offline undo_isolate_page_range() · 9b7ea46a
      Qian Cai 提交于
      Commit f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded
      memory to zones until online") introduced move_pfn_range_to_zone() which
      calls memmap_init_zone() during onlining a memory block.
      memmap_init_zone() will reset pagetype flags and makes migrate type to
      be MOVABLE.
      
      However, in __offline_pages(), it also call undo_isolate_page_range()
      after offline_isolated_pages() to do the same thing.  Due to commit
      2ce13640 ("mm: __first_valid_page skip over offline pages") changed
      __first_valid_page() to skip offline pages, undo_isolate_page_range()
      here just waste CPU cycles looping around the offlining PFN range while
      doing nothing, because __first_valid_page() will return NULL as
      offline_isolated_pages() has already marked all memory sections within
      the pfn range as offline via offline_mem_sections().
      
      Also, after calling the "useless" undo_isolate_page_range() here, it
      reaches the point of no returning by notifying MEM_OFFLINE.  Those pages
      will be marked as MIGRATE_MOVABLE again once onlining.  The only thing
      left to do is to decrease the number of isolated pageblocks zone counter
      which would make some paths of the page allocation slower that the above
      commit introduced.
      
      Even if alloc_contig_range() can be used to isolate 16GB-hugetlb pages
      on ppc64, an "int" should still be enough to represent the number of
      pageblocks there.  Fix an incorrect comment along the way.
      
      [cai@lca.pw: v4]
        Link: http://lkml.kernel.org/r/20190314150641.59358-1-cai@lca.pw
      Link: http://lkml.kernel.org/r/20190313143133.46200-1-cai@lca.pw
      Fixes: 2ce13640 ("mm: __first_valid_page skip over offline pages")
      Signed-off-by: NQian Cai <cai@lca.pw>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9b7ea46a
  7. 13 3月, 2019 2 次提交
    • M
      memblock: drop memblock_alloc_*_nopanic() variants · 26fb3dae
      Mike Rapoport 提交于
      As all the memblock allocation functions return NULL in case of error
      rather than panic(), the duplicates with _nopanic suffix can be removed.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-22-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: Petr Mladek <pmladek@suse.com>		[printk]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26fb3dae
    • M
      treewide: add checks for the return value of memblock_alloc*() · 8a7f97b9
      Mike Rapoport 提交于
      Add check for the return value of memblock_alloc*() functions and call
      panic() in case of error.  The panic message repeats the one used by
      panicing memblock allocators with adjustment of parameters to include
      only relevant ones.
      
      The replacement was mostly automated with semantic patches like the one
      below with manual massaging of format strings.
      
        @@
        expression ptr, size, align;
        @@
        ptr = memblock_alloc(size, align);
        + if (!ptr)
        + 	panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);
      
      [anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
        Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
      [rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
        Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
      [rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
        Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
      [akpm@linux-foundation.org: fix xtensa printk warning]
      Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAnders Roxell <anders.roxell@linaro.org>
      Reviewed-by: Guo Ren <ren_guo@c-sky.com>		[c-sky]
      Acked-by: Paul Burton <paul.burton@mips.com>		[MIPS]
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	[s390]
      Reviewed-by: Juergen Gross <jgross@suse.com>		[Xen]
      Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Acked-by: Max Filippov <jcmvbkbc@gmail.com>		[xtensa]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a7f97b9
  8. 06 3月, 2019 1 次提交
    • Q
      mm/sparse: fix a bad comparison · d778015a
      Qian Cai 提交于
      next_present_section_nr() could only return an unsigned number -1, so
      just check it specifically where compilers will convert -1 to unsigned
      if needed.
      
        mm/sparse.c: In function 'sparse_init_nid':
        mm/sparse.c:200:20: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
               ((section_nr >= 0) &&    \
                            ^~
        mm/sparse.c:478:2: note: in expansion of macro
        'for_each_present_section_nr'
          for_each_present_section_nr(pnum_begin, pnum) {
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
        mm/sparse.c:200:20: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
               ((section_nr >= 0) &&    \
                            ^~
        mm/sparse.c:497:2: note: in expansion of macro
        'for_each_present_section_nr'
          for_each_present_section_nr(pnum_begin, pnum) {
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
        mm/sparse.c: In function 'sparse_init':
        mm/sparse.c:200:20: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
               ((section_nr >= 0) &&    \
                            ^~
        mm/sparse.c:520:2: note: in expansion of macro
        'for_each_present_section_nr'
          for_each_present_section_nr(pnum_begin + 1, pnum_end) {
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Link: http://lkml.kernel.org/r/20190228181839.86504-1-cai@lca.pw
      Fixes: c4e1be9e ("mm, sparsemem: break out of loops early")
      Signed-off-by: NQian Cai <cai@lca.pw>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d778015a
  9. 29 12月, 2018 3 次提交
  10. 15 12月, 2018 1 次提交
  11. 31 10月, 2018 5 次提交
    • M
      memblock: stop using implicit alignment to SMP_CACHE_BYTES · 7e1c4e27
      Mike Rapoport 提交于
      When a memblock allocation APIs are called with align = 0, the alignment
      is implicitly set to SMP_CACHE_BYTES.
      
      Implicit alignment is done deep in the memblock allocator and it can
      come as a surprise.  Not that such an alignment would be wrong even
      when used incorrectly but it is better to be explicit for the sake of
      clarity and the prinicple of the least surprise.
      
      Replace all such uses of memblock APIs with the 'align' parameter
      explicitly set to SMP_CACHE_BYTES and stop implicit alignment assignment
      in the memblock internal allocation functions.
      
      For the case when memblock APIs are used via helper functions, e.g.  like
      iommu_arena_new_node() in Alpha, the helper functions were detected with
      Coccinelle's help and then manually examined and updated where
      appropriate.
      
      The direct memblock APIs users were updated using the semantic patch below:
      
      @@
      expression size, min_addr, max_addr, nid;
      @@
      (
      |
      - memblock_alloc_try_nid_raw(size, 0, min_addr, max_addr, nid)
      + memblock_alloc_try_nid_raw(size, SMP_CACHE_BYTES, min_addr, max_addr,
      nid)
      |
      - memblock_alloc_try_nid_nopanic(size, 0, min_addr, max_addr, nid)
      + memblock_alloc_try_nid_nopanic(size, SMP_CACHE_BYTES, min_addr, max_addr,
      nid)
      |
      - memblock_alloc_try_nid(size, 0, min_addr, max_addr, nid)
      + memblock_alloc_try_nid(size, SMP_CACHE_BYTES, min_addr, max_addr, nid)
      |
      - memblock_alloc(size, 0)
      + memblock_alloc(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_raw(size, 0)
      + memblock_alloc_raw(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_from(size, 0, min_addr)
      + memblock_alloc_from(size, SMP_CACHE_BYTES, min_addr)
      |
      - memblock_alloc_nopanic(size, 0)
      + memblock_alloc_nopanic(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_low(size, 0)
      + memblock_alloc_low(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_low_nopanic(size, 0)
      + memblock_alloc_low_nopanic(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_from_nopanic(size, 0, min_addr)
      + memblock_alloc_from_nopanic(size, SMP_CACHE_BYTES, min_addr)
      |
      - memblock_alloc_node(size, 0, nid)
      + memblock_alloc_node(size, SMP_CACHE_BYTES, nid)
      )
      
      [mhocko@suse.com: changelog update]
      [akpm@linux-foundation.org: coding-style fixes]
      [rppt@linux.ibm.com: fix missed uses of implicit alignment]
        Link: http://lkml.kernel.org/r/20181016133656.GA10925@rapoport-lnx
      Link: http://lkml.kernel.org/r/1538687224-17535-1-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Suggested-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: Paul Burton <paul.burton@mips.com>	[MIPS]
      Acked-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7e1c4e27
    • M
      mm: remove include/linux/bootmem.h · 57c8a661
      Mike Rapoport 提交于
      Move remaining definitions and declarations from include/linux/bootmem.h
      into include/linux/memblock.h and remove the redundant header.
      
      The includes were replaced with the semantic patch below and then
      semi-automated removal of duplicated '#include <linux/memblock.h>
      
      @@
      @@
      - #include <linux/bootmem.h>
      + #include <linux/memblock.h>
      
      [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
        Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
      [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
        Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
      [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
        Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
      Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57c8a661
    • M
      memblock: replace BOOTMEM_ALLOC_* with MEMBLOCK variants · 97ad1087
      Mike Rapoport 提交于
      Drop BOOTMEM_ALLOC_ACCESSIBLE and BOOTMEM_ALLOC_ANYWHERE in favor of
      identical MEMBLOCK definitions.
      
      Link: http://lkml.kernel.org/r/1536927045-23536-29-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97ad1087
    • M
      memblock: add align parameter to memblock_alloc_node() · 3913c8f9
      Mike Rapoport 提交于
      With the align parameter memblock_alloc_node() can be used as drop in
      replacement for alloc_bootmem_pages_node() and __alloc_bootmem_node(),
      which is done in the following patches.
      
      Link: http://lkml.kernel.org/r/1536927045-23536-15-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3913c8f9
    • M
      memblock: remove _virt from APIs returning virtual address · eb31d559
      Mike Rapoport 提交于
      The conversion is done using
      
      sed -i 's@memblock_virt_alloc@memblock_alloc@g' \
      	$(git grep -l memblock_virt_alloc)
      
      Link: http://lkml.kernel.org/r/1536927045-23536-8-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eb31d559
  12. 27 10月, 2018 1 次提交
    • A
      mm: provide kernel parameter to allow disabling page init poisoning · f682a97a
      Alexander Duyck 提交于
      Patch series "Address issues slowing persistent memory initialization", v5.
      
      The main thing this patch set achieves is that it allows us to initialize
      each node worth of persistent memory independently.  As a result we reduce
      page init time by about 2 minutes because instead of taking 30 to 40
      seconds per node and going through each node one at a time, we process all
      4 nodes in parallel in the case of a 12TB persistent memory setup spread
      evenly over 4 nodes.
      
      This patch (of 3):
      
      On systems with a large amount of memory it can take a significant amount
      of time to initialize all of the page structs with the PAGE_POISON_PATTERN
      value.  I have seen it take over 2 minutes to initialize a system with
      over 12TB of RAM.
      
      In order to work around the issue I had to disable CONFIG_DEBUG_VM and
      then the boot time returned to something much more reasonable as the
      arch_add_memory call completed in milliseconds versus seconds.  However in
      doing that I had to disable all of the other VM debugging on the system.
      
      In order to work around a kernel that might have CONFIG_DEBUG_VM enabled
      on a system that has a large amount of memory I have added a new kernel
      parameter named "vm_debug" that can be set to "-" in order to disable it.
      
      Link: http://lkml.kernel.org/r/20180925201921.3576.84239.stgit@localhost.localdomainReviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f682a97a
  13. 18 8月, 2018 6 次提交
    • P
      mm/sparse: delete old sparse_init and enable new one · 2a3cb8ba
      Pavel Tatashin 提交于
      Rename new_sparse_init() to sparse_init() which enables it.  Delete old
      sparse_init() and all the code that became obsolete with.
      
      [pasha.tatashin@oracle.com: remove unused sparse_mem_maps_populate_node()]
        Link: http://lkml.kernel.org/r/20180716174447.14529-6-pasha.tatashin@oracle.com
      Link: http://lkml.kernel.org/r/20180712203730.8703-6-pasha.tatashin@oracle.comSigned-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Tested-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
      Tested-by: NOscar Salvador <osalvador@suse.de>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a3cb8ba
    • P
      mm/sparse: add new sparse_init_nid() and sparse_init() · 85c77f79
      Pavel Tatashin 提交于
      sparse_init() requires to temporary allocate two large buffers: usemap_map
      and map_map.  Baoquan He has identified that these buffers are so large
      that Linux is not bootable on small memory machines, such as a kdump boot.
      The buffers are especially large when CONFIG_X86_5LEVEL is set, as they
      are scaled to the maximum physical memory size.
      
      Baoquan provided a fix, which reduces these sizes of these buffers, but it
      is much better to get rid of them entirely.
      
      Add a new way to initialize sparse memory: sparse_init_nid(), which only
      operates within one memory node, and thus allocates memory either in large
      contiguous block or allocates section by section.  This eliminates the
      need for use of temporary buffers.
      
      For simplified bisecting and review temporarly call sparse_init()
      new_sparse_init(), the new interface is going to be enabled as well as old
      code removed in the next patch.
      
      Link: http://lkml.kernel.org/r/20180712203730.8703-5-pasha.tatashin@oracle.comSigned-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Tested-by: NOscar Salvador <osalvador@suse.de>
      Tested-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
      Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      85c77f79
    • P
      mm/sparse: move buffer init/fini to the common place · afda57bc
      Pavel Tatashin 提交于
      Now that both variants of sparse memory use the same buffers to populate
      memory map, we can move sparse_buffer_init()/sparse_buffer_fini() to the
      common place.
      
      Link: http://lkml.kernel.org/r/20180712203730.8703-4-pasha.tatashin@oracle.comSigned-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Tested-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
      Tested-by: NOscar Salvador <osalvador@suse.de>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      afda57bc
    • P
      mm/sparse: use the new sparse buffer functions in non-vmemmap · e131c06b
      Pavel Tatashin 提交于
      non-vmemmap sparse also allocated large contiguous chunk of memory, and if
      fails falls back to smaller allocations.  Use the same functions to
      allocate buffer as the vmemmap-sparse
      
      Link: http://lkml.kernel.org/r/20180712203730.8703-3-pasha.tatashin@oracle.comSigned-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Tested-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Tested-by: NOscar Salvador <osalvador@suse.de>
      Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e131c06b
    • P
      mm/sparse: abstract sparse buffer allocations · 35fd1eb1
      Pavel Tatashin 提交于
      Patch series "sparse_init rewrite", v6.
      
      In sparse_init() we allocate two large buffers to temporary hold usemap
      and memmap for the whole machine.  However, we can avoid doing that if
      we changed sparse_init() to operated on per-node bases instead of doing
      it on the whole machine beforehand.
      
      As shown by Baoquan
        http://lkml.kernel.org/r/20180628062857.29658-1-bhe@redhat.com
      
      The buffers are large enough to cause machine stop to boot on small
      memory systems.
      
      Another benefit of these changes is that they also obsolete
      CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER.
      
      This patch (of 5):
      
      When struct pages are allocated for sparse-vmemmap VA layout, we first try
      to allocate one large buffer, and than if that fails allocate struct pages
      for each section as we go.
      
      The code that allocates buffer is uses global variables and is spread
      across several call sites.
      
      Cleanup the code by introducing three functions to handle the global
      buffer:
      
      sparse_buffer_init()	initialize the buffer
      sparse_buffer_fini()	free the remaining part of the buffer
      sparse_buffer_alloc()	alloc from the buffer, and if buffer is empty
      return NULL
      
      Define these functions in sparse.c instead of sparse-vmemmap.c because
      later we will use them for non-vmemmap sparse allocations as well.
      
      [akpm@linux-foundation.org: use PTR_ALIGN()]
      [akpm@linux-foundation.org: s/BUG_ON/WARN_ON/]
      Link: http://lkml.kernel.org/r/20180712203730.8703-2-pasha.tatashin@oracle.comSigned-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Tested-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Tested-by: NOscar Salvador <osalvador@suse.de>
      Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35fd1eb1
    • B
      mm/sparse: optimize memmap allocation during sparse_init() · c98aff64
      Baoquan He 提交于
      In sparse_init(), two temporary pointer arrays, usemap_map and map_map
      are allocated with the size of NR_MEM_SECTIONS.  They are used to store
      each memory section's usemap and mem map if marked as present.  With the
      help of these two arrays, continuous memory chunk is allocated for
      usemap and memmap for memory sections on one node.  This avoids too many
      memory fragmentations.  Like below diagram, '1' indicates the present
      memory section, '0' means absent one.  The number 'n' could be much
      smaller than NR_MEM_SECTIONS on most of systems.
      
        |1|1|1|1|0|0|0|0|1|1|0|0|...|1|0||1|0|...|1||0|1|...|0|
        -------------------------------------------------------
         0 1 2 3         4 5         i   i+1     n-1   n
      
      If we fail to populate the page tables to map one section's memmap, its
      ->section_mem_map will be cleared finally to indicate that it's not
      present.  After use, these two arrays will be released at the end of
      sparse_init().
      
      In 4-level paging mode, each array costs 4M which can be ignorable.
      While in 5-level paging, they costs 256M each, 512M altogether.  Kdump
      kernel Usually only reserves very few memory, e.g 256M.  So, even thouth
      they are temporarily allocated, still not acceptable.
      
      In fact, there's no need to allocate them with the size of
      NR_MEM_SECTIONS.  Since the ->section_mem_map clearing has been deferred
      to the last, the number of present memory sections are kept the same
      during sparse_init() until we finally clear out the memory section's
      ->section_mem_map if its usemap or memmap is not correctly handled.
      Thus in the middle whenever for_each_present_section_nr() loop is taken,
      the i-th present memory section is always the same one.
      
      Here only allocate usemap_map and map_map with the size of
      'nr_present_sections'.  For the i-th present memory section, install its
      usemap and memmap to usemap_map[i] and mam_map[i] during allocation.
      Then in the last for_each_present_section_nr() loop which clears the
      failed memory section's ->section_mem_map, fetch usemap and memmap from
      usemap_map[] and map_map[] array and set them into mem_section[]
      accordingly.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Link: http://lkml.kernel.org/r/20180628062857.29658-5-bhe@redhat.comSigned-off-by: NBaoquan He <bhe@redhat.com>
      Reviewed-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Oscar Salvador <osalvador@techadventures.net>
      Cc: Pankaj Gupta <pagupta@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c98aff64