1. 02 9月, 2020 7 次提交
    • M
      mm: reclaim small amounts of memory when an external fragmentation event occurs · 9bcadc70
      Mel Gorman 提交于
      to #28825456
      
      commit 1c30844d2dfe272d58c8fc000960b835d13aa2ac upstream.
      
      An external fragmentation event was previously described as
      
          When the page allocator fragments memory, it records the event using
          the mm_page_alloc_extfrag event. If the fallback_order is smaller
          than a pageblock order (order-9 on 64-bit x86) then it's considered
          an event that will cause external fragmentation issues in the future.
      
      The kernel reduces the probability of such events by increasing the
      watermark sizes by calling set_recommended_min_free_kbytes early in the
      lifetime of the system.  This works reasonably well in general but if
      there are enough sparsely populated pageblocks then the problem can still
      occur as enough memory is free overall and kswapd stays asleep.
      
      This patch introduces a watermark_boost_factor sysctl that allows a zone
      watermark to be temporarily boosted when an external fragmentation causing
      events occurs.  The boosting will stall allocations that would decrease
      free memory below the boosted low watermark and kswapd is woken if the
      calling context allows to reclaim an amount of memory relative to the size
      of the high watermark and the watermark_boost_factor until the boost is
      cleared.  When kswapd finishes, it wakes kcompactd at the pageblock order
      to clean some of the pageblocks that may have been affected by the
      fragmentation event.  kswapd avoids any writeback, slab shrinkage and swap
      from reclaim context during this operation to avoid excessive system
      disruption in the name of fragmentation avoidance.  Care is taken so that
      kswapd will do normal reclaim work if the system is really low on memory.
      
      This was evaluated using the same workloads as "mm, page_alloc: Spread
      allocations across zones before introducing fragmentation".
      
      1-socket Skylake machine
      config-global-dhp__workload_thpfioscale XFS (no special madvise)
      4 fio threads, 1 THP allocating thread
      --------------------------------------
      
      4.20-rc3 extfrag events < order 9:   804694
      4.20-rc3+patch:                      408912 (49% reduction)
      4.20-rc3+patch1-4:                    18421 (98% reduction)
      
                                         4.20.0-rc3             4.20.0-rc3
                                       lowzone-v5r8             boost-v5r8
      Amean     fault-base-1      653.58 (   0.00%)      652.71 (   0.13%)
      Amean     fault-huge-1        0.00 (   0.00%)      178.93 * -99.00%*
      
                                    4.20.0-rc3             4.20.0-rc3
                                  lowzone-v5r8             boost-v5r8
      Percentage huge-1        0.00 (   0.00%)        5.12 ( 100.00%)
      
      Note that external fragmentation causing events are massively reduced by
      this path whether in comparison to the previous kernel or the vanilla
      kernel.  The fault latency for huge pages appears to be increased but that
      is only because THP allocations were successful with the patch applied.
      
      1-socket Skylake machine
      global-dhp__workload_thpfioscale-madvhugepage-xfs (MADV_HUGEPAGE)
      -----------------------------------------------------------------
      
      4.20-rc3 extfrag events < order 9:  291392
      4.20-rc3+patch:                     191187 (34% reduction)
      4.20-rc3+patch1-4:                   13464 (95% reduction)
      
      thpfioscale Fault Latencies
                                         4.20.0-rc3             4.20.0-rc3
                                       lowzone-v5r8             boost-v5r8
      Min       fault-base-1      912.00 (   0.00%)      905.00 (   0.77%)
      Min       fault-huge-1      127.00 (   0.00%)      135.00 (  -6.30%)
      Amean     fault-base-1     1467.55 (   0.00%)     1481.67 (  -0.96%)
      Amean     fault-huge-1     1127.11 (   0.00%)     1063.88 *   5.61%*
      
                                    4.20.0-rc3             4.20.0-rc3
                                  lowzone-v5r8             boost-v5r8
      Percentage huge-1       77.64 (   0.00%)       83.46 (   7.49%)
      
      As before, massive reduction in external fragmentation events, some jitter
      on latencies and an increase in THP allocation success rates.
      
      2-socket Haswell machine
      config-global-dhp__workload_thpfioscale XFS (no special madvise)
      4 fio threads, 5 THP allocating threads
      ----------------------------------------------------------------
      
      4.20-rc3 extfrag events < order 9:  215698
      4.20-rc3+patch:                     200210 (7% reduction)
      4.20-rc3+patch1-4:                   14263 (93% reduction)
      
                                         4.20.0-rc3             4.20.0-rc3
                                       lowzone-v5r8             boost-v5r8
      Amean     fault-base-5     1346.45 (   0.00%)     1306.87 (   2.94%)
      Amean     fault-huge-5     3418.60 (   0.00%)     1348.94 (  60.54%)
      
                                    4.20.0-rc3             4.20.0-rc3
                                  lowzone-v5r8             boost-v5r8
      Percentage huge-5        0.78 (   0.00%)        7.91 ( 910.64%)
      
      There is a 93% reduction in fragmentation causing events, there is a big
      reduction in the huge page fault latency and allocation success rate is
      higher.
      
      2-socket Haswell machine
      global-dhp__workload_thpfioscale-madvhugepage-xfs (MADV_HUGEPAGE)
      -----------------------------------------------------------------
      
      4.20-rc3 extfrag events < order 9: 166352
      4.20-rc3+patch:                    147463 (11% reduction)
      4.20-rc3+patch1-4:                  11095 (93% reduction)
      
      thpfioscale Fault Latencies
                                         4.20.0-rc3             4.20.0-rc3
                                       lowzone-v5r8             boost-v5r8
      Amean     fault-base-5     6217.43 (   0.00%)     7419.67 * -19.34%*
      Amean     fault-huge-5     3163.33 (   0.00%)     3263.80 (  -3.18%)
      
                                    4.20.0-rc3             4.20.0-rc3
                                  lowzone-v5r8             boost-v5r8
      Percentage huge-5       95.14 (   0.00%)       87.98 (  -7.53%)
      
      There is a large reduction in fragmentation events with some jitter around
      the latencies and success rates.  As before, the high THP allocation
      success rate does mean the system is under a lot of pressure.  However, as
      the fragmentation events are reduced, it would be expected that the
      long-term allocation success rate would be higher.
      
      Link: http://lkml.kernel.org/r/20181123114528.28802-5-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NXu Yu <xuyu@linux.alibaba.com>
      Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
      9bcadc70
    • J
      arm64: Enable the support of pseudo-NMIs · beaa4f75
      Julien Thierry 提交于
      task #25552995
      
      commit bc3c03ccb4641fb940b27a0d369431876923a8fe upstream
      
      Add a build option and a command line parameter to build and enable the
      support of pseudo-NMIs.
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Suggested-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
      Reviewed-by: Nluanshi <zhangliguang@linux.alibaba.com>
      beaa4f75
    • J
      irqchip/gic-v3: Detect if GIC can support pseudo-NMIs · 7bf70240
      Julien Thierry 提交于
      task #25552995
      
      commit d98d0a990ca1446d3c0ca8f0b9ac127a66e40cdf upstream
      
      The values non secure EL1 needs to use for PMR and RPR registers depends on
      the value of SCR_EL3.FIQ.
      
      The values non secure EL1 sees from the distributor and redistributor
      depend on whether security is enabled for the GIC or not.
      
      To avoid having to deal with two sets of values for PMR
      masking/unmasking, only enable pseudo-NMIs when GIC has non-secure view
      of priorities.
      
      Also, add firmware requirements related to SCR_EL3.
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NZou Cao <zoucao@linux.alibaba.com>
      Reviewed-by: Nluanshi <zhangliguang@linux.alibaba.com>
      7bf70240
    • J
      arm64: Update silicon-errata.txt for Neoverse-N1 #1349291 · 52e57fc5
      James Morse 提交于
      task #28924046
      
      [ Upstream commit 3276cc248964 ]
      
      Neoverse-N1 affected by #1349291 may report an Uncontained RAS Error
      as Unrecoverable. The kernel's architecture code already considers
      Unrecoverable errors as fatal as without kernel-first support no
      further error-handling is possible.
      
      Now that KVM attributes SError to the host/guest more precisely
      the host's architecture code will always handle host errors that
      become pending during world-switch.
      Errors misclassified by this errata that affected the guest will be
      re-injected to the guest as an implementation-defined SError, which can
      be uncontained.
      
      Until kernel-first support is implemented, no workaround is needed
      for this issue.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NBin Yu <jkchen@linux.alibaba.com>
      Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com>
      Reviewed-by: Nzou cao <zoucao@linux.alibaba.com>
      52e57fc5
    • M
      arm64: Handle erratum 1418040 as a superset of erratum 1188873 · ca26fded
      Marc Zyngier 提交于
      task #28924046
      
      [ Upstream commit a5325089bd05 ]
      
      We already mitigate erratum 1188873 affecting Cortex-A76 and
      Neoverse-N1 r0p0 to r2p0. It turns out that revisions r0p0 to
      r3p1 of the same cores are affected by erratum 1418040, which
      has the same workaround as 1188873.
      
      Let's expand the range of affected revisions to match 1418040,
      and repaint all occurences of 1188873 to 1418040. Whilst we're
      there, do a bit of reformating in silicon-errata.txt and drop
      a now unnecessary dependency on ARM_ARCH_TIMER_OOL_WORKAROUND.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NBin Yu <jkchen@linux.alibaba.com>
      Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com>
      Reviewed-by: Nzou cao <zoucao@linux.alibaba.com>
      ca26fded
    • M
      arm64: Apply ARM64_ERRATUM_1188873 to Neoverse-N1 · b429dc25
      Marc Zyngier 提交于
      task #28924046
      
      [ Upstream commit 6989303a3b2d864fd8e17d3fa3365d3e9649a598 ]
      
      Neoverse-N1 is also affected by ARM64_ERRATUM_1188873, so let's
      add it to the list of affected CPUs.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      [will: Update silicon-errata.txt]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NBin Yu <jkchen@linux.alibaba.com>
      Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com>
      Reviewed-by: Nzou cao <zoucao@linux.alibaba.com>
      b429dc25
    • C
      alinux: introduce deferred_meminit boot parameter · 05f6ed40
      chenxiangzuo 提交于
      fix #27418285
      
      We introduce a boot parametter 'deferred_meminit' for defer
      page init feature. Default it is disabled, and we can pass
      'deferred_meminit' to enable it.
      Signed-off-by: Nchenxiangzuo <cxz18821786681@linux.alibaba.com>
      Reviewed-by: NXu Yu <xuyu@linux.alibaba.com>
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Acked-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      05f6ed40
  2. 23 6月, 2020 1 次提交
  3. 28 4月, 2020 1 次提交
    • B
      Documentation: Rename and update intel_rdt_ui.txt to resctrl_ui.txt · 5f0beb81
      Babu Moger 提交于
      to #26613714
      
      commit a6f771c9bf4eea2da1516e70c283ede61a7d666f upstream.
      
      Rename intel_rdt_ui.txt to generic resctrl_ui.txt and update the
      documentation for AMD.
      Signed-off-by: NBabu Moger <babu.moger@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Dmitry Safonov <dima@arista.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: <linux-doc@vger.kernel.org>
      Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Philippe Ombredanne <pombredanne@nexb.com>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: <qianyue.zj@alibaba-inc.com>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Reinette Chatre <reinette.chatre@intel.com>
      Cc: Rian Hunter <rian@alum.mit.edu>
      Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: <xiaochen.shen@intel.com>
      Link: https://lkml.kernel.org/r/20181121202811.4492-13-babu.moger@amd.comSigned-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      Tested-by: NWANG Siyuan <Siyuan.Wang@amd.com>
      Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      5f0beb81
  4. 13 4月, 2020 1 次提交
  5. 18 3月, 2020 12 次提交
  6. 17 1月, 2020 3 次提交
  7. 15 1月, 2020 15 次提交