1. 15 8月, 2020 2 次提交
  2. 13 8月, 2020 1 次提交
  3. 08 8月, 2020 2 次提交
  4. 03 8月, 2020 2 次提交
  5. 08 7月, 2020 1 次提交
  6. 22 6月, 2020 4 次提交
  7. 10 6月, 2020 3 次提交
  8. 05 6月, 2020 1 次提交
  9. 04 6月, 2020 6 次提交
    • J
      mm: memcontrol: delete unused lrucare handling · d9eb1ea2
      Johannes Weiner 提交于
      Swapin faults were the last event to charge pages after they had already
      been put on the LRU list.  Now that we charge directly on swapin, the
      lrucare portion of the charge code is unused.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Alex Shi <alex.shi@linux.alibaba.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Link: http://lkml.kernel.org/r/20200508183105.225460-19-hannes@cmpxchg.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d9eb1ea2
    • J
      mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API · 9d82c694
      Johannes Weiner 提交于
      With the page->mapping requirement gone from memcg, we can charge anon and
      file-thp pages in one single step, right after they're allocated.
      
      This removes two out of three API calls - especially the tricky commit
      step that needed to happen at just the right time between when the page is
      "set up" and when it's "published" - somewhat vague and fluid concepts
      that varied by page type.  All we need is a freshly allocated page and a
      memcg context to charge.
      
      v2: prevent double charges on pre-allocated hugepages in khugepaged
      
      [hannes@cmpxchg.org: Fix crash - *hpage could be ERR_PTR instead of NULL]
        Link: http://lkml.kernel.org/r/20200512215813.GA487759@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Alex Shi <alex.shi@linux.alibaba.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Qian Cai <cai@lca.pw>
      Link: http://lkml.kernel.org/r/20200508183105.225460-13-hannes@cmpxchg.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d82c694
    • J
      mm: memcontrol: switch to native NR_FILE_PAGES and NR_SHMEM counters · 0d1c2072
      Johannes Weiner 提交于
      Memcg maintains private MEMCG_CACHE and NR_SHMEM counters.  This
      divergence from the generic VM accounting means unnecessary code overhead,
      and creates a dependency for memcg that page->mapping is set up at the
      time of charging, so that page types can be told apart.
      
      Convert the generic accounting sites to mod_lruvec_page_state and friends
      to maintain the per-cgroup vmstat counters of NR_FILE_PAGES and NR_SHMEM.
      The page is already locked in these places, so page->mem_cgroup is stable;
      we only need minimal tweaks of two mem_cgroup_migrate() calls to ensure
      it's set up in time.
      
      Then replace MEMCG_CACHE with NR_FILE_PAGES and delete the private
      NR_SHMEM accounting sites.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Alex Shi <alex.shi@linux.alibaba.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Link: http://lkml.kernel.org/r/20200508183105.225460-10-hannes@cmpxchg.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0d1c2072
    • J
      mm: memcontrol: convert page cache to a new mem_cgroup_charge() API · 3fea5a49
      Johannes Weiner 提交于
      The try/commit/cancel protocol that memcg uses dates back to when pages
      used to be uncharged upon removal from the page cache, and thus couldn't
      be committed before the insertion had succeeded.  Nowadays, pages are
      uncharged when they are physically freed; it doesn't matter whether the
      insertion was successful or not.  For the page cache, the transaction
      dance has become unnecessary.
      
      Introduce a mem_cgroup_charge() function that simply charges a newly
      allocated page to a cgroup and sets up page->mem_cgroup in one single
      step.  If the insertion fails, the caller doesn't have to do anything but
      free/put the page.
      
      Then switch the page cache over to this new API.
      
      Subsequent patches will also convert anon pages, but it needs a bit more
      prep work.  Right now, memcg depends on page->mapping being already set up
      at the time of charging, so that it can maintain its own MEMCG_CACHE and
      MEMCG_RSS counters.  For anon, page->mapping is set under the same pte
      lock under which the page is publishd, so a single charge point that can
      block doesn't work there just yet.
      
      The following prep patches will replace the private memcg counters with
      the generic vmstat counters, thus removing the page->mapping dependency,
      then complete the transition to the new single-point charge API and delete
      the old transactional scheme.
      
      v2: leave shmem swapcache when charging fails to avoid double IO (Joonsoo)
      v3: rebase on preceeding shmem simplification patch
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NAlex Shi <alex.shi@linux.alibaba.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Link: http://lkml.kernel.org/r/20200508183105.225460-6-hannes@cmpxchg.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3fea5a49
    • J
      mm: memcontrol: drop @compound parameter from memcg charging API · 3fba69a5
      Johannes Weiner 提交于
      The memcg charging API carries a boolean @compound parameter that tells
      whether the page we're dealing with is a hugepage.
      mem_cgroup_commit_charge() has another boolean @lrucare that indicates
      whether the page needs LRU locking or not while charging.  The majority of
      callsites know those parameters at compile time, which results in a lot of
      naked "false, false" argument lists.  This makes for cryptic code and is a
      breeding ground for subtle mistakes.
      
      Thankfully, the huge page state can be inferred from the page itself and
      doesn't need to be passed along.  This is safe because charging completes
      before the page is published and somebody may split it.
      
      Simplify the callsites by removing @compound, and let memcg infer the
      state by using hpage_nr_pages() unconditionally.  That function does
      PageTransHuge() to identify huge pages, which also helpfully asserts that
      nobody passes in tail pages by accident.
      
      The following patches will introduce a new charging API, best not to carry
      over unnecessary weight.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NAlex Shi <alex.shi@linux.alibaba.com>
      Reviewed-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Reviewed-by: NShakeel Butt <shakeelb@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Link: http://lkml.kernel.org/r/20200508183105.225460-4-hannes@cmpxchg.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3fba69a5
    • J
      mm: fix NUMA node file count error in replace_page_cache() · f4129ea3
      Johannes Weiner 提交于
      Patch series "mm: memcontrol: charge swapin pages on instantiation", v2.
      
      This patch series reworks memcg to charge swapin pages directly at
      swapin time, rather than at fault time, which may be much later, or
      not happen at all.
      
      Changes in version 2:
      - prevent double charges on pre-allocated hugepages in khugepaged
      - leave shmem swapcache when charging fails to avoid double IO (Joonsoo)
      - fix temporary accounting bug by switching rmap<->commit (Joonsoo)
      - fix double swap charge bug in cgroup1/cgroup2 code gating
      - simplify swapin error checking (Joonsoo)
      - mm: memcontrol: document the new swap control behavior (Alex)
      - review tags
      
      The delayed swapin charging scheme we have right now causes problems:
      
      - Alex's per-cgroup lru_lock patches rely on pages that have been
        isolated from the LRU to have a stable page->mem_cgroup; otherwise
        the lock may change underneath him. Swapcache pages are charged only
        after they are added to the LRU, and charging doesn't follow the LRU
        isolation protocol.
      
      - Joonsoo's anon workingset patches need a suitable LRU at the time
        the page enters the swap cache and displaces the non-resident
        info. But the correct LRU is only available after charging.
      
      - It's a containment hole / DoS vector. Users can trigger arbitrarily
        large swap readahead using MADV_WILLNEED. The memory is never
        charged unless somebody actually touches it.
      
      - It complicates the page->mem_cgroup stabilization rules
      
      In order to charge pages directly at swapin time, the memcg code base
      needs to be prepared, and several overdue cleanups become a necessity:
      
      To charge pages at swapin time, we need to always have cgroup
      ownership tracking of swap records. We also cannot rely on
      page->mapping to tell apart page types at charge time, because that's
      only set up during a page fault.
      
      To eliminate the page->mapping dependency, memcg needs to ditch its
      private page type counters (MEMCG_CACHE, MEMCG_RSS, NR_SHMEM) in favor
      of the generic vmstat counters and accounting sites, such as
      NR_FILE_PAGES, NR_ANON_MAPPED etc.
      
      To switch to generic vmstat counters, the charge sequence must be
      adjusted such that page->mem_cgroup is set up by the time these
      counters are modified.
      
      The series is structured as follows:
      
      1. Bug fixes
      2. Decoupling charging from rmap
      3. Swap controller integration into memcg
      4. Direct swapin charging
      
      This patch (of 19):
      
      When replacing one page with another one in the cache, we have to decrease
      the file count of the old page's NUMA node and increase the one of the new
      NUMA node, otherwise the old node leaks the count and the new node
      eventually underflows its counter.
      
      Fixes: 74d60958 ("page cache: Add and replace pages using the XArray")
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NAlex Shi <alex.shi@linux.alibaba.com>
      Reviewed-by: NShakeel Butt <shakeelb@google.com>
      Reviewed-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Roman Gushchin <guro@fb.com>
      Link: http://lkml.kernel.org/r/20200508183105.225460-1-hannes@cmpxchg.org
      Link: http://lkml.kernel.org/r/20200508183105.225460-2-hannes@cmpxchg.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4129ea3
  10. 03 6月, 2020 1 次提交
  11. 25 5月, 2020 1 次提交
  12. 08 4月, 2020 1 次提交
    • H
      mm: huge tmpfs: try to split_huge_page() when punching hole · 71725ed1
      Hugh Dickins 提交于
      Yang Shi writes:
      
      Currently, when truncating a shmem file, if the range is partly in a THP
      (start or end is in the middle of THP), the pages actually will just get
      cleared rather than being freed, unless the range covers the whole THP.
      Even though all the subpages are truncated (randomly or sequentially), the
      THP may still be kept in page cache.
      
      This might be fine for some usecases which prefer preserving THP, but
      balloon inflation is handled in base page size.  So when using shmem THP
      as memory backend, QEMU inflation actually doesn't work as expected since
      it doesn't free memory.  But the inflation usecase really needs to get the
      memory freed.  (Anonymous THP will also not get freed right away, but will
      be freed eventually when all subpages are unmapped: whereas shmem THP
      still stays in page cache.)
      
      Split THP right away when doing partial hole punch, and if split fails
      just clear the page so that read of the punched area will return zeroes.
      
      Hugh Dickins adds:
      
      Our earlier "team of pages" huge tmpfs implementation worked in the way
      that Yang Shi proposes; and we have been using this patch to continue to
      split the huge page when hole-punched or truncated, since converting over
      to the compound page implementation.  Although huge tmpfs gives out huge
      pages when available, if the user specifically asks to truncate or punch a
      hole (perhaps to free memory, perhaps to reduce the memcg charge), then
      the filesystem should do so as best it can, splitting the huge page.
      
      That is not always possible: any additional reference to the huge page
      prevents split_huge_page() from succeeding, so the result can be flaky.
      But in practice it works successfully enough that we've not seen any
      problem from that.
      
      Add shmem_punch_compound() to encapsulate the decision of when a split is
      needed, and doing the split if so.  Using this simplifies the flow in
      shmem_undo_range(); and the first (trylock) pass does not need to do any
      page clearing on failure, because the second pass will either succeed or
      do that clearing.  Following the example of zero_user_segment() when
      clearing a partial page, add flush_dcache_page() and set_page_dirty() when
      clearing a hole - though I'm not certain that either is needed.
      
      But: split_huge_page() would be sure to fail if shmem_undo_range()'s
      pagevec holds further references to the huge page.  The easiest way to fix
      that is for find_get_entries() to return early, as soon as it has put one
      compound head or tail into the pagevec.  At first this felt like a hack;
      but on examination, this convention better suits all its callers - or will
      do, if the slight one-page-per-pagevec slowdown in shmem_unlock_mapping()
      and shmem_seek_hole_data() is transformed into a 512-page-per-pagevec
      speedup by checking for compound pages there.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Yang Shi <yang.shi@linux.alibaba.com>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2002261959020.10801@eggly.anvilsSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      71725ed1
  13. 03 4月, 2020 7 次提交
  14. 01 2月, 2020 1 次提交
  15. 01 12月, 2019 4 次提交
  16. 19 10月, 2019 1 次提交
  17. 25 9月, 2019 2 次提交