1. 09 6月, 2020 10 次提交
  2. 06 6月, 2020 3 次提交
  3. 05 6月, 2020 2 次提交
    • A
      mm/debug: add tests validating architecture page table helpers · 399145f9
      Anshuman Khandual 提交于
      This adds tests which will validate architecture page table helpers and
      other accessors in their compliance with expected generic MM semantics.
      This will help various architectures in validating changes to existing
      page table helpers or addition of new ones.
      
      This test covers basic page table entry transformations including but not
      limited to old, young, dirty, clean, write, write protect etc at various
      level along with populating intermediate entries with next page table page
      and validating them.
      
      Test page table pages are allocated from system memory with required size
      and alignments.  The mapped pfns at page table levels are derived from a
      real pfn representing a valid kernel text symbol.  This test gets called
      via late_initcall().
      
      This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected.
      Any architecture, which is willing to subscribe this test will need to
      select ARCH_HAS_DEBUG_VM_PGTABLE.  For now this is limited to arc, arm64,
      x86, s390 and powerpc platforms where the test is known to build and run
      successfully Going forward, other architectures too can subscribe the test
      after fixing any build or runtime problems with their page table helpers.
      
      Folks interested in making sure that a given platform's page table helpers
      conform to expected generic MM semantics should enable the above config
      which will just trigger this test during boot.  Any non conformity here
      will be reported as an warning which would need to be fixed.  This test
      will help catch any changes to the agreed upon semantics expected from
      generic MM and enable platforms to accommodate it thereafter.
      
      [anshuman.khandual@arm.com: v17]
        Link: http://lkml.kernel.org/r/1587436495-22033-3-git-send-email-anshuman.khandual@arm.com
      [anshuman.khandual@arm.com: v18]
        Link: http://lkml.kernel.org/r/1588564865-31160-3-git-send-email-anshuman.khandual@arm.comSuggested-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[s390]
      Tested-by: Christophe Leroy <christophe.leroy@c-s.fr>	[ppc32]
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Link: http://lkml.kernel.org/r/1583919272-24178-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      399145f9
    • A
      kcov: collect coverage from interrupts · 5ff3b30a
      Andrey Konovalov 提交于
      This change extends kcov remote coverage support to allow collecting
      coverage from soft interrupts in addition to kernel background threads.
      
      To collect coverage from code that is executed in softirq context, a part
      of that code has to be annotated with kcov_remote_start/stop() in a
      similar way as how it is done for global kernel background threads.  Then
      the handle used for the annotations has to be passed to the
      KCOV_REMOTE_ENABLE ioctl.
      
      Internally this patch adjusts the __sanitizer_cov_trace_pc() compiler
      inserted callback to not bail out when called from softirq context.
      kcov_remote_start/stop() are updated to save/restore the current per task
      kcov state in a per-cpu area (in case the softirq came when the kernel was
      already collecting coverage in task context).  Coverage from softirqs is
      collected into pre-allocated per-cpu areas, whose size is controlled by
      the new CONFIG_KCOV_IRQ_AREA_SIZE.
      
      [andreyknvl@google.com: turn current->kcov_softirq into unsigned int to fix objtool warning]
        Link: http://lkml.kernel.org/r/841c778aa3849c5cb8c3761f56b87ce653a88671.1585233617.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Marco Elver <elver@google.com>
      Link: http://lkml.kernel.org/r/469bd385c431d050bc38a593296eff4baae50666.1584655448.git.andreyknvl@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5ff3b30a
  4. 04 6月, 2020 10 次提交
  5. 03 6月, 2020 9 次提交
    • M
      RDMA/core: Remove FMR device ops · 3a578152
      Max Gurtovoy 提交于
      After removing FMR support from all the RDMA ULPs and providers, there
      is no need to keep FMR operation for IB devices.
      
      Link: https://lore.kernel.org/r/11-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.comSigned-off-by: NMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      3a578152
    • M
      RDMA/core: Remove FMR pool API · 4e373d54
      Max Gurtovoy 提交于
      This ancient and unsafe method for memory registration is no longer used
      by any RDMA based ULP. Remove the FMR pool API from the core driver.
      
      Link: https://lore.kernel.org/r/4-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.comSigned-off-by: NMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      4e373d54
    • C
      mm: remove map_vm_range · ed1f324c
      Christoph Hellwig 提交于
      Switch all callers to map_kernel_range, which symmetric to the unmap side
      (as well as the _noflush versions).
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Gao Xiang <xiang@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Kelley <mikelley@microsoft.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Wei Liu <wei.liu@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/20200414131348.444715-17-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed1f324c
    • J
      mm/memcg: automatically penalize tasks with high swap use · 4b82ab4f
      Jakub Kicinski 提交于
      Add a memory.swap.high knob, which can be used to protect the system
      from SWAP exhaustion.  The mechanism used for penalizing is similar to
      memory.high penalty (sleep on return to user space).
      
      That is not to say that the knob itself is equivalent to memory.high.
      The objective is more to protect the system from potentially buggy tasks
      consuming a lot of swap and impacting other tasks, or even bringing the
      whole system to stand still with complete SWAP exhaustion.  Hopefully
      without the need to find per-task hard limits.
      
      Slowing misbehaving tasks down gradually allows user space oom killers
      or other protection mechanisms to react.  oomd and earlyoom already do
      killing based on swap exhaustion, and memory.swap.high protection will
      help implement such userspace oom policies more reliably.
      
      We can use one counter for number of pages allocated under pressure to
      save struct task space and avoid two separate hierarchy walks on the hot
      path.  The exact overage is calculated on return to user space, anyway.
      
      Take the new high limit into account when determining if swap is "full".
      Borrowing the explanation from Johannes:
      
        The idea behind "swap full" is that as long as the workload has plenty
        of swap space available and it's not changing its memory contents, it
        makes sense to generously hold on to copies of data in the swap device,
        even after the swapin.  A later reclaim cycle can drop the page without
        any IO.  Trading disk space for IO.
      
        But the only two ways to reclaim a swap slot is when they're faulted
        in and the references go away, or by scanning the virtual address space
        like swapoff does - which is very expensive (one could argue it's too
        expensive even for swapoff, it's often more practical to just reboot).
      
        So at some point in the fill level, we have to start freeing up swap
        slots on fault/swapin.  Otherwise we could eventually run out of swap
        slots while they're filled with copies of data that is also in RAM.
      
        We don't want to OOM a workload because its available swap space is
        filled with redundant cache.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Chris Down <chris@chrisdown.name>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Link: http://lkml.kernel.org/r/20200527195846.102707-5-kuba@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b82ab4f
    • Y
      mm, memcg: add workingset_restore in memory.stat · a6f5576b
      Yafang Shao 提交于
      There's a new workingset counter introduced in commit 1899ad18 ("mm:
      workingset: tell cache transitions from workingset thrashing").  With
      the help of this counter we can know the workingset is transitioning or
      thrashing.  To leverage the benifit of this counter to memcg, we should
      introduce it into memory.stat.  Then we could know the workingset of the
      workload inside a memcg better.
      
      Bellow is the verification of this new counter in memory.stat.  Read a
      file into the memory and then read it again to make these pages be
      active.  The size of this file is 1G.  (memory.max is greater than file
      size) The counters in memory.stat will be
      
      	inactive_file 0
      	active_file 1073639424
      
      	workingset_refault 0
      	workingset_activate 0
      	workingset_restore 0
      	workingset_nodereclaim 0
      
      Trigger the memcg reclaim by setting a lower value to memory.high, and
      then some pages will be demoted into inactive list, and then some pages
      in the inactive list will be evicted into the storage.
      
      	inactive_file 498094080
      	active_file 310063104
      
      	workingset_refault 0
      	workingset_activate 0
      	workingset_restore 0
      	workingset_nodereclaim 0
      
      Then recover the memory.high and read the file into memory again.  As a
      result of it, the transitioning will occur.  Bellow is the result of
      this transitioning,
      
      	inactive_file 498094080
      	active_file 575397888
      
      	workingset_refault 64746
      	workingset_activate 64746
      	workingset_restore 64746
      	workingset_nodereclaim 0
      Signed-off-by: NYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NChris Down <chris@chrisdown.name>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Link: http://lkml.kernel.org/r/20200504153522.11553-1-laoar.shao@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a6f5576b
    • N
      mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead · 8d92890b
      NeilBrown 提交于
      After an NFS page has been written it is considered "unstable" until a
      COMMIT request succeeds.  If the COMMIT fails, the page will be
      re-written.
      
      These "unstable" pages are currently accounted as "reclaimable", either
      in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a
      'reclaimable' count.  This might have made sense when sending the COMMIT
      required a separate action by the VFS/MM (e.g.  releasepage() used to
      send a COMMIT).  However now that all writes generated by ->writepages()
      will automatically be followed by a COMMIT (since commit 919e3bd9
      ("NFS: Ensure we commit after writeback is complete")) it makes more
      sense to treat them as writeback pages.
      
      So this patch removes NR_UNSTABLE_NFS and accounts unstable pages in
      NR_WRITEBACK and WB_WRITEBACK.
      
      A particular effect of this change is that when
      wb_check_background_flush() calls wb_over_bg_threshold(), the latter
      will report 'true' a lot less often as the 'unstable' pages are no
      longer considered 'dirty' (as there is nothing that writeback can do
      about them anyway).
      
      Currently wb_check_background_flush() will trigger writeback to NFS even
      when there are relatively few dirty pages (if there are lots of unstable
      pages), this can result in small writes going to the server (10s of
      Kilobytes rather than a Megabyte) which hurts throughput.  With this
      patch, there are fewer writes which are each larger on average.
      
      Where the NR_UNSTABLE_NFS count was included in statistics
      virtual-files, the entry is retained, but the value is hard-coded as
      zero.  static trace points and warning printks which mentioned this
      counter no longer report it.
      
      [akpm@linux-foundation.org: re-layout comment]
      [akpm@linux-foundation.org: fix printk warning]
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      Acked-by: Michal Hocko <mhocko@suse.com>	[mm]
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Link: http://lkml.kernel.org/r/87d06j7gqa.fsf@notabene.neil.brown.nameSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d92890b
    • M
      mm: add readahead address space operation · 8151b4c8
      Matthew Wilcox (Oracle) 提交于
      This replaces ->readpages with a saner interface:
       - Return void instead of an ignored error code.
       - Page cache is already populated with locked pages when ->readahead
         is called.
       - New arguments can be passed to the implementation without changing
         all the filesystems that use a common helper function like
         mpage_readahead().
      Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com>
      Cc: Chao Yu <yuchao0@huawei.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Eric Biggers <ebiggers@google.com>
      Cc: Gao Xiang <gaoxiang25@huawei.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
      Cc: Miklos Szeredi <mszeredi@redhat.com>
      Link: http://lkml.kernel.org/r/20200414150233.24495-12-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8151b4c8
    • A
      Documentation/vm/slub.rst: s/Toggle/Enable/ · a3df6927
      Andrew Morton 提交于
      "toggle" means to change a boolean thing's state.  This operation
      doesn't do that - it sets it to "true".
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NRafael Aquini <aquini@redhat.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a3df6927
    • B
      Documentation: security: core.rst: add missing argument · 352780b6
      Ben Boeckel 提交于
      This argument was just never documented in the first place.
      Signed-off-by: NBen Boeckel <mathstuf@gmail.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      352780b6
  6. 02 6月, 2020 6 次提交