1. 25 2月, 2021 1 次提交
    • M
      mm: memcontrol: convert NR_SHMEM_THPS account to pages · 57b2847d
      Muchun Song 提交于
      Currently we use struct per_cpu_nodestat to cache the vmstat counters,
      which leads to inaccurate statistics especially THP vmstat counters.  In
      the systems with hundreds of processors it can be GBs of memory.  For
      example, for a 96 CPUs system, the threshold is the maximum number of 125.
      And the per cpu counters can cache 23.4375 GB in total.
      
      The THP page is already a form of batched addition (it will add 512 worth
      of memory in one go) so skipping the batching seems like sensible.
      Although every THP stats update overflows the per-cpu counter, resorting
      to atomic global updates.  But it can make the statistics more accuracy
      for the THP vmstat counters.
      
      So we convert the NR_SHMEM_THPS account to pages.  This patch is
      consistent with 8f182270 ("mm/swap.c: flush lru pvecs on compound page
      arrival").  Doing this also can make the unit of vmstat counters more
      unified.  Finally, the unit of the vmstat counters are pages, kB and
      bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
      rest which is without suffix are pages.
      
      Link: https://lkml.kernel.org/r/20201228164110.2838-5-songmuchun@bytedance.comSigned-off-by: NMuchun Song <songmuchun@bytedance.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
      Cc: Rafael. J. Wysocki <rafael@kernel.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Sami Tolvanen <samitolvanen@google.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57b2847d
  2. 24 1月, 2021 5 次提交
    • C
      fs: make helpers idmap mount aware · 549c7297
      Christian Brauner 提交于
      Extend some inode methods with an additional user namespace argument. A
      filesystem that is aware of idmapped mounts will receive the user
      namespace the mount has been marked with. This can be used for
      additional permission checking and also to enable filesystems to
      translate between uids and gids if they need to. We have implemented all
      relevant helpers in earlier patches.
      
      As requested we simply extend the exisiting inode method instead of
      introducing new ones. This is a little more code churn but it's mostly
      mechanical and doesnt't leave us with additional inode methods.
      
      Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      549c7297
    • C
      stat: handle idmapped mounts · 0d56a451
      Christian Brauner 提交于
      The generic_fillattr() helper fills in the basic attributes associated
      with an inode. Enable it to handle idmapped mounts. If the inode is
      accessed through an idmapped mount map it into the mount's user
      namespace before we store the uid and gid. If the initial user namespace
      is passed nothing changes so non-idmapped mounts will see identical
      behavior as before.
      
      Link: https://lore.kernel.org/r/20210121131959.646623-12-christian.brauner@ubuntu.com
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJames Morris <jamorris@linux.microsoft.com>
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      0d56a451
    • C
      acl: handle idmapped mounts · e65ce2a5
      Christian Brauner 提交于
      The posix acl permission checking helpers determine whether a caller is
      privileged over an inode according to the acls associated with the
      inode. Add helpers that make it possible to handle acls on idmapped
      mounts.
      
      The vfs and the filesystems targeted by this first iteration make use of
      posix_acl_fix_xattr_from_user() and posix_acl_fix_xattr_to_user() to
      translate basic posix access and default permissions such as the
      ACL_USER and ACL_GROUP type according to the initial user namespace (or
      the superblock's user namespace) to and from the caller's current user
      namespace. Adapt these two helpers to handle idmapped mounts whereby we
      either map from or into the mount's user namespace depending on in which
      direction we're translating.
      Similarly, cap_convert_nscap() is used by the vfs to translate user
      namespace and non-user namespace aware filesystem capabilities from the
      superblock's user namespace to the caller's user namespace. Enable it to
      handle idmapped mounts by accounting for the mount's user namespace.
      
      In addition the fileystems targeted in the first iteration of this patch
      series make use of the posix_acl_chmod() and, posix_acl_update_mode()
      helpers. Both helpers perform permission checks on the target inode. Let
      them handle idmapped mounts. These two helpers are called when posix
      acls are set by the respective filesystems to handle this case we extend
      the ->set() method to take an additional user namespace argument to pass
      the mount's user namespace down.
      
      Link: https://lore.kernel.org/r/20210121131959.646623-9-christian.brauner@ubuntu.com
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      e65ce2a5
    • C
      attr: handle idmapped mounts · 2f221d6f
      Christian Brauner 提交于
      When file attributes are changed most filesystems rely on the
      setattr_prepare(), setattr_copy(), and notify_change() helpers for
      initialization and permission checking. Let them handle idmapped mounts.
      If the inode is accessed through an idmapped mount map it into the
      mount's user namespace. Afterwards the checks are identical to
      non-idmapped mounts. If the initial user namespace is passed nothing
      changes so non-idmapped mounts will see identical behavior as before.
      
      Helpers that perform checks on the ia_uid and ia_gid fields in struct
      iattr assume that ia_uid and ia_gid are intended values and have already
      been mapped correctly at the userspace-kernelspace boundary as we
      already do today. If the initial user namespace is passed nothing
      changes so non-idmapped mounts will see identical behavior as before.
      
      Link: https://lore.kernel.org/r/20210121131959.646623-8-christian.brauner@ubuntu.com
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      2f221d6f
    • C
      inode: make init and permission helpers idmapped mount aware · 21cb47be
      Christian Brauner 提交于
      The inode_owner_or_capable() helper determines whether the caller is the
      owner of the inode or is capable with respect to that inode. Allow it to
      handle idmapped mounts. If the inode is accessed through an idmapped
      mount it according to the mount's user namespace. Afterwards the checks
      are identical to non-idmapped mounts. If the initial user namespace is
      passed nothing changes so non-idmapped mounts will see identical
      behavior as before.
      
      Similarly, allow the inode_init_owner() helper to handle idmapped
      mounts. It initializes a new inode on idmapped mounts by mapping the
      fsuid and fsgid of the caller from the mount's user namespace. If the
      initial user namespace is passed nothing changes so non-idmapped mounts
      will see identical behavior as before.
      
      Link: https://lore.kernel.org/r/20210121131959.646623-7-christian.brauner@ubuntu.com
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJames Morris <jamorris@linux.microsoft.com>
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      21cb47be
  3. 21 1月, 2021 1 次提交
  4. 16 12月, 2020 3 次提交
  5. 17 10月, 2020 1 次提交
  6. 14 10月, 2020 1 次提交
  7. 20 9月, 2020 1 次提交
  8. 04 9月, 2020 2 次提交
  9. 13 8月, 2020 2 次提交
  10. 08 8月, 2020 3 次提交
  11. 25 7月, 2020 1 次提交
  12. 10 6月, 2020 2 次提交
    • M
      mmap locking API: convert mmap_sem comments · c1e8d7c6
      Michel Lespinasse 提交于
      Convert comments that reference mmap_sem to reference mmap_lock instead.
      
      [akpm@linux-foundation.org: fix up linux-next leftovers]
      [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
      [akpm@linux-foundation.org: more linux-next fixups, per Michel]
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Laurent Dufour <ldufour@linux.ibm.com>
      Cc: Liam Howlett <Liam.Howlett@oracle.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ying Han <yinghan@google.com>
      Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c1e8d7c6
    • M
      mm: don't include asm/pgtable.h if linux/mm.h is already included · e31cf2f4
      Mike Rapoport 提交于
      Patch series "mm: consolidate definitions of page table accessors", v2.
      
      The low level page table accessors (pXY_index(), pXY_offset()) are
      duplicated across all architectures and sometimes more than once.  For
      instance, we have 31 definition of pgd_offset() for 25 supported
      architectures.
      
      Most of these definitions are actually identical and typically it boils
      down to, e.g.
      
      static inline unsigned long pmd_index(unsigned long address)
      {
              return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
      }
      
      static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
      {
              return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
      }
      
      These definitions can be shared among 90% of the arches provided
      XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.
      
      For architectures that really need a custom version there is always
      possibility to override the generic version with the usual ifdefs magic.
      
      These patches introduce include/linux/pgtable.h that replaces
      include/asm-generic/pgtable.h and add the definitions of the page table
      accessors to the new header.
      
      This patch (of 12):
      
      The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
      functions involving page table manipulations, e.g.  pte_alloc() and
      pmd_alloc().  So, there is no point to explicitly include <asm/pgtable.h>
      in the files that include <linux/mm.h>.
      
      The include statements in such cases are remove with a simple loop:
      
      	for f in $(git grep -l "include <linux/mm.h>") ; do
      		sed -i -e '/include <asm\/pgtable.h>/ d' $f
      	done
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
      Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e31cf2f4
  13. 04 6月, 2020 7 次提交
  14. 22 4月, 2020 3 次提交
  15. 08 4月, 2020 7 次提交