1. 21 5月, 2019 3 次提交
  2. 17 5月, 2019 1 次提交
  3. 15 5月, 2019 7 次提交
    • A
      mm: use mm_zero_struct_page from SPARC on all 64b architectures · 5470dea4
      Alexander Duyck 提交于
      Patch series "Deferred page init improvements", v7.
      
      This patchset is essentially a refactor of the page initialization logic
      that is meant to provide for better code reuse while providing a
      significant improvement in deferred page initialization performance.
      
      In my testing on an x86_64 system with 384GB of RAM I have seen the
      following.  In the case of regular memory initialization the deferred init
      time was decreased from 3.75s to 1.38s on average.  This amounts to a 172%
      improvement for the deferred memory initialization performance.
      
      I have called out the improvement observed with each patch.
      
      This patch (of 4):
      
      Use the same approach that was already in use on Sparc on all the
      architectures that support a 64b long.
      
      This is mostly motivated by the fact that 7 to 10 store/move instructions
      are likely always going to be faster than having to call into a function
      that is not specialized for handling page init.
      
      An added advantage to doing it this way is that the compiler can get away
      with combining writes in the __init_single_page call.  As a result the
      memset call will be reduced to only about 4 write operations, or at least
      that is what I am seeing with GCC 6.2 as the flags, LRU pointers, and
      count/mapcount seem to be cancelling out at least 4 of the 8 assignments
      on my system.
      
      One change I had to make to the function was to reduce the minimum page
      size to 56 to support some powerpc64 configurations.
      
      This change should introduce no change on SPARC since it already had this
      code.  In the case of x86_64 I saw a reduction from 3.75s to 2.80s when
      initializing 384GB of RAM per node.  Pavel Tatashin tested on a system
      with Broadcom's Stingray CPU and 48GB of RAM and found that
      __init_single_page() takes 19.30ns / 64-byte struct page before this patch
      and with this patch it takes 17.33ns / 64-byte struct page.  Mike Rapoport
      ran a similar test on a OpenPower (S812LC 8348-21C) with Power8 processor
      and 128GB or RAM.  His results per 64-byte struct page were 4.68ns before,
      and 4.59ns after this patch.
      
      Link: http://lkml.kernel.org/r/20190405221213.12227.9392.stgit@localhost.localdomainSigned-off-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Reviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Khalid Aziz <khalid.aziz@oracle.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <yi.z.zhang@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5470dea4
    • A
      hugetlb: allow to free gigantic pages regardless of the configuration · 4eb0716e
      Alexandre Ghiti 提交于
      On systems without CONTIG_ALLOC activated but that support gigantic pages,
      boottime reserved gigantic pages can not be freed at all.  This patch
      simply enables the possibility to hand back those pages to memory
      allocator.
      
      Link: http://lkml.kernel.org/r/20190327063626.18421-5-alex@ghiti.frSigned-off-by: NAlexandre Ghiti <alex@ghiti.fr>
      Acked-by: David S. Miller <davem@davemloft.net> [sparc]
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4eb0716e
    • A
      mm: simplify MEMORY_ISOLATION && COMPACTION || CMA into CONTIG_ALLOC · 8df995f6
      Alexandre Ghiti 提交于
      This condition allows to define alloc_contig_range, so simplify it into a
      more accurate naming.
      
      Link: http://lkml.kernel.org/r/20190327063626.18421-4-alex@ghiti.frSigned-off-by: NAlexandre Ghiti <alex@ghiti.fr>
      Suggested-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8df995f6
    • A
      sparc: advertise gigantic page support · b53f4569
      Alexandre Ghiti 提交于
      sparc actually supports gigantic pages and selecting
      ARCH_HAS_GIGANTIC_PAGE allows it to allocate and free gigantic pages at
      runtime.
      
      sparc allows configuration such as huge pages of 16GB, pages of 8KB and
      MAX_ORDER = 13 (default): HPAGE_SHIFT (34) - PAGE_SHIFT (13) = 21 >=
      MAX_ORDER (13)
      
      Link: http://lkml.kernel.org/r/20190327063626.18421-3-alex@ghiti.frSigned-off-by: NAlexandre Ghiti <alex@ghiti.fr>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b53f4569
    • M
      init: free_initmem: poison freed init memory · f4039999
      Mike Rapoport 提交于
      Various architectures including x86 poison the freed init memory.  Do the
      same in the generic free_initmem implementation and switch sparc32
      architecture that is identical to the generic code over to it now.
      
      Link: http://lkml.kernel.org/r/1550515285-17446-4-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f4039999
    • C
      initramfs: poison freed initrd memory · f94f7434
      Christoph Hellwig 提交于
      Various architectures including x86 poison the freed initrd memory.  Do
      the same in the generic free_initrd_mem implementation and switch a few
      more architectures that are identical to the generic code over to it now.
      
      Link: http://lkml.kernel.org/r/20190213174621.29297-9-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Steven Price <steven.price@arm.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f94f7434
    • I
      mm/gup: change GUP fast to use flags rather than a write 'bool' · 73b0140b
      Ira Weiny 提交于
      To facilitate additional options to get_user_pages_fast() change the
      singular write parameter to be gup_flags.
      
      This patch does not change any functionality.  New functionality will
      follow in subsequent patches.
      
      Some of the get_user_pages_fast() call sites were unchanged because they
      already passed FOLL_WRITE or 0 for the write parameter.
      
      NOTE: It was suggested to change the ordering of the get_user_pages_fast()
      arguments to ensure that callers were converted.  This breaks the current
      GUP call site convention of having the returned pages be the final
      parameter.  So the suggestion was rejected.
      
      Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com
      Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.comSigned-off-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NMike Marshall <hubcap@omnibond.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73b0140b
  4. 10 5月, 2019 3 次提交
    • V
      cpufreq: Call transition notifier only once for each policy · df24014a
      Viresh Kumar 提交于
      Currently, the notifiers are called once for each CPU of the policy->cpus
      cpumask. It would be more optimal if the notifier can be called only
      once and all the relevant information be provided to it. Out of the 23
      drivers that register for the transition notifiers today, only 4 of them
      do per-cpu updates and the callback for the rest can be called only once
      for the policy without any impact.
      
      This would also avoid multiple function calls to the notifier callbacks
      and reduce multiple iterations of notifier core's code (which does
      locking as well).
      
      This patch adds pointer to the cpufreq policy to the struct
      cpufreq_freqs, so the notifier callback has all the information
      available to it with a single call. The five drivers which perform
      per-cpu updates are updated to use the cpufreq policy. The freqs->cpu
      field is redundant now and is removed.
      
      Acked-by: David S. Miller <davem@davemloft.net> (sparc)
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      df24014a
    • M
      sparc64: simplify reduce_memory() function · f4d9a23d
      Mike Rapoport 提交于
      The reduce_memory() function clampls the available memory to a limit
      defined by the "mem=" command line parameter. It takes into account the
      amount of already reserved memory and excludes it from the limit
      calculations.
      
      Rather than traverse memblocks and remove them by hand, use
      memblock_reserved_size() to account the reserved memory and
      memblock_enforce_memory_limit() to clamp the available memory.
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4d9a23d
    • G
      sparc: use struct_size() in kzalloc() · bc0025b6
      Gustavo A. R. Silva 提交于
      One of the more common cases of allocation size calculations is finding the
      size of a structure that has a zero-sized array at the end, along with memory
      for some number of elements for that array. For example:
      
      struct foo {
          int stuff;
          void *entry[];
      };
      
      instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
      
      Instead of leaving these open-coded and prone to type mistakes, we can now
      use the new struct_size() helper:
      
      instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
      
      This code was detected with the help of Coccinelle.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc0025b6
  5. 09 5月, 2019 11 次提交
  6. 20 4月, 2019 1 次提交
  7. 18 4月, 2019 1 次提交
  8. 15 4月, 2019 1 次提交
  9. 11 4月, 2019 1 次提交
  10. 09 4月, 2019 1 次提交
    • S
      treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively · d75f773c
      Sakari Ailus 提交于
      %pF and %pf are functionally equivalent to %pS and %ps conversion
      specifiers. The former are deprecated, therefore switch the current users
      to use the preferred variant.
      
      The changes have been produced by the following command:
      
      	git grep -l '%p[fF]' | grep -v '^\(tools\|Documentation\)/' | \
      	while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done
      
      And verifying the result.
      
      Link: http://lkml.kernel.org/r/20190325193229.23390-1-sakari.ailus@linux.intel.com
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: sparclinux@vger.kernel.org
      Cc: linux-um@lists.infradead.org
      Cc: xen-devel@lists.xenproject.org
      Cc: linux-acpi@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: drbd-dev@lists.linbit.com
      Cc: linux-block@vger.kernel.org
      Cc: linux-mmc@vger.kernel.org
      Cc: linux-nvdimm@lists.01.org
      Cc: linux-pci@vger.kernel.org
      Cc: linux-scsi@vger.kernel.org
      Cc: linux-btrfs@vger.kernel.org
      Cc: linux-f2fs-devel@lists.sourceforge.net
      Cc: linux-mm@kvack.org
      Cc: ceph-devel@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Signed-off-by: NSakari Ailus <sakari.ailus@linux.intel.com>
      Acked-by: David Sterba <dsterba@suse.com> (for btrfs)
      Acked-by: Mike Rapoport <rppt@linux.ibm.com> (for mm/memblock.c)
      Acked-by: Bjorn Helgaas <bhelgaas@google.com> (for drivers/pci)
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      d75f773c
  11. 08 4月, 2019 2 次提交
  12. 05 4月, 2019 2 次提交
    • S
      syscalls: Remove start and number from syscall_set_arguments() args · 32d92586
      Steven Rostedt (VMware) 提交于
      After removing the start and count arguments of syscall_get_arguments() it
      seems reasonable to remove them from syscall_set_arguments(). Note, as of
      today, there are no users of syscall_set_arguments(). But we are told that
      there will be soon. But for now, at least make it consistent with
      syscall_get_arguments().
      
      Link: http://lkml.kernel.org/r/20190327222014.GA32540@altlinux.org
      
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: "Dmitry V. Levin" <ldv@altlinux.org>
      Cc: x86@kernel.org
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      Cc: linux-hexagon@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: nios2-dev@lists.rocketboards.org
      Cc: openrisc@lists.librecores.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-riscv@lists.infradead.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Cc: linux-um@lists.infradead.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: linux-arch@vger.kernel.org
      Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes
      Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits
      Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86
      Reviewed-by: NDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      32d92586
    • S
      syscalls: Remove start and number from syscall_get_arguments() args · b35f549d
      Steven Rostedt (Red Hat) 提交于
      At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
      function call syscall_get_arguments() implemented in x86 was horribly
      written and not optimized for the standard case of passing in 0 and 6 for
      the starting index and the number of system calls to get. When looking at
      all the users of this function, I discovered that all instances pass in only
      0 and 6 for these arguments. Instead of having this function handle
      different cases that are never used, simply rewrite it to return the first 6
      arguments of a system call.
      
      This should help out the performance of tracing system calls by ptrace,
      ftrace and perf.
      
      Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org
      
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: "Dmitry V. Levin" <ldv@altlinux.org>
      Cc: x86@kernel.org
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      Cc: linux-hexagon@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: nios2-dev@lists.rocketboards.org
      Cc: openrisc@lists.librecores.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-riscv@lists.infradead.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Cc: linux-um@lists.infradead.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: linux-arch@vger.kernel.org
      Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts
      Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes
      Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits
      Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86
      Reviewed-by: NDmitry V. Levin <ldv@altlinux.org>
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      b35f549d
  13. 03 4月, 2019 5 次提交
    • W
      locking/rwsem: Remove rwsem-spinlock.c & use rwsem-xadd.c for all archs · 390a0c62
      Waiman Long 提交于
      Currently, we have two different implementation of rwsem:
      
       1) CONFIG_RWSEM_GENERIC_SPINLOCK (rwsem-spinlock.c)
       2) CONFIG_RWSEM_XCHGADD_ALGORITHM (rwsem-xadd.c)
      
      As we are going to use a single generic implementation for rwsem-xadd.c
      and no architecture-specific code will be needed, there is no point
      in keeping two different implementations of rwsem. In most cases, the
      performance of rwsem-spinlock.c will be worse. It also doesn't get all
      the performance tuning and optimizations that had been implemented in
      rwsem-xadd.c over the years.
      
      For simplication, we are going to remove rwsem-spinlock.c and make all
      architectures use a single implementation of rwsem - rwsem-xadd.c.
      
      All references to RWSEM_GENERIC_SPINLOCK and RWSEM_XCHGADD_ALGORITHM
      in the code are removed.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: linux-riscv@lists.infradead.org
      Cc: linux-um@lists.infradead.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: nios2-dev@lists.rocketboards.org
      Cc: openrisc@lists.librecores.org
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      Link: https://lkml.kernel.org/r/20190322143008.21313-3-longman@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      390a0c62
    • W
      locking/rwsem: Remove arch specific rwsem files · 46ad0840
      Waiman Long 提交于
      As the generic rwsem-xadd code is using the appropriate acquire and
      release versions of the atomic operations, the arch specific rwsem.h
      files will not be that much faster than the generic code as long as the
      atomic functions are properly implemented. So we can remove those arch
      specific rwsem.h and stop building asm/rwsem.h to reduce maintenance
      effort.
      
      Currently, only x86, alpha and ia64 have implemented architecture
      specific fast paths. I don't have access to alpha and ia64 systems for
      testing, but they are legacy systems that are not likely to be updated
      to the latest kernel anyway.
      
      By using a rwsem microbenchmark, the total locking rates on a 4-socket
      56-core 112-thread x86-64 system before and after the patch were as
      follows (mixed means equal # of read and write locks):
      
                            Before Patch              After Patch
         # of Threads  wlock   rlock   mixed     wlock   rlock   mixed
         ------------  -----   -----   -----     -----   -----   -----
              1        29,201  30,143  29,458    28,615  30,172  29,201
              2         6,807  13,299   1,171     7,725  15,025   1,804
              4         6,504  12,755   1,520     7,127  14,286   1,345
              8         6,762  13,412     764     6,826  13,652     726
             16         6,693  15,408     662     6,599  15,938     626
             32         6,145  15,286     496     5,549  15,487     511
             64         5,812  15,495      60     5,858  15,572      60
      
      There were some run-to-run variations for the multi-thread tests. For
      x86-64, using the generic C code fast path seems to be a little bit
      faster than the assembly version with low lock contention.  Looking at
      the assembly version of the fast paths, there are assembly to/from C
      code wrappers that save and restore all the callee-clobbered registers
      (7 registers on x86-64). The assembly generated from the generic C
      code doesn't need to do that. That may explain the slight performance
      gain here.
      
      The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h
      with no code change as no other code other than those under
      kernel/locking needs to access the internal rwsem macros and functions.
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: linux-riscv@lists.infradead.org
      Cc: linux-um@lists.infradead.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: nios2-dev@lists.rocketboards.org
      Cc: openrisc@lists.librecores.org
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      Link: https://lkml.kernel.org/r/20190322143008.21313-2-longman@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      46ad0840
    • P
      arch/tlb: Clean up simple architectures · 6137fed0
      Peter Zijlstra 提交于
      For the architectures that do not implement their own tlb_flush() but
      do already use the generic mmu_gather, there are two options:
      
       1) the platform has an efficient flush_tlb_range() and
          asm-generic/tlb.h doesn't need any overrides at all.
      
       2) the platform lacks an efficient flush_tlb_range() and
          we select MMU_GATHER_NO_RANGE to minimize full invalidates.
      
      Convert all 'simple' architectures to one of these two forms.
      
      alpha:	    has no range invalidate -> 2
      arc:	    already used flush_tlb_range() -> 1
      c6x:	    has no range invalidate -> 2
      hexagon:    has an efficient flush_tlb_range() -> 1
                  (flush_tlb_mm() is in fact a full range invalidate,
      	     so no need to shoot down everything)
      m68k:	    has inefficient flush_tlb_range() -> 2
      microblaze: has no flush_tlb_range() -> 2
      mips:	    has efficient flush_tlb_range() -> 1
      	    (even though it currently seems to use flush_tlb_mm())
      nds32:	    already uses flush_tlb_range() -> 1
      nios2:	    has inefficient flush_tlb_range() -> 2
      	    (no limit on range iteration)
      openrisc:   has inefficient flush_tlb_range() -> 2
      	    (no limit on range iteration)
      parisc:	    already uses flush_tlb_range() -> 1
      sparc32:    already uses flush_tlb_range() -> 1
      unicore32:  has inefficient flush_tlb_range() -> 2
      	    (no limit on range iteration)
      xtensa:	    has efficient flush_tlb_range() -> 1
      
      Note this also fixes a bug in the existing code for a number
      platforms. Those platforms that did:
      
        tlb_end_vma() -> if (!full_mm) flush_tlb_*()
        tlb_flush -> if (full_mm) flush_tlb_mm()
      
      missed the case of shift_arg_pages(), which doesn't have @fullmm set,
      nor calls into tlb_*vma(), but still frees page-tables and thus needs
      an invalidate. The new code handles this by detecting a non-empty
      range, and either issuing the matching range invalidate or a full
      invalidate, depending on the capabilities.
      
      No change in behavior intended.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Nick Piggin <npiggin@gmail.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6137fed0
    • P
      asm-generic/tlb, arch: Invert CONFIG_HAVE_RCU_TABLE_INVALIDATE · 96bc9567
      Peter Zijlstra 提交于
      Make issuing a TLB invalidate for page-table pages the normal case.
      
      The reason is twofold:
      
       - too many invalidates is safer than too few,
       - most architectures use the linux page-tables natively
         and would thus require this.
      
      Make it an opt-out, instead of an opt-in.
      
      No change in behavior intended.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      96bc9567
    • P
      asm-generic/tlb, arch: Provide generic VIPT cache flush · e7fd28a7
      Peter Zijlstra 提交于
      The one obvious thing SH and ARM want is a sensible default for
      tlb_start_vma(). (also: https://lkml.org/lkml/2004/1/15/6 )
      
      Avoid all VIPT architectures providing their own tlb_start_vma()
      implementation and rely on architectures to provide a no-op
      flush_cache_range() when it is not relevant.
      
      This patch makes tlb_start_vma() default to flush_cache_range(), which
      should be right and sufficient. The only exceptions that I found where
      (oddly):
      
        - m68k-mmu
        - sparc64
        - unicore
      
      Those architectures appear to have flush_cache_range(), but their
      current tlb_start_vma() does not call it.
      
      No change in behavior intended.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nick Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e7fd28a7
  14. 29 3月, 2019 1 次提交
    • M
      KVM: export <linux/kvm_para.h> and <asm/kvm_para.h> iif KVM is supported · 3d9683cf
      Masahiro Yamada 提交于
      I do not see any consistency about headers_install of <linux/kvm_para.h>
      and <asm/kvm_para.h>.
      
      According to my analysis of Linux 5.1-rc1, there are 3 groups:
      
       [1] Both <linux/kvm_para.h> and <asm/kvm_para.h> are exported
      
          alpha, arm, hexagon, mips, powerpc, s390, sparc, x86
      
       [2] <asm/kvm_para.h> is exported, but <linux/kvm_para.h> is not
      
          arc, arm64, c6x, h8300, ia64, m68k, microblaze, nios2, openrisc,
          parisc, sh, unicore32, xtensa
      
       [3] Neither <linux/kvm_para.h> nor <asm/kvm_para.h> is exported
      
          csky, nds32, riscv
      
      This does not match to the actual KVM support. At least, [2] is
      half-baked.
      
      Nor do arch maintainers look like they care about this. For example,
      commit 0add5371 ("microblaze: Add missing kvm_para.h to Kbuild")
      exported <asm/kvm_para.h> to user-space in order to fix an in-kernel
      build error.
      
      We have two ways to make this consistent:
      
       [A] export both <linux/kvm_para.h> and <asm/kvm_para.h> for all
           architectures, irrespective of the KVM support
      
       [B] Match the header export of <linux/kvm_para.h> and <asm/kvm_para.h>
           to the KVM support
      
      My first attempt was [A] because the code looks cleaner, but Paolo
      suggested [B].
      
      So, this commit goes with [B].
      
      For most architectures, <asm/kvm_para.h> was moved to the kernel-space.
      I changed include/uapi/linux/Kbuild so that it checks generated
      asm/kvm_para.h as well as check-in ones.
      
      After this commit, there will be two groups:
      
       [1] Both <linux/kvm_para.h> and <asm/kvm_para.h> are exported
      
          arm, arm64, mips, powerpc, s390, x86
      
       [2] Neither <linux/kvm_para.h> nor <asm/kvm_para.h> is exported
      
          alpha, arc, c6x, csky, h8300, hexagon, ia64, m68k, microblaze,
          nds32, nios2, openrisc, parisc, riscv, sh, sparc, unicore32, xtensa
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NCornelia Huck <cohuck@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3d9683cf