1. 13 12月, 2019 1 次提交
  2. 05 12月, 2019 4 次提交
    • H
      parisc/hugetlb: use pgtable-nopXd instead of 4level-fixup · 2fa245c1
      Helge Deller 提交于
      Link: http://lkml.kernel.org/r/1572938135-31886-10-git-send-email-rppt@kernel.orgSigned-off-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Anatoly Pugachev <matorola@gmail.com>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Peter Rosin <peda@axentia.se>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Sam Creasey <sammy@sammy.net>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2fa245c1
    • M
      parisc: use pgtable-nopXd instead of 4level-fixup · d96885e2
      Mike Rapoport 提交于
      parisc has two or three levels of page tables and can use appropriate
      pgtable-nopXd and folding of the upper layers.
      
      Replace usage of include/asm-generic/4level-fixup.h and explicit
      definitions of __PAGETABLE_PxD_FOLDED in parisc with
      include/asm-generic/pgtable-nopmd.h for two-level configurations and
      with include/asm-generic/pgtable-nopud.h for three-lelve configurations
      and adjust page table manipulation macros and functions accordingly.
      
      Link: http://lkml.kernel.org/r/1572938135-31886-9-git-send-email-rppt@kernel.orgSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: NHelge Deller <deller@gmx.de>
      Cc: Anatoly Pugachev <matorola@gmail.com>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Peter Rosin <peda@axentia.se>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Sam Creasey <sammy@sammy.net>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d96885e2
    • M
      arch: sembuf.h: make uapi asm/sembuf.h self-contained · 0fb9dc28
      Masahiro Yamada 提交于
      Userspace cannot compile <asm/sembuf.h> due to some missing type
      definitions.  For example, building it for x86 fails as follows:
      
          CC      usr/include/asm/sembuf.h.s
        In file included from <command-line>:32:0:
        usr/include/asm/sembuf.h:17:20: error: field `sem_perm' has incomplete type
          struct ipc64_perm sem_perm; /* permissions .. see ipc.h */
                            ^~~~~~~~
        usr/include/asm/sembuf.h:24:2: error: unknown type name `__kernel_time_t'
          __kernel_time_t sem_otime; /* last semop time */
          ^~~~~~~~~~~~~~~
        usr/include/asm/sembuf.h:25:2: error: unknown type name `__kernel_ulong_t'
          __kernel_ulong_t __unused1;
          ^~~~~~~~~~~~~~~~
        usr/include/asm/sembuf.h:26:2: error: unknown type name `__kernel_time_t'
          __kernel_time_t sem_ctime; /* last change time */
          ^~~~~~~~~~~~~~~
        usr/include/asm/sembuf.h:27:2: error: unknown type name `__kernel_ulong_t'
          __kernel_ulong_t __unused2;
          ^~~~~~~~~~~~~~~~
        usr/include/asm/sembuf.h:29:2: error: unknown type name `__kernel_ulong_t'
          __kernel_ulong_t sem_nsems; /* no. of semaphores in array */
          ^~~~~~~~~~~~~~~~
        usr/include/asm/sembuf.h:30:2: error: unknown type name `__kernel_ulong_t'
          __kernel_ulong_t __unused3;
          ^~~~~~~~~~~~~~~~
        usr/include/asm/sembuf.h:31:2: error: unknown type name `__kernel_ulong_t'
          __kernel_ulong_t __unused4;
          ^~~~~~~~~~~~~~~~
      
      It is just a matter of missing include directive.
      
      Include <asm/ipcbuf.h> to make it self-contained, and add it to
      the compile-test coverage.
      
      Link: http://lkml.kernel.org/r/20191030063855.9989-3-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0fb9dc28
    • M
      arch: msgbuf.h: make uapi asm/msgbuf.h self-contained · 9ef0e004
      Masahiro Yamada 提交于
      Userspace cannot compile <asm/msgbuf.h> due to some missing type
      definitions.  For example, building it for x86 fails as follows:
      
          CC      usr/include/asm/msgbuf.h.s
        In file included from usr/include/asm/msgbuf.h:6:0,
                         from <command-line>:32:
        usr/include/asm-generic/msgbuf.h:25:20: error: field `msg_perm' has incomplete type
          struct ipc64_perm msg_perm;
                            ^~~~~~~~
        usr/include/asm-generic/msgbuf.h:27:2: error: unknown type name `__kernel_time_t'
          __kernel_time_t msg_stime; /* last msgsnd time */
          ^~~~~~~~~~~~~~~
        usr/include/asm-generic/msgbuf.h:28:2: error: unknown type name `__kernel_time_t'
          __kernel_time_t msg_rtime; /* last msgrcv time */
          ^~~~~~~~~~~~~~~
        usr/include/asm-generic/msgbuf.h:29:2: error: unknown type name `__kernel_time_t'
          __kernel_time_t msg_ctime; /* last change time */
          ^~~~~~~~~~~~~~~
        usr/include/asm-generic/msgbuf.h:41:2: error: unknown type name `__kernel_pid_t'
          __kernel_pid_t msg_lspid; /* pid of last msgsnd */
          ^~~~~~~~~~~~~~
        usr/include/asm-generic/msgbuf.h:42:2: error: unknown type name `__kernel_pid_t'
          __kernel_pid_t msg_lrpid; /* last receive pid */
          ^~~~~~~~~~~~~~
      
      It is just a matter of missing include directive.
      
      Include <asm/ipcbuf.h> to make it self-contained, and add it to
      the compile-test coverage.
      
      Link: http://lkml.kernel.org/r/20191030063855.9989-2-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9ef0e004
  3. 27 11月, 2019 1 次提交
    • E
      sysctl: Remove the sysctl system call · 61a47c1a
      Eric W. Biederman 提交于
      This system call has been deprecated almost since it was introduced, and
      in a survey of the linux distributions I can no longer find any of them
      that enable CONFIG_SYSCTL_SYSCALL.  The only indication that I can find
      that anyone might care is that a few of the defconfigs in the kernel
      enable CONFIG_SYSCTL_SYSCALL.  However this appears in only 31 of 414
      defconfigs in the kernel, so I suspect this symbols presence is simply
      because it is harmless to include rather than because it is necessary.
      
      As there appear to be no users of the sysctl system call, remove the
      code.  As this removes one of the few uses of the internal kernel mount
      of proc I hope this allows for even more simplifications of the proc
      filesystem.
      
      Cc: Alex Smith <alex.smith@imgtec.com>
      Cc: Anders Berg <anders.berg@lsi.com>
      Cc: Apelete Seketeli <apelete@seketeli.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Chee Nouk Phoon <cnphoon@altera.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Christian Ruppert <christian.ruppert@abilis.com>
      Cc: Greg Ungerer <gerg@uclinux.org>
      Cc: Harvey Hunt <harvey.hunt@imgtec.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hongliang Tao <taohl@lemote.com>
      Cc: Hua Yan <yanh@lemote.com>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: John Crispin <blogic@openwrt.org>
      Cc: Jonas Jensen <jonas.jensen@gmail.com>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: Jun Nie <jun.nie@linaro.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Kevin Wells <kevin.wells@nxp.com>
      Cc: Kumar Gala <galak@codeaurora.org>
      Cc: Lars-Peter Clausen <lars@metafoo.de>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Noam Camus <noamc@ezchip.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Phil Edworthy <phil.edworthy@renesas.com>
      Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Roland Stigge <stigge@antcom.de>
      Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Scott Telford <stelford@cadence.com>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Steven J. Hill <Steven.Hill@imgtec.com>
      Cc: Tanmay Inamdar <tinamdar@apm.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Wolfram Sang <w.sang@pengutronix.de>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      61a47c1a
  4. 21 11月, 2019 1 次提交
  5. 15 11月, 2019 1 次提交
    • A
      y2038: ipc: remove __kernel_time_t reference from headers · caf5e32d
      Arnd Bergmann 提交于
      There are two structures based on time_t that conflict between libc and
      kernel: timeval and timespec. Both are now renamed to __kernel_old_timeval
      and __kernel_old_timespec.
      
      For time_t, the old typedef is still __kernel_time_t. There is nothing
      wrong with that name, but it would be nice to not use that going forward
      as this type is used almost only in deprecated interfaces because of
      the y2038 overflow.
      
      In the IPC headers (msgbuf.h, sembuf.h, shmbuf.h), __kernel_time_t is only
      used for the 64-bit variants, which are not deprecated.
      
      Change these to a plain 'long', which is the same type as __kernel_time_t
      on all 64-bit architectures anyway, to reduce the number of users of the
      old type.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      caf5e32d
  6. 12 11月, 2019 1 次提交
  7. 06 11月, 2019 1 次提交
    • M
      module/ftrace: handle patchable-function-entry · a1326b17
      Mark Rutland 提交于
      When using patchable-function-entry, the compiler will record the
      callsites into a section named "__patchable_function_entries" rather
      than "__mcount_loc". Let's abstract this difference behind a new
      FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this
      explicitly (e.g. with custom module linker scripts).
      
      As parisc currently handles this explicitly, it is fixed up accordingly,
      with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is
      only defined when DYNAMIC_FTRACE is selected, the parisc module loading
      code is updated to only use the definition in that case. When
      DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so
      this removes some redundant work in that case.
      
      To make sure that this is keep up-to-date for modules and the main
      kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery
      simplified for legibility.
      
      I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled,
      and verified that the section made it into the .ko files for modules.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NHelge Deller <deller@gmx.de>
      Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NTorsten Duwe <duwe@suse.de>
      Tested-by: NAmit Daniel Kachhap <amit.kachhap@arm.com>
      Tested-by: NSven Schnelle <svens@stackframe.org>
      Tested-by: NTorsten Duwe <duwe@suse.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: linux-parisc@vger.kernel.org
      a1326b17
  8. 05 11月, 2019 1 次提交
    • K
      parisc: Move EXCEPTION_TABLE to RO_DATA segment · 6e85e23e
      Kees Cook 提交于
      Since the EXCEPTION_TABLE is read-only, collapse it into RO_DATA.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NHelge Deller <deller@gmx.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-s390@vger.kernel.org
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Cc: Sven Schnelle <svens@stackframe.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20191029211351.13243-24-keescook@chromium.org
      6e85e23e
  9. 04 11月, 2019 4 次提交
    • K
      vmlinux.lds.h: Replace RW_DATA_SECTION with RW_DATA · c9174047
      Kees Cook 提交于
      Rename RW_DATA_SECTION to RW_DATA. (Calling this a "section" is a lie,
      since it's multiple sections and section flags cannot be applied to
      the macro.)
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20191029211351.13243-14-keescook@chromium.org
      c9174047
    • K
      vmlinux.lds.h: Replace RO_DATA_SECTION with RO_DATA · 93240b32
      Kees Cook 提交于
      Finish renaming RO_DATA_SECTION to RO_DATA. (Calling this a "section"
      is a lie, since it's multiple sections and section flags cannot be
      applied to the macro.)
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20191029211351.13243-13-keescook@chromium.org
      93240b32
    • K
      vmlinux.lds.h: Move NOTES into RO_DATA · eaf93707
      Kees Cook 提交于
      The .notes section should be non-executable read-only data. As such,
      move it to the RO_DATA macro instead of being per-architecture defined.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20191029211351.13243-11-keescook@chromium.org
      eaf93707
    • J
      parisc: Avoid spurious inequivalent alias kernel error messages · e9c837c6
      John David Anglin 提交于
      This patch changes flush_dcache_page() to only print inequivalent alias error
      messages on systems that require coherency.  Inequivalent aliases can occur on
      systems that don't require coherency and this can cause spurious messages.
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      e9c837c6
  10. 31 10月, 2019 1 次提交
  11. 15 10月, 2019 3 次提交
  12. 11 10月, 2019 1 次提交
  13. 26 9月, 2019 2 次提交
    • M
      mm: introduce MADV_PAGEOUT · 1a4e58cc
      Minchan Kim 提交于
      When a process expects no accesses to a certain memory range for a long
      time, it could hint kernel that the pages can be reclaimed instantly but
      data should be preserved for future use.  This could reduce workingset
      eviction so it ends up increasing performance.
      
      This patch introduces the new MADV_PAGEOUT hint to madvise(2) syscall.
      MADV_PAGEOUT can be used by a process to mark a memory range as not
      expected to be used for a long time so that kernel reclaims *any LRU*
      pages instantly.  The hint can help kernel in deciding which pages to
      evict proactively.
      
      A note: It doesn't apply SWAP_CLUSTER_MAX LRU page isolation limit
      intentionally because it's automatically bounded by PMD size.  If PMD
      size(e.g., 256) makes some trouble, we could fix it later by limit it to
      SWAP_CLUSTER_MAX[1].
      
      - man-page material
      
      MADV_PAGEOUT (since Linux x.x)
      
      Do not expect access in the near future so pages in the specified
      regions could be reclaimed instantly regardless of memory pressure.
      Thus, access in the range after successful operation could cause
      major page fault but never lose the up-to-date contents unlike
      MADV_DONTNEED. Pages belonging to a shared mapping are only processed
      if a write access is allowed for the calling process.
      
      MADV_PAGEOUT cannot be applied to locked pages, Huge TLB pages, or
      VM_PFNMAP pages.
      
      [1] https://lore.kernel.org/lkml/20190710194719.GS29695@dhcp22.suse.cz/
      
      [minchan@kernel.org: clear PG_active on MADV_PAGEOUT]
        Link: http://lkml.kernel.org/r/20190802200643.GA181880@google.com
      [akpm@linux-foundation.org: resolve conflicts with hmm.git]
      Link: http://lkml.kernel.org/r/20190726023435.214162-5-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Daniel Colascione <dancol@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Oleksandr Natalenko <oleksandr@redhat.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Sonny Rao <sonnyrao@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Tim Murray <timmurray@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1a4e58cc
    • M
      mm: introduce MADV_COLD · 9c276cc6
      Minchan Kim 提交于
      Patch series "Introduce MADV_COLD and MADV_PAGEOUT", v7.
      
      - Background
      
      The Android terminology used for forking a new process and starting an app
      from scratch is a cold start, while resuming an existing app is a hot
      start.  While we continually try to improve the performance of cold
      starts, hot starts will always be significantly less power hungry as well
      as faster so we are trying to make hot start more likely than cold start.
      
      To increase hot start, Android userspace manages the order that apps
      should be killed in a process called ActivityManagerService.
      ActivityManagerService tracks every Android app or service that the user
      could be interacting with at any time and translates that into a ranked
      list for lmkd(low memory killer daemon).  They are likely to be killed by
      lmkd if the system has to reclaim memory.  In that sense they are similar
      to entries in any other cache.  Those apps are kept alive for
      opportunistic performance improvements but those performance improvements
      will vary based on the memory requirements of individual workloads.
      
      - Problem
      
      Naturally, cached apps were dominant consumers of memory on the system.
      However, they were not significant consumers of swap even though they are
      good candidate for swap.  Under investigation, swapping out only begins
      once the low zone watermark is hit and kswapd wakes up, but the overall
      allocation rate in the system might trip lmkd thresholds and cause a
      cached process to be killed(we measured performance swapping out vs.
      zapping the memory by killing a process.  Unsurprisingly, zapping is 10x
      times faster even though we use zram which is much faster than real
      storage) so kill from lmkd will often satisfy the high zone watermark,
      resulting in very few pages actually being moved to swap.
      
      - Approach
      
      The approach we chose was to use a new interface to allow userspace to
      proactively reclaim entire processes by leveraging platform information.
      This allowed us to bypass the inaccuracy of the kernel’s LRUs for pages
      that are known to be cold from userspace and to avoid races with lmkd by
      reclaiming apps as soon as they entered the cached state.  Additionally,
      it could provide many chances for platform to use much information to
      optimize memory efficiency.
      
      To achieve the goal, the patchset introduce two new options for madvise.
      One is MADV_COLD which will deactivate activated pages and the other is
      MADV_PAGEOUT which will reclaim private pages instantly.  These new
      options complement MADV_DONTNEED and MADV_FREE by adding non-destructive
      ways to gain some free memory space.  MADV_PAGEOUT is similar to
      MADV_DONTNEED in a way that it hints the kernel that memory region is not
      currently needed and should be reclaimed immediately; MADV_COLD is similar
      to MADV_FREE in a way that it hints the kernel that memory region is not
      currently needed and should be reclaimed when memory pressure rises.
      
      This patch (of 5):
      
      When a process expects no accesses to a certain memory range, it could
      give a hint to kernel that the pages can be reclaimed when memory pressure
      happens but data should be preserved for future use.  This could reduce
      workingset eviction so it ends up increasing performance.
      
      This patch introduces the new MADV_COLD hint to madvise(2) syscall.
      MADV_COLD can be used by a process to mark a memory range as not expected
      to be used in the near future.  The hint can help kernel in deciding which
      pages to evict early during memory pressure.
      
      It works for every LRU pages like MADV_[DONTNEED|FREE]. IOW, It moves
      
      	active file page -> inactive file LRU
      	active anon page -> inacdtive anon LRU
      
      Unlike MADV_FREE, it doesn't move active anonymous pages to inactive file
      LRU's head because MADV_COLD is a little bit different symantic.
      MADV_FREE means it's okay to discard when the memory pressure because the
      content of the page is *garbage* so freeing such pages is almost zero
      overhead since we don't need to swap out and access afterward causes just
      minor fault.  Thus, it would make sense to put those freeable pages in
      inactive file LRU to compete other used-once pages.  It makes sense for
      implmentaion point of view, too because it's not swapbacked memory any
      longer until it would be re-dirtied.  Even, it could give a bonus to make
      them be reclaimed on swapless system.  However, MADV_COLD doesn't mean
      garbage so reclaiming them requires swap-out/in in the end so it's bigger
      cost.  Since we have designed VM LRU aging based on cost-model, anonymous
      cold pages would be better to position inactive anon's LRU list, not file
      LRU.  Furthermore, it would help to avoid unnecessary scanning if system
      doesn't have a swap device.  Let's start simpler way without adding
      complexity at this moment.  However, keep in mind, too that it's a caveat
      that workloads with a lot of pages cache are likely to ignore MADV_COLD on
      anonymous memory because we rarely age anonymous LRU lists.
      
      * man-page material
      
      MADV_COLD (since Linux x.x)
      
      Pages in the specified regions will be treated as less-recently-accessed
      compared to pages in the system with similar access frequencies.  In
      contrast to MADV_FREE, the contents of the region are preserved regardless
      of subsequent writes to pages.
      
      MADV_COLD cannot be applied to locked pages, Huge TLB pages, or VM_PFNMAP
      pages.
      
      [akpm@linux-foundation.org: resolve conflicts with hmm.git]
      Link: http://lkml.kernel.org/r/20190726023435.214162-2-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Daniel Colascione <dancol@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Oleksandr Natalenko <oleksandr@redhat.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Sonny Rao <sonnyrao@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Tim Murray <timmurray@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c276cc6
  14. 25 9月, 2019 2 次提交
  15. 12 9月, 2019 1 次提交
  16. 08 9月, 2019 4 次提交
  17. 07 9月, 2019 1 次提交
  18. 05 9月, 2019 1 次提交
  19. 04 9月, 2019 1 次提交
    • C
      parisc: don't set ARCH_NO_COHERENT_DMA_MMAP · 5128da32
      Christoph Hellwig 提交于
      parisc is the only architecture that sets ARCH_NO_COHERENT_DMA_MMAP
      when an MMU is enabled.  AFAIK this is because parisc CPUs use VIVT
      caches, which means exporting normally cachable memory to userspace is
      relatively dangrous due to cache aliasing.
      
      But normally cachable memory is only allocated by dma_alloc_coherent
      on parisc when using the sba_iommu or ccio_iommu drivers, so just
      remove the .mmap implementation for them so that we don't have to set
      ARCH_NO_COHERENT_DMA_MMAP, which I plan to get rid of.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      5128da32
  20. 25 8月, 2019 1 次提交
  21. 21 8月, 2019 1 次提交
  22. 13 8月, 2019 2 次提交
  23. 11 8月, 2019 1 次提交
  24. 03 8月, 2019 3 次提交
新手
引导
客服 返回
顶部