1. 18 11月, 2017 1 次提交
  2. 16 11月, 2017 1 次提交
    • P
      sparc64: optimize struct page zeroing · 78c94366
      Pavel Tatashin 提交于
      Add an optimized mm_zero_struct_page(), so struct page's are zeroed
      without calling memset().  We do eight to ten regular stores based on
      the size of struct page.  Compiler optimizes out the conditions of
      switch() statement.
      
      SPARC-M6 with 15T of memory, single thread performance:
      
                                     BASE            FIX  OPTIMIZED_FIX
              bootmem_init   28.440467985s   2.305674818s   2.305161615s
      free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
                            --------------------------------------------
      Total                 231.286369658s 227.648759326s 174.861668175s
      
      BASE:  current linux
      FIX:   This patch series without "optimized struct page zeroing"
      OPTIMIZED_FIX: This patch series including the current patch.
      
      bootmem_init() is where memory for struct pages is zeroed during
      allocation.  Note, about two seconds in this function is a fixed time:
      it does not increase as memory is increased.
      
      Link: http://lkml.kernel.org/r/20171013173214.27300-11-pasha.tatashin@oracle.comSigned-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: NSteven Sistare <steven.sistare@oracle.com>
      Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
      Reviewed-by: NBob Picco <bob.picco@oracle.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      78c94366
  3. 15 11月, 2017 5 次提交
    • J
      fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall · 4d2dc2cc
      Jeff Layton 提交于
      Currently, we're capping the values too low in the F_GETLK64 case. The
      fields in that structure are 64-bit values, so we shouldn't need to do
      any sort of fixup there.
      
      Make sure we check that assumption at build time in the future however
      by ensuring that the sizes we're copying will fit.
      
      With this, we no longer need COMPAT_LOFF_T_MAX either, so remove it.
      
      Fixes: 94073ad7 (fs/locks: don't mess with the address limit in compat_fcntl64)
      Reported-by: NVitaly Lipatov <lav@etersoft.ru>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NDavid Howells <dhowells@redhat.com>
      4d2dc2cc
    • N
      sparc64: Fix page table walk for PUD hugepages · 70f3c8b7
      Nitin Gupta 提交于
      For a PUD hugepage entry, we need to propagate bits [32:22]
      from virtual address to resolve at 4M granularity. However,
      the current code was incorrectly propagating bits [29:19].
      This bug can cause incorrect data to be returned for pages
      backed with 16G hugepages.
      Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com>
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      70f3c8b7
    • V
      sparc64: Define SPARC default __fls function · be52bbe3
      Vijay Kumar 提交于
      __fls will now require a boot time patching on T4 and above.
      Redefining it under arch/sparc/lib.
      Signed-off-by: NVijay Kumar <vijay.ac.kumar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be52bbe3
    • V
      sparc64: Define SPARC default fls function · 41413a60
      Vijay Kumar 提交于
      fls will now require a boot time patching on T4 and above.
      Redefining it under arch/sparc/lib.
      Signed-off-by: NVijay Kumar <vijay.ac.kumar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41413a60
    • N
      vDSO for sparc · 9a08862a
      Nagarathnam Muthusamy 提交于
      Following patch is based on work done by Nick Alcock on 64-bit vDSO for sparc
      in Oracle linux. I have extended it to include support for 32-bit vDSO for sparc
      on 64-bit kernel.
      
      vDSO for sparc is based on the X86 implementation. This patch
      provides vDSO support for both 64-bit and 32-bit programs on 64-bit kernel.
      vDSO will be disabled on 32-bit linux kernel on sparc.
      
      *) vclock_gettime.c contains all the vdso functions. Since data page is mapped
         before the vdso code page, the pointer to data page is got by subracting offset
         from an address in the vdso code page. The return address stored in
         %i7 is used for this purpose.
      *) During compilation, both 32-bit and 64-bit vdso images are compiled and are
         converted into raw bytes by vdso2c program to be ready for mapping into the
         process. 32-bit images are compiled only if CONFIG_COMPAT is enabled. vdso2c
         generates two files vdso-image-64.c and vdso-image-32.c which contains the
         respective vDSO image in C structure.
      *) During vdso initialization, required number of vdso pages are allocated and
         raw bytes are copied into the pages.
      *) During every exec, these pages are mapped into the process through
         arch_setup_additional_pages and the location of mapping is passed on to the
         process through aux vector AT_SYSINFO_EHDR which is used by glibc.
      *) A new update_vsyscall routine for sparc is added to keep the data page in
         vdso updated.
      *) As vDSO cannot contain dynamically relocatable references, a new version of
         cpu_relax is added for the use of vDSO.
      
      This change also requires a putback to glibc to use vDSO. For testing,
      programs planning to try vDSO can be compiled against the generated
      vdso(64/32).so in the source.
      
      Testing:
      
      ========
      [root@localhost ~]# cat vdso_test.c
      int main() {
              struct timespec tv_start, tv_end;
              struct timeval tv_tmp;
      	int i;
              int count = 1 * 1000 * 10000;
      	long long diff;
      
              clock_gettime(0, &tv_start);
              for (i = 0; i < count; i++)
                    gettimeofday(&tv_tmp, NULL);
              clock_gettime(0, &tv_end);
              diff = (long long)(tv_end.tv_sec -
      		tv_start.tv_sec)*(1*1000*1000*1000);
              diff += (tv_end.tv_nsec - tv_start.tv_nsec);
      	printf("Start sec: %d\n", tv_start.tv_sec);
      	printf("End sec  : %d\n", tv_end.tv_sec);
              printf("%d cycles in %lld ns = %f ns/cycle\n", count, diff,
      		(double)diff / (double)count);
              return 0;
      }
      
      [root@localhost ~]# cc vdso_test.c -o t32_without_fix -m32 -lrt
      [root@localhost ~]# ./t32_without_fix
      Start sec: 1502396130
      End sec  : 1502396140
      10000000 cycles in 9565148528 ns = 956.514853 ns/cycle
      [root@localhost ~]# cc vdso_test.c -o t32_with_fix -m32 ./vdso32.so.dbg
      [root@localhost ~]# ./t32_with_fix
      Start sec: 1502396168
      End sec  : 1502396169
      10000000 cycles in 798141262 ns = 79.814126 ns/cycle
      [root@localhost ~]# cc vdso_test.c -o t64_without_fix -m64 -lrt
      [root@localhost ~]# ./t64_without_fix
      Start sec: 1502396208
      End sec  : 1502396218
      10000000 cycles in 9846091800 ns = 984.609180 ns/cycle
      [root@localhost ~]# cc vdso_test.c -o t64_with_fix -m64 ./vdso64.so.dbg
      [root@localhost ~]# ./t64_with_fix
      Start sec: 1502396257
      End sec  : 1502396257
      10000000 cycles in 380984048 ns = 38.098405 ns/cycle
      
      V1 to V2 Changes:
      =================
      	Added hot patching code to switch the read stick instruction to read
      tick instruction based on the hardware.
      
      V2 to V3 Changes:
      =================
      	Merged latest changes from sparc-next and moved the initialization
      of clocksource_tick.archdata.vclock_mode to time_init_early. Disabled
      queued spinlock and rwlock configuration when simulating 32-bit config
      to compile 32-bit VDSO.
      
      V3 to V4 Changes:
      =================
      	Hardcoded the page size as 8192 in linker script for both 64-bit and
      32-bit binaries. Removed unused variables in vdso2c.h. Added -mv8plus flag to
      Makefile to prevent the generation of relocation entries for __lshrdi3 in 32-bit
      vdso binary.
      Signed-off-by: NNick Alcock <nick.alcock@oracle.com>
      Signed-off-by: NNagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
      Reviewed-by: NShannon Nelson <shannon.nelson@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a08862a
  4. 08 11月, 2017 1 次提交
  5. 02 11月, 2017 2 次提交
    • G
      License cleanup: add SPDX license identifier to uapi header files with no license · 6f52b16c
      Greg Kroah-Hartman 提交于
      Many user space API headers are missing licensing information, which
      makes it hard for compliance tools to determine the correct license.
      
      By default are files without license information under the default
      license of the kernel, which is GPLV2.  Marking them GPLV2 would exclude
      them from being included in non GPLV2 code, which is obviously not
      intended. The user space API headers fall under the syscall exception
      which is in the kernels COPYING file:
      
         NOTE! This copyright does *not* cover user programs that use kernel
         services by normal system calls - this is merely considered normal use
         of the kernel, and does *not* fall under the heading of "derived work".
      
      otherwise syscall usage would not be possible.
      
      Update the files which contain no license information with an SPDX
      license identifier.  The chosen identifier is 'GPL-2.0 WITH
      Linux-syscall-note' which is the officially assigned identifier for the
      Linux syscall exception.  SPDX license identifiers are a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.  See the previous patch in this series for the
      methodology of how this patch was researched.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f52b16c
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  6. 25 10月, 2017 1 次提交
    • M
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns... · 6aa7de05
      Mark Rutland 提交于
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
      
      Please do not apply this to mainline directly, instead please re-run the
      coccinelle script shown below and apply its output.
      
      For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
      preference to ACCESS_ONCE(), and new code is expected to use one of the
      former. So far, there's been no reason to change most existing uses of
      ACCESS_ONCE(), as these aren't harmful, and changing them results in
      churn.
      
      However, for some features, the read/write distinction is critical to
      correct operation. To distinguish these cases, separate read/write
      accessors must be used. This patch migrates (most) remaining
      ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
      coccinelle script:
      
      ----
      // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
      // WRITE_ONCE()
      
      // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
      
      virtual patch
      
      @ depends on patch @
      expression E1, E2;
      @@
      
      - ACCESS_ONCE(E1) = E2
      + WRITE_ONCE(E1, E2)
      
      @ depends on patch @
      expression E;
      @@
      
      - ACCESS_ONCE(E)
      + READ_ONCE(E)
      ----
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davem@davemloft.net
      Cc: linux-arch@vger.kernel.org
      Cc: mpe@ellerman.id.au
      Cc: shuah@kernel.org
      Cc: snitzer@redhat.com
      Cc: thor.thayer@linux.intel.com
      Cc: tj@kernel.org
      Cc: viro@zeniv.linux.org.uk
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6aa7de05
  7. 24 10月, 2017 1 次提交
    • W
      linux/compiler.h: Split into compiler.h and compiler_types.h · d1515582
      Will Deacon 提交于
      linux/compiler.h is included indirectly by linux/types.h via
      uapi/linux/types.h -> uapi/linux/posix_types.h -> linux/stddef.h
      -> uapi/linux/stddef.h and is needed to provide a proper definition of
      offsetof.
      
      Unfortunately, compiler.h requires a definition of
      smp_read_barrier_depends() for defining lockless_dereference() and soon
      for defining READ_ONCE(), which means that all
      users of READ_ONCE() will need to include asm/barrier.h to avoid splats
      such as:
      
         In file included from include/uapi/linux/stddef.h:1:0,
                          from include/linux/stddef.h:4,
                          from arch/h8300/kernel/asm-offsets.c:11:
         include/linux/list.h: In function 'list_empty':
      >> include/linux/compiler.h:343:2: error: implicit declaration of function 'smp_read_barrier_depends' [-Werror=implicit-function-declaration]
           smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
           ^
      
      A better alternative is to include asm/barrier.h in linux/compiler.h,
      but this requires a type definition for "bool" on some architectures
      (e.g. x86), which is defined later by linux/types.h. Type "bool" is also
      used directly in linux/compiler.h, so the whole thing is pretty fragile.
      
      This patch splits compiler.h in two: compiler_types.h contains type
      annotations, definitions and the compiler-specific parts, whereas
      compiler.h #includes compiler-types.h and additionally defines macros
      such as {READ,WRITE.ACCESS}_ONCE().
      
      uapi/linux/stddef.h and linux/linkage.h are then moved over to include
      linux/compiler_types.h, which fixes the build for h8 and blackfin.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1508840570-22169-2-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d1515582
  8. 19 10月, 2017 2 次提交
  9. 10 10月, 2017 3 次提交
  10. 28 9月, 2017 2 次提交
  11. 10 9月, 2017 1 次提交
    • A
      sparc64: speed up etrap/rtrap on NG2 and later processors · a7159a87
      Anthony Yznaga 提交于
      For many sun4v processor types, reading or writing a privileged register
      has a latency of 40 to 70 cycles.  Use a combination of the low-latency
      allclean, otherw, normalw, and nop instructions in etrap and rtrap to
      replace 2 rdpr and 5 wrpr instructions and improve etrap/rtrap
      performance.  allclean, otherw, and normalw are available on NG2 and
      later processors.
      
      The average ticks to execute the flush windows trap ("ta 0x3") with and
      without this patch on select platforms:
      
       CPU            Not patched     Patched    % Latency Reduction
      
       NG2            1762            1558            -11.58
       NG4            3619            3204            -11.47
       M7             3015            2624            -12.97
       SPARC64-X      829             770              -7.12
      Signed-off-by: NAnthony Yznaga <anthony.yznaga@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7159a87
  12. 09 9月, 2017 1 次提交
    • M
      vga: optimise console scrolling · ac036f95
      Matthew Wilcox 提交于
      Where possible, call memset16(), memmove() or memcpy() instead of using
      open-coded loops.  I don't like the calling convention that uses a byte
      count instead of a count of u16s, but it's a little late to change that.
      Reduces code size of fbcon.o by almost 400 bytes on my laptop build.
      
      [akpm@linux-foundation.org: fix build]
      Link: http://lkml.kernel.org/r/20170720184539.31609-9-willy@infradead.orgSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac036f95
  13. 26 8月, 2017 1 次提交
    • J
      futex: Remove duplicated code and fix undefined behaviour · 30d6e0a4
      Jiri Slaby 提交于
      There is code duplicated over all architecture's headers for
      futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
      and comparison of the result.
      
      Remove this duplication and leave up to the arches only the needed
      assembly which is now in arch_futex_atomic_op_inuser.
      
      This effectively distributes the Will Deacon's arm64 fix for undefined
      behaviour reported by UBSAN to all architectures. The fix was done in
      commit 5f16a046 (arm64: futex: Fix undefined behaviour with
      FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.
      
      And as suggested by Thomas, check for negative oparg too, because it was
      also reported to cause undefined behaviour report.
      
      Note that s390 removed access_ok check in d12a2970 ("s390/uaccess:
      remove pointless access_ok() checks") as access_ok there returns true.
      We introduce it back to the helper for the sake of simplicity (it gets
      optimized away anyway).
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
      Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Reviewed-by: NDarren Hart (VMware) <dvhart@infradead.org>
      Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
      Cc: linux-mips@linux-mips.org
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: peterz@infradead.org
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: sparclinux@vger.kernel.org
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-hexagon@vger.kernel.org
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-xtensa@linux-xtensa.org
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: openrisc@lists.librecores.org
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-parisc@vger.kernel.org
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: linux-alpha@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
      30d6e0a4
  14. 17 8月, 2017 1 次提交
  15. 16 8月, 2017 3 次提交
  16. 11 8月, 2017 2 次提交
  17. 10 8月, 2017 5 次提交
  18. 05 8月, 2017 2 次提交
  19. 04 8月, 2017 1 次提交
  20. 25 7月, 2017 1 次提交
    • E
      signal: Remove kernel interal si_code magic · cc731525
      Eric W. Biederman 提交于
      struct siginfo is a union and the kernel since 2.4 has been hiding a union
      tag in the high 16bits of si_code using the values:
      __SI_KILL
      __SI_TIMER
      __SI_POLL
      __SI_FAULT
      __SI_CHLD
      __SI_RT
      __SI_MESGQ
      __SI_SYS
      
      While this looks plausible on the surface, in practice this situation has
      not worked well.
      
      - Injected positive signals are not copied to user space properly
        unless they have these magic high bits set.
      
      - Injected positive signals are not reported properly by signalfd
        unless they have these magic high bits set.
      
      - These kernel internal values leaked to userspace via ptrace_peek_siginfo
      
      - It was possible to inject these kernel internal values and cause the
        the kernel to misbehave.
      
      - Kernel developers got confused and expected these kernel internal values
        in userspace in kernel self tests.
      
      - Kernel developers got confused and set si_code to __SI_FAULT which
        is SI_USER in userspace which causes userspace to think an ordinary user
        sent the signal and that it was not kernel generated.
      
      - The values make it impossible to reorganize the code to transform
        siginfo_copy_to_user into a plain copy_to_user.  As si_code must
        be massaged before being passed to userspace.
      
      So remove these kernel internal si codes and make the kernel code simpler
      and more maintainable.
      
      To replace these kernel internal magic si_codes introduce the helper
      function siginfo_layout, that takes a signal number and an si_code and
      computes which union member of siginfo is being used.  Have
      siginfo_layout return an enumeration so that gcc will have enough
      information to warn if a switch statement does not handle all of union
      members.
      
      A couple of architectures have a messed up ABI that defines signal
      specific duplications of SI_USER which causes more special cases in
      siginfo_layout than I would like.  The good news is only problem
      architectures pay the cost.
      
      Update all of the code that used the previous magic __SI_ values to
      use the new SIL_ values and to call siginfo_layout to get those
      values.  Escept where not all of the cases are handled remove the
      defaults in the switch statements so that if a new case is missed in
      the future the lack will show up at compile time.
      
      Modify the code that copies siginfo si_code to userspace to just copy
      the value and not cast si_code to a short first.  The high bits are no
      longer used to hold a magic union member.
      
      Fixup the siginfo header files to stop including the __SI_ values in
      their constants and for the headers that were missing it to properly
      update the number of si_codes for each signal type.
      
      The fixes to copy_siginfo_from_user32 implementations has the
      interesting property that several of them perviously should never have
      worked as the __SI_ values they depended up where kernel internal.
      With that dependency gone those implementations should work much
      better.
      
      The idea of not passing the __SI_ values out to userspace and then
      not reinserting them has been tested with criu and criu worked without
      changes.
      
      Ref: 2.4.0-test1
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      cc731525
  21. 20 7月, 2017 1 次提交
    • E
      signal/sparc: Document a conflict with SI_USER with SIGFPE · cc9f72e4
      Eric W. Biederman 提交于
      Setting si_code to __SI_FAULT results in a userspace seeing
      an si_code of 0.  This is the same si_code as SI_USER.  Posix
      and common sense requires that SI_USER not be a signal specific
      si_code.  As such this use of 0 for the si_code is a pretty
      horribly broken ABI.
      
      This was introduced in 2.3.41 so this mess has had a long time for
      people to be able to start depending on it.
      
      As this bug has existed for 17 years already I don't know if it is
      worth fixing.  It is definitely worth documenting what is going
      on so that no one decides to copy this bad decision.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      cc9f72e4
  22. 19 7月, 2017 1 次提交
    • R
      sparc64: Prevent perf from running during super critical sections · fc290a11
      Rob Gardner 提交于
      This fixes another cause of random segfaults and bus errors that may
      occur while running perf with the callgraph option.
      
      Critical sections beginning with spin_lock_irqsave() raise the interrupt
      level to PIL_NORMAL_MAX (14) and intentionally do not block performance
      counter interrupts, which arrive at PIL_NMI (15).
      
      But some sections of code are "super critical" with respect to perf
      because the perf_callchain_user() path accesses user space and may cause
      TLB activity as well as faults as it unwinds the user stack.
      
      One particular critical section occurs in switch_mm:
      
              spin_lock_irqsave(&mm->context.lock, flags);
              ...
              load_secondary_context(mm);
              tsb_context_switch(mm);
              ...
              spin_unlock_irqrestore(&mm->context.lock, flags);
      
      If a perf interrupt arrives in between load_secondary_context() and
      tsb_context_switch(), then perf_callchain_user() could execute with
      the context ID of one process, but with an active TSB for a different
      process. When the user stack is accessed, it is very likely to
      incur a TLB miss, since the h/w context ID has been changed. The TLB
      will then be reloaded with a translation from the TSB for one process,
      but using a context ID for another process. This exposes memory from
      one process to another, and since it is a mapping for stack memory,
      this usually causes the new process to crash quickly.
      
      This super critical section needs more protection than is provided
      by spin_lock_irqsave() since perf interrupts must not be allowed in.
      
      Since __tsb_context_switch already goes through the trouble of
      disabling interrupts completely, we fix this by moving the secondary
      context load down into this better protected region.
      
      Orabug: 25577560
      Signed-off-by: NDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc290a11
  23. 17 7月, 2017 1 次提交